GitHub logo hrishikeshrt / google_drive_ocr

Perform OCR using Google's Drive API v3

Google OCR (Drive API v3)

https://img.shields.io/pypi/v/google_drive_ocr?color=success Documentation Status Python Version Support GitHub Issues GitHub Followers Twitter Followers

Perform OCR using Google's Drive API v3

  • Free software: GNU General Public License v3
  • Documentation: https://google-drive-ocr.readthedocs.io.

Features

  • Perform OCR using Google's Drive API v3
  • Class GoogleOCRApplication() for use in projects
  • Highly configurable CLI
  • Run OCR on a single image file
  • Run OCR on multiple image files
  • Run OCR on all images in directory
  • Use multiple workers (multiprocessing)
  • Work on a PDF document directly

Usage

Using in a Project

Create a GoogleOCRApplication application instance:

from google_drive_ocr import GoogleOCRApplication

app = GoogleOCRApplication('client_secret.json')
Enter fullscreen mode Exit fullscreen mode

Perform OCR on a single image:

app.perform_ocr('image.png')
Enter fullscreen mode Exit fullscreen mode

Perform OCR on mupltiple images:

app.perform_batch_ocr(['image_1.png', 'image_2.png', 'image_3.png'])
Enter fullscreen mode Exit fullscreen mode

Perform OCR on multiple images using multiple workers (multiprocessing):

app.perform_batch_ocr(['image_1.png', 'image_3.png', 'image_2.png'], workers=2)
Enter fullscreen mode Exit fullscreen mode

Using Command Line Interface

Typical usage with…

View on GitHub

Google's Drive API can be used to perform OCR on images from any language. google-drive-ocr is a python package that allows users to do this with utmost ease, right from the terminal.

  • Free software: GNU General Public License v3
  • Documentation: https://google-drive-ocr.readthedocs.io.

Features

  • Perform OCR using Google's Drive API v3
  • Class GoogleOCRApplication() for use in projects
  • Highly configurable CLI
  • Run OCR on a single image file
  • Run OCR on multiple image files
  • Run OCR on all images in directory
  • Use multiple workers (multiprocessing)
  • Work on a PDF document directly

Install

To install Google OCR (Drive API v3), run this command in your terminal:

pip install google-drive-ocr
Enter fullscreen mode Exit fullscreen mode

Note: One must setup a Google application and download client_secrets.json file before using google_drive_ocr.

Setup

  • Create a project on Google Cloud Platform
  • Wizard: https://console.developers.google.com/start/api?id=drive

Instructions

  • https://cloud.google.com/genomics/downloading-credentials-for-api-access
  • Select application type as "Installed Application"
  • Create credentials "OAuth consent screen" --> "OAuth client ID"
  • Save client_secret.json

Usage

Using in a Project

Create a GoogleOCRApplication application instance:

from google_drive_ocr import GoogleOCRApplication

app = GoogleOCRApplication('client_secret.json')
Enter fullscreen mode Exit fullscreen mode

Perform OCR on a single image:

app.perform_ocr('image.png')
Enter fullscreen mode Exit fullscreen mode

Perform OCR on mupltiple images:

app.perform_batch_ocr(['image_1.png', 'image_2.png', 'image_3.png'])
Enter fullscreen mode Exit fullscreen mode

Perform OCR on multiple images using multiple workers (multiprocessing):

app.perform_batch_ocr(['image_1.png', 'image_3.png', 'image_2.png'], workers=2)
Enter fullscreen mode Exit fullscreen mode

Using Command Line Interface

Typical usage with several options:

google-ocr --client-secret client_secret.json \
--upload-folder-id <google-drive-folder-id>  \
--image-dir images/ --extension .jpg \
--workers 4 --no-keep
Enter fullscreen mode Exit fullscreen mode

Show help message with the full set of options:

google-ocr --help
Enter fullscreen mode Exit fullscreen mode

Configuration

The default location for configuration is ~/.gdo.cfg.
If configuration is written to this location with a set of options,
we don't have to specify those options again on the subsequent runs.

Save configuration and exit:

google-ocr --client-secret client_secret.json --write-config ~/.gdo.cfg
Enter fullscreen mode Exit fullscreen mode

Read configuration from a custom location (if it was written to a custom location):

google-ocr --config ~/.my_config_file ..
Enter fullscreen mode Exit fullscreen mode

Performing OCR

Note: It is assumed that the client-secret option is saved in configuration file.

Single image file:

google-ocr -i image.png
Enter fullscreen mode Exit fullscreen mode

Multiple image files:

google-ocr -b image_1.png image_2.png image_3.png
Enter fullscreen mode Exit fullscreen mode

All image files from a directory with a specific extension:

google-ocr --image-dir images/ --extension .png
Enter fullscreen mode Exit fullscreen mode

Multiple workers (multiprocessing):

google-ocr -b image_1.png image_2.png image_3.png --workers 2
Enter fullscreen mode Exit fullscreen mode

PDF files:

google-ocr --pdf document.pdf --pages 1-3 5 7-10 13
Enter fullscreen mode Exit fullscreen mode
Logo

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐