Extract Text from Scanned Copy or PDF using Google Docs

There are many free software to extract text from images using OCR technology. If you don’t like to install a new software on your computer, you should try Google Docs to fetch the text data from scanned copy ( JPEG, PNG etc.) of a document or PDF files.

Sometimes we need to send information in text format because of the attachment restriction or other limitations. For example, if you want to send your university results to your friend and find that attachments in email is not allowed on your campus computer, you won’t be able to send the data. Don’t worry, you can still send the information using OCR (Optical Character Recognition) option which allows you to fetch the text out of any scanned image including your mark sheet. In other scenario, this OCR technology is great in transforming the scanned copy of your favorite print book into a text format eBook.

Google Docs gives 1GB of upload limit to every user for free. That means, you can upload and save up to 1 GB of documents and files in Google Docs for free and even can access them using your Android phone as well. The scenario may change after the expected release of Google Drive where you may get 5 GB of free storage. To extract the text out of image or PDF, you need to select an option named as “Convert text from PDF and image files to Google documents” while uploading that file on Docs.

google-docs-ocr

Once the upload process completes, you can see the extracted text at the bottom of the image. You need to open the uploaded document or file to see that.

That’s it. You can copy the text from that Google Docs file and paste that in compose email window of your mailbox. In case of books, you can upload all scanned images at once by selecting them together and choose the “Convert text from PDF and Image” option. All the images will be uploaded and you can easily copy the text from those files to use somewhere in ePub format or other eBook format.

Language Support

English is the default language in Google Docs “Convert text from PDF and Image” option but you may select the language used in document to fetch that correctly. Google Docs OCR supports all 38 languages available in Google Translate. But as you know the translation performed in Google Translation tool is not perfect yet, Google Docs uses same technology and that’s why you may not get perfect extraction in other languages than English. It is recommended to review the extracted text before using that.

Sanjeev Mishra is a professional blogger and an Internet Marketing Consultant based in India. He has built the Internet Techies to provide you updates in technology and web application area.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>