Google Docs Adds OCR: Converts Images and PDFs to Text
By on June 22nd, 2010

Optical Character Recognition utilities are a handy breed of applications which can extract text from images. Unfortunately, most desktop solutions are prohibitively expensive for casual use. Free online OCR services offer a way out by offering simple and quick conversion for free. Sure, they don’t have all the features you can find in a full-fledged desktop suite. But, they are sufficient for most users.

Google Docs is the latest online service to get OCR capabilities. While Google has been experimenting with OCR in its Docs API since last year, it was added to the Docs frontend a little while ago. Now, while uploading any document, you will be provided the option to “convert text from PDF or image files to Google Docs documents”.

To test the OCR service I used an image extracted from an Av-Comparatives report. While Google managed to detect most of text correctly, it failed to retain any formatting. It managed to detect correctly even non-dictionary words like ‘Kingsoft’, however failed to detect special characters like ‘&’ and superscripts.

Google-Docs-Test-Document
Sample Document Used
Google-Docs-Results
Output Returned by Google Docs

Overall, the accuracy was quite good and to be honest a lot better than I had expected. However, there are obvious limitations to the product. Nevertheless, it’s a handy addition to an already impressive service.

Tags: ,
Author: Pallab De Google Profile for Pallab De
Pallab De is a blogger from India who has a soft spot for anything techie. He loves trying out new software and spends most of his day breaking and fixing his PC. Pallab loves participating in the social web; he has been active in technology forums since he was a teenager and is an active user of both twitter (@indyan) and facebook .

Pallab De has written and can be contacted at pallab@techie-buzz.com.

Leave a Reply

Name (required)

Website (optional)

 
    Warning: call_user_func() expects parameter 1 to be a valid callback, function 'advanced_comment' not found or invalid function name in /home/keith/techie-buzz.com/htdocs/wp-includes/comment-template.php on line 1694
 
Copyright 2006-2012 Techie Buzz. All Rights Reserved. Our content may not be reproduced on other websites. Content Delivery by MaxCDN