Not Your Father’s OCR

By NE Docs | January 10, 2019

Not Your Father's OCR

You need to edit the text in a document, but the only copy you have is a bitmap file or a paper version. What can you do? You can type all the text in by hand, which is painfully tedious if there is a lot of text. Or, you can use Optical Character Recognition (OCR) to scan and convert the document to editable text.

Some Technological History

It’s interesting to consider that the idea for OCR has been around since the 1870s (see Wikipedia’s “Timeline of optical character recognition”). Early in its history, it was used to help the blind read. In the 1930s, OCR tools were invented and used in industry to interpret Morse code and read text out loud. Other uses included reading coupons, postal addresses, price tags, and passports. This technology eventually became a solution for manipulating text that was not already in digital format. Clearly, OCR has advanced far from its optical-mechanical origins to serve us in the digital age.

About 25 years ago, use of OCR technology was expanded through the availability of online OCR software, such as Adobe Acrobat, to scan and extract text from an image or paper file. This was a time-saver from having to retype all that text. However, often the results were not satisfactory, including character “guesses” and misspellings. Graphics and symbols caused problems as well.

Today, there have been significant advances in OCR technology, permitting scanning of images and quickly producing editable text.

Overcoming OCR Challenges

OCR is a huge timesaver. However, there can be challenges with extracting information from existing paper documents. For example, the document may be faded with exposure to light, or it may contain tears. The quality of the font and contrast with the paper color, and other text patterns affect the results of the OCR scan.

Working with an expert in document management and OCR technology will increase your likelihood of being able to retrieve the content on those aged or less-than-pristine documents.

Using Smart OCR

NEdocs scanning services and software programs implement what we call “smart OCR,” using artificial intelligence (AI). Offering more than just recognizing alphanumeric characters and symbols, this technology scans and learns form formatting and patterns, including such items as icons, logos, bars, lines, graphics, and even handwriting, based on a huge knowledge database that keeps growing.

Content on paper such as invoices, packing slips, labels and even brochures can be easily captured, indexed and restored to clean, editable PDF or Word files by simply scanning the documents, referencing the database, and applying the various AI rules and algorithms.

AI simplifies your interaction with documents by removing manual steps you would otherwise have to perform. The AI engine can quickly learn a variety of attributes about your document language, types, structure and context. It can also be configured to read low-resolution scans, copied or faded documents and damaged paper documents. The scan process can include your validation of the results, followed by automatic integration with your existing document storage system.

Today’s “smart” OCR technology is helping a lot of companies go more paperless, even when doing all their required “paperwork!”

For further information about using OCR powered by the magic of artificial intelligence, give our document specialists a call at (603) 625-1171.

Leave a Comment

Your email address will not be published.