For years we have been using computers with OCR and ICR software to read documents. Today these software programs have matured. Accuracy has increased to very acceptable levels and costs have dropped. Many organizations are receiving benefits and improved productivity with OCR/ICR technologies.
However, OCR/ICR still have limits to what they can read. OCR only reads machine printed characters. ICR has severe limitations when it comes to human handwriting. Characters must be hand printed with separate characters in boxes!
Today new technology has arrived that will have a major impact on reading documents. This technology is known as Natural Handwriting Recognition (NHR). NHR allows computers to read and recognize handwriting with a high degree of accuracy. Unlike its cousins OCR/ICR, NHR uses a complex series of algorithms to compare and recognize each character. Knowledge of what it is reading along with a field image is essential for success in reading the characters, just as a human would not be able to read this article if he had no knowledge of the language being used. NHR needs a dictionary of possible values of the field and a definition of the field type, such as Name or SS#, as described below.
The following outlines the basic NHR tasks:
Form Identification: This task consists of identifying certain expected features on each form image presented for recognition. The output of the task is either the rejection of the form as unrecognizable or a set of locations of key features that identified the form as acceptable for further processing.
Field Isolation: This task consists of extracting the text image of each data field from the form. The output of this task consists of one or more images of text minus the surrounding portions of the form.
Segmentation: This task consists of breaking each image of text into smaller units for recognition. The output of this task is one or more image segments. Each segment is either the image of an isolated character, an image of an isolated piece of a character, or the image of an isolated group of connected or otherwise under segmented characters. In most cases this will be images of an isolated character or symbol.
Recombining Segments: This task consists of selecting various combinations of segments as plausible candidates for isolated character images. The output of this task is one or more isolated character-image candidates.
Recognition: This task consists of assigning relative confidences to all of the allowed classes for each character-image candidate. The output of this task is either a single class or a set of ordered pairs consisting of character class and associated confidences. This output is call raw HWR to emphasize that it has been generated without the help of any context other that existing in the isolated character-image candidates.
Organizing Character Candidates: This task consists of organizing the output of tasks, Recombining Segments, and Recognition into a form useful for the dictionary input stage. The output of this task is just the output of those task, in a format suitable for the particular dictionary look-up method being used.
Dictionary-Based Correction: This task consists of selecting the dictionary entries that best match the properly organized character-image candidates according to some set criteria. The output of this task is the hypothetical answer provided by the HWR system as its final result and a confidence for that field.
Level of Acceptance: This task consists of comparing the final result confidence level with set accuracy levels. The output of this task will be a rejection ( the confidence level is too low) or acceptances of the final result as the conversion of the written data.
Rejection of the Result: This task consists of transmitting the field image, the final result, to a workstation for human correction or acceptances.
The output of an NHR system will be a word from a dictionary and a confidence level (the degree of confidence that the answer is accurate). The answer can be accepted or rejected at this point. The rejected answer can be sent to a workstation along with the field image for human correction or acceptance. Manual input may also be used for those words that can only be read by a human.
It should be pointed out that a 50% to 75% confidence level for the answer can be correct and would only need to be verified. Even at a 50% confidence level the productivity for key entry will be increased by 50%, lowering by one half the number of human workstations needed!. A recent test was conducted using 500 images providing conditions similar to an office application. The National Institute of Standards and Technology's (NIST) Special Database 1 was used containing real data (words written by 500 independent writers), the length of word - 3 symbols per field on average, dictionary size - 50 entries.
In this test we showed 98% recognition rate per field. The reject rate (percent of images to be processed by a human operator ) was only 2% . The error rate was 1%. Our experience shows that a typical human operator makes 0.7 - 1% error per field in such applications as mail sorting and check processing. To hand key this information took 32 man hours. The computer preformed this test in 58 minutes.
One of the most advanced NHR products today is ParaScript, developed by ParaGraph International. ParaScript_ began in Russia in the late 60's when the first program for handwritten text recognition was written for mini computer SM4. Then there was a long break in this work until the middle 80's when the problem was brought to life again. At that period the scientific seminars in mathematics and biology led by a prominent scientist, academician of the Academy of Science, professor of mathematics, I.M. Gel'fand, existed in the USSR Academy of Science. The seminars were very popular among young capable scientists and programmers who were keen on solving new complicated problems.
Among the audience there was a group of young scientists called by I.M. Gel'fand " A group for solving of unsolvable problems" who decided to make a program for human handwriting recognition. These people were among those who led by Stephan Pachikov, ParaGraph's President and CEO, were at the outset of ParaGraph International. In a few months the program was able to read a page of handwritten text. It was the first step in developing ParaScript technology.
Applications for NHR:
Postal Address Recognition: City, State, Zip SDK is an MS Windows SDK, developed on the basis of the ParaScript_ NHR Technology. City-State Zip SDK has the capability of recognizing cursive, hand-printed, and mixed cursive/hand-printed City-State-Zip address fields. Once the individual elements of the fields have been identified, it achieves its highest accuracy levels by "double checking" through cross validation with the US Postal Service USPS standard databases.
SDK can recognize City, State, and Zip values automatically with more than 99% accuracy or will recommend manual processing when the field is illegible. After that occurrence and operator can process the address conventionally. This reduces the manual work needed for processing. Even if a recognizer is not confident of an answer, this answer can be correct. Usually 50% - 75% of answers sent to a human operator are correct.
The performance of City-State-Zip SDK is as follows:
More than 70% reading rate for 99% accuracy on real mail streams
More than 80% reading rate for 99% accuracy on NIST Special Database
Average recognition speed of City-State-Zip SDK:
1-2 seconds per address using Pentium 90 MHZ
0.5 - 1 second per address using DEC Alpha 21064a 150 MHZ
Check Reading: Check SDK locates the legal amount field and recognizes the amount in this field and additionally locates the courtesy field and recognizes the amount in this field. These two fields are then cross validated to ensure the appropriate confidence of the recognition. To deliver superior recognition accuracy cross validation of part in legal and courtesy fields is fulfilled.
The performance of check reading is as follows:
More than 60%-65% reading rate for 99% accuracy on real checks.
Recognition
of the legal amount field and cross validation of courtesy and legal amount
fields improves reading rate up to 78% - 85%.
Other areas like Forms Recognition, Medical Records, Insurance Claims, Fax Recognition and Routing, Batch Indexing ,and Legacy Data Capture are all preformed with like results. As time goes forward new applications will be designed using HWR technology, giving increased productivity and some relief from the mounds of paper we all work with each day.
C. Edward Rawson, CIM - Document Management Consultant,
Technology Lead - NOAA Legacy Data Rescue Research Project
can be reached at cerawson@ovnet.com or 304-564-5805
Home...
ReStart Topic...