OCR Server for Form Processing

—————————————————————————————————-

“ExperVision® has one big advantage: SPEED. This corporate-level OCR application processes faster than any product of its type we’ve ever tested: It converted a scanned image of a 700-page book into an editable Word file in a startling 6 minutes! ExperVision® is worth considering for enterprise-level high-volume, high-speed OCR

PC Magazine is registered trademarks of Ziff Davis Publishing Holdings Inc.

—————————————————————————————————-

OCR has been applied more and more in the Form Recognition field to help people automatically process various paper forms, e.g. Insurance Claims, Medical Forms, Applications and Resumes, Invoices and Receipts, Orders and Checks, Accounting & Assets Records, Text Returns, Business Cards, Working Logs & Worksheets, etc. Many of our customers have encountered such a problem: “We buy Forms Processing Software that is alleged to handle all the forms, but apparently it does not work for many of our forms.”

The reason for this problem is that forms processing and document processing are essentially different. Document processing software needs to be fast and accurately recognize full content and export the whole-page-results in PDF or other popular formats. Forms processing does different things. A Forms Processing Solution (FPS) should pay attention only to “areas of interest” on the page, recognize the contents in those areas and export the result to application databases, in which the following problems need to be resolved:

  • How to help the user to define the areas of interest on a standard form image?
  • How to manage all the area definition(s) of interest for the same user or the same group of users?
  • How the software finds or “locates” the areas of interest on other images of the same form? The difficulties come from image variance in darkness, sizes, shift, skew, etc. due to physical factors of the paper form(s) and the scanning process.
  • How to recognize the contents in the located areas of interest? This is a typical OCR function, e.g. OpenRTK®.
  • How to correspond the recognized text with the fields in each record of the application database?

To address issues above, a solution needs to have, at the very minimum, a complex and “smart” module which could be called Form Layout Understanding (FLUS) in the FPS. The biggest problem customers find is that the FLUS module can only be specifically designed and developed for a particular form or group of forms, in other words, the FLUS must be specifically developed for the particular customer to solve its own forms processing problem.

The conclusion is that No Off-Shelf Software could provide a satisfactory solution to every customer.

ExperVision®’s solution is as follows:

  • In early years, ExperVision® developed our proprietary in-house GTS Block® technology in our champion OCR engine, OpenRTK®, to support page layout analysis functions.
  • ExperVision®has developed our FPS frame for general customer needs in forms processing based on OpenRTK®, the FLUS module being supported by the GTS Block®technology.
  • For any customer, our R&D team will specifically Train the intelligent FLUS module and customize other modules in the FPS frame resulting in a newly developed FPS to work particularly for the customer’s business needs.

ExperVision® ’s FPS has been continuously improved in the past decade and successfully applied to many customers’ business process.

OpenRTK® SDK 6.0 / 7.0

ExperVision® provides the OCR SDK, OpenRTK® currently in its 6.0/7.0 versions, which have been based on our awarding winning OCR technology, research & development work and customization experience for the past two decades. It has the following unique characters and advantages:

“Overall, ExperVision Recognition Toolkit (the OCR engine of ExperVision) performed the best in this year’s test (among OmniPage, WordScan and other OCR software). It demonstrated consistently high accuracy. It performs especially well on proportional pitch text, and is least affected by low resolution (200 dpi). It also provides an excellent automatic zoning capability.”

UNLA is registered trademarks of University of Nevada Las Vegas
DOE is registered trademarks of U.S. Department of Energy

FLUS® Module

ExperVision® Forms Processing Solution consists of a FLUS Module which provides a frame for form pre-processing and form layout analysis. It can be customized for particular forms for the customer’s business needs. It also helps users to define the area’s attributes on form images, e.g. type of information, data property, value ranges, judging conditions, geography constrains, etc. with which the OCR engine will work more accurately and effectively.
Please contact our consultant to request the complete solution

GTS Block®

Included in OpenRTK®, GTS Block® is a unique technology to analyze the page layout from word level bottom-up, which was developed by ExperVision® ’s R&D team in early 90s. Essentially different from ordinary paragraph layout analysis technologies employed by other OCR vendors, GTS Block® constitutes a solid base for intelligent “area locating” algorithms to be developed for forms processing. We have employed GTS Block® in the FLUS Module in ExperVision® ’s FPS, which is particularly helpful in customizing the system for customers’ needs in the forms processing field.
Please contact our consultant to request the complete solution

Form Template Training

ExperVision® cares about the performance of its Forms Process Solution in the customer’s actual business environment. We carefully study the business processes of forms processing, observe large number of practical form images, analyze the key issues during the processing, make adjustment and train the flexible format templates for the FLUS module to work for the customer. Give us your form examples, our R&D team customizes and guarantees ExperVision® ’s FPS will work to your satisfaction.

1. Unique features of OpenRTK®:

  • The unique mathematical model to keep font information for high recognition rate and speed,
  • The data driven recognition scheme for easy inclusion of additional languages, fonts and/or character sets,
  • Automatic training of the OCR engine based on the Data driven scheme,
  • Backtrack control and many other optimization algorithms fused in the engine,
  • Thoughtful architecture to include complementary methods and/or modules to keep performance improvement along with market needs, etc.

2. Open Data Structure: Unlike other OCR vendors who provide only general APIs, we open our Internal Data Structures to the developers, which will enable you to get rich engine information, as a result, you can develop the very unique OCR Application for your own business.

3. Architecture & APIs: Introduction of the 200 APIs based on TypeReader® 2008, the latest version which has OpenRTK® integrated to realize the most flexible desktop and/or server OCR functions. Some code samples are provided for developers to call OpenRTK® thru API.

4. Application Cases: OpenRTK® has been widely applied to DIM (Document Imaging Management), FPS (Form Processing Solution), Embedded and Server Applications.

5. Multi-OS Support: Currently OpenRTK® runs on more than ten OSs, and can be migrated to any others on customer’s request.

6. Extended OCR Service: ExperVision provides extended OCR R&D services for the sake of the customer’s unique business.

TextProofer Module

It is well known in the market that OCR Servers have become more and more popular in enterprise applications, around which many users work concurrently. While receiving the result from the OCR Server, each user may need a proofreading tool to make sure the final output is 100% correct. ExperVision’s Award Winning TextProofer® is such a proofreading tool which may be of interest to you. Following is  list of its basic features.

  • display both image and OCR result on screen;
  • mark suspected characters for the user to verify;
  • allow the user to make easy corrections;
  • export the OCR results with common or special file formats; and
  • provide software interface for integration with other applications.

It is possible to customize TextProofer® for OpenRTK® clients for their specifications in DIM (Document Imaging Management) and FPS (Forms Processing Solution) fields, since both TextProofer® for OpenRTK® are important integral parts inside TypeReader.

Core Technology R&D Service

ExperVision provides extensive core technology service in OCR and Forms Processing fields. For more details, please send your request to OCR_Consulting_Team@ExperVision.com.

ajax
ajax