STOIK Document Image Enhancement SDK - Improve OCR Quality of Document Images
Prepare document images taken with scanners, digital cameras or camera phones for text recognition and improve OCR quality by correcting common issues in full auto and semi-automatic modes.
STOIK Document Image Enhancement SDK is designed to ensure smooth operation of paperless workflow by pre-processing document images, making them easier to handle by text recognition (OCR) programs and enhancing visual quality of documents.
Features at a Glance
The SDK will remove digital noise and correct for common exposure problems, correct rotation and geometric distortion, clean background and extract document area to make subsequent OCR job smoother and easier with greater recognition quality and less text recognition errors. STOIK Document Imaging SDK will also enhance text legibility of document images and reduce their size. The features include:
- Automatic, highly accurate document area extraction
- Geometric distortion correction
- 3D perspective distortion correction (e.g. trapezoid distortion) with different interpolation schemes
- Automatic image align and rotation
- Automatic noise reduction
- Adaptive sharpening and de-blurring of noisy documents and images with background patterns
- Automatic brightness equalization
- Automatic boost of brightness and contrast for easier text recognition
- Adaptive local contrast enhancement in grayscale documents
- Adaptive binarization equalizing bad lighting conditions
- Automatic cleaning of document background
- Whitening document background
- Threshold operation for producing strictly black/white documents with character thickness control and spot/line removal
- Common pre-sets include A4/letter, postcard, business card, ID card, receipt, etc.
Pricing and Availability
For STOIK Document Image Enhancement SDK Price quotation and evaluation version please contact us.
STOIK Document Image Enhancement SDK currently available for:
- 32-bit and 64-bit versions of Microsoft Windows;
- Apple iOS;
- Symbian S60 and Symbian S^3;
- Android (v2.0 and higher).
The mobile version of STOIK Document Image Enhancement SDK offers quality, fast operation combined with resource-savvy performance. The latest release further boosts the performance of the mobile version.
Adaptive Local Contrast Enhancement in Grayscale Documents
The function makes text and fine detail more legible by automatically increasing local contrast without affecting digital noise or introducing image artefacts. Two parameters are available to fine-tune the performance for different font sizes and background noise levels.
Click to see sample images (100% zoom, taken with Apple iPhone 4) preview:
- Original (1.9 Mb)
- Processed with Adaptive Local Contrast Enhancement (1.2 Mb)
Cleaning Document Background
This feature automatically detects and removes distracting background elements such as texture, dots, lines, as well photocopying and scanning artefacts common in documents such as ID cards and passports, bank checks, tickets, and so on.
Cleaning background before processing the document with an OCR program can dramatically increase text recognition quality. As an example, there was an almost 50% increase in text recognition quality in a part of a security suite that was capturing passport information.
Effectively removes heavy noise, vertical and horizontal lines from microfiche scans. Specifically adapted for black&white image processing. Could also be useful for image with table data preprocessing.
Whitening document background
Whitening document background is yet another mean to reduce document size and improve text legibility in scanned documents. The function makes the whole document background white while presenting text characters in black, even if inverse fonts (bright text on dark background) or light colors were originally used.
The function can be adjusted for different strength and character sizes. The feature comes handy when faxing or printing complex documents, as it extends cartridge life and leads to significant reduction of transmission time and costs.
Document Area Extraction
This function automatically detects document area in the document image, and determines coordinates of document corners. Extracting document area out of a document image can reduce image size and make the document appear larger and more legible on screen for full-size viewing. The latest release further enhances Document Area Extraction, giving it a much higher accuracy if documents have patterned background or very low contrast.
Geometric and Perspective Distortion Correction
Distortion correction fixes document perspective distortions by ensuring the four corners of the document fit into a rectangular area.
The latest release adds 3D perspective correction, applying 3D compensation of perspective distortions. Perspective distortions are unavoidable when shooting documents at an angle with camera phones; STOIK Document Image Enhancement SDK helps developers obtain a plain, rectangular version of a document even if the image is severely misaligned and features a trapezoid distortion at the same time.
Auto Align and Rotation
This feature automatically detects whether portrait or landscape orientation should be used, and ensures the document is correctly aligned vertically and horizontally.
Automatic Noise Reduction and Sharpening
The full auto mode automatically removes digital noise while preserving and enhancing text detail and edges. Several pre-defined presets and manual adjustments are available to fine-tune the performance.
This feature is based on the powerful noise reduction engine used in STOIK Noise AutoFix, and adapted to document images. Specific algorithms are included for certain image capturing devices such as Nokia smartphones and iPhone 4.
The newest release adds several configuration options, allowing developers to control the level and scale of sharpening depending on the size and noisiness of document images.
Brightness equalization automatically reduces brightness variations throughout the document by making dark (e.g. printed text) and bright (e.g. white paper) areas appear equally lit even in mixed lighting conditions such as shots taken in the library. Brightness equalization can also enhance visual quality of images appearing in the document.
Brightness and Contrast Correction
Brightness and contrast enhancement automatically corrects document brightness and contrast to provide the best image quality. Manual adjustment is an option.
Producing a fully legible black-and-white copy of a color document has never been easy. The newly added adaptive de-saturation employs a locally adaptive color-to-greyscale conversion, optionally inverting the colors in the process.
Adaptive binarization further enhances document processing, performing reliable black-and-white conversion regardless of lighting and shooting conditions. Documents converted with this technique have less recognition errors when being put through the OCR process.
Threshold control converts images into strict black and white with no shades of gray or smooth gradients. The function can be used to adjust the thickness of lines and characters, and to remove black spots and lines of smaller size.
Threshold control is essential for quality text recognition, and should be used right after cleaning document background. Combined with other features, threshold control provides maximum suppression of unwanted detail, and produces clean, legible and easily recognizable documents.