OCR Extraction

Content Type: Module

Categories: Utility,Data

Overview

This component provides a ready-to-use OCR (Optical Character Recognition) solution for Mendix applications using the OCR.Space API. It allows applications to extract readable text from uploaded images in a reliable and configurable manner.

The module accepts images as Base64 input and performs automatic preprocessing to improve recognition quality. It includes blur detection to identify low-quality images before OCR execution and returns the extracted text along with a calculated confidence score to indicate result quality.

The component is designed for multi-user and multi-environment usage. No API key is bundled with the module. Users must supply their own OCR.Space API key through Mendix App Settings, enabling secure per-environment configuration (Local, Test, Acceptance, Production).

This module is suitable for use cases such as document digitization, ID and invoice scanning, form processing, and image-based data extraction. It is optimized for English-language text and works best with clear, well-aligned images using standard fonts.

The solution is implemented using a Java action and can be easily integrated into existing Mendix microflows without additional dependencies

Documentation

Typical usage scenario

This module allows Mendix applications to extract readable text from images using the OCR.Space API.

Typical use cases include:

Uploading ID cards, invoices, bills, receipts, certificates, or forms
Extracting text from scanned documents or photos
Automatically converting image content into searchable and editable text
Showing extracted text and confidence score in a popup or page
Using OCR results for validation, automation, or further processing

This solution is useful in:

Document digitization systems
Invoice or receipt processing apps
Any Mendix app requiring image-to-text conversion

Features and limitations

Features

Extracts text from images using OCR.Space API
Supports Base64 image input
Automatic image preprocessing (resize, format handling)
Returns:
- Extracted text
- Estimated OCR confidence score (%)
- Confidence level (Low / Medium / High)
Designed for multi-user Mendix applications
API key is configurable per environment
Clean Java Action implementation

Limitations

OCR accuracy depends on:
- Image quality
- Lighting
- Text clarity and font
Confidence score is an estimated quality indicator, not a mathematically exact accuracy
Java ImageIO does not support WEBP format by default
Requires an active internet connection (cloud OCR)

Dependencies

OCR.Space API key (free or paid)
Mendix Studio Pro 9.x or above
Java Runtime Environment (default Mendix runtime)
Internet access from Mendix runtime

No third-party Mendix modules are required.

Installation

Download the module from the Mendix Marketplace
Import the module into your Mendix project
Resolve any consistency errors
Add the module to your project dependencies
Deploy the application once to initialize the module

Configuration

Obtain an API key from OCR.Space
Store the API key in:
- Setting of App Constant or
- Environment variable (recommended for production)
Configure image upload entity to pass:
- File document
- Base64 image data
Call the provided Java Action from:
- Microflow
- Button click
- Popup action
Map the output:
- Extracted text
- Confidence percentage
- Confidence level

Known bugs

WEBP image format is not supported by default Java ImageIO
Very low-quality images may result in:
- Low confidence score
- Partial text extraction
Large images may increase response time

Releases

Version: 1.0.0

Framework Version: 9.24.34

Release Notes:

Initial release of the OCR Extraction module using the OCR.Space API.

Features:

• Extracts text from images using OCR.Space

• Supports Base64 image input(jpg,png)

• Includes automatic image preprocessing and blur detection

• Returns extracted text with a calculated confidence score

• Designed for multi-user usage with per-environment API key configuration

Language & Font Support:

• Optimized for English (ENG) text recognition

• Best accuracy achieved with clear printed English text

• Decorative, handwritten, or non-English text may result in lower accuracy

Configuration:

• Users must provide their own OCR.Space API key

• Set the API key in App Settings → Constants → OCRExtraction.OCRSpaceApiKey

• Restart the application after updating the key

Notes:

• No API key is bundled with this module

• Compatible with free OCR.Space trial keys