Universal OCR: Image to Text

Content Type: Widget

Categories: Utility,Visualization,Artificial Intelligence

Overview

About

Universal OCR is a pluggable web widget that extracts text from images directly in the browser. Users can drag & drop files, browse images, capture photos using the camera, or paste directly from the clipboard — no server‑side setup required.

Use Cases

Digitize invoices, receipts, and documents by snapping a photo or uploading a scan
Extract text from screenshots, whiteboards, handwritten notes, or ID cards
Build self‑service portals where users upload images and text automatically populates form fields
Automate data entry workflows by chaining the On Complete event to a microflow

How It Works

The widget preprocesses every image through an intelligent pipeline — grayscale conversion, auto‑contrast adjustment, sharpening, and adaptive thresholding — to maximize OCR accuracy.

The processed image is then sent to the OCR.space cloud API for high‑accuracy text extraction.

If no API key is configured, or if all keys are exhausted, the widget automatically falls back to Tesseract.js, which runs completely offline inside the browser.

Multi‑Key Support (Unlimited Free OCR)

OCR.space provides 25,000 free OCR requests per month per API key, with no credit card required.

The widget supports multiple API keys, separated by semicolons. When one key reaches its monthly limit, the widget seamlessly switches to the next available key.

Example:
- 4 free API keys = 100,000 OCR requests per month
- Total cost: ₹0 / $0

How to Get Free API Keys

Go to: https://ocr.space/ocrapi/freekey
Enter your email address and submit the form
Check your inbox for a confirmation email from OCR.space
Click the confirmation link
Your free API key will be sent to your email
- No credit card
- No billing
- No expiry

You can repeat this process with different email addresses to obtain multiple keys.

Paste them into the widget configuration separated by semicolons:

key1;key2;key3;key4

Features

Supports 30+ languages, including:
- English, Hindi, French, German, Spanish
- Japanese, Korean, Arabic, and more
Multiple image input methods:
- Drag & drop
- File browse
- Camera capture (mobile native + desktop webcam)
- Clipboard paste (Ctrl + V)
Three OCR.space engine options:
- Engine 1 – Fast
- Engine 2 – Most accurate
- Engine 5 – Latest
Split‑panel review UI with:
- Scanning animation
- Editable extracted text
- Word and character count
Copy extracted text to clipboard
Download OCR output as a .txt file
Fully responsive design for desktop, tablet, and mobile
On Complete and On Change events to trigger microflows or nanoflows

Quick Setup

Place the widget inside a Data View
Bind a String attribute to Output text
Paste your OCR.space API key(s) into OCR.space API keys
Run your app — done

Compatibility

Works with Mendix Studio Pro 9.24.28+ on all modern web browsers.

Documentation

Typical Usage Scenario

A common scenario is a document‑processing application where end users need to extract text from photos or scanned documents.

For example, a field worker takes a photo of a delivery receipt using a mobile phone. The widget instantly extracts the text from the image. The user reviews and edits the result if needed, then clicks Confirm & Save. The extracted text is stored in a String attribute, and a microflow is triggered to handle downstream processing such as creating records, sending notifications, or updating dashboards.

Other common scenarios include:

Digitizing handwritten notes taken during meetings
Extracting data from ID cards for KYC flows
Reading text from screenshots submitted in helpdesk tickets
Converting whiteboard photos into actionable digital text

Features and Limitations

Features

Multi‑key OCR.space integration
- Pass multiple API keys separated by semicolons
- Widget automatically rotates keys when one hits its quota
Offline fallback with Tesseract.js
- Works without internet or API keys
- Slightly lower accuracy for camera‑captured photos
Multiple image input methods
- Drag & drop
- File browse
- Camera capture (mobile native + desktop webcam)
- Clipboard paste (Ctrl + V)
Advanced image preprocessing
- Auto resize
- Grayscale conversion
- Auto contrast
- Sharpening
- Adaptive thresholding
Multi‑language OCR
- Supports 30+ languages including:
  - English, Hindi, French, German, Spanish
  - Japanese, Korean, Chinese
  - Arabic, Russian, Turkish, and more
OCR.space engine selection
- Engine 1 – Fast
- Engine 2 – Most accurate
- Engine 5 – Latest
Review‑friendly UI
- Split‑panel layout
- Editable extracted text
- Word and character count
- Copy to clipboard
- Download as .txt file
Workflow integration
- On Complete action
- On Change action
- Supports microflows and nanoflows
Responsive design
- Optimized for desktop, tablet, and mobile
- Studio Pro editor preview supported

Limitations

Web platform only
- Not supported in Mendix Native Mobile apps
Requires entity context
- Widget must be placed inside a Data View
Handwriting recognition accuracy depends on legibility
- Works best with printed or clean handwritten text
OCR.space free tier limitation
- 1 MB maximum file size per request
- Images are auto‑compressed, but very large photos may still exceed this limit
Offline OCR limitations
- Tesseract.js mode is slower and less accurate than OCR.space, especially for camera photos
PDF files are not supported
- Image formats only: PNG, JPG, JPEG

Dependencies

Mendix Studio Pro 9.24.28 or higher
- Tesseract.js v5Loaded automatically from CDN at runtime
OCR.space API key (optional but recommended)
- Free keys available at: https://ocr.space/ocrapi/freekey
No additional Mendix modules, Java actions, or marketplace widgets required

Installation

Download the .mpk file from the Marketplace
Copy the file into your Mendix project’s widgets folder
Open the project in Studio Pro
- The widget appears in the toolbox as Image To Text
If upgrading:
- Remove the old .mpk file before adding the new version
Run the project locally to verify the widget loads correctly

Configuration

Create or open a page with a Data View containing a String attribute
Drag the Image To Text widget into the Data View
Set Output text to your String attribute
Obtain a free OCR.space API key:
- Go to https://ocr.space/ocrapi/freekey
- Enter your email and submit
- Confirm via the email you receive
- Copy the API key from the email
Paste the API key into OCR.space API keys
- Multiple keys can be used:

key1;key2;key3

Choose an OCR engine
- Engine 2 is recommended for best accuracy
Optionally set:
- OCR language (default: eng)
- Multiple languages (e.g. eng+hin)
- On Complete action to trigger a microflow
Run your app

Known Bugs

On some Android devices, the camera capture button may open the file picker instead of the camera.
- This is a browser limitation
- Chrome on Android works correctly; some OEM browsers do not support the capture attribute
If OCR.space returns empty results for valid images:
- Try switching between Engine 2 and Engine 5
- Certain engines perform better for specific document types
Very dark or over‑exposed images may produce poor results
- Best results are achieved with good lighting and contrast

Frequently Asked Questions

Does this widget work without an API key?

Yes. Without OCR.space keys, the widget uses Tesseract.js running entirely in the browser. Accuracy is good for screenshots and scans, but lower for photos taken with a camera.

How do I get unlimited free OCR?

https://ocr.space/ocrapi/freekey

Each key provides 25,000 requests per month. Paste them semicolon‑separated:

key1;key2;key3;key4

The widget automatically rotates keys when one is exhausted.

Which OCR engine should I use?

Engine 2: Best overall accuracy (recommended)
Engine 1: Faster but less accurate
Engine 5: Newest engine, useful if Engine 2 performs poorly for certain documents

Does it support handwriting?

Yes, but accuracy depends on clarity and legibility. Printed text yields the best results.

Does the image leave the browser?

With OCR.space: the image is sent to their API for processing
Without API keys: processing is fully local using Tesseract.js

What languages are supported?

30+ languages including English, Hindi, French, German, Spanish, Portuguese, Italian, Japanese, Korean, Chinese (Simplified & Traditional), Arabic, Russian, Turkish, and many more. Configure using language codes in the OCR language property.

Can this be used in a Mendix Native Mobile app?

No. This widget is web‑only and relies on browser APIs (Canvas, getUserMedia) that are not available in the Mendix Native runtime.

Releases

Version: 1.0.0

Framework Version: 9.24.18

Release Notes: