Document Metadata
Overview
Easily read, write, and enrich metadata in Word, Excel, and PDF files.This module lets you manage document properties directly in Mendix: extract titles, authors, and custom fields, or add your own metadata to DOCX, XLSX, and PDF documents. Perfect for building smarter document workflows, improving searchability, and enhancing compliance by embedding meaningful metadata seamlessly.
Documentation
Typical usage scenario
Use this module when you need to read or write metadata of Office documents (DOCX, XLSX) and PDFs directly in your Mendix applications.
Typical scenarios include:
-
Extracting metadata (title, author, subject, keywords, etc.) for search and indexing.
-
Adding or updating custom metadata fields (e.g., CustomerID, CaseNumber, Classification).
-
Enriching documents before distribution (e.g., embedding workflow tags, compliance information).
-
Supporting business processes where metadata is critical for archiving, auditing, or discovery.
Features and limitations
Features
-
Read metadata from DOCX, XLSX, and PDF files.
-
Write custom string metadata back into documents.
-
Overwrite or add new custom properties
-
Core/extended properties (title, subject, author, etc.) are read automatically.
-
Dates returned in ISO-8601 format (UTC).
Limitations
-
Only string values are supported when writing.
Dependencies
This module relies on Apache POI and PDFBox libraries for handling Office and PDF documents.
Known bugs
-
None known at the time of release.