Document Metadata

Content Type: Module
Categories: Utility,Data

Overview

Easily read, write, and enrich metadata in Word, Excel, and PDF files.This module lets you manage document properties directly in Mendix: extract titles, authors, and custom fields, or add your own metadata to DOCX, XLSX, and PDF documents. Perfect for building smarter document workflows, improving searchability, and enhancing compliance by embedding meaningful metadata seamlessly.

Documentation

Typical usage scenario

Use this module when you need to read or write metadata of Office documents (DOCX, XLSX) and PDFs directly in your Mendix applications.

Typical scenarios include:

  • Extracting metadata (title, author, subject, keywords, etc.) for search and indexing.

  • Adding or updating custom metadata fields (e.g., CustomerID, CaseNumber, Classification).

  • Enriching documents before distribution (e.g., embedding workflow tags, compliance information).

  • Supporting business processes where metadata is critical for archiving, auditing, or discovery.

 

Features and limitations

Features

  • Read metadata from DOCX, XLSX, and PDF files.

  • Write custom string metadata back into documents.

  • Overwrite or add new custom properties

  • Core/extended properties (title, subject, author, etc.) are read automatically.

  • Dates returned in ISO-8601 format (UTC).

Limitations

  • Only string values are supported when writing.

 

Dependencies

This module relies on Apache POI and PDFBox libraries for handling Office and PDF documents.

 

Known bugs

  • None known at the time of release.

 

Releases