CSV Splitter
Overview
The CSV Splitter for Mendix module simplifies handling large CSV files by splitting them into manageable chunks, enhancing performance and manageability in Mendix applications. Users can define the number of rows per split, ensuring smooth processing of extensive datasets. Each chunk is stored as a FileDocument, facilitating easy retrieval while reducing memory usage.
This module also includes advanced functionality with the Excel Splitter, designed to handle large Excel files efficiently. It supports splitting by rows or threads based on user-defined parameters:
- Row-Based Splitting: Divides the Excel file into files containing a fixed number of rows.
- Thread-Based Splitting: Splits the file into a specified number of chunks, distributing rows evenly across them.
The Excel Splitter preserves headers (if enabled) and outputs chunks in .xlsx format. The feature logs processing times, ensuring transparency and performance tracking.
Key Features:
- Splitting CSV and Excel files into manageable pieces.
- Flexible splitting methods: by rows or threads.
- Efficient memory management using FileDocument storage.
- Supports headers in output files.
- Comprehensive error handling and logging for robust operation.
Additional Notes:
The implementation includes safeguards to preserve custom user code between regeneration cycles in Mendix Studio Pro. It adheres to modern coding practices, supporting special characters in comments and leveraging enhanced Java functionalities for improved maintainability.
Documentation
The CSV and Excel Splitter Module is a utility designed for Mendix applications to handle large datasets efficiently. It allows users to split CSV and Excel files into smaller, manageable chunks, simplifying data processing and improving performance. The module supports various splitting methods, retains headers, and saves output files as FileDocument objects for easy management.
Features
The module includes the following core functionalities:
- CSV Splitting
- Splits large CSV files into smaller chunks based on a user-defined number of rows per chunk.
- Each chunk is stored as a FileDocument for streamlined retrieval and reduced memory overhead.
- Excel Splitting
- Supports splitting Excel files in two ways:
- Row-Based Splitting: Divides the file into smaller files containing a fixed number of rows.
- Thread-Based Splitting: Distributes rows evenly into a specified number of chunks.
- Output files retain the header row if enabled by the user.
- Chunks are generated in .xlsx format for compatibility with modern spreadsheet tools.
- Supports splitting Excel files in two ways:
- Error Handling
- Validates input parameters such as file content, split quantity, and method to ensure correct operation.
- Logs detailed error messages for troubleshooting issues.
- Performance Monitoring
- Logs processing details, including chunk generation times, to facilitate performance tracking.
Installation
- Import the module into your Mendix project from the Mendix App Store or as a downloaded package.
- Add the required dependencies for FileDocument and system proxies if not already available in your project.
- Configure the required user roles and permissions to ensure access to the splitter functionalities.
Usage Instructions
CSV Splitting
- Provide the CSV file as a FileDocument in Mendix.
- Specify the number of rows per chunk.
- Call the splitter logic, which outputs individual chunks as separate FileDocument objects.
Excel Splitting
- Upload an Excel file as a FileDocument in Mendix.
- Choose the splitting method:
- Row-Based Splitting: Specify the number of rows per chunk.
- Thread-Based Splitting: Define the number of chunks the file should be split into.
- Enable or disable header retention based on your requirements.
- Call the splitting logic, which returns the resulting chunks as .xlsx files saved in Mendix.