- 08 Nov 2023
- 5 Minutes to read
- PDF
Configuring & using the Batch Sync Client
- Updated on 08 Nov 2023
- 5 Minutes to read
- PDF
The Batch Sync Client is a companion software that automates DocFusion's Enterprise Batch Processing functionality. It is a stand-alone application that monitors filesystem folders for XML or JSON payloads that conform to a configured filename pattern then creates batches for processing. These payloads can be validated against an XML schema before processing and, once batches are processed, they are written to an output folder in the format specified in the configuration. Log files for batches and their batch records are available in configured output folders as well to diagnose generation errors and arrange for payload resubmission.
The Sync Utility is useful when implementing batch processing for DocFusion without explicitly integrating with the API. The utility is configured using the appSettings.json file where folder paths, file patterns and CRON schedules are specified. Optimally, you can configure the chunk-size for file streaming to the batch engine in order to use network and hardware resources more efficiently. Process logging can also be integrated with external analytical tools to gain insights into performance.
Contact your DocFusion sales representative to get access to the Sync Utility.
Starting the Sync Utility
The Sync Utility can be started via the .exe provided by DocFusion or within a Docker container. Depending on how you start it, the appSettings.json configuration file is located in /app path whether running within a container or using the AIS.DocBatch.BatchClient.exe.
Configuring Settings
- Download a sample configuration file: appSettings.External.json
- or view the sample configuration file below...
1. Configure Batch Tenancy
JSON OBJECT: BatchTenant
Connects the batch tenant to your DocFusion business unit (1:1 relationship).
PROPERTY | SETTING |
---|---|
BatchTenantId | Set to the Guid of the batch tenant. |
2. Configure Batch Client
JSON OBJECT: BatchClient
Sets up OAuth 2.0 authentication for the Sync Utility (client).
PROPERTY | SETTING |
---|---|
Url | URL endpoint for the DocFusion Batch Engine. Default: https://online.docfusion-paas.com |
TokenEndpoint | Endpoint of the authentication service to retrieve a bearer token using OAuth2.0 client credential flow. Default: https://online.docfusion-paas.com/core/connect/token |
ClientId | OAuth2.0 client ID. Provided by DocFusion. |
ClientSecret | OAuth2.0 client secret. Provided by DocFusion. |
Scope | Set to "DocFusion" |
3. Configure Batch Status Folder
JSON OBJECT: BatchStatusFolder
Configuration to monitor the status of a batch. When a batch is created, a status file is created in the InputFolder path to record state information. Once the batch is processed, the status file is moved from the InputFolder to the OutputFolder. In the configuration, BatchStatusFolder can contain many FolderMonitors JSON Objects but only 1 BatchStatusFolder should be configured.
PROPERTY | SETTING |
---|---|
InputFolder | Path of folder that stores status files for input monitors. E.g. "/temp/batch/batches-state/in" (Linux) or "\\temp\\batch\\batches-state\\in" (Windows) |
OutputFolder | Path of folder that stores status files for processed batches. E.g. "/temp/batch/batches-state/out" (Linux) or "\\temp\\batch\\batches-state\\out" (Windows) |
AutoApproveWhenNeeded | Sets whether the Sync Client should send an approval for the batch if it's in the AwaitingApproval status. Whether batches require approval or not is configured on the Batch Type using the processRecordsWithoutApproval setting. Values: True / False |
Schedule | CRON expression specifying the interval to poll the InputFolder. When submitting many batches, offset the monitoring schedule from the BatchFolderMonitors schedule to allow time for processing. DocFusion's Batch Engine supports sub-minute (i.e. seconds) definitions for CRON schedules. E.g.: the CRON expression */5 * 1 * * specifies polling every 5 minutes on the 1st day of every month. Forward-slash specifies recurrence. CRON expressions constitute 5 parts, each specifying an interval as follows: * Minutes (0-59) * Hours (0-23) * Day of Month: (1-31) * Month (1-12) * Day of Week: (0-6) |
4. Configure Batch Folder Monitors
JSON OBJECT: BatchFolderMonitors / FolderMonitors
FolderMonitors are associated to a Batch Type and configure settings for payloads and processed files. InputFolder paths are scanned for the FilePattern on the defined CRON schedule, then processed files are written to the OutputFolder path using the FileNameTemplate setting. There are no limitations on the number of FolderMonitors you can add to the BatchFolderMonitors object, though it's important to consider what your hardware allows.
PROPERTY | SETTING |
---|---|
Enabled | Enables or disables monitoring of the specified folders. Values: True / False |
Name | Name of the folder monitor. Typically set to the name of the DocFusion template associated to the batch type. |
BatchTypeId | Unique Guid of the associated Batch Type. |
Schedule | CRON expression specifying the interval to poll the InputFolder. Each poll creates a new batch. Any files that that are dropped after the interval will be processed in the next batch. DocFusion's Batch Engine supports sub-minute (i.e. seconds) definitions for CRON schedules. E.g.: the CRON expression */5 * 1 * * specifies polling every 5 minutes on the 1st day of every month. Forward-slash specifies recurrence. CRON expressions constitute 5 parts, each specifying an interval as follows: * Minutes (0-59) * Hours (0-23) * Day of Month: (1-31) * Month (1-12) * Day of Week: (0-6) |
InputFolder | Path of the folder to monitor for input payloads. E.g. "/temp/batch/batch/in" (Linux) or "\\temp\\batch\\batch\\in" (Windows) |
OutputFolder | Path of the folder to write processed batch records (documents) to. E.g. "/temp/batch/batch/out" (Linux) or "\\temp\\batch\\batch\\out" (Windows) |
OutputFileExtension | File extension for output (processed) files. When the BatchType has the performMerge flag enabled, this property is used to specify the resulting file extension. E.g. ".PDF" |
FilePattern | File extension of files in the input folder that will be processed, e.g.: "XML". The Sync Utility will only process files with this extension. |
ChunkSize | Sets the number of batch records (files) to transmit to the server in each HTTP call to the batch engine. Limits the amount of HTTP calls by combining batch records into a file-transfer package. The limit on payload file size is 100MB per file. E.g. 10. (10 files will be sent to the engine in one HTTP call.) There is no limit to the ChunkSize setting, though the size impacts network throughput and is also impacted by the number of available CPU processes, which can impose throttling. |
RecordProcessedExtension | The file extension of processed documents stored in the OutputFolder path Default = ".processed" |
FileNameTemplate | Sets the filename pattern for processed files stored in the OutputFolder path. Allows you to map the output document to the payload that was sent. E.g. "{RecordId}_{RecordIdentifier}_{FileName}.pdf" Tag Reference:
|
Sample appSettings Config File
{
"AppSettings": {
"AppInsights": true
},
"ApplicationInsights": {
"ConnectionString": "InstrumentationKey=<Your-Key>;IngestionEndpoint=<URL>;LiveEndpoint=<URL>",
"LogLevel": {
"Default": "Information"
}
},
"BatchTenant": {
"BatchTenantId": "{BATCH-TENANT-GUID}"
},
"BatchClient": {
"Url": "{URL}",
"Port": 443,
"HttpMode": "Https",
"TokenEndpoint": "{SERVER-TOKEN-ENDPOINT}",
"ClientId": "{CLIENT-ID-FROM-DOCFUSION}",
"ClientSecret": "{CLIENT-SECRET-FROM-DOCFUSION}",
"Scope": "DocFusion"
},
"BatchStatusFolder": {
"InputFolder": "/temp/batch/batches-state/in",
"OutputFolder": "/temp/batch/batches-state/out",
"AutoApproveWhenNeeded": true,
"RecordBlockSize": 20,
"RetryCount": 5,
"RetryBackoffSeconds": 5,
"Schedule": "*/1 * * * * *"
},
"BatchFolderMonitors": {
"FolderMonitors": [
{
"Enabled": true,
"Name": "Rewards Statement",
"BatchTypeId": "<BATCH-TYPE-GUID>",
"Schedule": "*/10 * * * * *",
"InputFolder": "/temp/batch/in",
"OutputFolder": "/temp/batch/out",
"OutputFileExtension": ".pdf",
"FilePattern": "*.xml",
"ChunkSize": 10,
"RecordProcessedExtension": ".processed",
"FileNameTemplate": "{RecordId}_{RecordIdentifier}_{FileName}.pdf" // {RecordId} {RecordIdentifier} {FullFileName} {FileName} {FileExtension}
}
]
},
"Logging": {
"LogLevel": {
"Default": "Trace"
},
"Console": {
"FormatterName": "simple-console",
"FormatterOptions": {
"SingleLine": false,
"IncludeScopes": true,
"TimestampFormat": "yyyy/MM/dd HH:mm:ss "
}
}
},
"Serilog": {
"Using": [
"Serilog.Sinks.Console",
"Serilog.Sinks.File",
"Serilog.Sinks.ApplicationInsights"
],
"_MinimumLevel": "Verbose", // Debug, Information, Warning, Error, Fatal
"MinimumLevel": {
"Default": "Verbose",
"Override": {
"Microsoft": "Warning",
"System": "Warning"
}
},
"WriteTo": [
{
"Name": "Console"
},
{
"Name": "File",
"Args": {
"path": "logs/log-.txt",
"rollingInterval": "Day",
"fileSizeLimitBytes": 5242880,
"rollOnFileSizeLimit": true,
"retainedFileCountLimit": 31
}
},
{
"Name": "ApplicationInsights",
"Args": {
"connectionString": "InstrumentationKey=<Your-Key>;IngestionEndpoint=<URL>;LiveEndpoint=<URL>",
"telemetryConverter": "Serilog.Sinks.ApplicationInsights.TelemetryConverters.TraceTelemetryConverter, Serilog.Sinks.ApplicationInsights",
"restrictedToMinimumLevel": "Warning"
}
}
]
}
}