Configuring & using the Batch Sync Client
  • 08 Nov 2023
  • 5 Minutes to read
  • PDF

Configuring & using the Batch Sync Client

  • PDF

Article Summary

The Batch Sync Client is a companion software that automates DocFusion's Enterprise Batch Processing functionality. It is a stand-alone application that monitors filesystem folders for XML or JSON payloads that conform to a configured filename pattern then creates batches for processing. These payloads can be validated against an XML schema before processing and, once batches are processed, they are written to an output folder in the format specified in the configuration. Log files for batches and their batch records are available in configured output folders as well to diagnose generation errors and arrange for payload resubmission. 

The Sync Utility is useful when implementing batch processing for DocFusion without explicitly integrating with the API. The utility is configured using the appSettings.json file where folder paths, file patterns and CRON schedules are specified. Optimally, you can configure the chunk-size for file streaming to the batch engine in order to use network and hardware resources more efficiently. Process logging can also be integrated with external analytical tools to gain insights into performance.


   Contact your DocFusion sales representative to get access to the Sync Utility.

   

Starting the Sync Utility

The Sync Utility can be started via the .exe provided by DocFusion or within a Docker container. Depending on how you start it, the appSettings.json configuration file is located in /app path whether running within a container or using the AIS.DocBatch.BatchClient.exe. 

         

Configuring Settings

NOTE: Restart the Sync Utility after any configuration changes.

1. Configure Batch Tenancy

JSON OBJECT: BatchTenant

Connects the batch tenant to your DocFusion business unit (1:1 relationship).

PROPERTYSETTING
BatchTenantId Set to the Guid of the batch tenant.

     

2. Configure Batch Client

JSON OBJECT: BatchClient

Sets up OAuth 2.0 authentication for the Sync Utility (client).

PROPERTYSETTING
Url URL endpoint for the DocFusion Batch Engine.

Default: https://online.docfusion-paas.com
   
TokenEndpointEndpoint of the authentication service to retrieve a bearer token using OAuth2.0 client credential flow.

Default: https://online.docfusion-paas.com/core/connect/token
   
ClientIdOAuth2.0 client ID. Provided by DocFusion.
ClientSecretOAuth2.0 client secret. Provided by DocFusion. 
ScopeSet to "DocFusion"

   

3. Configure Batch Status Folder

JSON OBJECT: BatchStatusFolder

Configuration to monitor the status of a batch. When a batch is created, a status file is created in the InputFolder path to record state information. Once the batch is processed, the status file is moved from the InputFolder to the OutputFolder. In the configuration, BatchStatusFolder can contain many FolderMonitors JSON Objects but only 1 BatchStatusFolder  should be configured.

PROPERTYSETTING
InputFolderPath of folder that stores status files for input monitors.
E.g. "/temp/batch/batches-state/in" (Linux)
or "\\temp\\batch\\batches-state\\in" (Windows)
   
OutputFolderPath of folder that stores status files for processed batches.
E.g. "/temp/batch/batches-state/out" (Linux)
or "\\temp\\batch\\batches-state\\out" (Windows)
      
AutoApproveWhenNeededSets whether the Sync Client should send an approval for the batch if it's in the AwaitingApproval status. Whether batches require approval or not is configured on the Batch Type using the processRecordsWithoutApproval setting.
Values: True / False
     
ScheduleCRON expression specifying the interval to poll the InputFolder. When submitting many batches, offset the monitoring schedule from the BatchFolderMonitors schedule to allow time for processing.

DocFusion's Batch Engine supports sub-minute (i.e. seconds) definitions for CRON schedules.
E.g.: the CRON expression */5 * 1 * *  specifies polling every 5 minutes on the 1st day of every month. Forward-slash specifies recurrence.

CRON expressions constitute 5 parts, each specifying an interval as follows:
* Minutes (0-59)
* Hours (0-23)
* Day of Month: (1-31)
* Month (1-12)
* Day of Week: (0-6)

      

   

   

 4. Configure Batch Folder Monitors

JSON OBJECT: BatchFolderMonitors / FolderMonitors

FolderMonitors are associated to a Batch Type and configure settings for payloads and processed files. InputFolder paths are scanned for the FilePattern on the defined CRON schedule, then processed files are written to the OutputFolder path using the FileNameTemplate setting. There are no limitations on the number of FolderMonitors you can add to the BatchFolderMonitors object, though it's important to consider what your hardware allows.
    

PROPERTYSETTING
EnabledEnables or disables monitoring of the specified folders.
Values: True / False
   
NameName of the folder monitor. Typically set to the name of the DocFusion template associated to the batch type.
   
BatchTypeIdUnique Guid of the associated Batch Type.
   
ScheduleCRON expression specifying the interval to poll the InputFolder. Each poll creates a new batch. Any files that that are dropped after the interval will be processed in the next batch.

DocFusion's Batch Engine supports sub-minute (i.e. seconds) definitions for CRON schedules.
E.g.: the CRON expression */5 * 1 * *  specifies polling every 5 minutes on the 1st day of every month. Forward-slash specifies recurrence.

CRON expressions constitute 5 parts, each specifying an interval as follows:
* Minutes (0-59)
* Hours (0-23)
* Day of Month: (1-31)
* Month (1-12)
* Day of Week: (0-6)

InputFolderPath of the folder to monitor for input payloads.
E.g. "/temp/batch/batch/in" (Linux)
or "\\temp\\batch\\batch\\in" (Windows)

   
OutputFolderPath of the folder to write processed batch records (documents) to.
E.g. "/temp/batch/batch/out" (Linux)
or "\\temp\\batch\\batch\\out" (Windows)

   
OutputFileExtensionFile extension for output (processed) files. When the BatchType has the performMerge flag enabled, this property is used to specify the resulting file extension.
E.g. ".PDF"  
   
FilePatternFile extension of files in the input folder that will be processed, e.g.: "XML". The Sync Utility will only process files with this extension.
   
ChunkSizeSets the number of batch records (files) to transmit to the server in each HTTP call to the batch engine. Limits the amount of HTTP calls by combining batch records into a file-transfer package. The limit on payload file size is 100MB per file.
E.g. 10. (10 files will be sent to the engine in one HTTP call.)


 There is no limit to the ChunkSize setting, though the size impacts network throughput and is also impacted by the number of available CPU processes, which can impose throttling. 

   
RecordProcessedExtensionThe file extension of processed documents stored in the OutputFolder path
Default = ".processed"
   
FileNameTemplateSets the filename pattern for processed files stored in the OutputFolder path. Allows you to map the output document to the payload that was sent.
E.g. "{RecordId}_{RecordIdentifier}_{FileName}.pdf"

Tag Reference:

  • {RecordId}: Unique ID of the batch record.
  • {RecordIdentifier}: extracted from XML or JSON. (The Xpath or JSON path.)
  • {FullFileName}: Original filename of the payload before chunks were created.
  • {FileName}: Original filename of the payload without the extension.
  • {FileExtension}: Original file extension.
   

   


Sample appSettings Config File

{
  "AppSettings": {
    "AppInsights": true
  },
  "ApplicationInsights": {
    "ConnectionString": "InstrumentationKey=<Your-Key>;IngestionEndpoint=<URL>;LiveEndpoint=<URL>",
    "LogLevel": {
      "Default": "Information"
    }
  },
  "BatchTenant": {
    "BatchTenantId": "{BATCH-TENANT-GUID}"
  },
  "BatchClient": {
    "Url": "{URL}",
    "Port": 443,
    "HttpMode": "Https",
    "TokenEndpoint": "{SERVER-TOKEN-ENDPOINT}",
    "ClientId": "{CLIENT-ID-FROM-DOCFUSION}",
    "ClientSecret": "{CLIENT-SECRET-FROM-DOCFUSION}",
    "Scope": "DocFusion"
  },
  "BatchStatusFolder": {
    "InputFolder": "/temp/batch/batches-state/in",
    "OutputFolder": "/temp/batch/batches-state/out",
    "AutoApproveWhenNeeded": true,
    "RecordBlockSize": 20, 
    "RetryCount": 5,
    "RetryBackoffSeconds": 5,
    "Schedule": "*/1 * * * * *"
  },
  "BatchFolderMonitors": {
    "FolderMonitors": [
      
      {
        "Enabled": true,
        "Name": "Rewards Statement",
        "BatchTypeId": "<BATCH-TYPE-GUID>",
        "Schedule": "*/10 * * * * *",
        "InputFolder": "/temp/batch/in",
        "OutputFolder": "/temp/batch/out",
        "OutputFileExtension": ".pdf",
        "FilePattern": "*.xml",
        "ChunkSize": 10,
        "RecordProcessedExtension": ".processed",
        "FileNameTemplate": "{RecordId}_{RecordIdentifier}_{FileName}.pdf" // {RecordId} {RecordIdentifier} {FullFileName} {FileName} {FileExtension}
      }
    ]
  },
  "Logging": {
    "LogLevel": {
      "Default": "Trace"
    },
    "Console": {
      "FormatterName": "simple-console",
      "FormatterOptions": {
        "SingleLine": false,
        "IncludeScopes": true,
        "TimestampFormat": "yyyy/MM/dd HH:mm:ss "
      }
    }
  },
  "Serilog": {
    "Using": [
      "Serilog.Sinks.Console",
      "Serilog.Sinks.File",
      "Serilog.Sinks.ApplicationInsights"
    ],
    "_MinimumLevel": "Verbose", // Debug, Information, Warning, Error, Fatal
    "MinimumLevel": {
      "Default": "Verbose",
      "Override": {
        "Microsoft": "Warning",
        "System": "Warning"
      }
    },
    "WriteTo": [
      {
        "Name": "Console"
      },
      {
        "Name": "File",
        "Args": {
          "path": "logs/log-.txt",
          "rollingInterval": "Day",
          "fileSizeLimitBytes": 5242880,
          "rollOnFileSizeLimit": true,
          "retainedFileCountLimit": 31
        }
      },
      {
        "Name": "ApplicationInsights",
        "Args": {
          "connectionString": "InstrumentationKey=<Your-Key>;IngestionEndpoint=<URL>;LiveEndpoint=<URL>",
          "telemetryConverter": "Serilog.Sinks.ApplicationInsights.TelemetryConverters.TraceTelemetryConverter, Serilog.Sinks.ApplicationInsights",
          "restrictedToMinimumLevel": "Warning"
        }
      }
    ]
  }
}

Was this article helpful?