/
BusinessGPT File loader

BusinessGPT File loader

Service Purpose: To take files from a folder and upload them into the dashboard

Preparations: Must install FirewallAPI service in advance

Installation

Copy paste the desired version \\fs\Build Versions\Production\BGFileLoader or \\fs\Build Versions\Test\BGFileLoader to C:\Agat in the machine

Configuration

Open C:\Agat\BGFileLoader\Configuration\ApplicationSettings.config and configure the following:

  1. FolderPath - The folder from which the service will upload the files, can be local folder or UCN.

  2. ApiKey - See here

  3. UploadApiUrl - BusinessGPT upload service API URL- See here

  4. PerformanceFileCount - Help: After how many files to write into the log, the performance of the total number of files and process time. Default 50.

  5. DBPath - Local path to SQLite DB file. In general, nothing should be changed except whether access is from disk D or C.

  6. RunningIntervalMinutes - the service scans the folder every interval for new files that were added to the folder

  7. ReprocessAllFiles - default False. Set to True to reprocess all the files in the folder and upload them again.

  8. Enable copilot eDiscovery data processing - Yet to be released

  9. CopilotDataFolderLocation - the location of the ZIP file from the eDiscovery - Yet to be released

Installing the service

Open cmd as administrator, go to the service installation folder, and run the following command:
>BusinessGPTFileLoader.exe install

To uninstall the service run the command:
>BusinessGPTFileLoader.exe remove

image-20240625-121417.png

Note: Can run in console mode (not as a Windows Service) with the command:
>BusinessGPTFileLoader.exe console

How it works

Sanity test

Upload a file to the folder and check if it was successfully uploaded to the dashboard. The file can be found under Data Sources>Documents

image-20250218-114642.png
Local Folder
image-20250218-114843.png
We can see that the file has been successfully uploaded to the dashboard



Upload

Upload is done into a collection which is the name of the original folder + date of first run. Any files in this folder will be uploaded to the same collection.

Along with the file, also users and groups list which have read access to the file are uploaded as well.

If something is changed with one of these users or groups, an update is sent.

Error handling

In an upload error, the loader marks the record as failed in the status.

Each failed record goes through 3 retry attempts at intervals of 1 min, 10 min, and 1 hour.

In addition, you can configure to retry all failed by changing the configuration “RetryAllFailed,” which requires a restart of the service. Once restarted and configured to retry all, the service will mark all failed records at the first attempt

Specification

The service includes an SQL Light DB as part of the installation.

The service writes the status of handling Failed/ Finished, ContentID, and time into the DB for files (Table name: FileLoaderStatus.

When the service starts, it writes the total number of files to load into the log.

The processing order will be by date modified of files starting with new files.

It runs in a recursive to expand all sub-folders.

If the service restarts, it will check again how many files need to be uploaded.

The service writes into a log the progress statistics every 50 files (PerformanceFileCount):

  • Total files loaded, succeeded, failed

  • Processing time of last 50 files.

  • Total processing time

 

In next version, the service will update the ContentSourceName in the content info table to be “FileLoader” and the ContentSourceIP to the host's IP from which the service is running.

Measuring performance

The File uploader sends each document to the Firewall API. Once the Firewall API inserts the document into the content list, it returns to the loader to get the next file.

Based on AGAT typical files, it takes around 0.2 sec for a file
So, if you have 10,000 files, it would take around 0.5 hours.

Syncing performance to the dashboard

Every time (based on PrefromanceFileCount ), the service will write the performance into the log.

In next version, it will sync the data into the preference table. https://agatsoftware.atlassian.net/wiki/spaces/SKYP/pages/3184656385/BusinessGPT-+BG+Tables+SRS#TABLE---BG_PERFORMANCE

ParameterName:

Time to Load X files

Total Files loaded

createdBy: File Loader (IP of the host machine)

Online tool for SQLite database access

https://sqliteonline.com/

Use File menu => Open DB => select the FileLoader.db file.

You can see a list of DB tables and run SQL queries like SELECT/UPDATE/etc.

Copilot Auditing

When configured to process copilot- the file uploader can process the results of a manual export of the eDiscovery prompts.

When enabled (EnableCopilotProcessing), the service will look for the files in this location:

“CopilotDataFolderLocation”. Each file is parsed, and the prompt and response are sent to the Firewapp API (the same as the proxy sends the data)

The FileLoader sends the user's API key manually updated in the config file.

If the user is missing, send it to Unknown@company.com.

The parser of CoPilot review sets is used by both the File Uploader and the ingestor

The CopliotReviewSetParser is a DLL that unzips the Review Set and extracts the prompt, responses, users, time, and files.

 

Related content