BusinessGPT File loader

 

The role of the loader is to take files from a folder and upload them into the dashboard.

Configuration

It runs under a service named “BusinessGPTFileLoader” and has the following configurations:

  1. FolderPath - can be local folder or UCN

  2. ApiKey - See here

  3. UploadApiUrl - BusinessGPT upload service API URL- See here

  4. PerformanceFileCount - Help: After how many files to write into the log, the performance of the total number of files and process time. Default 50.

  5. DBPath - local path to SQLite DB file

  6. RunningIntervalMinutes - the service scans the folder every interval for new files that were added to the folder

  7. ReprocessAllFiles - default False. Set to True to reprocess all the files in the folder and upload them again.

  8. Enable copilot eDiscovery data processing

  9. CopilotDataFolderLocation - the location of the ZIP file from the eDiscovery

Notes: FolderPath must be accessible by the service. The service runs with Network Service account and is authenticated with the computer's credentials to remote servers. So if the FolderPath is a network shared folder, need to give read permissions to the computer on that folder.

Specification

The service includes an SQL Light DB as part of the installation.

The service writes the status of handling Failed/ Finished, ContentID, and time into the DB for files (Table name: FileLoaderStatus.

When the service starts, it writes the total number of files to load into the log.

The processing order will be by date modified of files starting with new files.

It runs in a recursive to expand all sub-folders.

If the service restarts, it will check again how many files need to be uploaded.

The service writes into a log the progress statistics every 50 files (PerformanceFileCount):

  • Total files loaded, succeeded, failed

  • Processing time of last 50 files.

  • Total processing time

 

In next version, the service will update the ContentSourceName in the content info table to be “FileLoader” and the ContentSourceIP to the host's IP from which the service is running.

Measuring performance

The File uploader sends each document to the Firewall API. Once the Firewall API inserts the document into the content list, it returns to the loader to get the next file.

Based on AGAT typical files, it takes around 0.2 sec for a file
So, if you have 10,000 files, it would take around 0.5 hours.

Syncing performance to the dashboard

Every time (based on PrefromanceFileCount ), the service will write the performance into the log.

In next version, it will sync the data into the preference table. https://agatsoftware.atlassian.net/wiki/spaces/SKYP/pages/3184656385/BusinessGPT-+BG+Tables+SRS#TABLE---BG_PERFORMANCE

ParameterName:

Time to Load X files

Total Files loaded

 

createdBy: File Loader (IP of the host machine)

Installing the service

Open cmd as administrator, go to the service installation folder, and run the following command:
>BusinessGPTFileLoader.exe install

To uninstall the service run the command:
>BusinessGPTFileLoader.exe remove

image-20240625-121417.png

Note: Can run in console mode (not as a Windows Service) with the command:
>BusinessGPTFileLoader.exe console

Online tool for SQLite database access

https://sqliteonline.com/

Use File menu => Open DB => select the FileLoader.db file.

You can see a list of DB tables and run SQL queries like SELECT/UPDATE/etc.

 

 

Copilot Auditing

When configured to process copilot- the file uploader can process the results of a manual export of the eDiscovery prompts.

When enabled (EnableCopilotProcessing), the service will look for the files in this location:

“CopilotDataFolderLocation”. Each file is parsed, and the prompt and response are sent to the Firewapp API (the same as the proxy sends the data)

The FileLoader sends the user's API key manually updated in the config file.

If the user is missing, send it to Unknown@company.com.

The parser of CoPilot review sets is used by both the File Uploader and the ingestor

The CopliotReviewSetParser is a DLL that unzips the Review Set and extracts the prompt, responses, users, time, and files.