Containers
Purpose | Option 1 (GPU) | Option 2 (CPU) https://agatsoftware.atlassian.net/wiki/spaces/VA/pages/3556311192 |
|---|---|---|
Gateway (Linux) (Vector DB, SQL DB, Gateway API) | 16 GB RAM (32 GB for larger databases) 4 cores 150GB SSD | 16 GB RAM 4 cores |
Embedder (Linux) | 16 GB RAM 16 GB GPU 4 cores 60 GB SSD | 16 GB RAM 4 cores |
LLM (Linux) | 16 GB RAM 24 GB GPU 4 cores 80 GB SSD | 16 GB RAM 8 cores |
Dashboard/Ingestor (Windows Server- Not a container) | 8 GB RAM 4 cores 80 GB SSD | 8 GB RAM 4 cores |
Example CPU processors: Intel Xeon Platinum 8000.
Example GPU for embedding: NVIDIA T4 Tensor Core.
Example GPU for LLM: NVIDIA L4 Tensor Core.
Database can be installed on the Windows server or any other convenient location, including an existing instance of Microsoft SQL Server.
The Linux containers may be deployed on one server or spread over multiple servers.
Recommended host OS: Ubuntu Linux 22.04 LTS +
Sizing
Above server hardware requirements are estimated to be sufficient for up to
50 GB of ingested data
300 Licensed users
20 concurrent users asking questions / 75 questions within a 15 min period.
Add more LLM containers with GPUs to support more concurrent users.
For sizing beyond this baseline (small → XL → enterprise tables, GPU compatibility matrix, HA), see <a href="https://agatsoftware.atlassian.net/wiki/spaces/VA/pages/3741450244">Hardware Sizing</a>.
AI model servers
The main hardware cost of the BusinessGPT deployment are the AI servers responsible for answering questions. These utilize GPUs.
These servers will also be used to create embeddings for processing and preparing the data for AI.
Server spec:
Linux server Ubuntu 2CPU, 8GB RAM HDD/SSD with a R/W speed of at least 100MB/s.
GPU: CUDA 11.8+, Min 24GB RAM.
The disk size should be 30% larger than the original content base file size.
Multiple servers with a load balancer may be used for higher performance / High availability.
Graphics Cards
For the full GPU comparison table (purchase costs, answer speed, supported models, compatibility matrix), see <a href="https://agatsoftware.atlassian.net/wiki/spaces/VA/pages/3741450244">Hardware Sizing</a> and <a href="https://agatsoftware.atlassian.net/wiki/spaces/VA/pages/3972530177">GPU & LLM Hardware Costs</a>.
Dashboard website
Windows Server 2016 Enterprise (or Higher), 4CPU, 8GB RAM, 50G disk space, IIS installed. HDD/SSD with a R/W speed of at least 50MB/s.
Ingestor service
May be co-located with Dashboard server (above)
Windows Server 2016 Enterprise (or Higher), 4CPU, 8GB RAM, 50G disk space, IIS installed. HDD/SSD with a R/W speed of at least 50MB/s.
Dashboard SQL DB
Configured with SQL file storage
Windows Server 2016 Enterprise (or Higher), 4CPU, 8GB RAM, IIS installed. HDD/SSD with a R/W speed of at least 50MB/s.
In general, the integrated content is not saved in this DB . Instead, the product keeps a link to the source file
In case files are uploaded manually to the Dashboard, they are saved in the Dashboard db as is (same size)
Gateway server
Linux server Ubuntu 4CPU, 8GB RAM. HDD/SSD with a R/W speed of at least 100MB/s.
The disk size should be 30% larger than the original content file size.
Server can optionally have an NVIDIA GPU with a minimum of 14GB vRAM for embedding the content.
Distributed Server Deployment
Recommended values for a multi-server distributed deployment (GPU servers not included — see <a href="https://agatsoftware.atlassian.net/wiki/spaces/VA/pages/3741450244">Hardware Sizing</a> for GPU specs).
Server | OS | RAM | CPU Cores | Disk |
|---|---|---|---|---|
Dashboard, Windows Services | Windows | 32 | 32 | 100 GB |
Gateway | Linux | 32 | 32 | 100 GB |
Microsoft SQL Server | Windows/Linux | 32 | 32 | 500 GB |
Postgres Vector Store | Linux | 32 | 32 | 500 GB |
Offline Installation
File / Container sizes
Gateway Image: 7.5 GB
Vector DB Image: 500 MB
Microsoft SQL DB Image (Optional): 1.6 GB
LLM Image with Model: 17 GB
BG Service + Dashboard installation package: 250 MB