AI model servers
The main hardware cost of the BusinessGPT deployment are the AI servers responsible for answering questions. These utilize GPUs.
These servers will also be used to create embeddings for processing and preparing the data for AI.
Graphics Cards
Below are GPU options that are available for purchase for standard servers. The difference between the cards is the number of questions per minute that can be processed.
The system supports multiple servers with a load balancer to boost performance.
Card | Answer Speed | Time taken to answer 10 questions asked simultaneously | Purchase cost* (One time cost) |
---|---|---|---|
H100 80GB SXM5 | 1.9Sec | 15 | $33,000 |
Nvidia RTX 4090 24 GB VRAM 4 vCPU | 2.1sec ( 29 Questions /Min) | 17 Sec | $2289 |
Nvidia RTX 4070 Ti 12GB VRAM 8vCPU | 2.5 sec (14 Questions /Min) | 25 Sec |
*Costs were taken from Amzaon.com however these graphic cards can also be purchased elsewhere.
** Approximate times when using Wizard-Vicuna-13B model.
Dashboard website
Windows Server 2016 Enterprise (or Higher), 4CPU, 8GB RAM, 50G disk space, IIS installed. HDD/SSD with a R/W speed of at least 50MB/s.
Ingestor service
May be co-located with Dashboard server (above)
Windows Server 2016 Enterprise (or Higher), 4CPU, 8GB RAM, 50G disk space, IIS installed. HDD/SSD with a R/W speed of at least 50MB/s.
Dashboard SQL DB
Configured with SQL file storage
Windows Server 2016 Enterprise (or Higher), 4CPU, 8GB RAM, IIS installed. HDD/SSD with a R/W speed of at least 50MB/s.
In general, the integrated content is not saved in this DB . Instead, the product keeps a link to the source file
In case files are uploaded manually to the Dashboard, they are saved in the Dashboard db as is (same size)
Gateway server
Linux server Ubuntu 4CPU, 8GB RAM. HDD/SSD with a R/W speed of at least 100MB/s.
The disk size should be 30% larger than the original content file size.
AI Server - Docker container
Linux server Ubuntu 2CPU, 4GB RAM HDD/SSD with a R/W speed of at least 100MB/s.
GPU: CUDA 11.8+, Min 12GB RAM.
The disk size should be 30% larger than the original content base file size.
Multiple docker containers may be used for higher performance / High availability with a load balancer.