Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 9 Next »

AI model servers

The main cost of the BusinessGPT deployment is the AI servers responsible for answering questions. These are utilizing graphic card GPUs (Graphic Process Units).

Graphics Cards

Below are two options of two cards that are available for purchase for standard servers. The difference between the cards is the number of questions per minute that can be processed.

The system supports multiple servers with a load balancer to boost performance.

Card

Answer Speed

Purchase cost* (One time cost)

Nvidia RTX 4090

24 GB VRAM  4 vCPU

2.1sec ( 29 Questions /Min)

$2289

Nvidia RTX 4070 Ti

12GB VRAM 8vCPU

4.2 sec (14 Questions /Min)

$790

*Costs were taken from Amzaon.com however these graphic cards can also be purchased elsewhere.

Other standard servers

In addition to the AI servers, it is recommended to have an additional 3 standard servers for the following components:

  • Dashboard web site

  • Ingestor service

  • Database (Can be shared with an existing SQL instance)

In case of budget restraints, all components can be deployed on the same server.

Embedding servers

The embedding servers are responsible for processing the data and preparing it for AI.

Below are the specs for this

Word/ M size per min.

  • No labels