Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

Containers

Purpose

Option 1 (GPU)

Option 2 (CPU)

POC with LLM running on a CPU

Gateway (Linux)

(Vector DB, SQL DB, Gateway API)

16 GB RAM (32 GB for larger databases)

4 vCPUscores

150GB SSD

16 GB RAM

4 vCPUscores

Embedder (Linux)

16 GB RAM

16 GB GPU2 vCPUs

4 cores

60 GB SSD

16 GB RAM

4 vCPUscores

LLM (Linux)

16 GB RAM

24 GB GPU

2 vCPUs

To be determined.

4 cores

80 GB SSD

16 GB RAM

8 cores

Dashboard/Ingestor (Windows Server- Not a container)

8 GB RAM

4 vCPUscores

80 GB SSD

8 GB RAM

4 vCPUscores

Info
  1. Example CPU processors: Intel Xeon Platinum 8000.

  2. Example GPU for embedding: NVIDIA T4 Tensor Core.

  3. Example GPU for LLM: NVIDIA L4 Tensor Core.

  4. Database can be installed on the Windows server or any other convenient location, including an existing instance of Microsoft SQL Server.

  5. The Linux containers may be deployed on one server or spread over multiple servers.

  6. Recommended host OS: Ubuntu Linux 22.04 LTS +

Sizing

Above server hardware requirements are estimated to be sufficient for up to

  • 50 GB of ingested data

  • 300 Licensed users

  • 20 concurrent users asking questions / 75 questions within a 15 min period.

Add more LLM containers with GPUs to support more concurrent users.

AI model servers

The main hardware cost of the BusinessGPT deployment are the AI servers responsible for answering questions. These utilize GPUs.

These servers will also be used to create embeddings for processing and preparing the data for AI.

Server spec:

Linux server Ubuntu 2CPU, 8GB RAM HDD/SSD with a R/W speed of at least 100MB/s.

...

Multiple servers with a load balancer may be used for higher performance / High availability.

Graphics Cards

Below are GPU options that are available for purchase for standard servers. The difference between the cards is the number of questions per minute that can be processed.

...

*** Multiple medium range cards are preferable to a lower quantity of more expensive cards.

Dashboard website

Windows Server 2016 Enterprise (or Higher), 4CPU, 8GB RAM, 50G disk space, IIS installed. HDD/SSD with a R/W speed of at least 50MB/s.

Ingestor service

May be co-located with Dashboard server (above)

Windows Server 2016 Enterprise (or Higher), 4CPU, 8GB RAM, 50G disk space, IIS installed. HDD/SSD with a R/W speed of at least 50MB/s.

Dashboard SQL DB

Configured with SQL file storage
Windows Server 2016 Enterprise (or Higher), 4CPU, 8GB RAM, IIS installed. HDD/SSD with a R/W speed of at least 50MB/s.

...

In case files are uploaded manually to the Dashboard, they are saved in the Dashboard db as is (same size)

Gateway server

Linux server Ubuntu 4CPU, 8GB RAM. HDD/SSD with a R/W speed of at least 100MB/s.
The disk size should be 30% larger than the original content file size.

Server can optionally have an NVIDIA GPU with a minimum of 14GB vRAM for embedding the content.

Offline Installation

File / Container sizes

Gateway Image: 7.5 GB

Vector DB Image: 500 MB

...