Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Purpose

Option 1 (GPU)

Option 2 (CPU)

POC with LLM running on a CPU

Gateway (Linux)

(Vector DB, SQL DB, Gateway API)

16 GB RAM (32 GB for larger databases)

4 cores

150GB SSD

16 GB RAM

4 cores

Embedder (Linux)

16 GB RAM

16 GB GPU

4 cores

60 GB SSD

16 GB RAM

4 cores

LLM (Linux)

16 GB RAM

24 GB GPU

4 coresTo

be determined.80 GB SSD

16 GB RAM

8 cores

Dashboard/Ingestor (Windows Server- Not a container)

8 GB RAM

4 cores

80 GB SSD

8 GB RAM

4 cores

Info
  1. Example CPU processors: Intel Xeon Platinum 8000.

  2. Example GPU for embedding: NVIDIA T4 Tensor Core.

  3. Example GPU for LLM: NVIDIA L4 Tensor Core.

  4. Database can be installed on the Windows server or any other convenient location, including an existing instance of Microsoft SQL Server.

  5. The Linux containers may be deployed on one server or spread over multiple servers.

  6. Recommended host OS: Ubuntu Linux 22.04 LTS +

Sizing

Above server hardware requirements are estimated to be sufficient for up to

  • 50 GB of ingested data

  • 300 Licensed users

  • 20 concurrent users asking questions.

Add more LLM containers with GPUs to support more concurrent users.

AI model servers

The main hardware cost of the BusinessGPT deployment are the AI servers responsible for answering questions. These utilize GPUs.

...