Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Component

Small

Medium

Large

Bigger

Storage Optimised

Faster

Speed Optimised

CPU

12 Cores

i7 /Xeon

16 cores

i7 /Xeon

32 cores

i7 /Xeon

64 Cores

i7 /Xeon

20 cores

i7 /Xeon

Memory

48 GB DDR4+

64 GB DDR4+

128 GB DDR4+

256 GB DDR4+

128 GB DDR4+

SSD

256 GB

500 GB

4 TB

48 TB

4 TB

Consumer grade GPU

1 x RTX 4090 24GB

1 x RTX 4070 Ti 12 GB

4 x RTX 4090 24GB

1 x RTX 4070 Ti

8 x RTX 4090 24GB

2 x RTX 4070 Ti

10 x RTX 4090 24GB

1 x H100 NVL

2 x RTX 4070 Ti

Data Center grade GPU

2 x L4 24 GB

1 x H100NVL GPU

2 x H100NVL GPU

2 x H100NVL GPU

3 x H100NVL GPU

LLM Supported Options

(Consumer Grade GPU)

Runs 1x Llama 3.1 8B
--

1 Embedding GPU

4x Llama 3.1 8B OR

1x Llama 3.3 70B (20k tokens)

--

1 Embedding GPU

8x Llama 3.1 8B OR

2x Llama 3.3 70B (20k tokens) OR

1x Llama 3.3 70B (128k tokens)

or a combination of the above

--

2 embedding GPUs

8x Llama 3.1 8B OR

2x Llama 3.3 70B (20k tokens) OR

1x Llama 3.3 70B (128k tokens)

or a combination of the above

--

GPUs allocated for embedding/ LLM usage according to requirements

1x Llama 3.1 70B (20k tokens)

Advantage: Single query speed is 2x as fast as with RTX 4090

--

2 embedding GPUs

Number of Users Supported

150

800 (8B) OR 150 (70B)

2000 (8B) / 400 (70B)

2000

400

Amount of data supported

500 GB

700 GB

1.4 TB

30 TB

1.4 TB

...