Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

Single Unit

Component

Small

Medium

Large

Bigger

Storage Optimised

Faster

Speed Optimised

CPU

12 Cores

i7 /Xeon

16 cores

i7 /Xeon

32 cores

i7 /Xeon

64 Cores

i7 /Xeon

20 cores

i7 /Xeon

Memory

48 GB DDR4+

64 GB DDR4+

128 GB DDR4+

256 GB DDR4+

128 GB DDR4+

SSD

256 GB

500 GB

4 TB

48 TB

4 TB

GPU

1 x RTX 4090 24GB

1 x RTX 4070 Ti 12 GB

4 x RTX 4090 24GB

1 x RTX 4070 Ti

8 x RTX 4090 24GB

2 x RTX 4070 Ti

10 x RTX 4090 24GB

1 x H100 NVL

2 x RTX 4070 Ti

LLM Supported Options

Runs 1x Llama 3.1 8B
--

1 Embedding GPU

4x Llama 3.1 8B OR

1x Llama 3.3 70B (20k tokens)

--

1 Embedding GPU

8x Llama 3.1 8B OR

2x Llama 3.3 70B (20k tokens) OR

1x Llama 3.3 70B (128k tokens)

or a combination of the above

--

2 embedding GPUs

8x Llama 3.1 8B OR

2x Llama 3.3 70B (20k tokens) OR

1x Llama 3.3 70B (128k tokens)

or a combination of the above

--

GPUs allocated for embedding/ LLM usage according to requirements

1x Llama 3.1 70B (20k tokens)

Advantage: Single query speed is 2x as fast as with RTX 4090

--

2 embedding GPUs

Number of Users Supported

150

800 (8B) OR 150 (70B)

2000 (8B) / 400 (70B)

2000

400

Amount of data supported

500 GB

700 GB

1.4 TB

30 TB

1.4 TB

Llama 3.1 70B is recommended for code generation and data analysis

  • No labels