Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Component

Small

Medium

LargeX Large

Faster

CPU

12 Cores

i7 /Xeon

16 cores

i7 /Xeon

32 cores

i7 /Xeon

20 cores

i7 /Xeon

Memory

48 GB DDR4+

64 GB DDR4+

128 GB DDR4+

128 GB DDR4+

SSD

256 GB

500 GB

4 TB

4 TB

GPU

1 x RTX 4090 24GB

1 x RTX 4070 Ti

4 x RTX 4090 24GB

1 x RTX 4070 Ti

8 x RTX 4090 24GB

2 x RTX 4070 Ti

1 x H100 NVL

2 x RTX 4070 Ti

LLM Supported Options

Runs 1x Llama 3.1 8B
--

1 Embedding GPU

4x Llama 3.1 8B OR

1x Llama 3.170B 1 70B (20k tokens)

--

1 Embedding GPU

8x Llama 3.1 8B OR

2x Llama 3.1 70B (20k tokens) OR

1x Llama 3.1 70B (128k tokens)

or a combination of the above

--

2 embedding GPUs

1x Llama 3.1 70B (20k tokens)

Advantage: Single query speed is 2x as fast as with RTX 4090

--

2 embedding GPUs

Number of Users Supported

150

800 (8B) OR 150 (70B)

2000 (8B) / 400 (70B)

400