8B
Benchmark Question:
Prompt: 7000 tokens
Completion: 214 tokens
...
* Discounts available for different usage patterns
70B
Option A
Model: neuralmagic/Meta-Llama-3.1-70B-Instruct-FP8
GPU: 4 x RTX 4090 96GB VRAM (84 GB used)
72GB disk space used by model