Benchmark Question:
Prompt: 7000 tokens
Completion: 214 tokens
GPU | vRAM | Answer Speed | 4 questions | Price | AWS* | Runpod* |
---|---|---|---|---|---|---|
L4 | 24 | 16 | 37 | G6 $0.80 | $0.43 | |
4090 | 24 | 5 | 8 | $2,000 | NA | $0.69 |
L40s | 48 | 6.5 | 9 | $9,800 | G6e $1.861 | $1.03 |
H100 SXM | 80 | 2.4 | 3.4 | $31,000 | NA | $2.99 |
* Discounts available for different usage patterns