Supported Self Hosted LLM Models

Supported Self Hosted LLM Models

AWS Savings plans can offer discounts of around 30% on the AWS prices shown below. Prices are shown for on-demand instances.

Model*

vRam

GPU Examples for Purchase

AWS Instance / Cost per hour**

AWS 24X7 monthly ***

AWS
200 hours monthly

***

Features (ratings / 10)

Cloud API Option

Model*

vRam

GPU Examples for Purchase

AWS Instance / Cost per hour**

AWS 24X7 monthly ***

AWS
200 hours monthly

***

Features (ratings / 10)

Cloud API Option

Open AI GPT OSS 20B

16 GB

 

Data Center GPU:
L4, 24G - $2600

Consumer GPU:
RTX 3090/4090, 24 GB ($2400)

N. Virginia / Stockholm:

G4dn.xlarge ($0.52)


Israel:

g5.xlarge · 1×A10G (24 GB)
$1.179

 

$400

 

 

$900

 

$100

 

 

$240

Answer Accuracy - 7

Code Generation - 8

Agent -

Translation -

Vision - No

Bedrock

Open AI GPT OSS 120B

80 GB

Data Center GPU:

2x L40s, 96 GB total

($16,000)

 

Consumer GPU:

4x 4090, 96 GB Total ($9600)

N. Virginia g6.12xlarge ($4.60)


Stockholm g5.12xlarge

($6)


Israel g5.12xlarge

($6.64)

$3500

 

 

$4500

 

 

$5000

$900

 

 

$1200

 

 

$1300

Answer Accuracy - 8

Code Generation - 9

Agent -

Translation -

Vision - No

Bedrock

Gemma 3 27 B

80 GB

Data Center GPU:

2x L40s, 96 GB total

($16,000)

(65,000 NIS in Israel)

 

Consumer GPU:

Information upon request

N. Virginia g6.12xlarge ($4.60)


Stockholm g5.12xlarge

($6)


Israel g5.12xlarge

($6.64)

$3500

 

 

$4500

 

 

$5000

$900

 

 

$1200

 

 

$1300

Answer Accuracy - 7

Code Generation - 8

Agent -

Vision - Yes

Translation - Yes

OpenRouter

Llama 4 Maverick

640 GB

8× H100 80 GB

($200,000)

p5.48xlarge ( $55/ hr)

$41,000

$11,000

Answer Accuracy - 8

Code Generation - 9

Agent - Yes

Vision - Yes

Translation - Yes

Bedrock

Llama 3.1 8B

(Deprecated)

24 GB

Data Center GPU:
L4, 24G - $2600

Consumer GPU:
RTX 4090, 24 GB ($2400)

N. Virginia

g6.xlarge ($0.80)

$600

$160

Answer Accuracy - 5

Code Generation - N/A

Agent - No

Translation - No

Vision - No

Bedrock

* Above models are official releases only

** LLM Instance only. Windows Dashboard and Linux Gateway instances required too.

*** Prices are approximate

 

Also tested:

Oss 20B: RTX 2000 ADA, RTX A4500 (RTX A4000 didn’t pass testing)