POC with LLM running on a CPU

Although running Business GPT locally without a GPU is not recommended, customers may wish to run a POC trial installation using only CPUs.

Embedding and Chat Bot usage are both supported with CPU.

The LLM offered for use with CPU is a quantised version of LLama3.

bartowski/Meta-Llama-3-8B-Instruct-GGUF at main

 

CPU vs GPU Comparison

Action

CPU Hardware

CPU Time

GPU Hardware

GPU Time

Speed Factor

Action

CPU Hardware

CPU Time

GPU Hardware

GPU Time

Speed Factor

Public Data Question

8 cores, 8GB RAM

02:00 (2 mins)

Nvidia L4 24 GB VRAM

00:08

x15

Questions based on sources (3000 tokens total)

8 cores, 8GB RAM

09:02 (9 mins)

Nvidia L4 24 GB VRAM

00:12

x45