POC with LLM running on a CPU

Although running Business GPT locally without a GPU is not recommended, customers may wish to run a POC trial installation using only CPUs.

Embedding and Chat Bot usage are both supported with CPU.

The LLM offered for use with CPU is a quantised version of LLama3.

https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF/tree/main?show_file_info=Meta-Llama-3-8B-Instruct-Q5_K_S.gguf

 

CPU vs GPU Comparison

Action

CPU Hardware

CPU Time

GPU Hardware

GPU Time

Speed Factor

Action

CPU Hardware

CPU Time

GPU Hardware

GPU Time

Speed Factor

Public Data Question

8 cores, 8GB RAM

02:00 (2 mins)

Nvidia L4 24 GB VRAM

00:08

x15

Questions based on sources (3000 tokens total)

8 cores, 8GB RAM

09:02 (9 mins)

Nvidia L4 24 GB VRAM

00:12

x45