POC with LLM running on a CPU
Although running Business GPT locally without a GPU is not recommended, customers may wish to run a POC trial installation using only CPUs.
Embedding and Chat Bot usage are both supported with CPU.
The LLM offered for use with CPU is a quantised version of LLama3.
bartowski/Meta-Llama-3-8B-Instruct-GGUF at main
CPU vs GPU Comparison
Action | CPU Hardware | CPU Time | GPU Hardware | GPU Time | Speed Factor |
---|---|---|---|---|---|
Public Data Question | 8 cores, 8GB RAM | 02:00 (2 mins) | Nvidia L4 24 GB VRAM | 00:08 | x15 |
Questions based on sources (3000 tokens total) | 8 cores, 8GB RAM | 09:02 (9 mins) | Nvidia L4 24 GB VRAM | 00:12 | x45 |