...
Components overview
Dashboard / Ingestor Windows Server
Dashboard
.NET website for the end user and admin UI
The Dashboard manages the synced content items, permissions, building collections, chat interfaces, and settings at the user and site levels.
LLM
Customers can choose between using OpenAI’s LLM models such as Chat GPT 4o and Meta's Llama3.1. Llama 3.1 is royalty free for commercial use for for applications with less than 700 million monthly users.
Ingestor service
The ingestor Ingestor service is a service with connectors to various data sources. It pulls the content and the permissions and sends them to the Gateway embedding queue.
Firewall Service
This is a service that Classifies and analyses data at rest against policies.
Firewall API
This is a website exposing API to be used in real-time for inspection and classification.
Gateway Linux Server / Containers
Content loader
...
The loader extracts the text from the content items and cleans unnecessary parts from the content, such as email signatures and disclaimers in emails.
Embedding service
...
Gets data from the Loader, splits it into chunks, and transforms the chunked content into vectors.
...
The embedding vectors will be stored in a vector DB - we’ve elected to use Chroma DB.Postgres Vector BB
Embedding AI model
For the private AI BusinessGPT uses GTE Qwen2 (by AliBaba)
This model supports 29 languages, among them Hebrew :
https://ollama.com/library/qwen2:1.5b-instruct
https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct
According to the MTEB benchmarks (link) this model is better than the one used by OpenAI ( ChatGPT) is is most cost-effective in terms of resources needed
Vector DB
The Vector DB stores embedding as vector content and the chunked text with metadata.
...
The API gets the query and decides the best algorithm to generate the answer.
Local LLM Linux Server / Container
Customers can choose between using OpenAI’s LLM models such as Chat GPT 4o and Meta's Llama3.1. Llama 3.1 is royalty free for commercial use for for applications with less than 700 million monthly users.
Token classification model
An AI model used for specific data classification, such as PII
Bastion Proxy
Network proxy supporting HTTP and Web sockets needed for analysing the AI services traffic
Firewall Service
This is a BL that is run by the BG SErvice for Classifies and analyses data at rest against policies
Firewall API
This is a website exposing API to be used in real-time for inspection and classification