AI firewall user guide

Understanding what the AI Firewall is

The firewall allows the company to monitor and manage the risks of using Generative AI services, focusing on public AI services.

BusinessGPT firewall allows companies to enforce AI policies by analyzing and understanding the use case, users' objectives for using AI, and the sensitivity of the relevant data.

The firewall safeguards against misuse of AI that could pose a business risk or violate company policies. It also enforces AI guardrails to ensure the AI response is aligned with the company ethics and responsible use of AI.

Policies can be tailored to suit the specific needs and characteristics of individual users or groups, optimizing the advantages of AI utilization while mitigating associated risks.

BusinessGPT gives the company complete visibility and analytics of what the users are doing with GenAI, whether ChatGPT, Copilot, Gemini, or a custom AI model.

BusinessGPT also offers complete end-to-end Private/ On-prem AI solutions for highly regulated companies, ensuring zero data exposure. You can read more about this solution here .

Accessing the BusinessGPT dashboard

To access the dashboard, follow this link - BGD10 Demo Dashboard

You can sign-in with your Miscrosoft or Google account.

How do you get your prompts into the system?

The system relies on analyzing the interactions with the public AI service. For this reason, you need to configure your demo environment to get the interactions (prompts and responses) into the system.

You can demo the firewall using the BusinessGPT chatbot or your ChatGPT / Copilot in the browser. The easiest way is to use the browser extension.

There are several interfaces available for the system.

  1. Use the system's built-in Chatbot. The chatbot can interact with ChatGPT directly, or you can ask questions about the content you upload or sync to the system.

  2. Installing a browser extension (Chrome / Edge).
    For Chrome go to the Chrome web store here: https://chromewebstore.google.com/detail/businessgpt-ai-firewall/nmclpkidicoigfkjbphinajeekefeoel?hl=en-US&utm_source=ext_sidebar
    For Edge go to the Microsoft Edge add on page here: BusinessGPT AI Firewall - Microsoft Edge Addons

    Notes:
    a. Currently, the extension supports Chrome, Edge and ChatGPT.
    b. The next versions will support Copilot and Gemini

  3. Configure Network proxy - contact support for this

 

Try it out: Install the BusinessGPT AI Firewall extension.

  1. Open the Chrome web store https://chromewebstore.google.com/detail/businessgpt-ai-firewall/nmclpkidicoigfkjbphinajeekefeoel?hl=en-US&utm_source=ext_sidebar

  2. Add the extension

  3. Go to ChatGPT. Make sure you are signed into OpenAI with the same email as you are using to Log into the dashboard agatsoftware,ai/demo

  4. When the extension is installed, you will see a link to the dashboard on the left and another disclaimer line at the footer.

    image-20240522-110030.png
  5. You can pin it to be visible, though the only UI it has is to enable or disable it

    image-20240522-112127.png

Who can access the account Firewall settings?

The first user who signs into the Dashboard from a specific domain is automatically the account admin. Any other users with the same domain who sign in will be members of that account. You can change this by contacting our support.

The firewall components

All components are visible to the company admin under the Account Firewall section.

 

Dashboard - The product's web UI.

Data classification rules—Rules classifying data, included in the prompt/response and relevant content (documents). Each rule has a sensitivity level.

Usage classification rules- Rules classifying the intention of using the GenAI services.

Firewall Policies- Policies that define each scenario's risk and action (block/ allow/ monitor)

based on groups/users/data classification rules/usage classification rules.

Firewall auditing - Report of all prompts and responses with the policy applied and the matching classification rules

Firewall Dashboard- A management graphic summary of all Firewall results

Classification rules- overview

There are two types of classification rules in the system:

  1. Data classification rules - These rules cover data written in the prompt or response and files uploaded to the prompt as part of the interaction.

  2. Usage classification rules- These rules apply only to the prompt to analyze the user’s objectives/use case.

For each classification rule (data and usage), there are two types of rules - Regular or AI

  1. Regular expression: The engine will search for text defined in the value. You can also enter more advanced patterns using regular expressions. If you want text, just type it in the field without other characters.

  2. AI - Define the rule in your own language (NLP)

Configure Data classification rules

The data classification rules classify content, including the prompt, response, and data involved.

The data can be a file added to the prompt or any data type synced into the system.

Each rule has a sensitivity level set to it.

Rules classified as critical in the sensitivity level cause the firewall to block them regardless of any policy.

Any new account includes default rules, allowing you to add more manually.

Default rules are marked with a green background but can not be deleted. If you don’t want to use the default rules, disable them from the edit rule page.

 

Try it yourself:
1. Go to the Account AI Firewall > Data Classification page

  1. Click the Add button to define a new rule as follows.

 

 

  1. AI data classification rules - these types of rules allow you to define in natural language the terms of classification

 

Data classification language

Each rule has a language set. By default, it is any, which means the rule is checked regardless of the document language.

Writing the rules in the local language is recommended to achieve better results in non-English documents. When done, set the rule language to the relevant value.

When an account contains rules with non-English rules, it will check the language of each document and only send rules that are either set to “Any” or to the document's matching language.

 

 

Configure your usage classification rules

Usage classification rules identify the objectives and intentions of using the AI service to manage the risks.

These rules only apply to the prompts and not the response or the attached files.

Usage classification rules are typically defined using Natural Language (AI) but can also be defined using regular expressions to find words.

 

Firewall Auditing

 

 

Firewall auditing is available at the user or account level.

 

Auditing parts

Each prompt creates a record in the auditing. When a chat contains several prompts - it will be presented as several lines with the same chat name.

The auditing records have the following parts as can be seen in the image below:

 

At the very top, there is the policy that was applied and an icon indicating its risk.

In the second section, there are details of the prompt, response, and data sources. Data sources are either files attached to the prompt when using public AI services or the data the AI has based it's response on when using the BusinessGPT chatbot with company data.

The prompt contains the following parts:
Data classification: The data classification of the prompt text
Usage classification: The usage classification rules that matched the prompt
Specific usage - a summary of the usage regardless of any classifications

 

Below, is the response data classification, and at then comes the data sources classifications

 

In the last section, there are technical details, including the Firewall type:
BusinessGPT API is reflected to control the BusinessGPT chatbot

 

Prompt ID is used for troubleshooting.

Firewall actions

The action can have the following values:

image-20240417-100300.png Allowed - The activity was allowed after validating the policies

image-20240522-091707.png Inspecting - The activity was allowed, but the policies have not yet been validated. This action will change shortly and be set to Allowed or Flagged if it matches a policy set to Flag.

image-20240417-100338.png Flagged- The activity was allowed but found to match a policy set to “Flag.” This means that this activity should be looked at but not necessarily be at a high risk that requires blocking.

image-20240417-100349.png Blocked - The activity was blocked in real-time.

 

Firewall risks

The risks are the results of the matching policies. A full circle represents them

Empty: When no risk is displayed, the policy validation has not yet been done.
image-20240417-100613.png None: The Policy validation was completed, but no matching policies were found.
image-20240417-100647.png High: The activity matched a high-risk policy
image-20240417-100709.png Medium: The activity matched a medium-risk policy
image-20240417-102510.png Low(Yellow): The activity matched a low-risk policy

 

Firewall policy Engine

Can get the following values:

Firewall Service - means that the BG Service analyzed the interaction at rest after completing the interaction

Real-time Firewall- means that interaction was analyzed in real-time

 

Firewall Type

The type of component that proxies the data into the dashboard.

It can get the following values:

  • ChromeExtention

  • Proxy

  • BusinessGPTApi (Dashboard)

  • FileLoader

  • CopilotManualUpload

  • CopilotAPI

  • FirewallAPI (For external integration)

 

Chat AI service

  • ChatGPT

  • Gemini

  • BusinessGPTDashboard

  • MicrosoftCopilot

  • MicrosoftWordCopilot

  • MicrosoftTeamsCopilot

  • MicrosoftBrowserCopilot

  • MicrosoftExcelCopilot

  • MicrosoftPowerPointCopilot

  • MicrosoftOutlookCopilot

  • Cohere

  • Claude

  • SlackBot

ChatContext

The context of the chat. From which application it was used. For example the name of the Slack channel of the name of the word document that the AI was used from.

Sensitivity Levels

Sensitivity levels are a result of data classification rules.

If data classification is not completed, the system displays a message

image-20240417-101229.png A light blue means no matching data classification rules were found. Under the rules, the system writes:

“The content did not match any data classification rules.”

In other classifications, it shows the risk in yellow, orange, and red rings.

Configure Firewall policies

The firewall policies define the risk and the firewall activity (Block/Allow) to be applied for every use of AI.
The policy has two parts:
1. Details
2. Conditions

On the policy page, you establish the criteria for policy matching, leveraging data, and usage classification rules. You can select from existing conditions or seamlessly incorporate new ones directly within the policy page interface. Additionally, defining the specific users or groups the policy applies to, is integral to its configuration.

Furthermore, the policy encompasses risk assessment for each scenario and the firewall's response: whether to block or allow access.

Firewall Details

In this part, you can set the policy's name and description and whether it is enabled or not.
In addition, you set the risk of this policy as you see it in your company.
Risk
Note that the risk can change based on the user performing the activity. For example, a member of the legal group poses less risk than any other user when asking for legal advice.

Action:
Block- will block the prompt or response

Allow- will allow the interaction

Flag- will allow the interaction but will flag it in the auditing for further inspection.

Firewall Conditions:

Start by configuring the identity to whom the policy applies. It can be a group or a user.
Groups can be created in the system and/or synced from your Azure AD.
Contact support to learn how to sync your Azure AD groups and users.

Below this, you choose or create the data and usage classification rule. When you create a rule from this page, it will be added to your classification rules list.

 

 

Firewall dashboard

See here: AI Firewall Dashboard

Data classification report

See here: Data Classification Report

Usage management

See here Usage management