Smart search

BusinessGPT offers two types of Similarity searches

Invoking AWS similarity search using Amazon Bedrock on data ingested in AWS as a knowledge base
Invoking BusinessGPT similarity search on data ingested in BusinesGPT

BusinesGPT supports searching your content using Semantic search.

This allows retrieval of documents containing content similar to the search term.

Search modes available

Exact - Identical match to your search term

Keywords for flexible matching

Stemming: PostgreSQL reduces words to their root forms, enabling matches across different word variations. For example, searching for "run" would also match "runs," "running," and "ran".
Tokenization: The database breaks down text into individual words or tokens, allowing for more granular searching.
Stop word removal: Common words like "the" or "an" are automatically ignored to improve search relevance.
Partial word matching: Using wildcards, users can search for word fragments, increasing search flexibility
Normalization: Words are converted to standardized forms, allowing for matches across different spellings or variations/
Boolean operators: Searches can be refined using AND, OR, and NOT operators to create more complex queries3.

Semantic for context-based search using AI

Smart to blend both above methods for optimal results.

Search configurations

Number of items to return

Search results that can be downloaded in a CSV file

Number of items to display

How many items be displayed in the table view shown in the dashboard search results.

Note that when returning to a search , it will always show top 10 sample records

K Parameter

In semantic search, the value of K typically refers to the number of top results (or "top-K" results) returned from a similarity search, such as finding the nearest neighbors of a query in a vector space. The ideal value for K depends on various factors like the size of the dataset, the use case, and the model's accuracy. However, here are some common ranges for K in different scenarios:

Small datasets or highly relevant results:
- K = 1 to 10
- This is useful when you need highly precise results, often used in question-answering systems or search engines where only the most relevant answers are needed.
Moderate-sized datasets or moderate recall needs:
- K = 10 to 50
- For broader search tasks, where a user might need to consider a few options before finding a relevant match (e.g., document retrieval, product recommendations).
Large datasets or exploratory search:
- K = 50 to 200 or more
- In exploratory searches, users may want to browse a wide array of results. This is more common in research or creative search tasks.

In practice, K is often tuned based on performance metrics (e.g., precision, recall, F1-score) specific to your domain, as a smaller K yields fewer but highly relevant results, while a larger K offers more diversity at the potential cost of relevance.

Defaults

Defaults are configured in your user setting , but some of them can be changed per search as in the table below

Parameter	Default in user settings	Change per search

Parameter	Default in user settings	Change per search
Max number of items to return	V	V
Max number of items to display	V
K parameter	V	V