Smart search
BusinessGPT offers two types of Similarity searches
Invoking AWS similarity search using Amazon Bedrock on data ingested in AWS as a knowledge base
Invoking BusinessGPT similarity search on data ingested in BusinesGPT
BusinesGPT supports searching your content using Semantic search.
This allows retrieval of documents containing content similar to the search term.
Search modes available
Exact - Identical match to your search term
Keywords for flexible matching
Stemming: PostgreSQL reduces words to their root forms, enabling matches across different word variations. For example, searching for "run" would also match "runs," "running," and "ran".
Tokenization: The database breaks down text into individual words or tokens, allowing for more granular searching.
Stop word removal: Common words like "the" or "an" are automatically ignored to improve search relevance.
Partial word matching: Using wildcards, users can search for word fragments, increasing search flexibility
Normalization: Words are converted to standardized forms, allowing for matches across different spellings or variations
/
Boolean operators: Searches can be refined using AND, OR, and NOT operators to create more complex queries
3
.
Semantic for context-based search using AI
Smart to blend both above methods for optimal results.
Search configurations
Number of items to return
Search results that can be downloaded in a CSV file
Number of items to display
How many items be displayed in the table view shown in the dashboard search results.
Note that when returning to a search , it will always show top 10 sample records
K Parameter
In semantic search, the value of K typically refers to the number of top results (or "top-K" results) returned from a similarity search, such as finding the nearest neighbors of a query in a vector space. The ideal value for K depends on various factors like the size of the dataset, the use case, and the model's accuracy. However, here are some common ranges for K in different scenarios:
Small datasets or highly relevant results:
K = 1 to 10
This is useful when you need highly precise results, often used in question-answering systems or search engines where only the most relevant answers are needed.
Moderate-sized datasets or moderate recall needs:
K = 10 to 50
For broader search tasks, where a user might need to consider a few options before finding a relevant match (e.g., document retrieval, product recommendations).
Large datasets or exploratory search:
K = 50 to 200 or more
In exploratory searches, users may want to browse a wide array of results. This is more common in research or creative search tasks.
In practice, K is often tuned based on performance metrics (e.g., precision, recall, F1-score) specific to your domain, as a smaller K yields fewer but highly relevant results, while a larger K offers more diversity at the potential cost of relevance.
Defaults
Defaults are configured in your user setting , but some of them can be changed per search as in the table below
Parameter | Default in user settings | Change per search |
---|---|---|
Max number of items to return | V | V |
Max number of items to display | V |
|
K parameter | V | V |