Pricing/ Speed
Currently paying min of 5.88 USD / day + VAT= 213 USD/Month - In process + put in wiki
Hope this email finds you well ! Thanks for your patience while we are working towards a solution regarding your query. This email is to share an update that I have received from the service team. As spoke to the Service Team, This is a tough question to answer precisely as there will be various factors from customers that can contribute both time and cost of running an ingestion job. But at a high level you can use the following details.
Throughput/Time (impacting the time of an ingestion job):
>> KB ingestion through put depends on number of token we process. We process customer content at approximately the same rate as the embedding model throughput [1] i.e, around 300 K tokens per minute for Titan V2 embeddings model. Now customer can use this detail to calculate the total time to process their content as they know of their corpus better.
NOTE: This throughput limit can be increased by customer with provisioned throughput [2].
Cost:
>> KB service by itself don't charge customers anything extra. The cost comes from the downstream services KB is going to call on behalf of the customer. For example, the cost will differ between the selection of OSS vs RDS as your vector store, similarly depending on the embedding model you choose cost may differ as well. After reaching to Product Team for a detailed calculation, the team has request for few additional details from you.
Product team has requested the following details to get cost estimation :
Region i.e us-east-1
Embedding Model i.e. Titan Embedding Text -V2 -
Vector Store i.e.OpenSearch Serverless -
Generation Model i.e. Claud 3 Haiku -
For Embedding cost:
How many documents need to be processed? -
What is the average token per document? -
What is the average document size (in MB)? -
Incremental update (Monthly) i.e. 10% -
For Generation Cost :
How many questions per minute? -
What is the expected duration in hours? -
Avg input tokens per query -
Avg Input tokens in retrieved context per request -
Do you want chat history to be included in the request? if yes, Avg input token per history -
Avg Output tokens per request -
Please be advised that we are making an sincere attempts to get your request catered and everything to avoid any unnecessary delays.
Please write back with the above information so that we can have the cost estimated. Please feel free to let us know if you have questions or doubts, will be glad to assist you.
Have a wonderful day!!
[1] https://docs.aws.amazon.com/bedrock/latest/userguide/quotas.html
[2] https://docs.aws.amazon.com/bedrock/latest/userguide/prov-throughput.html
Â
We value your feedback. Please share your experience by rating this and other correspondences in the AWS Support Center. You can rate a correspondence by selecting the stars in the top right corner of the correspondence.
Best regards,
Manisha B.
Amazon Web Services
Â