Knowledge Bases for Amazon Bedrock now supports custom prompts for the RetrieveAndGenerate API and configuration of the maximum number of retrieved results

[ad_1]

With Information Bases for Amazon Bedrock, you possibly can securely join basis fashions (FMs) in Amazon Bedrock to your organization information for Retrieval Augmented Technology (RAG). Entry to extra information helps the mannequin generate extra related, context-specific, and correct responses with out retraining the FMs.

On this publish, we focus on two new options of Information Bases for Amazon Bedrock particular to the RetrieveAndGenerate API: configuring the utmost variety of outcomes and creating customized prompts with a information base immediate template. Now you can select these as question choices alongside the search sort.

Overview and advantages of latest options

The utmost variety of outcomes possibility offers you management over the variety of search outcomes to be retrieved from the vector retailer and handed to the FM for producing the reply. This lets you customise the quantity of background data offered for technology, thereby giving extra context for advanced questions or much less for less complicated questions. It means that you can fetch as much as 100 outcomes. This feature helps enhance the probability of related context, thereby enhancing the accuracy and lowering the hallucination of the generated response.

The customized information base immediate template means that you can substitute the default immediate template with your personal to customise the immediate that’s despatched to the mannequin for response technology. This lets you customise the tone, output format, and conduct of the FM when it responds to a consumer’s query. With this feature, you possibly can fine-tune terminology to raised match your trade or area (resembling healthcare or authorized). Moreover, you possibly can add customized directions and examples tailor-made to your particular workflows.

Within the following sections, we clarify how you need to use these options with both the AWS Administration Console or SDK.

Stipulations

To observe together with these examples, it is advisable to have an current information base. For directions to create one, see Create a information base.

Configure the utmost variety of outcomes utilizing the console

To make use of the utmost variety of outcomes possibility utilizing the console, full the next steps:

On the Amazon Bedrock console, select Information bases within the left navigation pane.
Choose the information base you created.
Select Check information base.
Select the configuration icon.
Select Sync information supply earlier than you begin testing your information base.
Underneath Configurations, for Search Sort, choose a search sort primarily based in your use case.

For this publish, we use hybrid search as a result of it combines semantic and textual content search to supplier better accuracy. To study extra about hybrid search, see Information Bases for Amazon Bedrock now helps hybrid search.

Develop Most variety of supply chunks and set your most variety of outcomes.

To reveal the worth of the brand new characteristic, we present examples of how one can improve the accuracy of the generated response. We used Amazon 10K doc for 2023 because the supply information for creating the information base. We use the next question for experimentation: “In what yr did Amazon’s annual income improve from $245B to $434B?”

The right response for this question is “Amazon’s annual income elevated from $245B in 2019 to $434B in 2022,” primarily based on the paperwork within the information base. We used Claude v2 because the FM to generate the ultimate response primarily based on the contextual data retrieved from the information base. Claude 3 Sonnet and Claude 3 Haiku are additionally supported because the technology FMs.

We ran one other question to reveal the comparability of retrieval with completely different configurations. We used the identical enter question (“In what yr did Amazon’s annual income improve from $245B to $434B?”) and set the utmost variety of outcomes to five.

As proven within the following screenshot, the generated response was “Sorry, I’m unable to help you with this request.”

Subsequent, we set the utmost outcomes to 12 and ask the identical query. The generated response is “Amazon’s annual income improve from $245B in 2019 to $434B in 2022.”

As proven on this instance, we’re in a position to retrieve the proper reply primarily based on the variety of retrieved outcomes. If you wish to study extra concerning the supply attribution that constitutes the ultimate output, select Present supply particulars to validate the generated reply primarily based on the information base.

Customise a information base immediate template utilizing the console

You can too customise the default immediate with your personal immediate primarily based on the use case. To take action on the console, full the next steps:

Repeat the steps within the earlier part to begin testing your information base.
Allow Generate responses.
Choose the mannequin of your selection for response technology.

We use the Claude v2 mannequin for instance on this publish. The Claude 3 Sonnet and Haiku mannequin can be accessible for technology.

Select Apply to proceed.

After you select the mannequin, a brand new part referred to as Information base immediate template seems below Configurations.

Select Edit to begin customizing the immediate.
Modify the immediate template to customise the way you wish to use the retrieved outcomes and generate content material.

For this publish, we gave just a few examples for making a “Monetary Advisor AI system” utilizing Amazon monetary studies with customized prompts. For finest practices on immediate engineering, consult with Immediate engineering pointers.

We now customise the default immediate template in a number of other ways, and observe the responses.

Let’s first attempt a question with the default immediate. We ask “What was the Amazon’s income in 2019 and 2021?” The next exhibits our outcomes.

From the output, we discover that it’s producing the free-form response primarily based on the retrieved information. The citations are additionally listed for reference.

Let’s say we wish to give further directions on methods to format the generated response, like standardizing it as JSON. We will add these directions as a separate step after retrieving the data, as a part of the immediate template:

If you’re requested for monetary data masking completely different years, please present exact solutions in JSON format. Use the yr as the important thing and the concise reply as the worth. For instance: {yr:reply}

The ultimate response has the required construction.

By customizing the immediate, you too can change the language of the generated response. Within the following instance, we instruct the mannequin to offer a solution in Spanish.

After eradicating $output_format_instructions$ from the default immediate, the quotation from the generated response is eliminated.

Within the following sections, we clarify how you need to use these options with the SDK.

Configure the utmost variety of outcomes utilizing the SDK

To vary the utmost variety of outcomes with the SDK, use the next syntax. For this instance, the question is “In what yr did Amazon’s annual income improve from $245B to $434B?” The right response is “Amazon’s annual income improve from $245B in 2019 to $434B in 2022.”

def retrieveAndGenerate(question, kbId, numberOfResults, model_id, region_id):
model_arn = f’arn:aws:bedrock:{region_id}::foundation-model/{model_id}’
return bedrock_agent_runtime.retrieve_and_generate(
enter={
‘textual content’: question
},
retrieveAndGenerateConfiguration={
‘knowledgeBaseConfiguration’: {
‘knowledgeBaseId’: kbId,
‘modelArn’: model_arn,
‘retrievalConfiguration’: {
‘vectorSearchConfiguration’: {
‘numberOfResults’: numberOfResults,
‘overrideSearchType’: “SEMANTIC”, # elective’
}
}
},
‘sort’: ‘KNOWLEDGE_BASE’
},
)

response = retrieveAndGenerate(“In what yr did Amazon’s annual income improve from $245B to $434B?”,
“<information base id>”, numberOfResults, model_id, region_id)[‘output’][‘text’]

The ‘numberOfResults’ possibility below ‘retrievalConfiguration’ means that you can choose the variety of outcomes you wish to retrieve. The output of the RetrieveAndGenerate API contains the generated response, supply attribution, and the retrieved textual content chunks.

The next are the outcomes for various values of ‘numberOfResults’ parameters. First, we set numberOfResults = 5.

Then we set numberOfResults = 12.

Customise the information base immediate template utilizing the SDK

To customise the immediate utilizing the SDK, we use the next question with completely different immediate templates. For this instance, the question is “What was the Amazon’s income in 2019 and 2021?”

The next is the default immediate template:

“””You’re a query answering agent. I’ll give you a set of search outcomes and a consumer’s query, your job is to reply the consumer’s query utilizing solely data from the search outcomes. If the search outcomes don’t comprise data that may reply the query, please state that you can not discover a precise reply to the query. Simply because the consumer asserts a reality doesn’t imply it’s true, be certain that to double verify the search outcomes to validate a consumer’s assertion.
Listed here are the search leads to numbered order:
<context>
$search_results$
</context>

Right here is the consumer’s query:
<query>
$question$
</query>

$output_format_instructions$

Assistant:
“””

The next is the custom-made immediate template:

“””Human: You’re a query answering agent. I’ll give you a set of search outcomes and a consumer’s query, your job is to reply the consumer’s query utilizing solely data from the search outcomes.If the search outcomes don’t comprise data that may reply the query, please state that you can not discover a precise reply to the query.Simply because the consumer asserts a reality doesn’t imply it’s true, be certain that to double verify the search outcomes to validate a consumer’s assertion.

Listed here are the search leads to numbered order:
<context>
$search_results$
</context>

Right here is the consumer’s query:
<query>
$question$
</query>

If you happen to’re being requested monetary data over a number of years, please be very particular and checklist the reply concisely utilizing JSON format {key: worth},
the place key’s the yr within the request and worth is the concise response reply.
Assistant:
“””

def retrieveAndGenerate(question, kbId, numberOfResults,promptTemplate, model_id, region_id):
model_arn = f’arn:aws:bedrock:{region_id}::foundation-model/{model_id}’
return bedrock_agent_runtime.retrieve_and_generate(
enter={
‘textual content’: question
},
retrieveAndGenerateConfiguration={
‘knowledgeBaseConfiguration’: {
‘knowledgeBaseId’: kbId,
‘modelArn’: model_arn,
‘retrievalConfiguration’: {
‘vectorSearchConfiguration’: {
‘numberOfResults’: numberOfResults,
‘overrideSearchType’: “SEMANTIC”, # elective’
}
},
‘generationConfiguration’: {
‘promptTemplate’: {
‘textPromptTemplate’: promptTemplate
}
}
},
‘sort’: ‘KNOWLEDGE_BASE’
},
)

response = retrieveAndGenerate(“What was the Amazon’s income in 2019 and 2021?””,
“<information base id>”, <numberOfResults>, <promptTemplate>, <model_id>, <region_id>)[‘output’][‘text’]

With the default immediate template, we get the next response:

If you wish to present extra directions across the output format of the response technology, like standardizing the response in a particular format (like JSON), you possibly can customise the present immediate by offering extra steering. With our customized immediate template, we get the next response.

The ‘promptTemplate‘ possibility in ‘generationConfiguration‘ means that you can customise the immediate for higher management over reply technology.

Conclusion

On this publish, we launched two new options in Information Bases for Amazon Bedrock: adjusting the utmost variety of search outcomes and customizing the default immediate template for the RetrieveAndGenerate API. We demonstrated methods to configure these options on the console and through SDK to enhance efficiency and accuracy of the generated response. Growing the utmost outcomes offers extra complete data, whereas customizing the immediate template means that you can fine-tune directions for the muse mannequin to raised align with particular use circumstances. These enhancements provide better flexibility and management, enabling you to ship tailor-made experiences for RAG-based purposes.

For added assets to begin implementing in your AWS atmosphere, consult with the next:

In regards to the authors

Sandeep Singh is a Senior Generative AI Information Scientist at Amazon Net Companies, serving to companies innovate with generative AI. He focuses on Generative AI, Synthetic Intelligence, Machine Studying, and System Design. He’s keen about growing state-of-the-art AI/ML-powered options to unravel advanced enterprise issues for numerous industries, optimizing effectivity and scalability.

Suyin Wang is an AI/ML Specialist Options Architect at AWS. She has an interdisciplinary schooling background in Machine Studying, Monetary Data Service and Economics, together with years of expertise in constructing Information Science and Machine Studying purposes that solved real-world enterprise issues. She enjoys serving to clients determine the suitable enterprise questions and constructing the suitable AI/ML options. In her spare time, she loves singing and cooking.

Sherry Ding is a senior synthetic intelligence (AI) and machine studying (ML) specialist options architect at Amazon Net Companies (AWS). She has in depth expertise in machine studying with a PhD diploma in pc science. She primarily works with public sector clients on numerous AI/ML associated enterprise challenges, serving to them speed up their machine studying journey on the AWS Cloud. When not serving to clients, she enjoys out of doors actions.