RAG with Tool Calls in AWS Bedrock Knowledge Bases
This tutorial covers the basics of building a conversational RAG application that supports tool calls, via the AWS Bedrock Knowledge Bases and Converse APIs. Here we demonstrate how to build the specific RAG app used in our Integrate Codex as-a-Tool with AWS Bedrock Knowledge Bases tutorial. Remember that Codex works with any RAG app, you can easily translate these ideas to more complex RAG pipelines.
Here’s a typical architecture for RAG apps with tool calling:
Let’s first install packages required for this tutorial and set up required AWS configurations.
%pip install -U boto3 # we used package-version 1.36.0
Optional: Set up AWS configurations
import os
import boto3
from botocore.client import Config
os.environ["AWS_ACCESS_KEY_ID"] = (
"<YOUR_AWS_ACCESS_KEY_ID>" # Your permament access key (not session access key)
)
os.environ["AWS_SECRET_ACCESS_KEY"] = (
"<YOUR_AWS_SECRET_ACCESS_KEY>" # Your permament secret access key (not session secret access key)
)
os.environ["MFA_DEVICE_ARN"] = (
"<YOUR_MFA_DEVICE_ARN>" # If your organization requires MFA, find this in AWS Console under: settings -> security credentials -> your mfa device
)
os.environ["AWS_REGION"] = "us-east-1" # Specify your AWS region
# Load environment variables
aws_access_key_id = os.getenv("AWS_ACCESS_KEY_ID")
aws_secret_access_key = os.getenv("AWS_SECRET_ACCESS_KEY")
region_name = os.getenv("AWS_REGION", "us-east-1") # Default to 'us-east-1' if not set
mfa_serial_number = os.getenv("MFA_DEVICE_ARN")
# Ensure required environment variables are set
if not all([aws_access_key_id, aws_secret_access_key, mfa_serial_number]):
raise EnvironmentError(
"Missing required environment variables. Ensure AWS_ACCESS_KEY_ID, "
"AWS_SECRET_ACCESS_KEY, and MFA_DEVICE_ARN are set."
)
# Enter MFA code in case your AWS organization requires it
mfa_token_code = input("Enter your MFA code: ")
print("MFA code entered: ", mfa_token_code)
sts_client = boto3.client(
"sts",
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key,
region_name=region_name,
)
try:
# Request temporary credentials
response = sts_client.get_session_token(
DurationSeconds=3600 * 24, # Valid for 24 hours
SerialNumber=mfa_serial_number,
TokenCode=mfa_token_code,
)
temp_credentials = response["Credentials"]
temp_access_key = temp_credentials["AccessKeyId"]
temp_secret_key = temp_credentials["SecretAccessKey"]
temp_session_token = temp_credentials["SessionToken"]
# Create a Bedrock Agent Runtime client
client = boto3.client(
"bedrock-agent-runtime",
aws_access_key_id=temp_access_key,
aws_secret_access_key=temp_secret_key,
aws_session_token=temp_session_token,
region_name=region_name,
)
print("Bedrock client successfully created.")
except Exception as e:
print(f"Error creating Bedrock client: {e}")
Initialize Bedrock retrieval and generation clients.
bedrock_config = Config(connect_timeout=120, read_timeout=120, retries={'max_attempts': 0})
BEDROCK_RETRIEVE_CLIENT = boto3.client(
"bedrock-agent-runtime",
config=bedrock_config,
aws_access_key_id=temp_access_key,
aws_secret_access_key=temp_secret_key,
aws_session_token=temp_session_token,
region_name=region_name
)
BEDROCK_GENERATION_CLIENT = boto3.client(
service_name='bedrock-runtime',
aws_access_key_id=temp_access_key,
aws_secret_access_key=temp_secret_key,
aws_session_token=temp_session_token,
region_name=region_name
)
Example RAG App: Product Customer Support
Consider a customer support / e-commerce RAG use-case where the Knowledge Base contains product listings like the following:
Creating a Knowledge Base
To keep our example simple, we upload the product description to AWS S3 as a single file: simple_water_bottle.txt
. This is the sole file our Knowledge Base will contain, but you can populate your actual Knowledge Base with many heterogeneous documents.
To create a Knowledge Base using Amazon Bedrock, refer to the official documentation.
After you’ve created it, add your KNOWLEDGE_BASE_ID
below.
KNOWLEDGE_BASE_ID = 'DASYAHIOKX' # replace with your own Knowledge Base
Implement a standard RAG pipeline
A RAG pipeline has two key steps – retrieval and generation, which implement using AWS Bedrock APIs. We’ll add tool calling support to the generation step.
Retrieval in AWS Knowledge Bases
We define helper methods for retrieving context from our Knowledge Base.
Optional: Helper methods for Retrieval in AWS Knowledge Bases
def retrieve(query, knowledgebase_id, numberOfResults=3):
"""Fetches relevant document chunks to query from Knowledge Base using AWS Bedrock Agent Runtime"""
return BEDROCK_RETRIEVE_CLIENT.retrieve(
retrievalQuery= {
'text': query
},
knowledgeBaseId=knowledgebase_id,
retrievalConfiguration= {
'vectorSearchConfiguration': {
'numberOfResults': numberOfResults,
'overrideSearchType': "HYBRID"
}
}
)
def retrieve_and_get_contexts(query, kbId, numberOfResults=3, threshold=0.0):
"""Fetches relevant contexts and properly formats them for the subsequent LLM response generation step."""
retrieval_results = retrieve(query, kbId, numberOfResults)
contexts = []
for retrievedResult in retrieval_results['retrievalResults']:
if retrievedResult['score'] >= threshold:
text = retrievedResult['content']['text']
if text.startswith("Document 1: "):
text = text[len("Document 1: "):] # Remove prefix if present
contexts.append(text)
return contexts
SCORE_THRESHOLD = 0.3 # Similarity score threshold for retrieving context to use in our RAG app
Let’s run our retrieval with a query.
query = "What is the Simple Water Bottle?"
print(retrieve_and_get_contexts(query, KNOWLEDGE_BASE_ID)[0])
Response generation with tool calling
To generate responses with an LLM that can also call tools, we pass the user query and retrieved context from our Knowledge Base into the AWS Converse API.
This API can either return a string response from the LLM or a tool call. If the output is a tool call, our method will keep prompting the Converse API until the LLM returns a string response after processing the result of tool call(s).
Optional: Helper methods for response generation with tool calling via AWS Converse API
import json
def form_prompt(user_question: str, contexts: list) -> str:
"""Forms the prompt to be used for querying the model."""
context_strings = "\n\n".join([f"Context {i + 1}: {context}" for i, context in enumerate(contexts)])
query_with_context = f"{context_strings}\n\nQUESTION:\n{user_question}"
indented_question_with_context = "\n".join(f" {line}" for line in query_with_context.splitlines())
return indented_question_with_context
def generate_text(user_question: str, model: str, tools: list[dict], system_prompts: list, messages: list[dict], bedrock_client) -> list[dict]:
"""Generates text dynamically handling tool use within Amazon Bedrock.
Params:
messages: List of message history in the desired format.
model: Identifier for the Amazon Bedrock model.
tools: List of tools the model can call.
bedrock_client: Client to interact with Bedrock API.
Returns:
messages: Final updated list of messages including tool interactions and responses.
"""
# Initial call to the model
response = bedrock_client.converse(
modelId=model,
messages=messages,
toolConfig=tools,
system=system_prompts,
)
output_message = response["output"]["message"]
stop_reason = response["stopReason"]
messages.append(output_message)
while stop_reason == "tool_use":
# Extract tool requests from the model response
tool_requests = output_message.get("content", [])
for tool_request in tool_requests:
if "toolUse" in tool_request:
tool = tool_request["toolUse"]
tool_name = tool["name"]
tool_input = tool["input"]
tool_use_id = tool["toolUseId"]
try:
# If you don't want the original question to be modified, use this instead
if 'question' in tool['input'].keys():
tool['input']['question'] = user_question
print(f"[internal log] Requesting tool {tool['name']}. with arguments: {tool_input}.")
tool_output_json = _handle_any_tool_call_for_stream_response(tool_name, tool_input)
tool_result = json.loads(tool_output_json)
print(f"[internal log] Tool response: {tool_result}")
# If tool call resulted in an error
if "error" in tool_result:
tool_result_message = {
"role": "user",
"content": [{"toolResult": {
"toolUseId": tool_use_id,
"content": [{"text": tool_result["error"]}],
"status": "error"
}}]
}
else:
# Format successful tool response
tool_result_message = {
"role": "user",
"content": [{"toolResult": {
"toolUseId": tool_use_id,
"content": [{"json": {"response": tool_result}}]
}}]
}
except Exception as e:
# Handle unexpected exceptions during tool handling
tool_result_message = {
"role": "user",
"content": [{"toolResult": {
"toolUseId": tool_use_id,
"content": [{"text": f"Error processing tool: {str(e)}"}],
"status": "error"
}}]
}
# Append the tool result to messages
messages.append(tool_result_message)
# Send the updated messages back to the model
response = bedrock_client.converse(
modelId=model,
messages=messages,
toolConfig=tools,
system=system_prompts,
)
output_message = response["output"]["message"]
stop_reason = response["stopReason"]
messages.append(output_message)
return messages
def _handle_any_tool_call_for_stream_response(function_name: str, arguments: dict) -> str:
"""Handles any tool dynamically by calling the function by name and passing in collected arguments.
Returns a dictionary of the tool output.
Returns error message if the tool is not found, not callable, or called incorrectly.
"""
tool_function = globals().get(function_name) or locals().get(function_name)
if callable(tool_function):
try:
# Dynamically call the tool function with arguments
tool_output = tool_function(**arguments)
return json.dumps(tool_output)
except Exception as e:
return json.dumps({
"error": f"Exception while calling tool '{function_name}': {str(e)}",
"arguments": arguments,
})
else:
return json.dumps({
"error": f"Tool '{function_name}' not found or not callable.",
"arguments": arguments,
})
Define single-turn RAG app
We integrate the above helper methods into a standard RAG app that can respond to any user query, calling tools as the LLM deems necessary. Our rag()
method can be called multiple times in a conversation, as long as a messages
variable is provided each time to track conversation history.
def rag(model: str, user_question: str, system_prompt: str, tools: list[dict], messages: list, knowledgebase_id: str) -> str:
"""Performs Retrieval-Augmented Generation using the provided model and tools.
Params:
model: Model name or ID.
user_question: The user's question or query.
system_prompt: System message to set context or behavior.
tools: List of tools the model can call.
knowledgebase_id: Knowledge base ID for retrieving contexts.
messages: Optional list of prior conversation history.
Returns:
Final response text generated by the model.
"""
# Retrieve contexts based on the user query and knowledge base ID
contexts = retrieve_and_get_contexts(user_question, knowledgebase_id,threshold= SCORE_THRESHOLD)
query_with_context = form_prompt(user_question, contexts)
print(f"[internal log] Invoking LLM with prompt + context\n{query_with_context}\n\n")
# Construct the user message with the retrieved contexts
user_message = {
"role": "user",
"content": [{"text": query_with_context}]
}
messages.append(user_message)
system_prompts = [{'text': system_prompt}]
# Call generate_text with the updated messages
final_messages = generate_text(
user_question=user_question,
model=model,
tools=tools,
system_prompts=system_prompts,
messages=messages,
bedrock_client=BEDROCK_GENERATION_CLIENT,
)
# Extract and return the final response text
return final_messages[-1]["content"][-1]["text"]
Example tool: get_todays_date
Let’s define an example tool, get_todays_date()
, to use in our RAG system. We provide the corresponding function and instructions on how to use it in a JSON format required by the AWS Converse API.
from datetime import datetime
def get_todays_date(date_format: str) -> str:
"""A tool that returns today's date in the date format requested."""
datetime_str = datetime.now().strftime(date_format)
return datetime_str
todays_date_tool_json = {
"toolSpec": {
"name": "get_todays_date",
"description": "A tool that returns today's date in the date format requested. Options are: '%Y-%m-%d', '%d', '%m', '%Y'.",
"inputSchema": {
"json": {
"type": "object",
"properties": {
"date_format": {
"type": "string",
"description": "The format that the tool requests the date in."
}
},
"required": [
"date_format"
]
}
}
}
}
System prompt with tool use instructions
For the best performance, add clear instructions on when to use the tool into the system prompt that governs your LLM. Below we simply add Step 3. in our list of instructions, which otherwise represent a typical RAG system prompt. In most RAG apps, one instructs the LLM on what fallback answer to respond with when it does not know how to answer a user’s query. Such fallback instructions help you reduce hallucinations and more precisely control the AI.
fallback_answer = "Based on the available information, I cannot provide a complete answer to this question."
system_prompt = f"""You are a helpful assistant designed to help users navigate a complex set of documents for question-answering tasks. Answer the user's Question based on the following possibly relevant Context and previous chat history using the tools provided if necessary. Follow these rules in order:
1. NEVER use phrases like "according to the context," "as the context states," etc. Treat the Context as your own knowledge, not something you are referencing.
2. Use only information from the provided Context. Your purpose is to provide information based on the Context, not to offer original advice.
3. Give a clear, short, and accurate answer. Explain complex terms if needed.
4. If the answer to the question requires today's date, use the following tool: get_todays_date. Return the date in the exact format the tool provides it.
5. If you remain unsure how to answer the Question then only respond with: "{fallback_answer}".
Remember, your purpose is to provide information based on the Context, not to offer original advice.
""".format(
fallback_answer=fallback_answer
)
Conversational RAG with tool calling
We track conversation history in a messages
variable that is updated each time we call the rag()
method to respond to a user query.
Let’s also select an LLM for our RAG pipeline and which tools are available.
After that, we can chat with our RAG app! Here we try a few user queries to evaluate different scenarios.
messages = []
model = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-5-sonnet-20240620-v1:0'
tool_config = {
"tools": [todays_date_tool_json]
}
Scenario 1: RAG can answer the question without tools
user_question = "How big is the water bottle?"
rag_response = rag(model=model, user_question=user_question, system_prompt=system_prompt, tools=tool_config, messages=messages, knowledgebase_id=KNOWLEDGE_BASE_ID)
print(f'[RAG response] {rag_response}')
For this user query, the necessary information is available in the Knowledge Base (as part of the product description).
Scenario 2: RAG can answer the question using tools
user_question = "Has the limited edition Amber water bottle already launched?"
rag_response = rag(model=model, user_question=user_question, system_prompt=system_prompt, tools=tool_config, messages=messages, knowledgebase_id=KNOWLEDGE_BASE_ID)
print(f'[RAG response] {rag_response}')
For this user query, the LLM chose to call our get_todays_date
tool to obtain necessary information. Note that a proper answer to this question also requires considering information from the Knowledge Base as well.
Scenario 3: RAG can answer the question considering conversation history
user_question = "What is the full name of it?"
rag_response = rag(model=model, user_question=user_question, system_prompt=system_prompt, tools=tool_config, messages=messages, knowledgebase_id=KNOWLEDGE_BASE_ID)
print(f'[RAG response] {rag_response}')
This user query only makes sense taking the conversation history into account.
Scenario 4: RAG cannot answer the question
user_question = "Can I return my simple water bottle?"
rag_response = rag(model=model, user_question=user_question, system_prompt=system_prompt, tools=tool_config, messages=messages, knowledgebase_id=KNOWLEDGE_BASE_ID)
print(f'[RAG response] {rag_response}')
Note that the Knowledge Base does not contain information about the return policy, and the get_todays_date
tool would not help either. In this case, the best our RAG app can do is to return our fallback response to the user.
Optional: Review full message history (includes tool calls)
# For educational purposes, we passed `messages` into every RAG call and logged every step in this variable.
for message in messages:
print(message)
{'role': 'user', 'content': [{'text': ' Context 1: Simple Water Bottle - Amber (limited edition launched Jan 1st 2025) A water bottle designed with a perfect blend of functionality and aesthetics in mind. Crafted from high-quality, durable plastic with a sleek honey-colored finish. Price: $24.99 \\nDimensions: 10 inches height x 4 inches width\n \n QUESTION:\n How big is the water bottle?'}]}
{'role': 'assistant', 'content': [{'text': "The Simple Water Bottle - Amber has the following dimensions:\n\n10 inches in height\n4 inches in width\n\nThese dimensions indicate that it's a fairly standard-sized water bottle, tall enough to hold a good amount of liquid while still being easy to carry and fit into most cup holders or bag pockets."}]}
{'role': 'user', 'content': [{'text': ' Context 1: Simple Water Bottle - Amber (limited edition launched Jan 1st 2025) A water bottle designed with a perfect blend of functionality and aesthetics in mind. Crafted from high-quality, durable plastic with a sleek honey-colored finish. Price: $24.99 \\nDimensions: 10 inches height x 4 inches width\n \n QUESTION:\n Has the limited edition Amber water bottle already launched?'}]}
{'role': 'assistant', 'content': [{'text': "To answer this question accurately, I need to know today's date and compare it with the launch date of the Simple Water Bottle - Amber limited edition. Let me use the available tool to get today's date."}, {'toolUse': {'toolUseId': 'tooluse_yjXK7j33T7yCaR-cXHVsvQ', 'name': 'get_todays_date', 'input': {'date_format': '%Y-%m-%d'}}}]}
{'role': 'user', 'content': [{'toolResult': {'toolUseId': 'tooluse_yjXK7j33T7yCaR-cXHVsvQ', 'content': [{'json': {'response': '2025-02-13'}}]}}]}
{'role': 'assistant', 'content': [{'text': "Based on the information provided and today's date, I can answer your question:\n\nThe limited edition Amber water bottle has already launched. The context states that it was launched on January 1st, 2025, and today's date is February 13, 2025. This means the water bottle has been available for about a month and a half."}]}
{'role': 'user', 'content': [{'text': ' Context 1: Simple Water Bottle - Amber (limited edition launched Jan 1st 2025) A water bottle designed with a perfect blend of functionality and aesthetics in mind. Crafted from high-quality, durable plastic with a sleek honey-colored finish. Price: $24.99 \\nDimensions: 10 inches height x 4 inches width\n \n QUESTION:\n What is the full name of it?'}]}
{'role': 'assistant', 'content': [{'text': 'The full name of the product is:\n\nSimple Water Bottle - Amber\n\nThis name encompasses both the product type and its specific color variant, which is described as a limited edition.'}]}
{'role': 'user', 'content': [{'text': ' Context 1: Simple Water Bottle - Amber (limited edition launched Jan 1st 2025) A water bottle designed with a perfect blend of functionality and aesthetics in mind. Crafted from high-quality, durable plastic with a sleek honey-colored finish. Price: $24.99 \\nDimensions: 10 inches height x 4 inches width\n \n QUESTION:\n Can I return my simple water bottle?'}]}
{'role': 'assistant', 'content': [{'text': "Based on the available information, I cannot provide a complete answer to this question. The given context does not include any details about return policies or procedures for the Simple Water Bottle - Amber. To answer this question accurately, we would need additional information about the company's return policy or specific terms and conditions for this product."}]}
Next Steps
Adding tool calls to your RAG system expands the capabilities of what your AI can do and the types of questions it can answer.
Once you have a RAG app with tools set up, adding Codex as-a-Tool takes only a few lines of code. Codex enables your RAG app to answer questions it previously could not (like Scenario 4 above). Learn how via our tutorial: Integrate Codex as-a-Tool with AWS Bedrock Knowledge Bases.
Need help? Check the FAQ or email us at: support@cleanlab.ai