Integrate Codex as-a-Tool into any RAG framework
To demonstrate how to integrate Codex with any RAG framework, we’ll consider a toy example RAG app built from scratch using OpenAI LLMs. You can translate the same ideas to any RAG framework, assuming basic familiarity with RAG and LLMs.
This tutorial presumes your RAG app already can perform tool calls. If unsure how to do RAG with tool calls, follow our tutorial: Adding Tool Calls to RAG.
Once you have a RAG app that supports tool calling, adding Codex as an additional Tool takes minimal effort but guarantees better responses from your AI application.
If you prefer to integrate Codex without adding tool calls to your application, check out our other integrations.
Let’s first install packages required for this tutorial.
%pip install --upgrade cleanlab_codex
Optional: Helper methods for basic RAG from prior tutorial (Adding Tool Calls to RAG)
import os
import json
from datetime import datetime
from openai import OpenAI
fallback_answer = "Based on the available information, I cannot provide a complete answer to this question." # desired RAG response when query cannot be answered
system_prompt_without_codex = f"""
Answer the user's Question based on the following possibly relevant Context. Follow these rules:
1. Never use phrases like "according to the context," "as the context states," etc. Treat the Context as your own knowledge, not something you are referencing.
2. Give a clear, short, and accurate answer. Explain complex terms if needed.
3. If the answer to the question requires today's date, use the following tool: get_todays_date.
4. If the Context doesn't adequately address the Question, say: "{fallback_answer}" only, nothing else.
Remember, your purpose is to provide information based on the Context, not to offer original advice.
"""
def get_todays_date(date_format: str) -> str:
"""A tool that returns today's date in the date format requested."""
datetime_str = datetime.now().strftime(date_format)
return datetime_str
todays_date_tool_json = {
"type": "function",
"function": {
"name": "get_todays_date",
"description": "A tool that returns today's date in the date format requested. Options for date_format parameter are: '%Y-%m-%d', '%d', '%m', '%Y'.",
"parameters": {
"type": "object",
"properties": {
"date_format": {
"type": "string",
"enum": ["%Y-%m-%d", "%d", "%m", "%Y"],
"default": "%Y-%m-%d",
"description": "The date format to return today's date in."
}
},
"required": ["date_format"],
}
}
}
tools_without_codex = [todays_date_tool_json]
def retrieve_context(user_question: str) -> str:
"""Toy retrieval that returns same context for any user question. Replace this with actual retrieval in your RAG system."""
contexts = """Simple Water Bottle - Amber (limited edition launched Jan 1st 2025)
A water bottle designed with a perfect blend of functionality and aesthetics in mind. Crafted from high-quality, durable plastic with a sleek honey-colored finish.
Price: $24.99 \nDimensions: 10 inches height x 4 inches width"""
return contexts
def form_prompt(user_question: str, retrieved_context: str) -> str:
question_with_context = f"Context:\n{retrieved_context}\n\nUser Question:\n{user_question}"
indented_question_with_context = "\n".join(f" {line}" for line in question_with_context.splitlines()) # line is just formatting the final prompt for readability in the tutorial
return indented_question_with_context
def simulate_response_as_message(response: str) -> list[dict]:
"""Commits the response to a conversation history to return back to the model."""
return {"role": "assistant", "content": response}
def simulate_tool_call_as_message(tool_call_id: str, function_name: str, function_arguments: str) -> dict:
"""Commits the tool call to a conversation history to return back to the model."""
tool_call_message = {
"role": "assistant",
"tool_calls": [{
"id": tool_call_id,
"type": "function",
"function": {
"arguments": function_arguments,
"name": function_name
}
}]}
return tool_call_message
def simulate_tool_call_response_as_message(tool_call_id: str, function_response: str) -> dict:
"""Commits the result of the function call to a conversation history to return back to the model."""
function_call_result_message = {
"role": "tool",
"content": function_response,
"tool_call_id": tool_call_id,
}
return function_call_result_message
def stream_response(client, messages: list[dict], model: str, tools: list[dict]) -> str:
"""Processes a streaming model response dynamically, handling any tool calls that were made.
Params:
messages: message history list in openai format
model: model name
tools: list of tools model can call
Returns:
response: final response in openai format
"""
response_stream = client.chat.completions.create(
model=model,
messages=messages,
stream=True,
tools=tools,
parallel_tool_calls=False, # prevents OpenAI from making multiple tool calls in a single response
)
collected_messages = []
final_tool_calls = {}
for chunk in response_stream:
if chunk.choices[0].delta.content:
collected_messages.append(chunk.choices[0].delta.content)
for tool_call in chunk.choices[0].delta.tool_calls or []:
index = tool_call.index
if index not in final_tool_calls:
final_tool_calls[index] = tool_call
final_tool_calls[index].function.arguments += tool_call.function.arguments
if chunk.choices[0].finish_reason == "tool_calls":
for tool_call in final_tool_calls.values():
function_response = _handle_any_tool_call_for_stream_response(tool_call.function.name, json.loads(tool_call.function.arguments))
print(f'[internal log] Called {tool_call.function.name} tool, with arguments: {tool_call.function.arguments}')
print(f'[internal log] Tool response: {str(function_response)}')
tool_call_response_message = simulate_tool_call_response_as_message(tool_call.id, function_response)
# If the tool call resulted in an error, return the message instead of continuing the conversation
if "error" in tool_call_response_message["content"]:
return tool_call_response_message
response = [
simulate_tool_call_as_message(tool_call.id, tool_call.function.name, tool_call.function.arguments),
tool_call_response_message,
]
# If needed, extend messages and re-call the stream response
messages.extend(response)
response = stream_response(client=client, messages=messages, model=model, tools=tools) # This recursive call handles the case when a tool calls another tool until all tools are resolved and a final response is returned
else:
collected_messages = [m for m in collected_messages if m is not None]
full_str_response = "".join(collected_messages)
response = simulate_response_as_message(full_str_response)
return response
def _handle_any_tool_call_for_stream_response(function_name: str, arguments: dict) -> str:
"""Handles any tool dynamically by calling the function by name and passing in collected arguments.
Returns a dictionary of the tool output.
Returns error message if the tool is not found, not callable or called incorrectly.
"""
try:
tool_function = globals().get(function_name) or locals().get(function_name)
if callable(tool_function):
# Dynamically call the tool function with arguments
tool_output = tool_function(**arguments)
return json.dumps(tool_output)
else:
return json.dumps({
"error": f"Tool '{function_name}' not found or not callable.",
"arguments": arguments,
})
except Exception as e:
return json.dumps({
"error": f"Exception in handling tool '{function_name}': {str(e)}",
"arguments": arguments,
})
Example RAG App: Product Customer Support
Let’s revisit our RAG app built in the RAG With Tool Calls tutorial, which has the option to call a get_todays_date()
tool. This example represents a customer support / e-commerce use-case where the Knowledge Base contains product listings like the following:
The details of this toy RAG app are unimportant if you are already familiar with RAG and Tool Calling, otherwise refer to the RAG With Tool Calls tutorial. That tutorial walks through the RAG method defined below, which uses the OpenAI LLM API for single-turn Q&A with token-streaming. To run this method, we instantiate our OpenAI client. Subsequently, we integrate Codex-as-a-Tool and demonstrate its benefits.
Optional: Helper RAG method from prior tutorial (Adding Tool Calls to RAG)
def rag(client, model: str, user_question: str, system_prompt: str, tools: list[dict]) -> str:
retrieved_context = retrieve_context(user_question)
question_with_context = form_prompt(user_question, retrieved_context)
print(f"[internal log] Invoking LLM with prompt\n{question_with_context}\n\n")
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": question_with_context},
]
response_messages = stream_response(client=client, messages=messages, model=model, tools=tools)
return f"\n[RAG response] {response_messages.get('content')}"
os.environ["OPENAI_API_KEY"] = "<YOUR-KEY-HERE>" # Replace with your OpenAI API key
model = "gpt-4o" # which LLM to use
client = OpenAI()
Create Codex Project
To use Codex, first create a Project.
Here we assume some common (question, answer) pairs about the Simple Water Bottle have already been added to a Codex Project. Learn how that was done via our tutorial: Populating Codex.
Our existing Codex Project contains the following entries:
access_key = "<YOUR-PROJECT-ACCESS-KEY>" # Obtain from your Project's settings page: https://codex.cleanlab.ai/
Integrate Codex as an additional tool
Integrating Codex into a RAG app that supports tool calling requires minimal code changes:
- Import Codex and add it into your list of
tools
. - Update your system prompt to include instructions for calling Codex, as demonstrated below in:
system_prompt_with_codex
.
After that, call your original RAG pipeline with these updated variables to start experiencing the benefits of Codex!
Note: This tutorial uses a Codex tool description in OpenAI format, provided via the to_openai_tool()
function. For certain non-OpenAI LLMs, you can import the Codex tool description in other provided formats as well, or manually write it yourself if no provided format is available. Check the Codex API Docs for other formats.
from cleanlab_codex import CodexTool
codex_tool = CodexTool.from_access_key(access_key=access_key, fallback_answer=fallback_answer)
codex_tool_openai = codex_tool.to_openai_tool()
globals()[codex_tool.tool_name] = codex_tool.query # Optional step for convenience: make function to call the tool globally accessible
tools_with_codex = tools_without_codex + [codex_tool_openai] # Add Codex to the list of tools
# Update the RAG system prompt with instructions for handling Codex (adjust based on your needs)
system_prompt_with_codex = f"""
You are a helpful assistant designed to help users navigate a complex set of documents for question-answering tasks. Answer the user's Question based on the following possibly relevant Context and previous chat history using the tools provided if necessary. Follow these rules in order:
1. NEVER use phrases like 'according to the context,' 'as the context states,' etc. Treat the Context as your own knowledge, not something you are referencing.
2. Use only information from the provided Context. Your purpose is to provide information based on the Context, not to offer original advice.
3. Give a clear, short, and accurate answer. Explain complex terms if needed.
4. If the answer to the question requires today's date, use the following tool: todays_date_tool. Return the date in the exact format the tool provides it.
5. If you remain unsure how to answer the user query, then use the {codex_tool.tool_name} tool to search for the answer. Always call {codex_tool.tool_name} whenever the provided Context does not answer the user query. Do not call {codex_tool.tool_name} if you already know the right answer or the necessary information is in the provided Context. Your query to {codex_tool.tool_name} should match the user's original query, unless minor clarification is needed to form a self-contained query. After you have called {codex_tool.tool_name}, determine whether its answer seems helpful, and if so, respond with this answer to the user. If the answer from {codex_tool.tool_name} does not seem helpful, then simply ignore it.
6. If you remain unsure how to answer the Question (even after using the {codex_tool.tool_name} tool and considering the provided Context), then only respond with: "{fallback_answer}".
"""
RAG with Codex in action
Integrating Codex as-a-Tool allows your RAG app to answer more questions than it was originally capable of.
Example 1
Let’s ask a question to our original RAG app (before Codex was integrated).
user_question = "Can I return my simple water bottle?"
response = rag(client, model=model, user_question=user_question,
system_prompt=system_prompt_without_codex, tools=tools_without_codex
)
print(response)
The original RAG app is unable to answer, in this case because the required information is not in its Knowledge Base.
Let’s ask the same question to our RAG app with Codex added as an additional tool. Note that we use the updated system prompt and tool list when Codex is integrated in the RAG app.
response = rag(client, model=model, user_question=user_question,
system_prompt=system_prompt_with_codex, tools=tools_with_codex
)
print(response)
As you see, integrating Codex enables your RAG app to answer questions it originally strugged with, as long as a similar question was already answered in the corresponding Codex Project.
Example 2
Let’s ask another question to our RAG app with Codex integrated.
user_question = "How can I order the Simple Water Bottle in bulk?"
response = rag(client, model=model, user_question=user_question,
system_prompt=system_prompt_with_codex, tools=tools_with_codex
)
print(response)
Our RAG app is unable to answer this question because there is no relevant information in its Knowledge Base, nor has a similar question been answered in the Codex Project (see the contents of the Codex Project above).
Codex automatically recognizes this question could not be answered and logs it into the Project where it awaits an answer from a SME. Navigate to your Codex Project in the Web App where you (or a SME at your company) can enter the desired answer for this query.
As soon as an answer is provided in Codex, our RAG app will be able to answer all similar questions going forward (as seen for the previous query).
Example 3
Let’s ask another query to our RAG app with Codex integrated. This is a query the original RAG app was able to correctly answer without Codex (since the relevant information exists in the Knowledge Base).
user_question = "How big is the water bottle?"
response = rag(client, model=model, user_question=user_question,
system_prompt=system_prompt_with_codex, tools=tools_with_codex
)
print(response)
We see that the RAG app with Codex integrated is still able to correctly answer this query. Integrating Codex has no negative effect on questions your original RAG app could answer.
Next Steps
Now that Codex is integrated with your RAG app, you and SMEs can open the Codex Project and answer questions logged there to continuously improve your AI.
Adding Codex only improves your RAG app. As seen here, integrating Codex into your RAG app requires minimal extra code. Once integrated, the Codex Project automatically logs all user queries that your original RAG app handles poorly. Using a simple web interface, SMEs at your company can answer the highest priority questions in the Codex Project. As soon as an answer is entered in Codex, your RAG app will be able to properly handle all similar questions encountered in the future
Codex is the fastest way for nontechnical SMEs to directly improve your RAG app. As the Developer, you simply integrate Codex once, and from then on, SMEs can continuously improve how your AI handles common user queries without needing your help. Codex works with any RAG architecture, so Developers can independently improve the RAG system in other ways with their new free time.
This tutorial demonstrated a single-turn Q&A app, but you can easily extend this code into a conversational app (multi-turn chat).
Need help, more capabilities, or other deployment options? Check the FAQ or email us at: support@cleanlab.ai