Building an agentic RAG app with OpenAI Assistants

Run in Google Colab

This tutorial demonstrates how to build an OpenAI Assistants application that can call tools, converse with users, and answer their questions based on a Knowledge Base of files you provide. The OpenAI Assistants framework implements agentic RAG, which treats Retrieval as a tool that can be called (rather than as a step hardcoded into every user interaction, as is done in standard RAG).

Building such an Assistant is a prerequisite for our tutorial: Integrate Codex-as-a-tool into OpenAI Assistants, which shows how to greatly improve any existing Assistant.

RAG Workflow

Let’s first install and setup the OpenAI client library.

%pip install openai  # we used package-version 1.59.7

from openai import OpenAI
import os

os.environ["OPENAI_API_KEY"] = "<YOUR-KEY-HERE>"  # Replace with your OpenAI API key
model = "gpt-4o"  # which LLM to use
client = OpenAI()

Example RAG App: Product Customer Support

Let’s revisit our OpenAI Assistants application built in the tutorial: Agentic RAG with OpenAI Assistants, which has the option to call a get_todays_date() tool. This example represents a customer support / e-commerce use-case where the Knowledge Base contains product listings like the following:

Simple water bottle product listing

For simplicity, our Assistant’s Knowledge Base here only contains a single document featuring this one product description. To build a RAG app with OpenAI Assistants: we load documents/files into a Knowledge Base (vector store), and then connect the Assistant to this Knowledge Base.

Optional: Define helper methods for Knowledge Base creation and retreival

from io import BytesIO
import json

from openai.types.beta.threads import Run
from openai.types.beta.assistant import Assistant
from openai.types.beta.assistant_tool_param import AssistantToolParam
from openai.types.beta.thread import Thread
from openai.types.beta.threads.run import Run as RunObject
from openai.types.beta.threads.message_content import MessageContent
from openai.types.beta.threads.run_submit_tool_outputs_params import ToolOutput

DEFAULT_FILE_SEARCH: AssistantToolParam = {"type": "file_search"}

def create_rag_assistant(client: OpenAI, instructions: str, tools: list[AssistantToolParam]) -> Assistant:
    """Create and configure a RAG-enabled assistant."""

    assert any(tool["type"] == "file_search" for tool in tools), "File search tool is required"
    
    return client.beta.assistants.create(
        name="RAG Assistant",
        instructions=instructions,
        model="gpt-4o",
        tools=tools,
    )

def load_documents(client: OpenAI):
    # Create a vector store
    vector_store = client.beta.vector_stores.create(name="Simple Context")

    # This is a highly simplified way to provide document content
    # In a real application, you would likely:
    # - Read documents from files on disk
    # - Download documents from a database or cloud storage
    # - Process documents from various sources (PDFs, web pages, etc.)
    
    documents = {
        "simple_water_bottle.txt": "Simple Water Bottle - Amber (limited edition launched Jan 1st 2025)\n\nA water bottle designed with a perfect blend of functionality and aesthetics in mind. Crafted from high-quality, durable plastic with a sleek honey-colored finish.\n\nPrice: $24.99 \nDimensions: 10 inches height x 4 inches width",
    }

    # Ready the files for upload to OpenAI
    file_objects = []
    for doc_name, doc_content in documents.items():
        # Create BytesIO object from document content
        file_object = BytesIO(doc_content.encode("utf-8"))
        file_object.name = doc_name
        file_objects.append(file_object)

    # Upload files to vector store
    client.beta.vector_stores.file_batches.upload_and_poll(
        vector_store_id=vector_store.id,
        files=file_objects
    )
    
    return vector_store

def add_vector_store_to_assistant(client: OpenAI, assistant, vector_store):
    assistant = client.beta.assistants.update(
        assistant_id=assistant.id,
        tool_resources={"file_search": {"vector_store_ids": [vector_store.id]}},
    )
    return assistant

Create Chat App that supports Tool Calls

We instantiate a typical chat application to interact with the Assistant. Each instance of our RAGChat class defined below manages a conversation thread (multi-turn user interaction), responding to each user message through its chat method. Our app handles tool calls for any tools registered via a ToolRegistry class.

Optional: Define class for RAG chat with tools

class ToolRegistry:
    """Registry for tool implementations"""
    def __init__(self):
        self._tools = {}
    
    def register_tool(self, tool_name: str, handler):
        """Register a tool handler function"""
        self._tools[tool_name] = handler
        
    def get_handler(self, tool_name: str):
        """Get the handler for a tool"""
        return self._tools.get(tool_name)
        
    def __contains__(self, tool_name: str) -> bool:
        """Allow using 'in' operator to check if tool exists"""
        return tool_name in self._tools

class RAGChat:
    def __init__(self, client: OpenAI, assistant_id: str, tool_registry: ToolRegistry):
        self.client = client
        self.assistant_id = assistant_id
        self.tool_registry = tool_registry

        # Create a thread for the conversation
        self.thread: Thread = self.client.beta.threads.create()

    def _handle_tool_calls(self, run: RunObject) -> list[ToolOutput]:
        """Handle tool calls from the assistant."""
        if not run.required_action or not run.required_action.submit_tool_outputs:
            return []
            
        tool_outputs: list[ToolOutput] = []
        for tool_call in run.required_action.submit_tool_outputs.tool_calls:
            function_name = tool_call.function.name
            function_args = json.loads(tool_call.function.arguments)
            
            if function_name in self.tool_registry:
                print(f"[internal log] Calling tool: {function_name} with args: {function_args}")
                handler = self.tool_registry.get_handler(function_name)
                if handler is None:
                    raise ValueError(f"No handler found for called tool: {function_name}")
                output = handler(**function_args)
            else:
                output = f"Unknown tool: {function_name}"
                
            tool_outputs.append({
                "tool_call_id": tool_call.id,
                "output": output
            })
        
        return tool_outputs

    def _get_message_text(self, content: MessageContent) -> str:
        """Extract text from message content."""
        if hasattr(content, 'text'):
            return content.text.value
        return "Error: Message content is not text"

    def chat(self, user_message: str) -> str:
        """Process a user message and return the assistant's response."""
        # Add the user message to the thread
        self.client.beta.threads.messages.create(
            thread_id=self.thread.id,
            role="user",
            content=user_message
        )

        # Create a run
        run: Run = self.client.beta.threads.runs.create(
            thread_id=self.thread.id,
            assistant_id=self.assistant_id
        )

        # Wait for run to complete and handle any tool calls
        while True:
            run = self.client.beta.threads.runs.retrieve(
                thread_id=self.thread.id,
                run_id=run.id
            )
            
            if run.status == "requires_action":
                # Handle tool calls
                tool_outputs = self._handle_tool_calls(run)
                
                # Submit tool outputs
                run = self.client.beta.threads.runs.submit_tool_outputs(
                    thread_id=self.thread.id,
                    run_id=run.id,
                    tool_outputs=tool_outputs
                )
                
            elif run.status == "completed":
                # Get the latest message
                messages = self.client.beta.threads.messages.list(
                    thread_id=self.thread.id
                )
                if messages.data:
                    return self._get_message_text(messages.data[0].content[0])
                return "Error: No messages found"
                
            elif run.status in ["failed", "expired"]:
                return f"Error: Run {run.status}"

Example tool: get_todays_date

Let’s define an example tool get_todays_date() that our Assistant can rely on. Here we follow OpenAI’s JSON format for representing the tool.

from datetime import datetime

def get_todays_date(date_format: str) -> str:
  "A tool that returns today's date in the date format requested. Options are: 'YYYY-MM-DD', 'DD', 'MM', 'YYYY'."
  datetime_str = datetime.now().strftime(date_format)
  return datetime_str

todays_date_tool_json = {
  "type": "function",
  "function": {
    "name": "get_todays_date",
    "description": "A tool that returns today's date in the date format requested. Options are: 'YYYY-MM-DD', 'DD', 'MM', 'YYYY'.",
    "parameters": {
      "type": "object",
      "properties": {
        "date_format": {
          "type": "string",
          "enum": ["%Y-%m-%d", "%d", "%m", "%Y"],
          "default": "%Y-%m-%d",
          "description": "The date format to return today's date in."
        }
      },
      "required": ["date_format"],
    }
  }
}

Update our system prompt with tool call instructions

For the best performance, add instructions on when to use the tool into the system prompt that governs your LLM. Below we simply added Step 3. in our list of instructions, which otherwise represent a typical RAG system prompt. In most RAG apps, one instructs the LLM what fallback answer to respond with when it does not know how to answer a user’s query. Such fallback instructions help you reduce hallucinations and more precisely control the AI.

fallback_answer = "Based on the available information, I cannot provide a complete answer to this question."

system_prompt = f"""For each question:
    1. Start with file_search tool
    2. If file_search results are incomplete/empty:
        - Inform the user about insufficient file results
        - Use get_todays_date for additional information if the answer to the question requires today's date
        - Present get_todays_date findings without citations
       
    Only use citations (【source】) for information found directly in files via file_search.
    Do not abstain from answering without trying both tools. When you do, say: "{fallback_answer}", nothing else."""

Initialize OpenAI Assistant

We now use the system_prompt, vector store helper methods and RAG classes defined above to initialize our RAG App. We add the get_todays_date tool into the tool_registry. File search (Retrieval) is another tool OpenAI Assistants can invoke during generation, so we want to add it to the function here.

Optional: Code to create the RAG assistant and add the vector store

vector_store = load_documents(client)

# Initialize default file search as a tool
DEFAULT_FILE_SEARCH: AssistantToolParam = {"type": "file_search"}

# Create an empty tool registry (as we're not using any additional tools yet)
tool_registry = ToolRegistry()
tool_registry.register_tool('get_todays_date', get_todays_date)

# Create assistant and configure RAP App with it, tools and the vector store.
assistant = create_rag_assistant(client, system_prompt, [DEFAULT_FILE_SEARCH, todays_date_tool_json])
assistant = add_vector_store_to_assistant(client, assistant, vector_store)
rag = RAGChat(client, assistant.id, tool_registry)

RAG in action

Let’s ask our Assistant common questions from users about the Simple Water Bottle in our example.

Scenario 1: RAG can answer the question using its Knowledge Base

user_question = "How big is the water bottle?"
rag.chat(user_question)

'The water bottle is 10 inches in height and 4 inches in width【4:0†simple_water_bottle.txt】.'

Here the Assistant was able to provide a good answer because its Knowledge Base contains the necessary information.

Scenario 2: RAG can answer the question using other tools

user_question = "Check today's date. Has the limited edition Amber water bottle already launched?"
rag.chat("Check today's date. Has the limited edition Amber water bottle already launched?")

[internal log] Calling tool: get_todays_date with args: {'date_format': '%Y-%m-%d'}

"The limited edition Amber water bottle was launched on January 1st, 2025【8:0†simple_water_bottle.txt】. Since today's date is February 27th, 2025, it has already been launched."

In this case, the assistant chose to call our get_todays_date tool to obtain information necessary for properly answering the user’s query. Note that a proper answer to this question also requires considering information from the Knowledge Base as well.

Scenario 3: RAG can’t answer the question

user_question = "Can I return my simple water bottle?"
rag.chat(user_question)

'Based on the available information, I cannot provide a complete answer to this question.'

This Assistant’s Knowledge Base does not contain information about the return policy, and the get_todays_date tool would not help here either. In this case, the best our Assistant can do is to return our fallback response to the user.

Next steps

Once you have an OpenAI Assistant that can call tools, adding Codex as a Tool takes only a few lines of code. Codex enables your RAG app to answer questions it previously could not (like Scenario 3 above). Learn how via our tutorial: Integrate Codex-as-a-tool into OpenAI Assistants

Need help? Check the FAQ or email us at: support@cleanlab.ai

Example RAG App: Product Customer Support​

Create Chat App that supports Tool Calls​

Example tool: get_todays_date​

Update our system prompt with tool call instructions​

Initialize OpenAI Assistant​

RAG in action​

Scenario 1: RAG can answer the question using its Knowledge Base​

Scenario 2: RAG can answer the question using other tools​

Scenario 3: RAG can’t answer the question​

Next steps​