Skip to main content

Integrate Cleanlab with OpenAI Assistants

Run in Google ColabRun in Google Colab

This tutorial demonstrates how to integrate Cleanlab with a RAG app built using OpenAI Assistants.

This is our recommended integration strategy for developers using OpenAI Assistants. The integration is only a few lines of code.

When integrating Cleanlab, you can automatically detect problematic RAG responses - see the advanced usage section of our validator tutorial for an in-depth look at these detection methods.

RAG Workflow

Let’s first install packages required for this tutorial.

%pip install openai  # we used package-version 1.59.7
%pip install --upgrade cleanlab_codex
from openai import OpenAI

client = OpenAI() # API key is read from the OPENAI_API_KEY environment variable

Example RAG App: Customer Service for a New Product

Consider a customer support use-case, where the RAG application is built on a Knowledge Base with product pages such as the following:

Image of a beautiful simple water bottle that is definitely worth more than the asking price

RAG with OpenAI Assistants

Let’s set up our Assistant! To keep this example simple, our Assistant’s Knowledge Base only has a single document containing the description of the product listed above.

Optional: Helper functions to set up an OpenAI Assistant

from io import BytesIO

from openai.types.beta.threads import Run
from openai.types.beta.assistant import Assistant

def create_rag_assistant(client: OpenAI, instructions: str) -> Assistant:
"""Create and configure a RAG-enabled Assistant."""
return client.beta.assistants.create(
name="RAG Assistant",
instructions=instructions, # System prompt that governs the Assistant
model="gpt-4o-mini",
tools=[{"type": "file_search"}], # OpenAI Assistants is an agentic RAG framework that treats retrieval as a Tool called file_search
)

def load_documents(client: OpenAI):
"""A highly simplified way to populate our Assistant's Knowledge Base. You can replace this toy example with many heterogeneous document files (PDFs, web pages, ...)."""
vector_store = client.beta.vector_stores.create(name="Simple Context")

documents = {
"simple_water_bottle.txt": "Simple Water Bottle - Amber (limited edition launched Jan 1st 2025)\n\nA water bottle designed with a perfect blend of functionality and aesthetics in mind. Crafted from high-quality, durable plastic with a sleek honey-colored finish.\n\nPrice: $24.99 \nDimensions: 10 inches height x 4 inches width",
} # our toy example only has one short document

# Upload documents to OpenAI
file_objects = []
for name, content in documents.items():
file_object = BytesIO(content.encode("utf-8"))
file_object.name = name
file_objects.append(file_object)

client.beta.vector_stores.file_batches.upload_and_poll(
vector_store_id=vector_store.id,
files=file_objects
)
return vector_store

def add_vector_store_to_assistant(client: OpenAI, assistant, vector_store):
assistant = client.beta.assistants.update(
assistant_id=assistant.id,
tool_resources={"file_search": {"vector_store_ids": [vector_store.id]}},
)
return assistant

Once we have defined basic functionality to setup our Assistant, let’s implement a standard RAG app using the OpenAI Assistants API. Our application will be conversational, supporting multi-turn dialogues. A new dialogue (i.e. thread) is instantiated as a RAGChat object defined below. To have the Assistant respond to each user message in the dialogue, simply call this object’s chat() method. The RAGChat class properly manages conversation history, retrieval, and LLM response-generation via the OpenAI Assistants API.

Optional: RAGChat class to orchestrate each conversation with our Assistant

class RAGChat:
def __init__(self, client: OpenAI, assistant_id: str):
self.client = client
self.assistant_id = assistant_id
self.thread_id = self.client.beta.threads.create().id

def chat(self, user_message: str) -> str:
"""Process a user message and return the assistant's response."""
# Add the user message to the thread
self.client.beta.threads.messages.create(
thread_id=self.thread_id,
role="user",
content=user_message
)

# Invoke the assistant on the current thread
run: Run = self.client.beta.threads.runs.create_and_poll(
thread_id=self.thread_id,
assistant_id=self.assistant_id,
)

# Display the assistant's response (basic example, modify it as necessary for settings like token streaming)
messages = list(self.client.beta.threads.messages.list(thread_id=self.thread_id, run_id=run.id))

message_content = messages[0].content[0].text
annotations = message_content.annotations
for index, annotation in enumerate(annotations):
message_content.value = message_content.value.replace(annotation.text, f"[{index}]")

return message_content.value

Let’s use these helper methods to instantiate an Assistant.

# Ingest files and load them into a Knowledge Base
vector_store = load_documents(client)

# Define instructions the Assistant should generally follow
fallback_answer = "Based on the available information, I cannot provide a complete answer to this question."
system_message = f"""Do not make up answers to questions if you cannot find the necessary information.
If you remain unsure how to accurately respond to the user after considering the available information and tools, then only respond with: "{fallback_answer}".
"""

# Create assistant and connect our vector store for file search (i.e. retrieval)
assistant = create_rag_assistant(client, system_message)
assistant = add_vector_store_to_assistant(client, assistant, vector_store)

# Create RAG app to chat with this Assistant
rag = RAGChat(client, assistant.id)

At this point, you can chat with the Assistant via: rag.chat(your_query) as shown below. Before we demonstrate that, let’s first see how easy it is to integrate Cleanlab.

Create Cleanlab Project

To use the Cleanlab AI Platform, first create a Project.

Here we assume some common (question, answer) pairs about the Simple Water Bottle have already been added to a Cleanlab Project.

Our existing Cleanlab Project contains the following entries:

Codex Knowledge Base Example

Integrate Cleanlab

RAG apps unfortunately sometimes produce bad/unhelpful responses. Instead of providing these to your users, Cleanlab can automatically detect these cases and provide better answers.

Integrating Cleanlab just requires two steps:

  1. Configure the Cleanlab system with your Cleanlab Project credentials and settings that control what sort of responses are detected to be bad.
  2. Enhance your RAG app to:
    • Use Cleanlab to monitor whether each Assistant response is bad.
    • Query Cleanlab for a better answer when needed.
    • Update the conversation with Cleanlab’s answer when needed.

After that, call your enhanced RAG app just like the original app - Cleanlab works automatically in the background.

Below is all the code needed to integrate Cleanlab.

Optional: RAGChat subclass that integates Cleanlab (RAGChatWithCodexBackup)

from typing import Any, Dict, Optional
from cleanlab_codex import Project
from cleanlab_codex.response_validation import is_bad_response

class RAGChatWithCodexBackup(RAGChat):
"""Determines when to rely on Codex based on `cleanlab_codex.response_validation.is_bad_response()`. Keyword arguments for this method can be provided when instantiating this object via: `is_bad_response_config`."""
def __init__(
self,
client: OpenAI,
assistant_id: str,
codex_access_key: str,
is_bad_response_config: Optional[Dict[str, Any]] = None,
):
super().__init__(client, assistant_id)
self._codex_project = Project.from_access_key(codex_access_key)
self._is_bad_response_config = is_bad_response_config

def _replace_latest_message(self, new_message: str) -> None:
"""Updates the latest assistant message in the thread with the backup response from Codex"""
client: OpenAI = self.client
thread_id: str = self.thread_id

messages = client.beta.threads.messages.list(
thread_id=thread_id
).data
latest_message = messages[0]

client.beta.threads.messages.delete(
thread_id=thread_id,
message_id=latest_message.id,
)
client.beta.threads.messages.create(
thread_id=thread_id,
content=new_message,
role="assistant",
)

def chat(self, user_message: str) -> str:
response = super().chat(user_message)

kwargs = {"response": response, "query": user_message}
if self._is_bad_response_config is not None:
kwargs["config"] = self._is_bad_response_config

if is_bad_response(**kwargs):
codex_response: str | None = self._codex_project.query(user_message)[0]

if codex_response is not None:
# You may prefer to utilize Codex answers differently in your app than done here
self._replace_latest_message(codex_response)
response = codex_response

return response

Cleanlab automatically detects when your RAG app would have provided unsafe or untrustworthy responses. Here we provide a basic configuration for this detection that relies on the fallback answer that we instructed the Assistant to output whenever it doesn’t know how to respond accurately. With this configuration, Cleanlab will be consulted whenever your Assistant’s response is estimated to be unhelpful.

Learn more about available detection methods and configurations via our tutorial: Validator - Advanced Usage.

os.environ["CODEX_ACCESS_KEY"] = "<YOUR_PROJECT_ACCESS_KEY>"  # Available from your Project's settings page at: https://codex.cleanlab.ai/

is_bad_response_config = {
"fallback_answer": fallback_answer,
}

# Instantiate RAG app enhanced with Cleanlab
rag_with_codex = RAGChatWithCodexBackup(
client=client,
assistant_id=assistant.id,
codex_access_key=os.environ["CODEX_ACCESS_KEY"],
is_bad_response_config=is_bad_response_config,
)

RAG with Cleanlab in action

We can now ask user queries to our original RAG app (rag), as well as another version of this RAG app enhanced with Cleanlab (rag_with_codex).

Example 1

Let’s ask a question to our original RAG app (before Cleanlab was integrated).

user_question = "Can I return my simple water bottle?"
rag.chat(user_question)
'Based on the available information, I cannot provide a complete answer to this question.'

The original RAG app is unable to answer, in this case because the required information is not in its Knowledge Base.

Let’s ask the same question to the RAG app enhanced with Cleanlab.

rag_with_codex.chat(user_question)
'Return it within 30 days for a full refund-- no questions asked. Contact our support team to initiate your return!'

As you see, integrating Cleanlab enables your RAG app to answer questions it originally strugged with, as long as a similar question was already answered in the corresponding Cleanlab Project.

Example 2

Let’s ask another question to our RAG app with Cleanlab integrated.

user_question = "How can I order the Simple Water Bottle in bulk?"
rag.chat(user_question)
'Based on the available information, I cannot provide a complete answer to this question.'
rag_with_codex.chat(user_question)
'Based on the available information, I cannot provide a complete answer to this question.'

Our RAG app is unable to answer this question because there is no relevant information in its Knowledge Base, nor has a similar question been answered in the Cleanlab Project (see the contents of the Cleanlab Project above).

Cleanlab automatically recognizes this question could not be answered and logs it into the Project where it awaits an answer from a SME. Navigate to your Cleanlab Project in the Web App where you (or a SME at your company) can enter the desired answer for this query.

As soon as an answer is provided in Cleanlab, our RAG app will be able to answer all similar questions going forward (as seen for the previous query).

Example 3

Let’s ask another query to our two RAG apps.

user_question = "How big is the water bottle?"
rag.chat(user_question)
'The Simple Water Bottle has dimensions of 10 inches in height and 4 inches in width[0].'

The original RAG app was able to correctly answer without Cleanlab (since the relevant information exists in the Knowledge Base).

rag_with_codex.chat(user_question)
'The Simple Water Bottle measures 10 inches in height and 4 inches in width[0].'

We see that the RAG app with Cleanlab integrated is still able to correctly answer this query. Integrating Cleanlab has no negative effect on questions your original RAG app could answer.

Next Steps

Now that Cleanlab is integrated with your RAG app, you and SMEs can open the Cleanlab Project and answer questions logged there to continuously improve your AI.

This tutorial demonstrated how to easily integrate Cleanlab into any OpenAI Assistants application. Unlike tool calls which are harder to control, you can choose when to call Cleanlab. For instance, you can use Cleanlab to automatically detect whenever the Assistant produces hallucinations or unhelpful responses such as “I don’t know”.

Adding Cleanlab only improves your RAG app. Once integrated, Cleanlab automatically logs all user queries that your original RAG app handles poorly. Using a simple web interface, SMEs at your company can answer the highest priority questions in the Cleanlab Project. As soon as an answer is entered in Cleanlab, your RAG app will be able to properly handle all similar questions encountered in the future.

Cleanlab is the fastest way for nontechnical SMEs to directly improve your AI Assistant. As the Developer, you simply integrate Cleanlab once, and from then on, SMEs can continuously improve how your Assistant handles common user queries without needing your help.

Need help, more capabilities, or other deployment options?
Check the FAQ or email us at: support@cleanlab.ai