RAG with Tool Calls in LangChain
This notebook covers the basics of building an agentic RAG application with LangChain (the specific RAG app used in our Integrate Codex as-a-Tool with LangChain tutorial)
The LangChain framework implements agentic RAG, which treats Retrieval as a tool that can be called (rather than as a step hardcoded into every user interaction, as is done in standard RAG). For standard RAG (where retrieval is a hardcoded step), refer to our Adding Tool Calls to RAG tutorial.
Here’s a typical architecture for agentic RAG apps with tool calling:
Let’s first install required packages for this tutorial.
%pip install langchain-text-splitters langchain-community langgraph langchain-openai # we used package-versions 0.3.5, 0.3.16, 1.59.7, 0.3.2
import os
os.environ["OPENAI_API_KEY"] = "<YOUR-KEY-HERE>" # Replace with your OpenAI API key
generation_model = "gpt-4o" # model used by RAG system (has to support tool calling)
embedding_model = "text-embedding-3-small" # any LangChain embeddings model
Example: Customer Service for a New Product
Consider a customer support / e-commerce RAG use-case where the Knowledge Base contains product listings like the following:
To keep this example minimal, we’ll use a simple in-memory vector store with a single document. The document will contain the context
(product information) on the product above. The current setup can be updated with any LangChain embeddings model and vector store.
Optional: Initialize vector store + add document
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document
from langchain_core.vectorstores import InMemoryVectorStore
from langchain_text_splitters import RecursiveCharacterTextSplitter
# Initialize vector store
embeddings = OpenAIEmbeddings(model=embedding_model)
vector_store = InMemoryVectorStore(embeddings)
# Sample document to demonstrate Codex integration
product_page_content = """Simple Water Bottle - Amber (limited edition launched Jan 1st 2025)
A water bottle designed with a perfect blend of functionality and aesthetics in mind. Crafted from high-quality, durable plastic with a sleek honey-colored finish.
Price: $24.99 \nDimensions: 10 inches height x 4 inches width"""
documents =[
Document(
id="simple_water_bottle.txt",
page_content=product_page_content,
),
]
# Standard LangChain text splitting - use any splitter that fits your docs
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
all_splits = text_splitter.split_documents(documents)
# Add documents to your chosen vector store
_ = vector_store.add_documents(documents=all_splits)
Create Chat App with Tool Calls
We now define a tool-calling RAG app.
from langchain_core.tools import BaseTool
from langchain_core.messages import (
HumanMessage,
SystemMessage,
BaseMessage,
)
from typing import List, Optional
class RAGApp:
def __init__(
self,
llm: ChatOpenAI,
tools: List[BaseTool],
retriever: BaseTool,
messages: Optional[List[BaseMessage]] = None
):
"""Initialize RAG application with provided components."""
_tools = [retriever] + tools
self.tools = {tool.name: tool for tool in _tools}
self.llm = llm.bind_tools(_tools)
self.messages: List[BaseMessage] = messages or []
def chat(self, user_query: str) -> str:
"""Process user input and handle any necessary tool calls."""
# Add user query to messages
self.messages.append(HumanMessage(content=user_query))
# Get initial response (may include tool calls)
print(f"[internal log] Invoking LLM text\n{user_query}\n\n")
response = self.llm.invoke(self.messages)
self.messages.append(response)
# Handle any tool calls
while response.tool_calls:
# Process each tool call
for tool_call in response.tool_calls:
# Get the appropriate tool
tool = self.tools[tool_call["name"].lower()]
# Call the tool and get result
tool_name = tool_call["name"]
tool_args = tool_call["args"]
print(f"[internal log] Calling tool: {tool_name} with args: {tool_args}")
tool_result = tool.invoke(tool_call)
print(f'[internal log] Tool response: {str(tool_result)}')
self.messages.append(tool_result)
# Get next response after tool calls
response = self.llm.invoke(self.messages)
self.messages.append(response)
return response.content
Example tool: get_todays_date
Let’s define an example tool get_todays_date()
that our RAG app can rely on. LangChain does not need function schemas meaning adding tools much is easier - write normal Python functions and it automatically: reads your function name and docstring, understands your parameters and type hints, and creates the LLM-friendly format for you
from langchain_core.tools import tool
from datetime import datetime
@tool
def get_todays_date(date_format: str) -> str:
"A tool that returns today's date in the date format requested. Options for date_format parameter are: '%Y-%m-%d', '%d', '%m', '%Y'."
datetime_str = datetime.now().strftime(date_format)
return datetime_str
Define Retriever Tool
In addition to our example tool, we need to explicitly provide the system with a tool that works as a retriever
. It searches the vector store for relevant Context as is needed if we want our system to do any context retrieval. Let’s define it here.
@tool
def retrieve(query: str) -> str:
"""Search through available documents to find relevant information."""
docs = vector_store.similarity_search(query, k=2)
return "\n\n".join(doc.page_content for doc in docs)
Update our LLM system prompt with tool call instructions
For the best performance, add instructions on when to use the tool into the system prompt that governs your LLM. Below we simply added Step 3. and 4. in our list of instructions, which otherwise represent a typical RAG system prompt. In most RAG apps, one instructs the LLM on what fallback_answer
to respond with when it does not know how to answer a user’s query. Such fallback instructions help you reduce hallucinations and more precisely control the AI.
fallback_answer = "Based on the available information, I cannot provide a complete answer to this question."
system_message = f"""
Answer the user's Question based on the following possibly relevant Context. Follow these rules:
1. Never use phrases like "according to the context," "as the context states," etc. Treat the Context as your own knowledge, not something you are referencing.
2. Give a clear, short, and accurate answer. Explain complex terms if needed.
3. You have access to the retrieve tool, to retrieve relevant information to the query as Context.
4. If the answer to the question requires today's date, use the following tool: get_todays_date.
5. If the Context doesn't adequately address the Question, say: "{fallback_answer}" only, nothing else.
Remember, your purpose is to provide information based on the Context, not to offer original advice.
"""
Initialize our RAG App
Finally, let’s set up our LLM that supports tool calling and initialize our RAG App. Any LangChain-compatible LLM can be used here, as long as it supports tool calling
llm = ChatOpenAI(model=generation_model)
rag = RAGApp(
llm=llm,
tools=[get_todays_date], # Add your tools here
retriever=retrieve,
messages=[SystemMessage(content=system_message)]
)
RAG in action
Let’s run our RAG application over different questions commonly asked by users about the Simple Water Bottle in our example.
Scenario 1: RAG can answer the question without tools
response = rag.chat("How big is the water bottle?")
print(f"\n[RAG response] {response}")
Scenario 2: RAG can answer the question (using other tools)
response = rag.chat("Check today's date. Has the limited edition Amber water bottle already launched?")
print(f"\n[RAG response] {response}")
Scenario 3: RAG cannot answer the question
response = rag.chat("Can I return my simple water bottle?")
print(f"\n[RAG response] {response}")
Note that the Context does not contain information about the return policy, and the get_todays_date
tool would not help either.
In this case, we want to return our fallback response to the user.
Next Steps
Adding tool calls to your RAG system expands the capabilities of what your AI can do and the types of questions it can answer.
Once you have a RAG app with tools set up, adding Codex as-a-Tool takes only a few lines of code. Codex enables your RAG app to answer questions it previously could not (like Scenario 3 above). Learn how via our tutorial: Integrate Codex as-a-Tool with LangChain.
Need help? Check the FAQ or email us at: support@cleanlab.ai