Integrate Cleanlab with AWS Strands Agents
This tutorial demonstrates how to integrate Cleanlab with AI Agents built using the Strands SDK. With minimal changes to your existing Strands Agent code, you can detect bad responses and automatically remediate them in real-time.
Setup
The Python packages required for this tutorial can be installed using pip:
%pip install --upgrade cleanlab-codex strands-agents "strands-agents[openai]" tavily-python
This tutorial requires a Cleanlab API key. Get one here.
import os
os.environ["CODEX_API_KEY"] = "<Cleanlab Codex API key>" # Get your API key from: https://codex.cleanlab.ai/
os.environ["OPENAI_API_KEY"] = "<OpenAI API key>" # for using OpenAI models with Strands
os.environ["TAVILY_API_KEY"] = "<TAVILY API KEY>" # for using a web search tool (get your free API key from Tavily)
from cleanlab_codex.client import Client
from tavily import TavilyClient
Overview of this tutorial
This tutorial showcases using Cleanlab’s CleanlabModel wrapper to add real-time validation to Strands Agents.
We’ll demonstrate five key scenarios:
- Conversational Chat Response - Basic agent interaction with validation
- Tool Call Response - Agent response using tools with validation
- Bad AI Response - How Cleanlab detects and scores problematic responses
- Expert Answer Response - A deterministic remediation to problematic responses
- Information Retrieval Tool Call Response - Context-aware validation with web search
Create Cleanlab Project
To use the Cleanlab AI Platform for validation, we must first create a Project. Here we assume no (question, answer) pairs have already been added to the Project yet.
User queries where Cleanlab detected a bad response from your AI app will be logged in this Project for SMEs to later answer.
# Create a Cleanlab project
client = Client()
project = client.create_project(
name="Strands Agent with Cleanlab Validation Tutorial",
description="Tutorial demonstrating validation of a Strands Agent with CleanlabModel wrapper"
)
Example Use Case: Bank Loan Customer Support
We’ll build a customer support agent for bank loans to demonstrate validation scenarios.
Let’s define tools representing different response quality levels:
- a good tool that returns reasonable information
- a bad tool that returns problematic information
- a web search tool that provides additional context to the Agent
Note: The web search tool follows the example information retrieval function defined in the Strands Web Search tutorial
Optional: Tool definitions for demonstration scenarios
from strands.tools.decorator import tool
# ============ Good Tool: Returns reasonable information ============
@tool
def get_payment_schedule(account_id: str) -> str:
"""Get payment schedule for an account."""
payment_schedule = f"""Account {account_id} has:
Bi-weekly payment plan
Upcoming payment scheduled for next Friday
"""
return payment_schedule
# ============ Bad Tool: Returns problematic information ============
@tool
def get_total_amount_owed(account_id: str) -> dict:
"""A tool that simulates fetching the total amount owed for a loan.
**Note:** This tool returns a hardcoded *unrealistic* total amount for demonstration purposes."""
return {
"account_id": account_id,
"currency": "USD",
"total": 7000000000000000000000000000000000000.00,
}
# ============ Web Search Tool: Provides context for the Agent ============
@tool
def web_search(
query: str, time_range: str | None = None, include_domains: str | None = None
) -> str:
"""Perform a web search. Returns the search results as a string, with the title, url, and content of each result ranked by relevance.
Args:
query (str): The search query to be sent for the web search.
time_range (str | None, optional): Limits results to content published within a specific timeframe.
Valid values: 'd' (day - 24h), 'w' (week - 7d), 'm' (month - 30d), 'y' (year - 365d).
Defaults to None.
include_domains (list[str] | None, optional): A list of domains to restrict search results to.
Only results from these domains will be returned. Defaults to None.
Returns:
formatted_results (str): The web search results
"""
def format_search_results_for_agent(search_results: list[dict]) -> str:
"""Format search results into a numbered context string for the agent."""
results = search_results["results"]
parts = []
for i, r in enumerate(results, start=1):
title = r.get("title", "").strip()
content = r.get("content", "").strip()
if title or content:
block = (
f"Context {i}:\n"
f"title: {title}\n"
f"content: {content}"
)
parts.append(block)
return "\n\n".join(parts)
client = TavilyClient(api_key=os.getenv("TAVILY_API_KEY"))
formatted_results = format_search_results_for_agent(
client.search(
query=query,
max_results=2,
time_range=time_range,
include_domains=include_domains
)
)
return formatted_results
Strands Integration
To add response validation to your Strands agents, wrap any existing Strands model with a CleanlabModel for real-time validation during Agent execution.
Cleanlab’s wrapper intercepts responses during generation and validates them in real-time, and provides automatic expert answer substitution and guardrail enforcement.
Integration steps:
- Wrap your Model with CleanlabModel
- Create your Agent with the wrapped Model
- Call
cleanlab_model.set_agent_reference(agent)for full functionality
Context-Aware Validation for Information Retrieval
For agents with tools that retrieve information (e.g., RAG, web search, database queries), Cleanlab can use this retrieved content as context during validation. This enables more accurate evaluation by:
- Checking if the AI response is grounded in the retrieved information
- Measuring context sufficiency (whether enough information was retrieved)
- Detecting hallucinations by comparing the response against actual context
To enable this, specify the names of your context-providing tools in the context_retrieval_tools parameter during CleanlabModel initialization.
import uuid
from strands.agent.agent import Agent
from strands.models.openai import OpenAIModel
from strands.session.file_session_manager import FileSessionManager
from cleanlab_codex.experimental.strands import CleanlabModel
SYSTEM_PROMPT = "You are a customer service agent for bank loans. Be polite and concise in your responses. Always rely on the tool answers."
# Create base model
base_model = OpenAIModel(
model_id="gpt-4o-mini",
)
### New code to add for Cleanlab API ###
FALLBACK_RESPONSE = "Sorry I am unsure. You can try rephrasing your request."
cleanlab_model = CleanlabModel( # Wrap with Cleanlab validation
underlying_model=base_model,
cleanlab_project=project,
fallback_response=FALLBACK_RESPONSE,
context_retrieval_tools=["web_search", "get_payment_schedule", "get_total_amount_owed"] # Specify tool(s) that provide context
)
### End of new code to add for Cleanlab API ###
# Create agent with validated model for normal conversation
agent = Agent(
model=cleanlab_model, # Add your wrapped model here
system_prompt=SYSTEM_PROMPT,
tools=[get_payment_schedule, get_total_amount_owed, web_search],
session_manager=FileSessionManager(session_id=uuid.uuid4().hex), # Persist chat history
callback_handler=None, # Optionally add a callback handler for logging
)
### New code to add for Cleanlab API ###
cleanlab_model.set_agent_reference(agent)
### End of new code to add for Cleanlab API ###
Scenario 1: Conversational Chat Response
Let’s start with a basic agent interaction without tools.
The CleanlabModel wrapper validates the response in real-time.
Optional: Helper method to run the agent and print Cleanlab validation results
def display_openai_validation_results(final_output, initial_llm_response, validation_result, query):
"""Helper function to display OpenAI Agent validation results with consistent formatting"""
print("-" * 30)
print("Response Delivered to User:")
print("-" * 30)
print()
print(final_output)
print()
print()
print("=== Internal Trace (not shown to user) ===")
print()
if validation_result:
# Group core detection metrics
should_guardrail = validation_result.get('should_guardrail', False)
escalated_to_sme = validation_result.get('escalated_to_sme', False)
is_bad_response = validation_result.get('is_bad_response', False)
expert_answer_available = bool(validation_result.get('expert_answer'))
print("-" * 30)
if should_guardrail or expert_answer_available:
print(f"Original AI Response (not delivered to user):")
print("-" * 30)
print()
print(initial_llm_response)
print()
else:
print(f"Original AI Response:")
print("-" * 30)
print()
print("[Same as \"Response Delivered to User\"]")
print()
print("-" * 30)
print("Cleanlab Analysis:")
print("-" * 30)
print()
print(f"Should Guardrail: {should_guardrail}")
print(f"Escalated to SME: {escalated_to_sme}")
print(f"Is Bad Response: {is_bad_response}")
print(f"Expert Answer Available: {expert_answer_available}")
# Show evaluation scores if available
eval_scores = validation_result.get('eval_scores', None)
if eval_scores is not None:
print()
# Access trustworthiness score
if 'trustworthiness' in eval_scores:
trust_score = eval_scores['trustworthiness'].get("score", None)
if trust_score is not None:
print(f"Trustworthiness: {trust_score:.3f} (triggered_guardrail = {eval_scores['trustworthiness'].get('triggered_guardrail', False)})")
# Access response helpfulness score
if 'response_helpfulness' in eval_scores:
help_score = eval_scores['response_helpfulness'].get("score", None)
if help_score is not None:
print(f"Response Helpfulness: {help_score:.3f} (triggered_guardrail = {eval_scores['response_helpfulness'].get('triggered_guardrail', False)})")
# Access context sufficiency score (for retrieval scenarios)
if 'context_sufficiency' in eval_scores:
context_score = eval_scores['context_sufficiency'].get("score", None)
if context_score is not None:
print(f"Context Sufficiency: {context_score:.3f} (triggered_guardrail = {eval_scores['context_sufficiency'].get('triggered_guardrail', False)})")
# Show expert answer if available
if expert_answer_available:
print()
print("-" * 30)
print("Expert Answer Available:")
print("-" * 30)
print()
print(validation_result.get('expert_answer'))
print()
# Show validation status summary
if should_guardrail or is_bad_response or expert_answer_available:
print()
if expert_answer_available:
print("💡 EXPERT ANSWER AVAILABLE: Expert answer was available and delivered to user instead of Original AI Response")
elif should_guardrail:
print("⚠️ GUARDRAIL TRIGGERED: Original AI Response was blocked and a fallback response was delivered to user")
if escalated_to_sme and (not expert_answer_available) and (not should_guardrail):
print("🔄 ESCALATED: This case was flagged as problematic for subject matter expert review in the Cleanlab Project Interface")
else:
print()
print("✅ VALIDATION PASSED: Original AI Response delivered to user")
else:
print("No validation results available")
def run_with_validation(agent: Agent, query: str):
# Prompt the agent and get response
result = agent(query)
# Display results with validation details
final_output = str(result)
validation_results = agent.state.get('cleanlab_validation_results')
initial_llm_response = agent.state.get('initial_model_response')
initial_llm_response = initial_llm_response[0].get('text', str(initial_llm_response))
display_openai_validation_results(final_output, initial_llm_response, validation_results, query)
run_with_validation(agent, "What is a credit score?")
Without Cleanlab: The agent would deliver its response directly to the user without any validation or safety checks.
With Cleanlab: The above response is automatically validated for trustworthiness and helpfulness before reaching the user. In this case, Cleanlab found the response trustworthy, so it allowed the original response to be delivered to the user.
Scenario 2: Tool Call Response
Now let’s test an agent interaction that uses tools. Cleanlab validation checks both tool usage and the final response.
run_with_validation(agent, "What is the payment schedule for account ID 12345?")
Without Cleanlab: The agent would deliver its tool-based response directly to the user without validation.
With Cleanlab: The response is validated even when tools are used. Cleanlab evaluated both the which tools are called and the final response, found them highly trustworthy, and delivered the original response to the user.
After this interaction, we can see the tool calls and response show up in the message history.
agent.messages[-4:]
Scenario 3: Bad AI Response
When an Agent calls an incorrect tool or summarizes problematic information returned from the tool call, Cleanlab automatically:
- Detects the problematic response
- Blocks it from reaching the user
- Substitutes a safe fallback response
- Cleans message history to remove problematic tool calls
Let’s see this in action:
run_with_validation(agent, "How much do I owe on my loan for account ID 12345?")
After this chat turn, we see the message history is updated only with the user query and final agent response.
agent.messages[-6:]
Without Cleanlab: The user would receive the problematic response: “The total amount owed on your loan for account ID 12345 is $7,000,000,000,000,000,000,000,000,000,000,000,000,000…” - clearly an unrealistic and harmful amount that could confuse or alarm the user.
With Cleanlab: Cleanlab’s validation detects the unrealistic amount, assigns a very low trustworthiness score, blocks the problematic response, removes bad tool interaction from the message history and delivers a configurable fallback response to the user.
After this interaction, we can see the conversation history is updated with the safe fallback response.
Scenario 4: Expert Answer Response
After setting up the project in Cleanlab UI, you can add expert answer to common queries that could be deterministically returned to the user instead of the Agent response.
Consider the following user query:
run_with_validation(agent, "How do I add a payee to my Mortgagelender loan? Give me specific steps for Mortgagelender website.")
Since we did not give our Agent specific context on how to do this action, the Original LLM Response are hallucinated.
As expected, the Trustworthiness score is low and the query, answer pair is marked as an Issue in the Web UI.
Consider adding an expert answer for the Question above on the proper steps like:
1. Open the Morgagelender site or app and sign into your profile.
2. Go to the section where you handle billing or transfer details.
3. Look for an option to set up a new recipient for payments.
4. Fill in the recipient’s required details (name, account info, etc.).
5. Confirm the details and complete the setup.
6. Wait for a notice or email confirming the payee has been linked.
Now, when we re-run the same exact query the expert answer will be used, immediately improving the accuracy of the Agent responses.
run_with_validation(agent, "How do I add a payee to my Mortgagelender loan? Give me specific steps for Mortgagelender website.")
Scenario 5: Information Retrieval Tool Call Response
Now let’s ask a question that requires our Agent to use web search, which we specified in our context_retrieval_tools list.
What happens with context-aware validation:
- Tool results are automatically passed to Cleanlab as context
- Cleanlab can evaluate whether the AI response is grounded in the retrieved information, represented with the Context Sufficiency score
- You’ll see a “Retrieved Context” section in the Cleanlab Project UI showing what information was available for validation
run_with_validation(agent, "What are current mortgage interest rates?")
Context is now automatically extracted from web search tool result and passed to Cleanlab validation, improving evaluation accuracy for information retrieval scenarios.
agent.messages[-3:] # Last 3 messages to see web search tool call and context
How Cleanlab Validation Works
Cleanlab evaluates AI responses across multiple dimensions (trustworthiness, helpfulness, reasoning quality, etc.) and provides scores, guardrail decisions, and expert remediation.
For detailed information on Cleanlab’s validation methodology, see:
Message History Management
When Cleanlab detects a problematic response that involved tool calls, it performs the following cleanup:
- Identifies the problematic turn: Finds the conversation turn that produced the bad response
- Removes tool calls: Eliminates the assistant message containing tool calls from history
- Removes tool results: Eliminates the corresponding tool result messages from history
- Preserves user messages: Keeps user queries to maintain conversation context
- Adds clean response: Adds the safe fallback or expert answer to history
This prevents the problematic tool information from contaminating future conversation turns.
Specifying Context Handling in More Detail
If you want more control over how context is passed to Cleanlab, it’s recommended to create a custom CleanlabModel subclass and override the cleanlab_get_validate_fields method with custom logic to extract context from tool results and include in validation.
from typing import Any
from strands.types.content import Messages
from cleanlab_codex.experimental.strands import CleanlabModel
from cleanlab_codex.experimental.strands.cleanlab_model import get_latest_user_message_content
def custom_get_context_function(messages: Messages) -> str:
# Define your custom context extraction logic here
return "your context extraction logic"
class CleanlabModelWithContext(CleanlabModel):
def __init__(self, **init_args) -> None:
super().__init__(**init_args)
def cleanlab_get_validate_fields(self, messages: Messages) -> dict[str, Any]:
"""Extract fields from messages for cleanlab validation (overridden to also return context)."""
user_message_content = get_latest_user_message_content(messages)
context = custom_get_context_function(messages) # User defined function to extract context
return {
"query": user_message_content,
"context": context,
}
What’s different if I’m using Amazon Bedrock models with Strands?
The CleanlabModel wrapper works with any Strands-compatible model provider. To use Amazon Bedrock:
from strands.models.bedrock import BedrockModel
# Create Bedrock model
base_model = BedrockModel(
model_id="anthropic.claude-3-sonnet-20240229-v1:0",
params={"temperature": 0.1}
)
# Wrap with CleanlabModel
cleanlab_model = CleanlabModel(
underlying_model=base_model,
cleanlab_project=project,
)
# Use exactly as shown in the examples above
agent = Agent(model=cleanlab_model, tools=[...])
cleanlab_model.set_agent_reference(agent) # Remember to set agent reference
The validation behavior and message history management work identically across all model providers.
Summary
This tutorial demonstrated integrating Cleanlab validation with AWS Strands Agents using the CleanlabModel wrapper.
Key benefits:
- Real-time validation during response generation
- Automatic remediation with expert answers and fallbacks
- Message history cleanup to prevent contamination
- Context-aware validation for retrieval-based agents
- Multi-model support (OpenAI, Anthropic, Amazon Bedrock, etc.)
The CleanlabModel wrapper provides enterprise-grade safety with minimal code changes.