Expert Guidance

Expert Guidance allows Subject Matter Experts (SMEs) to improve their AI and patch agent failures by providing natural language feedback. Expert Guidance is particularly valuable for remediating failures by tool-calling AI agents, where “what good looks like” for a given query is often dynamic and personalized (e.g., adjusting by time, user details, or long-running context).

How Expert Guidance Works

Expert Guidance takes a two-step approach:

Human Input via Natural Language: SMEs review Question/Response pairs and provide natural language explanations describing what went wrong and how the AI should behave differently.
AI-Optimized Translation: Cleanlab automatically translates these SME explanations into optimized “Guidance” that:
- Is rewritten and formatted for optimal AI performance
- Is served as context to the AI when relevant
- Addresses the specific failure case and generalizes to similar classes of failures

This translation process allows humans to freely instruct or teach the AI about what it did wrong and how it can do better, without requiring them to think about optimal prompting, logic, or technical implementation details. The guidance is automatically optimized and injected into your AI’s prompt to improve its behavior for similar scenarios.

Example: Airline Support Assistant

Consider an airline support agent that fails to search multiple airports when users ask for flights:

Query	Original Failing Response	Human Explanation/Guidance	Desired Response
Find me the shortest flight from the bay to miami for tmrw	The shortest flight from the Bay Area (SFO) to Miami (MIA) for tomorrow is flight WN5613…	Bad: The Bay also means SFO, OAK, SJC. Same for Miami — include MIA, FLL, PBI. Don’t just pick one airport, show the user all their different options	Here are the shortest nonstop economy flights from Bay Area airports (SFO, OAK, SJC) to Miami-area airports (MIA, FLL, PBI) for tomorrow: [shows all options]

When the SME provides this explanation, Cleanlab generates Expert Guidance that instructs the AI to:

Expand city names to include all relevant airports
Present multiple flight options from all relevant airports
Allow users to choose from comprehensive options

This single guidance entry then improves not just the original query, but similar queries like:

“Find me the shortest flight from nyc to la for tmrw"
"I want the cheapest roundtrip flight from chicago to dallas leaving this Sat and coming back next Sun”

SME Experience: Providing Expert Guidance

SMEs can provide Expert Guidance directly from the Expert Workspace or Logs when reviewing Question/Response pairs.

AI Guidance Generation Interface

Step 1: Review the Issue

When reviewing an Issue or Log entry, SMEs see:

The user’s original query
The AI’s problematic response
Relevant context and metadata

Step 2: Assess the Response

SMEs are prompted: “Is this response acceptable?” with three options:

Yes → Marks the response Good (similar Q/R pairs will not be guardrailed)
No → Marks the response Bad (similar Q/R pairs will be guardrailed). SME can optionally:
- Add an explanation → generates an Expert Guidance remediation
- Provide a correct answer → creates an Expert Answer remediation
Skip + Mark as Addressed → Closes the Issue without judging the response (uses current manual “mark as addressed” workflow)

Step 3: Provide Guidance or Expert Answer

When an SME selects No (Bad response), they can optionally:

Add an Explanation → Generates an Expert Guidance remediation
- The SME provides a natural language explanation describing what went wrong and how the AI should behave differently
- Cleanlab automatically generates optimized “Guidance” from this explanation
- The SME can review the generated Guidance and click Submit if it captures their intent
Provide a Correct Answer → Creates an Expert Answer remediation
- This creates a deterministic answer that will be served for similar queries

Example: Seat Change Request

Here’s another example showing how Expert Guidance helps with more complex scenarios:

Query	Original Failing Response	Human Explanation/Guidance	Desired Response
can you change my seat to a window seat? my confirmation number is 57171 and my name is Charles M	I have changed your seat to a seat with a power outlet, but it is not a window seat. The seat change is confirmed…	Bad: The AI prematurely confirmed a seat change that didn’t match the user’s request (no window). This breaks trust and could incur unwanted fees. For seat-change requests, the AI should: 1. Acknowledge the request and confirm booking details. 2. Clarify preferences (standard vs. extra legroom/exit row, willingness to pay). 3. List actual available options with seat type + fee, if any. 4. Ask the user to choose before committing. Never auto-confirm a seat assignment.	I’ve pulled up booking 57171 for Charles M. I can check available window seats for you. Here are the current options: – 14A: Standard window, no fee – 20F: Window, extra legroom (+$32) – 21A: Window, exit row (+$45) Would you like me to switch you to one of these, or keep your current seat?

This guidance then improves similar queries like:

“Are there any aisle seats for me? I want to switch over to one"
"Can you move me to sit next to my travel companion on booking 57172?”

Managing Expert Guidance

Submitted Expert Guidance entries can be managed from the Expert Answers page, which now includes separate tabs for:

Expert Guidance - Remediations that provide advice for the AI
Expert Reviews - Good/Bad annotations for a response
Expert Answers - Remediations that specify a verbatim answer to serve

From the Expert Answers page, you can view and manage Expert Guidance entries. Each entry includes fields for Status (Active, Paused), Query (the query pattern this guidance applies to), Guidance (the guidance text that will be injected), Times Served (usage counter), and metadata (Last Edited, Last Edited By, Date Created).

Expert Guidance can be created from Issues/Logs when reviewing failure cases, or manually from the Expert Answers page. All fields can be modified at any time.

Integration: Using Expert Guidance in Your Application

Expert Guidance is integrated into your AI application through the consult() API, which runs pre-generation to dynamically inject guidance into your AI’s prompt.

Key Integration Points

consult() API - Pre-Generation Check
- Called before LLM generation
- Checks if the user query matches any Expert Guidance entries (via scenario-based matching)
- Returns Expert Guidance if available (and no Expert Answer exists for that query)
- Note: Expert Answers take priority - if there’s an Expert Answer for a similar query, Expert Guidance is not applied

Dynamic Prompt Injection

The Expert Guidance is appended to the user prompt
This happens pre-generation, allowing the AI to consider the guidance when generating its response

Suggested prompt injection

def update_prompt_with_guidance(prompt: str, guidance_items: list[str]) -> str:
    """Update the prompt with the guidance."""
    if guidance_items:
        guidance_block = "\n\n".join(f"<guidance>\n{g}\n</guidance>" for g in guidance_items)
        guidance_instruction = (
            'Consider the following guidance and whether each "if" scenario seems relevant here.\n'
            'When the "if" scenario seems relevant, follow the guidance\'s specified behavior exactly (including any wording suggestions).\n'
            "Otherwise ignore that guidance if it does not seem relevant here."
        )
        return f"{prompt}\n\n<advice_to_consider>\n{guidance_instruction}\n\n{guidance_block}\n\n</advice_to_consider>"
    return prompt

validate() API - Post-Generation Validation
- Still runs after generation to ensure the response passes configured guardrails
- This ensures that even with Expert Guidance, the response still meets your safety standards
- Guardrails can still block responses if they fail to meet criteria

Priority Order

If you have properly integrated the Cleanlab API, then here’s how your AI application should serve responses to users and validate against bad, untrustworthy responses:

Expert Answer (highest priority) - If an Expert Answer exists for a similar query, it’s served with highest priority
Expert Guidance - If no Expert Answer exists, Expert Guidance is applied pre-generation. The generated response is still validated against guardrails.
Standard Generation - If neither exists, generate a standard response from your AI, and it will be validated against Cleanlab’s guardrails

Best Practices

Writing Effective Explanations

When providing explanations that will become Expert Guidance:

Be Specific About the Problem: Clearly describe what went wrong with actionable feedback
- Good: “The AI prematurely confirmed a seat change that didn’t match the user’s request”
- Avoid: “The response was bad”
Describe Desired Behavior: Explain how the AI should behave differently
- Good: “For seat-change requests, the AI should: 1. Acknowledge the request. 2. Clarify preferences. 3. List available options. 4. Ask the user to choose before committing.”
- Avoid: “Just ask the user first”
Think About Classes of Failures: Consider what similar queries might benefit from this guidance
- The guidance should generalize to similar scenarios, not just the specific case

When to Use Expert Guidance vs Expert Answers

Use Expert Guidance when:

The correct response depends on dynamic factors (user preferences, time, context)
You want the AI to generate a response, but with better instructions
Multiple similar queries would benefit from the same guidance
The response needs to adapt to user choices or follow-up questions

Use Expert Answers when:

You want a deterministic, verbatim response for specific queries, across all users
The answer is always the same regardless of context
You want to bypass AI generation entirely for certain queries

How Expert Guidance Works​

Example: Airline Support Assistant​

SME Experience: Providing Expert Guidance​

Step 1: Review the Issue​

Step 2: Assess the Response​

Step 3: Provide Guidance or Expert Answer​

Example: Seat Change Request​

Managing Expert Guidance​

Integration: Using Expert Guidance in Your Application​

Key Integration Points​

Priority Order​

Best Practices​

Writing Effective Explanations​

When to Use Expert Guidance vs Expert Answers​