Skip to main content

Expert Guidance

Expert Guidance allows Subject Matter Experts (SMEs) to improve their AI and patch agent failures by providing natural language feedback. Expert Guidance is particularly valuable for remediating failures by tool-calling AI agents, where “what good looks like” for a given query is often dynamic and personalized (e.g., adjusting by time, user details, or long-running context).

How Expert Guidance Works

Expert Guidance takes a two-step approach:

  1. Human Input via Natural Language: SMEs review Question/Response pairs and provide natural language explanations describing what went wrong and how the AI should behave differently.

  2. AI-Optimized Translation: Cleanlab automatically translates these SME explanations into optimized “Guidance” that:

    • Is rewritten and formatted for optimal AI performance
    • Is served as context to the AI when relevant
    • Addresses the specific failure case and generalizes to similar classes of failures

This translation process allows humans to freely instruct or teach the AI about what it did wrong and how it can do better, without requiring them to think about optimal prompting, logic, or technical implementation details. The guidance is automatically optimized and injected into your AI’s prompt to improve its behavior for similar scenarios.

Example: Airline Support Assistant

Consider an airline support agent that fails to search multiple airports when users ask for flights:

QueryOriginal Failing ResponseHuman Explanation/GuidanceDesired Response
Find me the shortest flight from the bay to miami for tmrwThe shortest flight from the Bay Area (SFO) to Miami (MIA) for tomorrow is flight WN5613…Bad: The Bay also means SFO, OAK, SJC. Same for Miami — include MIA, FLL, PBI. Don’t just pick one airport, show the user all their different optionsHere are the shortest nonstop economy flights from Bay Area airports (SFO, OAK, SJC) to Miami-area airports (MIA, FLL, PBI) for tomorrow: [shows all options]

When the SME provides this explanation, Cleanlab generates Expert Guidance that instructs the AI to:

  • Expand city names to include all relevant airports
  • Present multiple flight options from all relevant airports
  • Allow users to choose from comprehensive options

This single guidance entry then improves not just the original query, but similar queries like:

  • “Find me the shortest flight from nyc to la for tmrw"
  • "I want the cheapest roundtrip flight from chicago to dallas leaving this Sat and coming back next Sun

SME Experience: Providing Expert Guidance

SMEs can provide Expert Guidance directly from the Expert Workspace or Logs when reviewing Question/Response pairs.

AI Guidance Generation Interface

Step 1: Review the Issue

When reviewing an Issue or Log entry, SMEs see:

  • The user’s original query
  • The AI’s problematic response
  • Relevant context and metadata

Step 2: Assess the Response

SMEs are prompted: “Is this response acceptable?” with three options:

  • Yes → Marks the response Good (similar Q/R pairs will not be guardrailed)
  • No → Marks the response Bad (similar Q/R pairs will be guardrailed). SME can optionally:
    • Add an explanation → generates an Expert Guidance remediation
    • Provide a correct answer → creates an Expert Answer remediation
  • Skip + Mark as Addressed → Closes the Issue without judging the response (uses current manual “mark as addressed” workflow)

Step 3: Provide Guidance or Expert Answer

When an SME selects No (Bad response), they can optionally:

  1. Add an Explanation → Generates an Expert Guidance remediation

    • The SME provides a natural language explanation describing what went wrong and how the AI should behave differently
    • Cleanlab automatically generates optimized “Guidance” from this explanation
    • The SME can review the generated Guidance and click Submit if it captures their intent
  2. Provide a Correct Answer → Creates an Expert Answer remediation

    • This creates a deterministic answer that will be served for similar queries

Example: Seat Change Request

Here’s another example showing how Expert Guidance helps with more complex scenarios:

QueryOriginal Failing ResponseHuman Explanation/GuidanceDesired Response
can you change my seat to a window seat? my confirmation number is 57171 and my name is Charles MI have changed your seat to a seat with a power outlet, but it is not a window seat. The seat change is confirmed…Bad: The AI prematurely confirmed a seat change that didn’t match the user’s request (no window). This breaks trust and could incur unwanted fees. For seat-change requests, the AI should: 1. Acknowledge the request and confirm booking details. 2. Clarify preferences (standard vs. extra legroom/exit row, willingness to pay). 3. List actual available options with seat type + fee, if any. 4. Ask the user to choose before committing. Never auto-confirm a seat assignment.I’ve pulled up booking 57171 for Charles M. I can check available window seats for you. Here are the current options: – 14A: Standard window, no fee – 20F: Window, extra legroom (+$32) – 21A: Window, exit row (+$45) Would you like me to switch you to one of these, or keep your current seat?

This guidance then improves similar queries like:

  • “Are there any aisle seats for me? I want to switch over to one"
  • "Can you move me to sit next to my travel companion on booking 57172?”

Managing Expert Guidance

Submitted Expert Guidance entries can be managed from the Expert Answers page, which now includes separate tabs for:

  • Expert Guidance - Remediations that provide advice for the AI
  • Expert Reviews - Good/Bad annotations for a response
  • Expert Answers - Remediations that specify a verbatim answer to serve

From the Expert Answers page, you can view and manage Expert Guidance entries. Each entry includes fields for Status (Active, Paused), Query (the query pattern this guidance applies to), Guidance (the guidance text that will be injected), Times Served (usage counter), and metadata (Last Edited, Last Edited By, Date Created).

Expert Guidance can be created from Issues/Logs when reviewing failure cases, or manually from the Expert Answers page. All fields can be modified at any time.

Integration: Using Expert Guidance in Your Application

Expert Guidance is integrated into your AI application through the consult() API, which runs pre-generation to dynamically inject guidance into your AI’s prompt.

Key Integration Points

  1. consult() API - Pre-Generation Check

    • Called before LLM generation
    • Checks if the user query matches any Expert Guidance entries (via scenario-based matching)
    • Returns Expert Guidance if available (and no Expert Answer exists for that query)
    • Note: Expert Answers take priority - if there’s an Expert Answer for a similar query, Expert Guidance is not applied
  2. Dynamic Prompt Injection

    • The Expert Guidance is injected into your AI’s prompt (either as system instructions or appended to the prompt)
    • This happens pre-generation, allowing the AI to consider the guidance when generating its response
  3. validate() API - Post-Generation Validation

    • Still runs after generation to ensure the response passes configured guardrails
    • This ensures that even with Expert Guidance, the response still meets your safety standards
    • Guardrails can still block responses if they fail to meet criteria

Priority Order

If you have properly integrated the Cleanlab API, then here’s how your AI application should serve responses to users and validate against bad, untrustworthy responses:

  1. Expert Answer (highest priority) - If an Expert Answer exists for a similar query, it’s served with highest priority
  2. Expert Guidance - If no Expert Answer exists, Expert Guidance is applied pre-generation. The generated response is still validated against guardrails.
  3. Standard Generation - If neither exists, generate a standard response from your AI, and it will be validated against Cleanlab’s guardrails

Best Practices

Writing Effective Explanations

When providing explanations that will become Expert Guidance:

  1. Be Specific About the Problem: Clearly describe what went wrong with actionable feedback

    • Good: “The AI prematurely confirmed a seat change that didn’t match the user’s request”
    • Avoid: “The response was bad”
  2. Describe Desired Behavior: Explain how the AI should behave differently

    • Good: “For seat-change requests, the AI should: 1. Acknowledge the request. 2. Clarify preferences. 3. List available options. 4. Ask the user to choose before committing.”
    • Avoid: “Just ask the user first”
  3. Think About Classes of Failures: Consider what similar queries might benefit from this guidance

    • The guidance should generalize to similar scenarios, not just the specific case

When to Use Expert Guidance vs Expert Answers

Use Expert Guidance when:

  • The correct response depends on dynamic factors (user preferences, time, context)
  • You want the AI to generate a response, but with better instructions
  • Multiple similar queries would benefit from the same guidance
  • The response needs to adapt to user choices or follow-up questions

Use Expert Answers when:

  • You want a deterministic, verbatim response for specific queries, across all users
  • The answer is always the same regardless of context
  • You want to bypass AI generation entirely for certain queries