`module` `cleanlab_codex.response_validation`

Validation functions for evaluating LLM responses and determining if they should be replaced with Codex-generated alternatives.

`function` `is_bad_response`

is_bad_response(
    response: 'str',
    context: 'Optional[str]' = None,
    query: 'Optional[str]' = None,
    config: 'Union[BadResponseDetectionConfig, Dict[str, Any]]' = BadResponseDetectionConfig(fallback_answer='Based on the available information, I cannot provide a complete answer to this question.', fallback_similarity_threshold=0.7, trustworthiness_threshold=0.5, format_prompt=<function default_format_prompt at 0x7fe5cb6c6170>, unhelpfulness_confidence_threshold=0.5, tlm_config=TLMConfig(quality_preset=None, task=None, options=None)),
    codex_access_key: 'Optional[str]' = None
) → AggregatedResponseValidationResult

Run a series of checks to determine if a response is bad.

The function returns an AggregatedResponseValidationResult object containing results from multiple validation checks. If any check fails (detects an issue), the AggregatedResponseValidationResult will evaluate to True when used in a boolean context. This means code like if is_bad_response(...) will enter the if-block when problems are detected.

For example:

is_bad = is_bad_response(...)
if is_bad:  # True if any validation check failed
     print("Response had issues")
     # Access detailed results through is_bad.results

This function runs three possible validation checks:

Fallback check: Detects if response is too similar to a known fallback answer. 2. Untrustworthy check: Assesses response trustworthiness based on the given context and query. 3. Unhelpful check: Predicts if the response adequately answers the query or not, in a useful way.

Note: Each validation check runs conditionally based on whether the required arguments are provided. As soon as any validation check fails, the function returns True.

Args:

response (str): The response to check.
context (str, optional): Optional context/documents used for answering. Required for untrustworthy check.
query (str, optional): Optional user question. Required for untrustworthy and unhelpful checks.
config (BadResponseDetectionConfig, optional): Optional, configuration parameters for validation checks. See BadResponseDetectionConfig for details. If not provided, default values will be used.

Returns:

AggregatedResponseValidationResult: The results of the validation checks.

`function` `is_fallback_response`

is_fallback_response(
    response: 'str',
    fallback_answer: 'str' = 'Based on the available information, I cannot provide a complete answer to this question.',
    threshold: 'float' = 0.7
) → SingleResponseValidationResult

Check if a response is too similar to a known fallback answer.

Uses fuzzy string matching to compare the response against a known fallback answer. Returns True if the response is similar enough to the fallback answer to be considered unhelpful.

Args:

response (str): The response to check.
fallback_answer (str): A known unhelpful/fallback response to compare against.
threshold (float): Similarity threshold (0-1.0) above which a response is considered to match the fallback answer. Higher values require more similarity. Default 0.7 means responses that are 70% or more similar are considered bad.

Returns:

SingleResponseValidationResult: The results of the validation check.

`function` `score_fallback_response`

score_fallback_response(
    response: 'str',
    fallback_answer: 'str' = 'Based on the available information, I cannot provide a complete answer to this question.'
) → float

Score a response against a known fallback answer, based on how similar they are using fuzzy string matching.

Args:

response (str): The response to check.
fallback_answer (str): A known unhelpful/fallback response to compare against.

Returns:

float: The score of the response, between 0.0 and 1.0.

`function` `is_untrustworthy_response`

is_untrustworthy_response(
    response: 'str',
    context: 'str',
    query: 'str',
    tlm_config: 'TLMConfig' = TLMConfig(quality_preset=None, task=None, options=None),
    trustworthiness_threshold: 'float' = 0.5,
    format_prompt: 'Callable[[str, str], str]' = <function default_format_prompt at 0x7fe5cb6c6170>,
    codex_access_key: 'Optional[str]' = None
) → SingleResponseValidationResult

Check if a response is untrustworthy.

Uses TLM to evaluate whether a response is trustworthy given the context and query. Returns True if TLM’s trustworthiness score falls below the threshold, indicating the response may be incorrect or unreliable.

Args:

response (str): The response to check from the assistant.
context (str): The context information available for answering the query.
query (str): The user’s question or request.
tlm_config (TLMConfig): The TLM configuration to use for evaluation.
trustworthiness_threshold (float): Score threshold (0.0-1.0) under which a response is considered untrustworthy. Lower values allow less trustworthy responses. Default 0.5 means responses with scores less than 0.5 are considered untrustworthy.
format_prompt (Callable[[str, str], str]): Function that takes (query, context) and returns a formatted prompt string. Users should provide the prompt formatting function for their RAG application here so that the response can be evaluated using the same prompt that was used to generate the response.

Returns:

SingleResponseValidationResult: The results of the validation check.

`function` `score_untrustworthy_response`

score_untrustworthy_response(
    response: 'str',
    context: 'str',
    query: 'str',
    tlm_config: 'TLMConfig' = TLMConfig(quality_preset=None, task=None, options=None),
    format_prompt: 'Callable[[str, str], str]' = <function default_format_prompt at 0x7fe5cb6c6170>,
    codex_access_key: 'Optional[str]' = None
) → float

Scores a response’s trustworthiness using TLM, given a context and query.

Args:

response (str): The response to check from the assistant.
context (str): The context information available for answering the query.
query (str): The user’s question or request.
tlm (TLM): The TLM model to use for evaluation.
format_prompt (Callable[[str, str], str]): Function that takes (query, context) and returns a formatted prompt string. Users should provide the prompt formatting function for their RAG application here so that the response can be evaluated using the same prompt that was used to generate the response.

Returns:

float: The score of the response, between 0.0 and 1.0. A lower score indicates the response is less trustworthy.

`function` `is_unhelpful_response`

is_unhelpful_response(
    response: 'str',
    query: 'str',
    tlm_config: 'TLMConfig' = TLMConfig(quality_preset=None, task=None, options=None),
    confidence_score_threshold: 'float' = 0.5,
    codex_access_key: 'Optional[str]' = None
) → SingleResponseValidationResult

Check if a response is unhelpful by asking TLM to evaluate it.

Uses TLM to evaluate whether a response is helpful by asking it to make a Yes/No judgment. The evaluation considers both the TLM’s binary classification of helpfulness and its confidence score. Returns True only if TLM classifies the response as unhelpful AND is sufficiently confident in that assessment (if a threshold is provided).

Args:

response (str): The response to check.
query (str): User query that will be used to evaluate if the response is helpful.
tlm_config (TLMConfig): The configuration
confidence_score_threshold (float): Confidence threshold (0.0-1.0) above which a response is considered unhelpful. E.g. if confidence_score_threshold is 0.5, then responses with scores higher than 0.5 are considered unhelpful.

Returns:

SingleResponseValidationResult: The results of the validation check.

`function` `score_unhelpful_response`

score_unhelpful_response(
    response: 'str',
    query: 'str',
    tlm_config: 'TLMConfig' = TLMConfig(quality_preset=None, task=None, options=None),
    codex_access_key: 'Optional[str]' = None
) → float

Scores a response’s unhelpfulness using TLM, given a query.

Args:

response (str): The response to check.
query (str): User query that will be used to evaluate if the response is helpful.
tlm_config (TLMConfig): The TLM model to use for evaluation.

Returns:

float: The score of the response, between 0.0 and 1.0. A higher score corresponds to a less helpful response.

`class` `BadResponseDetectionConfig`

Configuration for bad response detection functions.

Used by is_bad_response function to which passes values to corresponding downstream validation checks.

class BadResponseDetectionConfig(BaseModel):
     fallback_answer: 'str' = Based on the available information, I cannot provide a complete answer to this question.
     fallback_similarity_threshold: 'float' = 0.7
     trustworthiness_threshold: 'float' = 0.5
     format_prompt: 'typing.Callable[[str, str], str]' = <function default_format_prompt at 0x7fe5cb6c6170>
     unhelpfulness_confidence_threshold: 'float' = 0.5
     tlm_config: 'TLMConfig' = quality_preset=None task=None options=None

Args:

fallback_answer (str): Known unhelpful response to compare against
fallback_similarity_threshold (float): Fuzzy matching similarity threshold (0.0-1.0). Higher values mean responses must be more similar to fallback_answer to be considered bad.
trustworthiness_threshold (float): Score threshold (0.0-1.0). Lower values allow less trustworthy responses.
format_prompt (typing.Callable[[str, str], str]): Function to format (query, context) into a prompt string.
unhelpfulness_confidence_threshold (float): Confidence threshold (0.0-1.0) for unhelpful classification.
tlm_config (TLMConfig): TLM model configuration to use for untrustworthiness and unhelpfulness checks.

function is_bad_response​

function is_fallback_response​

function score_fallback_response​

function is_untrustworthy_response​

function score_untrustworthy_response​

function is_unhelpful_response​

function score_unhelpful_response​

class BadResponseDetectionConfig​

`function` `is_bad_response`

`function` `is_fallback_response`

`function` `score_fallback_response`

`function` `is_untrustworthy_response`

`function` `score_untrustworthy_response`

`function` `is_unhelpful_response`

`function` `score_unhelpful_response`

`class` `BadResponseDetectionConfig`