module cleanlab_codex.project
Module for interacting with a Codex project.
class MissingProjectError
Raised when the project ID or access key does not match any existing project.
class Project
Represents a Codex project.
To integrate a Codex project into your RAG/Agentic system, we recommend using the Project.validate()
method.
method __init__
__init__(
sdk_client: '_Codex',
project_id: 'str',
verify_existence: 'bool' = True
)
Initialize the Project. This method is not meant to be used directly. Instead, use the Client.get_project()
, Client.create_project()
, or Project.from_access_key()
methods.
Args:
sdk_client
(Codex): The Codex SDK client to use to interact with the project.project_id
(str): The ID of the project.verify_existence
(bool, optional): Whether to verify that the project exists.
property id
The ID of the project.
method add_remediation
add_remediation(question: 'str', answer: 'str | None' = None) → None
Add a remediation to the project. A remediation represents a question and answer pair that is expert verified and should be used to answer future queries to the AI system that are similar to the question.
Args:
question
(str): The question to add to the project.answer
(str, optional): The expert answer for the question. If not provided, the question will be added to the project without an expert answer.
classmethod create
create(
sdk_client: '_Codex',
organization_id: 'str',
name: 'str',
description: 'str | None' = None
) → Project
Create a new Codex project. This method is not meant to be used directly. Instead, use the create_project
method on the Client
class.
Args:
sdk_client
(Codex): The Codex SDK client to use to create the project. This client must be authenticated with a user-level API key.organization_id
(str): The ID of the organization to create the project in.name
(str): The name of the project.description
(str, optional): The description of the project.
Returns:
Project
: The created project.
Raises:
AuthenticationError
: If the SDK client is not authenticated with a user-level API key.
method create_access_key
create_access_key(
name: 'str',
description: 'str | None' = None,
expiration: 'datetime | None' = None
) → str
Create a new access key for this project. Must be authenticated with a user-level API key to use this method. See Client.create_project()
or Client.get_project()
.
Args:
name
(str): The name of the access key.description
(str, optional): The description of the access key.expiration
(datetime, optional): The expiration date of the access key. If not provided, the access key will not expire.
Returns:
str
: The access key token.
Raises:
AuthenticationError
: If the Project was created from a project-level access key instead of a Client instance.
classmethod from_access_key
from_access_key(access_key: 'str') → Project
Initialize a Project from a project-level access key.
Args:
access_key
(str): The access key for authenticating project access.
Returns:
Project
: The project associated with the access key.
method validate
validate(
messages: 'list[ChatCompletionMessageParam]',
response: 'Union[ChatCompletion, str]',
query: 'str',
context: 'str',
rewritten_query: 'Optional[str]' = None,
metadata: 'Optional[object]' = None,
tools: 'Optional[list[ChatCompletionToolParam]]' = None,
eval_scores: 'Optional[Dict[str, float]]' = None
) → ProjectValidateResponse
Evaluate the quality of an AI-generated response
based on the same exact inputs that your LLM used to generate the response.
Supply the same messages
that your LLM used to generate its response (formatted as OpenAI-style chat messages), including all past user/assistant messages, and any preceding system messages (including any retrieved context).
**For single-turn Q&A apps, messages
can be a minimal list with one user message containing all relevant info that was supplied to your LLM. For multi-turn conversations, provide the full dialog leading up to the final response (not including the final response).
If your AI response is flagged as problematic, then this method will:
- return an expert answer if one was previously provided for a similar query
- otherwise log this query for future SME review (to consider providing an expert answer) in the Web interface.
Args:
messages
(list[ChatCompletionMessageParam]): The full prompt given to your LLM that generated the response, in the OpenAI Chat Completions format.This must include the final user message that triggered the AI response. This must include all of the state that was supplied to your LLM (including
: full conversation history, system instructions/prompt, retrieved context, etc).response
(ChatCompletion | str): Your AI-response that was generated by LLM given the samemessages
. This is the response being evaluated, and should not appear in themessages
.query
(str): The core user query that theresponse
is answering, i.e. the latest user message inmessages
. Specifying thequery
(as a part of the fullmessages
object) enables Cleanlab to: match this against other users’ queries (e.g. for serving expert answers), run certain Evals, and display the query in the Web Interface.context
(str): All retrieved context (e.g., from your RAG/retrieval/search system) that was supplied as part ofmessages
for generating the LLMresponse
. Specifying thecontext
(as a part of the fullmessages
object) enables Cleanlab to run certain Evals and display the retrieved context in the Web Inferface.rewritten_query
(str, optional): An optional reformulation ofquery
(e.g. to form a self-contained question out of a multi-turn conversation history) to improve retrieval quality. If you are using a query-rewriter in your RAG system, you can provide its output here. If not provided, Cleanlab may internally do its own query rewrite when necessary.metadata
(object, optional): Arbitrary metadata to associate with this LLMresponse
for logging/analytics inside the Project.tools
(list[ChatCompletionToolParam], optional): Optional definitions of tools that were provided to the LLM in the response-generation call. Should match thetools
argument in OpenAI’s Chat Completions API. When provided to the LLM, its response might be to call one of these tools rather than natural language.eval_scores
(dict[str, float], optional): Pre-computed evaluation scores to bypass automatic scoring. Providingeval_scores
for specific evaluations bypasses automated scoring and uses the supplied scores instead. If you already have them pre-computed, this can reduce runtime.
Returns:
-
ProjectValidateResponse
: A structured object with the following fields:- should_guardrail (bool): True if the AI system should suppress or modify the response before returning it to the user. When True, the response is considered problematic and may require further review or modification.
- escalated_to_sme (bool): True if the query should be escalated to SME for review. When True, the query is logged and may be answered by an expert.
- eval_scores (dict[str, ThresholdedEvalScore]): Evaluation scores for different response attributes (e.g., trustworthiness, helpfulness, …). Each includes a numeric score and a
failed
flag indicating whether the score falls below threshold. - expert_answer (str | None): If it was auto-determined that this query should be escalated to SME, and a prior SME answer for a similar query was found, then this will return that expert answer. Otherwise, it is None.
When available, consider swapping your AI response with the expert answer before serving the response to your user.