Skip to main content

Inference

module cleanlab_studio.studio.inference

Methods for interfacing with deployed ML models (to produce predictions).

This module is not meant to be imported and used directly. Instead, use Studio.get_model() to instantiate a Model object.

The Model Deployment tutorial explains the end-to-end workflow for using Cleanlab Studio’s model deployment functionality.


class Model

Represents a machine learning model instance in a Cleanlab Studio account.

Models should be instantiated using the Studio.get_model() method. Then, using a Model object, you can predict() labels for new data.

method __init__

__init__(api_key: str, model_id: str)

Initializes a model.

Objects of this class are not meant to be constructed directly. Instead, use Studio.get_model().


method predict

predict(
batch: Union[List[str], ndarray[Any, dtype[str_]], Series, DataFrame],
return_pred_proba: bool = False,
timeout: int = 930
) → Union[ndarray[Any, dtype[int64]], ndarray[Any, dtype[str_]], ndarray[Any, dtype[Any]], Tuple[Union[ndarray[Any, dtype[int64]], ndarray[Any, dtype[str_]], ndarray[Any, dtype[Any]]], DataFrame]]

Gets predictions (and optionally, predicted probabilities) for batch of examples using deployed model. Currently only supports tabular and text datasets.

Args:

  • batch: batch of examples to predict classes for
  • return_pred_proba: whether to return predicted class probabilities for each example
  • timeout: optional parameter to set timeout for predictions in seconds (defaults to 930s)

Returns:

  • Predictions: the predicted labels for the batch as a numpy array. For a multi-class model, returns a numpy array of the labels with types matching types of given labels in the original training set (string, integer, or boolean). For a multi-label model, returns a numpy array of lists of strings, where each list includes the labels that are present for the corresponding row.
  • ClassProbabilities: optionally returns pandas DataFrame of the class probabilities where column names correspond to the labels.

Example outputs: Multi-class project: Say we have a dataset with 3 classes, “bear”, “cat” and “dog”. For two example rows, the deployed model predicts “cat” and “dog”, each with probability 1. The outputs will be:

  • Predictions: array(['cat', 'dog']) ClassProbabilities: bear cat dog 0 0.0 1.0 0.0 1 0.0 0.0 1.0 Note that for multi-class predictions, ClassProbabilities rows will sum to 1, and the prediction for each row corresponds to the column with the highest probability.

Multi-label project: Say we have some text dataset that we want sentiment predictions for, and the model predicts probabilities [0.6, 0.9, 0.1] for the set of possible labels, “happy”, “excited”, “sad”; for a second example, the predicted probabilities are [0.1, 0.3, 0.8]. The outputs will be:

  • Predictions: array([["happy", "excited"], ["sad"]]) ClassProbabilities: happy excited sad 0 0.6 0.9 0.1 1 0.1 0.3 0.8 Note that for multi-label predictions, each entry in a row of the ClassProbabilities should be interpreted as the probability that label is present for the example.