Getting Started with Cleanlab Studio Python API
Instead of the web application interface, you can also use Cleanlab Studio programmatically via the Python API, which unlocks many additional capabilities.
This requires:
- Python >= 3.8
- A Cleanlab account
Installation
Install the Cleanlab Studio client from PyPI with:
pip install cleanlab-studio
If you already have the client installed and wish to upgrade to the latest version, run:
pip install --upgrade cleanlab-studio
Some of the recently added tutorials in this documentation may not work unless you have upgraded to the latest version of the client. In case you’re curious, the source code for the client is available on GitHub.
API Key
To use Cleanlab Studio programmatically, you must get your API key.
First, open your account page in the Cleanlab Studio web application. If you’re not already logged in, do so now. The API key can be copied from your account page.
Running Cleanlab Studio programmatically
The starting point for using Cleanlab Studio is loading a Dataset and then creating a Project. When you do these actions programmatically via the Python API, you can access the resulting data/results both from your Python session and the web application.
The generally available version of Cleanlab Studio can be used for image, text, and structured/tabular datasets. Here are introductory tutorials to start using the application programmatically for data from each modality:
- image data quickstart tutorial
- text data quickstart tutorial
- tabular data quickstart tutorial (+ a 2nd tabular data tutorial)
We recommend starting with one of these tutorials, and then you can find the most relevant tutorial for your Data/AI application. Unless explicitly specified, most of our tutorials can be followed for all data modalities. So don’t be discouraged if you have say text data, but see the tutorial closest to your use-case features image data – this tutorial likely also works for text data. Even if you don’t find a similar dataset/application amongst our tutorials, Cleanlab Studio is an extremely general platform that can likely work for your dataset as long as it’s properly formatted.