Skip to main content

Getting Started with Cleanlab Studio CLI

This guide steps through how to use Cleanlab Studio programmatically via the command line. This workflow is primarily for those who want to quickly access the cleanlab columns produced by Cleanlab Studio without having to use the web interface or Python API.

Following this guide requires:

  1. Python >= 3.8
  2. A Cleanlab account

Installation

You can install the Cleanlab Studio client from PyPI with:

pip install cleanlab-studio

If you already have the client installed and wish to upgrade to the latest version, run:

pip install --upgrade cleanlab-studio

API Key Authentication

To upload datasets, create projects, and more, you need an API key.

  1. Log in (if you’re not already) and open your account page in Cleanlab Studio.
  2. Copy the API key from your account page.
  3. Authenticate yourself (if you haven’t already) by running:
cleanlab login
  1. Paste your API key when prompted.

Upload Dataset

The starting point for using Cleanlab Studio is uploading a Dataset.

Upload File

If you have a dataset saved to your filesystem in one of Cleanlab Studio’s supported formats, you can upload it from a filepath.

cleanlab dataset upload -f <path to your dataset>

Export

Download Cleanlab Columns

You can download the cleanlab columns from your project by providing the cleanset ID and path to the output CSV file where you want to save your results. This output CSV file will contain per-row information about each data point in your dataset, along with the corrections you’ve made.

Ensure that the output file has the .csv file extension.

cleanlab cleanset download --id <cleanset ID> --output <path to output CSV file> --all

Apply Corrections

You can apply the corrections from your project given a copy of your dataset in file form and the cleanset ID. This will yield an output file of the corrections applied to your dataset.

cleanlab cleanset download --id <cleanset ID> --filepath <path to your dataset> --output <path to save to> --all

The Cleanlab Studio command line interface does not offer as much functionality as the Python API, so check that out to see what else you can do programmatically with this tool!