Getting Started with Cleanlab Studio CLI
This guide steps through how to use Cleanlab Studio programmatically via the command line. This workflow is primarily for those who want to quickly access the cleanlab columns produced by Cleanlab Studio without having to use the web interface or Python API.
Following this guide requires:
- Python >= 3.8
- A Cleanlab account
Installation
You can install the Cleanlab Studio client from PyPI with:
pip install cleanlab-studio
If you already have the client installed and wish to upgrade to the latest version, run:
pip install --upgrade cleanlab-studio
API Key Authentication
To upload datasets, create projects, and more, you need an API key.
- Log in (if you’re not already) and open your account page in Cleanlab Studio.
- Copy the API key from your account page.
- Authenticate yourself (if you haven’t already) by running:
cleanlab login
- Paste your API key when prompted.
Upload Dataset
The starting point for using Cleanlab Studio is uploading a Dataset.
Upload File
If you have a dataset saved to your filesystem in one of Cleanlab Studio’s supported formats, you can upload it from a filepath.
cleanlab dataset upload <path to your dataset>
Export
Download Cleanlab Columns
You can download the cleanlab columns from your project by providing the cleanset ID and path to the output CSV file where you want to save your results. This output CSV file will contain per-row information about each data point in your dataset, along with the corrections you’ve made.
Ensure that the output file has the .csv
file extension.
cleanlab cleanset download --id <cleanset ID> --output <path to output CSV file> --all
Apply Corrections
You can apply the corrections from your project given a copy of your dataset in file form and the cleanset ID. This will yield an output file of the corrections applied to your dataset.
cleanlab cleanset download --id <cleanset ID> --filepath <path to your dataset> --output <path to save to> --all
The Cleanlab Studio command line interface does not offer as much functionality as the Python API, so check that out to see what else you can do programmatically with this tool!