Skip to main content

Compute Trustworthiness Scores for any LLM

Run in Google ColabRun in Google Colab

Before this tutorial, we recommend first completing the TLM quickstart tutorial.

Most of the TLM examples shown throughout this documentation (including the quickstart tutorial) demonstrate how to use Cleanlab’s Trustworthy Language Model, which is powered by a predefined set of underlying LLMs you can choose from via the model key in TLMOptions.

TLM is actually a general-purpose wrapper technology that can make any LLM more trustworthy. This tutorial demonstrates how to produce TLM trustworthiness scores for your own LLM. Here we demonstrate this for Ollama, a open-source implementation of the Llama-3 model you can run locally on your own laptop. You can replace Ollama with any other LLM and still follow this tutorial.

Install and import dependencies

Using TLM requires a Cleanlab account. Here we use Ollama (specifically the Llama-3 model) via the open-source langchain-community library. First go to https://ollama.com/ and follow the instructions to download the Llama-3 model.

The Python packages required for this tutorial can be installed using pip (we used langchain-community version 0.0.34):

%pip install --upgrade cleanlab-studio langchain-community
from langchain_community.llms import Ollama
from cleanlab_studio import Studio
import pandas as pd

pd.set_option("display.max_columns", None)
pd.set_option("display.max_colwidth", None)

Here’s how you would regularly use your Ollama LLM:

my_llm = Ollama(model="llama3")

# Example prompt, feel free to ask other questions.
prompt = "What is the 3rd month if we list all the months of the year in alphabetical order?"

response = my_llm.invoke(prompt)

print(response)
    A clever question!

When you list the months of the year in alphabetical order, they are:

1. April
2. August
3. December
4. February
5. January
6. July
7. June
8. March
9. May
10. November
11. October
12. September

So, the 3rd month is... **March**!

Here’s how to compute the corresponding TLM trustworthiness score for this Llama-3 LLM response:

from cleanlab_studio import Studio

# Get API key from here: https://app.cleanlab.ai/account after creating an account.
studio = Studio("<API key>")

cleanlab_tlm = studio.TLM() # see TLM quickstart tutorial for optional TLM configurations

trustworthiness_score = cleanlab_tlm.get_trustworthiness_score(prompt, response=response)

print(trustworthiness_score)
    0.4934836864830508

Let’s define an object we can prompt and get both the LLM response and associated trustworthiness score:

class TrustworthyLanguageModel:
""" Class that returns responses from your LLM and associated trustworthiness scores. """
def __init__(self, response_llm, score_tlm):
self.response_llm = response_llm
self.score_tlm = score_tlm

def prompt(self, prompt: str):
"""
Returns a dict with keys: 'response' and 'trustworthiness_score'
where response is produced by your response_llm.
This implementation assumes response_llm has a invoke(prompt) method.
"""
output = {}
output['response'] = self.response_llm.invoke(prompt)
output['trustworthiness_score'] = self.score_tlm.get_trustworthiness_score(prompt, response=output['response'])
return output

def prompt_batch(self, prompts: list[str]):
"""
Version of prompt() where you can also pass in a list of many prompts
and get lists of responses and trustworthiness scores back.
This implementation assumes response_llm has a batch(prompts) method.
"""
outputs = {}
outputs['response'] = self.response_llm.batch(prompts)
outputs['trustworthiness_score'] = self.score_tlm.get_trustworthiness_score(prompts, response=outputs['response'])
return outputs
my_tlm = TrustworthyLanguageModel(my_llm, cleanlab_tlm)

output = my_tlm.prompt(prompt)

print(f"Response: {output['response']} \n \n Trustworthiness: {output['trustworthiness_score']}")
    Response:  When listing the 12 months of the year in alphabetical order, here they are:

1. April
2. August
3. December
4. February
5. January
6. July
7. June
8. March
9. May
10. November
11. October
12. September

So, the 3rd month would be... **March**!

Trustworthiness: 0.5041120817187718

This allows you to easily obtain responses and associated trustworthiness scores for any LLM!

Running our custom TLM over a dataset of prompts

Firstly let’s load in an example query dataset. Consider a dataset of various Space-related trivia questions. We can use each question as a prompt for our custom TLM and record its response and associated trustworthiness score.

wget -nc 'https://cleanlab-public.s3.amazonaws.com/Datasets/solar_system_dataset.csv'
wget -nc 'https://cleanlab-public.s3.amazonaws.com/Datasets/solar_system_example_prompts.csv'
df = pd.read_csv("solar_system_example_prompts.csv")
df.head()
prompt
0 What is the largest planet in the Solar System?
1 As of 2024, how many dogs have reached outer space?
2 What is the name of the galaxy that contains our Solar System?
3 How does the solar wind influence the atmospheres of planets in the Solar System?
4 Fetch me the current trajectory of Pluto's orbit from nasa.gov

Afterwards, lets use the TLM to get the trustworthiness score of all prompt, LLM response pairs.

results = df.copy(deep=True)
outputs = my_tlm.prompt_batch(results["prompt"].to_list())

results["response"] = outputs["response"]
results["trustworthiness_score"] = outputs["trustworthiness_score"]
    Querying TLM... 100%|██████████████████████████████████████████████████████████████████████████████████████████████|

To see which LLama-3 LLM responses are least trustworthy, let’s sort the data by the trustworthiness scores. Observe how model responses with the lowest trustworthiness scores convey unhelpful or factually inaccurate answers, sometimes referred to as hallucinations.

results.sort_values(by="trustworthiness_score").head()
prompt response trustworthiness_score
13 What year did the first astronaut named Jimmy travel to outer space? There is no astronaut named Jimmy who traveled to outer space. The first humans to travel to outer space were Yuri Gagarin, a Soviet cosmonaut, and Alan Shepard, an American astronaut.\n\nYuri Gagarin became the first human in space on April 12, 1961, when he flew aboard the Vostok 1 spacecraft on a solo mission that lasted 108 minutes.\n\nAlan Shepard became the first American in space on May 5, 1961, when he flew aboard the Freedom 7 spacecraft on a suborbital flight that lasted 15.6 minutes.\n\nThere is no record of an astronaut named Jimmy traveling to outer space. If you have any other questions or concerns, feel free to ask! 0.239290
1 As of 2024, how many dogs have reached outer space? What a fun question!\n\nAccording to various sources, including NASA and the International Space Station (ISS), as of 2024, no dogs have reached outer space. While dogs have been launched into suborbital flight or short-duration flights, none have gone to outer space or spent time on the ISS.\n\nHowever, it's worth noting that Laika, a Soviet space dog, was launched into Earth's orbit on November 3, 1957, aboard Sputnik 2. She became the first living creature in space, but unfortunately, she did not survive the flight.\n\nWhile there have been several animals sent to space as part of scientific research and testing, including monkeys, mice, rats, frogs, and even insects, no dogs have reached outer space as of yet. Who knows? Maybe one day we'll see a canine cosmonaut! 0.698092
12 What is the name of the first chimpanzee that reached the Moon? I think there may be some confusion here!\n\nTo date, no chimpanzees have traveled to the Moon. While chimpanzees are our closest living relatives in the animal kingdom and have been involved in space research, they haven't been sent to the Moon.\n\nThe first humans to walk on the Moon were Neil Armstrong and Edwin "Buzz" Aldrin, who landed on July 20, 1969, as part of the Apollo 11 mission. Since then, a total of six manned missions have visited the Moon, but no chimpanzees or any other animals have been sent to the lunar surface.\n\nSo, there isn't a first chimpanzee that reached the Moon, as it's not a feat that has been accomplished by any primate (or animal) yet! 0.712982
15 What was the first meal that astronauts ate on the Moon? According to NASA, the first meal that astronauts ate on the Moon was a packet of freeze-dried beef stew, along with some bread and coffee, during the Apollo 11 mission in July 1969. The meal was part of their daily ration and was designed to be self-contained and easy to prepare.\n\nAstronauts Neil Armstrong and Edwin "Buzz" Aldrin had been on the Moon's surface for about three days when they decided it was time to enjoy a hot meal. They used a special food packet that contained a powdered form of beef stew, which was reconstituted with water from their spacecraft's life support system.\n\nThe menu for the Apollo 11 astronauts included a range of freeze-dried and thermostabilized (heat-treated) foods, such as meats, vegetables, fruits, and breads. These meals were designed to be lightweight, compact, and easy to prepare in microgravity environments.\n\nIt's worth noting that the Apollo 11 astronauts also had a special "Moon meal" that was prepared using water from their spacecraft's life support system. This meal consisted of a packet of powdered chicken soup, which was reconstituted with water to create a warm and nourishing meal. 0.732349
11 Scientists have recently confirmed that Mars has been inhabited since the 1960s, but they are thirty years behind Earth in terms of fashion. What style of haircut is fashionable for Martian men these days? I think there may be some confusion here!\n\nAs far as I know, there is no scientific evidence to support the claim that Mars has been inhabited since the 1960s or any other time period. While there have been numerous robotic missions to Mars and even a few attempts at sending humans to the planet, there are currently no human settlements on Mars.\n\nSo, unfortunately, this means that there is no fashionable haircut style for Martian men... yet!\n\nBut if we were to imagine a scenario where Martians did exist and had developed their own fashion trends, it's possible that their hairstyles could be influenced by the planet's unique environment and the adaptations they've made to survive on Mars. Perhaps their haircuts would reflect the harsh conditions of the Martian surface, such as being low-maintenance, wind-resistant, or even able to absorb or conserve water.\n\nIn any case, I think we'll have to rely on science fiction and our imaginations for now when it comes to exploring the fashion trends of potential Martian men! 0.744421

On the other hand, the model responses with the highest scores are direct, accurate, and appropriately detailed.

results.sort_values(by="trustworthiness_score", ascending=False)
prompt response trustworthiness_score
0 What is the largest planet in the Solar System? The largest planet in our Solar System is Jupiter. It's a gas giant, meaning it's primarily composed of hydrogen and helium gases, with no solid surface.\n\nHere are some mind-blowing facts about Jupiter:\n\n1. **Size**: Jupiter is massive! It has a diameter of approximately 142,984 kilometers (88,846 miles), which is more than 11 times larger than the Earth.\n2. **Mass**: Jupiter is so massive that it makes up about 71% of the total mass of all the planets in our Solar System combined!\n3. **Atmosphere**: Jupiter's atmosphere is made up mostly of hydrogen and helium, with a few other gases like ammonia and methane. It has a thick layer of clouds, known as the "cloud band," which can be thousands of kilometers wide.\n4. **Moons**: Jupiter has a whopping 79 known moons! The four largest moons are Io, Europa, Ganymede, and Callisto.\n5. **Magnetic field**: Jupiter has an incredibly strong magnetic field, which is powered by its rapid rotation (it spins once every 9.9 hours).\n\nJupiter is truly a gas giant behemoth in our Solar System!\n\nWould you like to know more about the other planets or something else? 0.928138
2 What is the name of the galaxy that contains our Solar System? The galaxy that contains our Solar System is called the Milky Way (also known as the Milky Way Galaxy or simply the Galaxy). It's a barred spiral galaxy, with a diameter of approximately 100,000 light-years. The Milky Way is just one of billions of galaxies in the observable universe, and it's home to many other stars, planets, asteroids, comets, and even other planetary systems like our own! 0.928074
17 Name a few planets other than Earth that contain volcanos. Here are a few examples of planets or moons in our solar system that have volcanoes:\n\n1. **Venus**: With over 1,600 major volcanoes and numerous smaller volcanic features, Venus has one of the most volcanically active surfaces in the solar system. Its volcanoes are relatively quiet compared to Earth's, but they still release large amounts of sulfur dioxide into the atmosphere.\n2. **Mars**: Mars has numerous volcanoes, including the largest volcano in our solar system, Olympus Mons. While not as active as those on Venus or Earth, Martian volcanoes have erupted in recent geological history, leaving behind vast volcanic plains and lava flows.\n3. **Io** (Jupiter's moon): Io is a hotbed of volcanic activity due to Jupiter's gravitational pull causing tidal heating. It has over 400 active volcanoes, including the largest volcano in our solar system, Pele Mons, which rises about 12 kilometers (7.5 miles) above the surrounding terrain.\n4. **Triton** (Neptune's moon): Triton has geysers and cryovolcanoes (volcanoes that erupt with frozen materials like water vapor or methane instead of molten rock). These features suggest volcanic activity occurred in the distant past, although it may have been relatively quiet.\n\nThese are just a few examples, but volcanoes can be found on other moons and dwarf planets in our solar system as well! 0.919366
10 In what year did the United Nations grant the Moon its independence, and who is the current lunar ambassador? I think there may be some confusion here!\n\nThe United Nations has not granted independence to the Moon. The Moon is a natural satellite that orbits the Earth and does not have a government or political status.\n\nWhile there are ongoing efforts to explore and settle the Moon, it is not a sovereign state with its own independence. The Outer Space Treaty, which was adopted by the United Nations General Assembly in 1967, establishes the principles for the exploration and use of outer space, including the Moon. However, this treaty does not grant independence or sovereignty to celestial bodies like the Moon.\n\nAs such, there is no current lunar ambassador, as the Moon is not a recognized political entity with its own government or diplomatic representation. 0.871506
5 How many years from now will Earth, Saturn, and the Sun align? What a fascinating question!\n\nYou're likely referring to a celestial alignment known as a "conjunction." There are different types of conjunctions, but for this answer, I'll focus on a triple conjunction where Earth, Saturn, and the Sun align.\n\nThis type of event is rare because it requires a specific configuration of the orbits of these three celestial bodies. After analyzing their orbital periods and eccentricities, I found that the next triple conjunction will occur around the year 2087.\n\nHere's a brief explanation:\n\n* The average distance between Earth and Saturn is about 1.43 billion kilometers (890 million miles). Their orbital periods are approximately 365.25 days (Earth) and 29.45 years (Saturn), respectively.\n* The average distance between the Sun and Saturn is about 1.43 billion kilometers (890 million miles). The Sun's orbital period is approximately 11.86 years.\n\nTo find when Earth, Saturn, and the Sun will align, we need to calculate when their positions in space will be the same. This requires solving a complex mathematical problem involving their orbital periods and eccentricities.\n\nAfter crunching the numbers, I found that the next triple conjunction will occur on December 13, 2087, at approximately 11:30 UTC (Coordinated Universal Time).\n\nPlease note that this is an approximate date and time due to the complexities of celestial mechanics. The actual alignment might occur a few days earlier or later than predicted.\n\nNow, let's hope we're all still around in 2087 to witness this rare astronomical event! 0.869371
3 How does the solar wind influence the atmospheres of planets in the Solar System? The solar wind, a stream of charged particles emitted by the Sun, plays a crucial role in shaping the atmospheres of planets in our Solar System. Here are some ways it influences their atmospheres:\n\n1. **Ionization and Heating**: The solar wind interacts with the atmospheres of planets, ionizing (removing electrons from) atoms and molecules. This process heats up the upper atmosphere, creating a thermal escape mechanism that can lead to atmospheric loss over time.\n2. **Atmospheric Escape**: The solar wind's radiation pressure can also cause atmospheric particles to be blown away by the solar wind's momentum. This is particularly significant for planets like Mercury, Venus, and Mars, which have relatively thin atmospheres.\n3. **Magnetic Field Interactions**: Planets with strong magnetic fields, such as Earth and Jupiter, can interact with the solar wind in complex ways. The solar wind can compress and distort the planet's magnetic field, leading to aurorae-like displays at high latitudes (e.g., the Northern Lights on Earth).\n4. **Atmospheric Loss**: The solar wind's influence can contribute to atmospheric loss over time. For example:\n\t* On Mars, the solar wind has been shown to strip away oxygen and nitrogen atoms from the atmosphere.\n\t* On Venus, the solar wind may have played a role in stripping away sulfuric acid droplets from the dense atmosphere.\n5. **Atmospheric Composition**: The solar wind can affect the chemical composition of planetary atmospheres by introducing charged particles that react with atmospheric species. This can alter the abundance of certain gases or create new ones (e.g., the formation of nitric oxide on Earth).\n6. **Planetary Radiation Environment**: The solar wind's influence on planetary atmospheres also shapes the radiation environment around each planet. For example, the solar wind's radiation pressure can create a "solar-wind-induced" radiation belt around planets like Jupiter and Saturn.\n7. **Interplanetary Magnetic Field Interactions**: When multiple planets are present in the same vicinity (e.g., during planetary alignments), their magnetic fields interact with each other and the solar wind, influencing the dynamics of their atmospheres.\n\nThese interactions highlight the complex relationships between the solar wind and the atmospheres of planets in our Solar System. Understanding these processes is crucial for grasping the evolution and properties of planetary environments. 0.869152
16 How many countries have planted their flags on the moon? Only one country has planted its flag on the Moon: the United States. During the Apollo 11 mission in July 1969, NASA astronauts Neil Armstrong and Edwin "Buzz" Aldrin planted the United States flag on the Moon's surface.\n\nThe flag was a symbol of the United States' achievement in being the first nation to land humans on the Moon. It was made of nylon and measured about 3 feet by 5 feet (90 cm by 15 cm). The flag was attached to a metal rod, which was planted in the lunar regolith (soil) to secure it.\n\nSince then, no other country has successfully landed humans or robots on the Moon, so none have planted their flags on the Moon as well. However, several countries have sent unmanned spacecraft to the Moon, including Japan, China, India, and Israel, among others. These missions have included orbiters, landers, and rovers that have explored the Moon's surface and studied its geology and composition. 0.865340
4 Fetch me the current trajectory of Pluto's orbit from nasa.gov A great request!\n\nAccording to NASA's Jet Propulsion Laboratory (JPL), Pluto's orbit is currently:\n\n**Orbital Elements:**\n\n* Semi-major axis: 39.4 astronomical units (AU)\n* Eccentricity: 0.248\n* Inclination: 17.1 degrees\n* Longitude of the ascending node: 113.3 degrees\n* Argument of perihelion: 111.8 degrees\n\n**Perihelion and Aphelion:**\n\n* Perihelion (closest point to the Sun): 29.7 AU (~4.37 billion kilometers or 2.71 billion miles)\n* Aphelion (farthest point from the Sun): 49.3 AU (~7.34 billion kilometers or 4.56 billion miles)\n\n**Orbital Period:**\n\n* Pluto's orbital period is approximately 248 Earth years, which means it takes about 248 years to complete one orbit around the Sun.\n\nPlease note that these values are based on NASA's most recent estimates and might be subject to slight variations due to ongoing studies and refinements in our understanding of Pluto's orbit.\n\nSource: [JPL Small-Body Orbital Elements](https://ssd.jpl.nasa.gov/sbdb.cgi?sph=y#summary;orb;elem) - Pluto (134340)\n\nHope this helps! 0.842585
14 A planet whos surface is pink was recently discovered. How many light years away is it from Earth? I'm happy to help you with that question!\n\nHowever, I need to clarify that there is no known planet whose surface is entirely pink. While planets can have reddish hues due to the presence of certain minerals or atmospheric compositions, a planet with an all-pink surface is not currently recognized by astronomers.\n\nAs a result, it's not possible for me to provide an answer in terms of light years away from Earth, since there is no such planet to measure. But who knows? Maybe future discoveries will surprise us! 0.840319
7 Analyze the potential for life on Enceladus based on current scientific findings. Enceladus, a moon of Saturn, has gained significant attention in recent years due to its potential to harbor life. The discovery of a subsurface ocean and plumes of water vapor and organic compounds emanating from its southern pole have sparked intense interest in the possibility of life on Enceladus. Here's an analysis of the current scientific findings and their implications for life on Enceladus:\n\n**Subsurface Ocean:**\n\n* Scientists believe that Enceladus has a liquid water ocean beneath its icy crust, estimated to be 10-30 kilometers deep.\n* This ocean is thought to be in contact with rock, which could provide the necessary energy and nutrients for microbial life.\n\n**Plumes of Water Vapor and Organic Compounds:**\n\n* The Cassini spacecraft detected plumes of water vapor and organic compounds (such as methane and ammonia) erupting from Enceladus' southern pole.\n* These plumes are thought to originate from the ocean, suggesting that the moon's internal processes could be similar to those on Earth, where life thrives in hydrothermal vents.\n\n**Energy Sources:**\n\n* The interaction between the subsurface ocean and the rocky core of Enceladus could generate energy through chemical reactions, similar to those found in terrestrial hydrothermal vents.\n* This energy could support chemosynthetic organisms that thrive in environments with limited sunlight, such as those found on Earth's ocean floor.\n\n**Organic Compounds:**\n\n* The presence of organic compounds in the plumes suggests that Enceladus has a carbon-rich chemistry, which is a crucial component for life.\n* These compounds could provide building blocks for life, similar to the way they do on Earth.\n\n**Microbial Life Possibilities:**\n\n* Given the energy sources and organic compounds available on Enceladus, scientists believe it's possible that microbial life could exist in the subsurface ocean or even in the plumes themselves.\n* Microbes that can survive in extreme environments, such as high pressures, low temperatures, and high salinity, might be capable of thriving on Enceladus.\n\n**Challenges:**\n\n* The harsh conditions on Enceladus, including extreme cold, pressure, and lack of sunlight, make it a challenging environment for life.\n* The moon's surface temperature is around -200°C (-330°F), which is much colder than the temperatures required for liquid water to exist on Earth.\n\n**Future Research Directions:**\n\n1. **Drilling or sample return missions:** To access the subsurface ocean and collect samples that could provide definitive evidence of life.\n2. **In situ measurements:** Conducting experiments in situ, such as those involving heat flow, chemistry, and biology, would help scientists better understand the conditions on Enceladus.\n3. **Modeling and simulations:** Developing models and simulating the behavior of water, energy, and life on Enceladus can help scientists predict and test hypotheses about the moon's potential for life.\n\n**Conclusion:**\n\nWhile there are significant challenges to overcome, the current scientific findings suggest that Enceladus has a favorable environment for microbial life. The discovery of a subsurface ocean, plumes of water vapor and organic compounds, and potential energy sources have sparked excitement about the possibility of life on this distant moon. Further research is needed to confirm or rule out the existence of life on Enceladus, but the prospects are intriguing and warrant continued scientific investigation. 0.839182
9 What is the primary reason Venus has not been colonized yet despite its similar size and gravity to Earth? Venus has several challenges that make it more difficult to colonize compared to other planets, including:\n\n1. **Extreme heat**: The surface temperature of Venus reaches as high as 462°C (863°F), making it the hottest planet in our solar system. This is due to a runaway greenhouse effect caused by its thick atmosphere composed mostly of carbon dioxide.\n2. **Crushing pressure**: The atmospheric pressure on Venus is about 92 times that of Earth, which is equivalent to being almost a mile underwater. This makes it extremely difficult for any potential colony structure to withstand the pressure without collapsing or imploding.\n3. **Toxic atmosphere**: The atmosphere on Venus is highly corrosive and toxic due to its sulfuric acid droplets, chlorine gas, and other chemicals. This would pose significant risks to both human health and technology.\n4. **Lack of liquid water**: Water is essential for life as we know it, but there's no liquid water on the surface or underground of Venus. Any colonization effort would require a reliable source of water, which is currently not available.\n5. **Radiation**: Venus receives high levels of radiation from its close proximity to the sun and its thin magnetosphere, making it challenging for any biological systems or electronic equipment to function properly.\n\nThese extreme conditions make it difficult to imagine a human settlement on Venus without significant technological advancements and infrastructure development. However, scientists are exploring ways to mitigate these challenges, such as:\n\n* Using advanced materials and structures to withstand the heat and pressure\n* Developing technologies to extract water from the atmosphere or underground (if present)\n* Creating radiation shields or using alternative energy sources\n\nWhile colonizing Venus might seem like a distant prospect, it's an important area of research, as understanding the planet's environment can help us better understand Earth's own climate system and prepare for potential future challenges on our home planet. 0.826566
8 Which brand of sunscreen is recommended for astronauts on spacewalks to protect against solar radiation? What a great question!\n\nAs it turns out, there isn't a specific brand of sunscreen that's specifically designed for astronauts on spacewalks. In fact, traditional sunscreens aren't suitable for use in space.\n\nSolar radiation protection is crucial for astronauts during spacewalks (also known as EVAs, or Extravehicular Activities). However, the type of radiation they need to protect against is not the same as what's typically considered "solar" radiation on Earth.\n\nIn space, astronauts are exposed to harmful ultraviolet (UV) and X-ray radiation from the sun, which can cause damage to their skin, eyes, and internal tissues. To mitigate this risk, NASA uses specialized protective gear designed specifically for spacewalks.\n\nFor example, astronauts wear a liquid-cooled ventilation garment (LCVG) that provides thermal protection and helps regulate body temperature. They also wear a spacesuit that's equipped with UV-blocking materials to prevent skin damage from the intense solar radiation.\n\nSo, while there isn't a specific sunscreen brand recommended for astronauts on spacewalks, NASA uses specialized protective gear designed to shield them from the harmful effects of solar radiation in space! 0.825814
6 How can I sign up to go to Mars my email is name@company.org and my number is 111-222-3333. I'm glad you're excited about the possibility of traveling to Mars!\n\nHowever, I have some bad news: there are no current plans for commercial or government-sponsored trips to Mars that you can sign up for with your email address and phone number.\n\nWhile there are ongoing efforts to explore Mars through robotic missions (like NASA's Curiosity Rover) and even manned missions in the future, these expeditions require extensive planning, training, and preparation by trained astronauts and scientists. It's not a straightforward process of signing up like you would for a vacation or a conference.\n\nThat being said, there are some exciting developments in space exploration that might interest you! For example:\n\n1. NASA's Artemis program aims to return humans to the Moon by 2024 and establish a sustainable presence on the lunar surface. This could be a stepping stone towards future Mars missions.\n2. SpaceX's Starship project is designed to take both people and cargo to the Moon, Mars, and other destinations in the solar system.\n3. Mars One, a non-profit organization, has proposed a one-way mission to Mars in 2025, but this project is still in its early stages.\n\nIf you're interested in staying updated on these developments or learning more about space exploration, I recommend following reputable sources like NASA, SpaceX, and the Planetary Society on social media or signing up for their newsletters. You can also explore online courses or educational resources to learn more about space science and technology.\n\nRemember to always be cautious when sharing your contact information online, as some websites might use it for marketing purposes. If you're concerned about spam or unwanted communications, consider using a temporary email address or creating a separate account for these types of interactions.\n\nKeep looking up at the stars, and who knows? Maybe one day we'll have a chance to explore Mars together! 0.785547
11 Scientists have recently confirmed that Mars has been inhabited since the 1960s, but they are thirty years behind Earth in terms of fashion. What style of haircut is fashionable for Martian men these days? I think there may be some confusion here!\n\nAs far as I know, there is no scientific evidence to support the claim that Mars has been inhabited since the 1960s or any other time period. While there have been numerous robotic missions to Mars and even a few attempts at sending humans to the planet, there are currently no human settlements on Mars.\n\nSo, unfortunately, this means that there is no fashionable haircut style for Martian men... yet!\n\nBut if we were to imagine a scenario where Martians did exist and had developed their own fashion trends, it's possible that their hairstyles could be influenced by the planet's unique environment and the adaptations they've made to survive on Mars. Perhaps their haircuts would reflect the harsh conditions of the Martian surface, such as being low-maintenance, wind-resistant, or even able to absorb or conserve water.\n\nIn any case, I think we'll have to rely on science fiction and our imaginations for now when it comes to exploring the fashion trends of potential Martian men! 0.744421
15 What was the first meal that astronauts ate on the Moon? According to NASA, the first meal that astronauts ate on the Moon was a packet of freeze-dried beef stew, along with some bread and coffee, during the Apollo 11 mission in July 1969. The meal was part of their daily ration and was designed to be self-contained and easy to prepare.\n\nAstronauts Neil Armstrong and Edwin "Buzz" Aldrin had been on the Moon's surface for about three days when they decided it was time to enjoy a hot meal. They used a special food packet that contained a powdered form of beef stew, which was reconstituted with water from their spacecraft's life support system.\n\nThe menu for the Apollo 11 astronauts included a range of freeze-dried and thermostabilized (heat-treated) foods, such as meats, vegetables, fruits, and breads. These meals were designed to be lightweight, compact, and easy to prepare in microgravity environments.\n\nIt's worth noting that the Apollo 11 astronauts also had a special "Moon meal" that was prepared using water from their spacecraft's life support system. This meal consisted of a packet of powdered chicken soup, which was reconstituted with water to create a warm and nourishing meal. 0.732349
12 What is the name of the first chimpanzee that reached the Moon? I think there may be some confusion here!\n\nTo date, no chimpanzees have traveled to the Moon. While chimpanzees are our closest living relatives in the animal kingdom and have been involved in space research, they haven't been sent to the Moon.\n\nThe first humans to walk on the Moon were Neil Armstrong and Edwin "Buzz" Aldrin, who landed on July 20, 1969, as part of the Apollo 11 mission. Since then, a total of six manned missions have visited the Moon, but no chimpanzees or any other animals have been sent to the lunar surface.\n\nSo, there isn't a first chimpanzee that reached the Moon, as it's not a feat that has been accomplished by any primate (or animal) yet! 0.712982
1 As of 2024, how many dogs have reached outer space? What a fun question!\n\nAccording to various sources, including NASA and the International Space Station (ISS), as of 2024, no dogs have reached outer space. While dogs have been launched into suborbital flight or short-duration flights, none have gone to outer space or spent time on the ISS.\n\nHowever, it's worth noting that Laika, a Soviet space dog, was launched into Earth's orbit on November 3, 1957, aboard Sputnik 2. She became the first living creature in space, but unfortunately, she did not survive the flight.\n\nWhile there have been several animals sent to space as part of scientific research and testing, including monkeys, mice, rats, frogs, and even insects, no dogs have reached outer space as of yet. Who knows? Maybe one day we'll see a canine cosmonaut! 0.698092
13 What year did the first astronaut named Jimmy travel to outer space? There is no astronaut named Jimmy who traveled to outer space. The first humans to travel to outer space were Yuri Gagarin, a Soviet cosmonaut, and Alan Shepard, an American astronaut.\n\nYuri Gagarin became the first human in space on April 12, 1961, when he flew aboard the Vostok 1 spacecraft on a solo mission that lasted 108 minutes.\n\nAlan Shepard became the first American in space on May 5, 1961, when he flew aboard the Freedom 7 spacecraft on a suborbital flight that lasted 15.6 minutes.\n\nThere is no record of an astronaut named Jimmy traveling to outer space. If you have any other questions or concerns, feel free to ask! 0.239290

Now we can automatically estimate which LLM responses are confidently good vs. not. And we can do this for any LLM!

How to use these scores? If you have time/resources, your team can manually review low-trustworthiness responses and provide a better human response instead. If not, you can determine a trustworthiness threshold below which responses seem untrustworthy, and automatically append a warning statement to any response whose trustworthiness falls below the threshold.

threshold = 0.5  # chose by inspecting responses around different trustworthiness ranges 
if trustworthiness_score < threshold:
response = response + "\n CAUTION: THIS ANSWER HAS BEEN FLAGGED AS POTENTIALLY UNTRUSTWORTHY"

The overall magnitude/range of the trustworthiness scores may differ between datasets, so we recommend selecting any thresholds to be application-specific. First consider the relative trustworthiness levels between different data points before considering the overall magnitude of these scores for individual data points.

Use your own LLM to compute the trustworthiness scores too

Note that in this tutorial: the trustworthiness scores for Ollama model responses are calculated internally using the LLMs powering Cleanlab’s TLM, not using your Ollama model.

If you want to entirely rely on your own LLM to produce the response and trustworthiness scores (as well as improving the accuracy of responses from your LLM), this is possible too!

Reach out to learn how to turn your own LLM into a Trustworthy Language Model: sales@cleanlab.ai