personal

In 1976, a British statistician, George Box, wrote “all models are wrong, but some are useful.” Nowhere is this more apparent than in machine learning, where models are not just wrong but often unreliable despite their immense usefulness. If we look towards other disciplines such as civil engineering or traditional software engineering, engineers in these fields work in the reliablility domains of 99.9999999% such that the likelihood of their systems failing is guaranteed to be less than the chance of a large meteorite hitting Earth tomorrow.

Machine learning models and systems, on the other hand, are often as reliable as humans, which is to say not very reliable at all. Machine learning engineers are lucky to get models with 99% accuracy, which is equivalent to a system that fails or hallucinates 1 out of 100 times. Think about a world in 1 out of every 100 bridges collapse or 1 out of every 100 transactions at your bank are lost to the ether. Part of the challenge of being a machine learning engineer is to figure out how we can integrate these useful but unreliable systems into a society that is built on reliable infrastructure. In this blog, I will go over the top lessons I learned in dealing with unreliable models as a machine learning engineering

1. Machine learning requires human learning

Since machine learning and AI systems can and will fail, users must cultivate a mindset of skepticism and verificiation when using these tools. A clear example of lacking this mindset is the ChatGPT lawyer that infamously cited court cases that were hallincated from the popular large language model. The ChatGPT lawyer and other horror stories that came out after the release of ChatGPT highlight the dangers that unreliable models can bring to society, namely the spread of misinfomation and propagation of errorneous decisions that can fester over time. The only way for us as a society to become immune to such dangers is to (1) accept that machine learning models can and will fail and (2) train a mindset of skepticism and verification to mitigate and stop the propagation of mistakes. Thus, as machine learning and AI tools become increasingly ubiquitous in our lives, we must also train ourselves to adapt to failures that can be produced by these tools.

2. Humans can make models more reliable and vice versa

Both AI and humans are unreliable systems. Part of the job as a machine learning engineer is to ensure that the unreliability of humans and AI don’t compound into a system that is even more unreliable. For example, if a model is updated on data labelled by biased individuals, then the model will propagate and spread such biases in it’s own decision making process. To combat these issues, machine learning engineers must design systems where humans and AI can work symbiotically to reduce their own unreliability and biases. This includes having diversity and representation in the data labelling process (especially when it comes to medical studies) and writing code “seams” where humans can intervene AI systems and correct wrong or biased decisions.

We can also extend this idea to the human side of the equation. One major achievement in AI systems is in the field of Radiology, where AI systems have been shown to outperform humans in detecting cancerous tumors. The best performing systems, however, are not the ones that replace radiologists but the ones that work with radiologists to improve their performance. In this case, the AI system is able to detect tumors that the radiologist missed and the radiologist is able to correct the AI system when it makes a mistake. This symbiotic relationship between AI and humans is the future of AI systems and is the simplest way to ensure that AI systems are reliable enough to be used in society.

3. Continual learning and adaptation are key

As a machine learning engineer, we know that failures are inherent in any AI systems, but we also want to ensure that our models fail in consistent ways over time. For example, a large language model trained and not adapted for years may make more mistakes and contain bias over time as society creates new terms and gains new perspectives. Think about how different the world was in the early 2010s compared to now, and it becomes clear that AI models cannot be static systems that we can build once and use for years without significant adaption. AI must be fluid as it often models and reflects the most dynamic pieces of our world–be it language, environment, art, or culture. It’s clear that continual learning and adaptation are key to ensuring that AI systems are reliable and useful into the future.