When deploying machine learning models in the real world, especially in domains like healthcare, robotics, or natural language processing, the stakes are high. It’s not enough to train a model, evaluate its accuracy, and call it a day. Questions of how confident the model is, how reliable its predictions are, and how to act on these predictions are critical yet often overlooked. This talk takes you beyond conventional metrics and into the world of uncertainty quantification and probability calibration, with conformal prediction as the definitive tool for both.
We’ll start of the presentation by exploring the fundamental need for uncertainty in AI systems—why it matters, how it’s quantified, and how it can be used to make informed decisions. From there, we’ll introduce conformal prediction, a mathematically rigorous yet practical framework that provides guarantees on prediction reliability while remaining model-agnostic. Core concepts such as probability calibration and uncertainty quantification will be highlighted as key parts in the modelling process, establishing their importance in the domain.
The session will also feature real-world examples and use cases such as:
Healthcare: Predict disease likelihood with quantifiable confidence, such as identifying risks in MRI scans.
Robotics: Navigate dynamic environments safely using calibrated vision-based models.
Natural Language Processing: Improve outputs of large language models with uncertainty-aware predictions.
Finally, we’ll showcase the TorchCP toolbox, a GPU-accelerated library for integrating conformal prediction into deep learning pipelines, an area of Data Science that has a lot of hype but often overlooks the importance of such tools. Through a live demonstration, you’ll see how to implement these methods step-by-step, empowering you to build trustworthy AI systems that go beyond accuracy.
Attendees will leave with:
A solid understanding of uncertainty quantification, probability calibration and their importance.
Practical knowledge of conformal prediction and how to implement it.
A new perspective on AI reliability and decision-making in critical domains.
Whether you're an ML researcher, data scientist, or practitioner deploying AI models in critical environments, this session will equip you with the right tools and philosophy to create AI systems that are not only accurate but also reliable and robust.