Interactive end-to-end root-cause analysis with explainable AI in a Python Shiny App

Simone Lederer, Julius Möller

Wednesday 11:45 in Platinum3

Problem Statement Data scientists' daily work is characterized by a repetitive and time-consuming cycle of exploratory data analysis, preprocessing, model training, and feature identification. This ultimately means missing key insights into the data. Time spent on repetitive tasks detracts from critical work. We enable data scientists to focus on what matters.

Solution We streamline the data analysis process to facilitate efficient dataset exploration and uncovering critical insights without time spent on coding. We empower users to seamlessly conduct data preprocessing, interactive exploratory analysis, on-demand model training, evaluation, and interpretation, reducing the time to understand a dataset to under an hour.

Demonstrator Our pure Python application features a reactive dashboard. It allows users to engage with data—uploading, manipulating, creating interactive visualizations, performing on-demand model training and interpretation, while tracking results in MLflow. We demonstrate how to quickly deliver insights and identify root causes.

Architecture/Technical Implementation Our application is built entirely in Python, utilizing the Shiny framework for a reactive dashboard. The backend uses Plotly, Scikit-learn, CatBoost, SHAP values, and MLflow. We highlight the core functionalities and development choices, emphasizing data preprocessing, model training, evaluation, and explainable AI features.

Simone Lederer

Trained as a mathematician, I quickly delved into the world of machine learning and computational statistics to learn more about cancer dynamics in molecular biology and patient data. I currently work as a Machine Learning Engineer in the domains of Med-Tech, optics, and semi-conductors at Carl Zeiss AG.

Julius Möller