Among popular open-source MLOps tools, Kubeflow stands out as a Kubernetes-native platform designed to support the entire ML lifecycle, from data preprocessing to model training, deployment, and retraining. Its modular structure enables the integration of a wide range of tools, making it a highly versatile framework for building scalable and reproducible ML workflows. Despite this, most existing resources focus on individual components rather than demonstrating how these can be orchestrated into a seamless, end-to-end pipeline.
In this talk, we present a practical case study that highlights the potential of Kubeflow in a real-world application. Specifically, we showcase how an automated ML pipeline for anomaly detection in International Space Station (ISS) telemetry data can be built and deployed using Kubeflow and other open-source MLOps tools. The dataset, originating from the Columbus module of the ISS, introduces unique challenges due to its complexity and high-dimensional nature, providing an excellent testbed for MLOps workflows.
Our workflow is built entirely in Python, leveraging Kubeflow’s Python SDK to orchestrate every stage of the pipeline. This eliminates the need for manual interaction with Kubernetes or container configurations, making the process accessible to ML engineers and data scientists without extensive DevOps expertise.
To ensure reproducibility and accessibility, the entire pipeline, including configurations and code, is publicly available in our GitHub repository here. Attendees will be able to replicate the workflow, adapt it to their own use cases, or extend it with additional features.
This session is designed for data scientists, ML engineers, and Python enthusiasts who want to simplify the development of scalable ML pipelines. Whether you're new to Kubernetes or looking to streamline your MLOps workflows, this talk will provide actionable insights and tools to help you succeed.