Reinforcement Learning Without a PhD: A Python Developer’s Journey

Jochen Luithardt

Wednesday 14:30 in Palladium

Reinforcement Learning (RL) has made headlines for beating humans at Go and StarCraft, and it’s already being used by companies like Google, Amazon, and Lyft to optimize real-world systems. But outside of big tech and research labs, RL is still rarely applied. Why? Because even though RL is powerful, it's also complex, resource-intensive, and hard to implement without the right tools.

In this talk, we explore what it really takes to bring RL into production—without a PhD, a research team, or unlimited infrastructure. I’ll share the story of how we applied RL to a real-world business problem: optimizing digital campaign management in a fast-changing environment. We faced all the classic challenges—limited data, no simulator, and no out-of-the-box tools that actually worked for our use case.

We’ll look at how we built a training environment from historical data, dealt with uncertainty using ensemble models, and iterated through a long cycle of trial, error, and learning. That experience eventually led us to create pi_optimal, an open-source toolkit designed to make RL more accessible to Python developers and data scientists.

You’ll walk away with a clear understanding of:

  • Why RL is powerful, but rarely applied in practice
  • What makes real-world RL so challenging
  • How we got a working RL system off the ground without a PhD in RL
  • How pi_optimal helps lower the barrier to entry
  • How you can get started with RL, either through theory or hands-on practice

Whether you're RL-curious or looking to apply it in your own projects, this talk offers practical insights and a live demo to help you take your first steps.

Jochen Luithardt

I'm the Co-Founder of pi_optimal, where we're working to democratize reinforcement learning and make it usable for real-world decision-making. My passion lies in building AI systems that don't just work in theory, but actually solve meaningful problems in practice.

Before that, I was Lead Data Scientist at Stellwerk3 GmbH, where I led the development of a model-based reinforcement learning project for campaign control. I also had the chance to represent the company at Cyber Valley Incubator events and build a strong, collaborative data team.

My academic journey brought me to the Max Planck Institute for Intelligent Systems, where I focused on challenges in autonomous learning — from sparse rewards in model-free RL to structured world models and graph networks in model-based approaches. Earlier on, I also worked in digital advertising technology at Gruner + Jahr, developing deep learning models for ad click prediction.

Across all these experiences, one thing has stayed the same: I love taking complex machine learning concepts and turning them into impactful, real-world applications.