The Foundation Model Revolution for Tabular Data

Friday 10:55 in Platinum3

TabPFN shows how foundation model concepts can advance tabular data analysis in Python. Published in Nature Magazine in January 2025, it found strong community adoption with >3,000+ GitHub stars and 1,000,000+ downloads.

Detailed Outline:

Motivation

Why tabular data: examples of tabular prediction tasks and time series forecasting
Why foundation models for tabular data
Learning from the foundation model revolution in text and vision

Technical Insights

How we adapted transformers for tabular data
Making in-context learning work for structured data
Performance characteristics and resource requirements
How to apply TabPFN to time series

Practical Applications

When to choose TabPFN vs traditional methods
Resource requirements and scalability limits
What's next for TabPFN

Colab Demo

Key Takeaways:

Practical understanding of TabPFN's capabilities and limitations
Hands-on experience integrating with Python data science workflows
Best practices for working with foundation models on tabular data
Insight into emerging approaches for structured data analysis

Noah Hollmann

Frank Hutter

Frank is a Hector-Endowed Fellow and PI at the ELLIS Institute Tübingen and has been a full professor for Machine Learning at the University of Freiburg (Germany) since 2016. Previously, he has been an Emmy Noether Research Group Lead at the University of Freiburg since 2013. Before that, he did a PhD (2004-2009) and postdoc (2009-2013) at the University of British Columbia (UBC) in Canada. He received the 2010 CAIAC doctoral dissertation award for the best thesis in AI in Canada, as well as several best paper awards and prizes in international ML competitions. He is a Fellow of ELLIS and EurAI, Director of the ELLIS unit Freiburg, and the recipient of 3 ERC grants. Frank is best known for his research on automated machine learning (AutoML), including neural architecture search, efficient hyperparameter optimization, and meta-learning. He co-authored the first book on AutoML and the prominent AutoML tools Auto-WEKA, Auto-sklearn and Auto-PyTorch, won the first two AutoML challenges with his team, is co-teaching the first MOOC on AutoML, co-organized 15 AutoML-related workshops at ICML, NeurIPS and ICLR, and founded the AutoML conference as general chair in 2022. In recent years, his focus has been on the intersection of foundation models and AutoML, prominently including the first foundation model for tabular data, TabPFN.