The Foundation Model Revolution for Tabular Data

Noah Hollmann, Frank Hutter

Friday 10:55 in Platinum3

TabPFN shows how foundation model concepts can advance tabular data analysis in Python. Published in Nature Magazine in January 2025, it found strong community adoption with >3,000+ GitHub stars and 1,000,000+ downloads.

Detailed Outline:

  1. Motivation
  • Why tabular data: examples of tabular prediction tasks and time series forecasting
  • Why foundation models for tabular data
  • Learning from the foundation model revolution in text and vision
  1. Technical Insights
  • How we adapted transformers for tabular data
  • Making in-context learning work for structured data
  • Performance characteristics and resource requirements
  • How to apply TabPFN to time series
  1. Practical Applications
  • When to choose TabPFN vs traditional methods
  • Resource requirements and scalability limits
  • What's next for TabPFN
  1. Colab Demo
  • Q&A

Key Takeaways:

  • Practical understanding of TabPFN's capabilities and limitations
  • Hands-on experience integrating with Python data science workflows
  • Best practices for working with foundation models on tabular data
  • Insight into emerging approaches for structured data analysis

Noah Hollmann

Frank Hutter

Frank is a Hector-Endowed Fellow and PI at the ELLIS Institute Tübingen and has been a full professor for Machine Learning at the University of Freiburg (Germany) since 2016. Previously, he has been an Emmy Noether Research Group Lead at the University of Freiburg since 2013. Before that, he did a PhD (2004-2009) and postdoc (2009-2013) at the University of British Columbia (UBC) in Canada. He received the 2010 CAIAC doctoral dissertation award for the best thesis in AI in Canada, as well as several best paper awards and prizes in international ML competitions. He is a Fellow of ELLIS and EurAI, Director of the ELLIS unit Freiburg, and the recipient of 3 ERC grants. Frank is best known for his research on automated machine learning (AutoML), including neural architecture search, efficient hyperparameter optimization, and meta-learning. He co-authored the first book on AutoML and the prominent AutoML tools Auto-WEKA, Auto-sklearn and Auto-PyTorch, won the first two AutoML challenges with his team, is co-teaching the first MOOC on AutoML, co-organized 15 AutoML-related workshops at ICML, NeurIPS and ICLR, and founded the AutoML conference as general chair in 2022. In recent years, his focus has been on the intersection of foundation models and AutoML, prominently including the first foundation model for tabular data, TabPFN.