3 Ways to Speed up Your Regression Modeling in Python

Alexander Fischer

Friday 14:55 in Titanium3

We introduce three different ways to make regressions run faster.

We first introduce sparse solvers and show how to run regressions on sparse matrices via scikit-learn and the fastreg libraries.

We then lay out the Frisch-Waugh-Lovell theorem and the alternating projections algorithm and show how to speed it up on the CPU (via numba) and on the GPU (via JAX) as implemented in the pyfixest library.

Finally, we demonstrate how to drastically speed up regression estimation by first preprocessing the data in duckdb and then fitting a regression via weighted least squares in memory.

References:

Alexander Fischer

Economist and Data Scientist. I spend most of my week working on online auctions at Trivago. In the evenings and weekend, I work on open source packages for regression modeling and inference in R and Python.