Demos and prototypes for projects with generative AI can be quickly put together: an API key from the preferred model provider, some source code from the online tutorial and a few small adjustments suffice. Thanks to Streamlit and the like, even beginners can achieve impressive results that can be used by users within a few hours.
But what happens when users actually like the solution? When demos and prototypes need to be expanded and connected to other systems? What if the number of users continues to rise?
It is quite impressive how far you can bend Streamlit to achieve things it was probably never meant for. But at a certain point, you pay for the hacks and workarounds with unreliability and frustrating debugging.
The speakers repeatedly reached this point in various projects and delayed the necessary architecture discussion for too long. So the path was longer and more painful than it should have been – but in the end, thanks to the wide range of open-source (Python) projects, a flexible and stable system was created. Our current tech stack includes Qdrant, Postgress, Litellm and FastAPI – as well as OpenWebUI, and of course Streamlit.
Thanks to modularization, we now have a stable system that we can easily run locally but also deploy in an enterprise environment. Nevertheless, we have retained a great deal of flexibility.
In our talk, we report on the trials and tribulations along the way. We report on the challenges that led to decisions for various components. We disclose which problems we were able to solve and which new problems arose.
The talk is aimed primarily at those who are taking their first steps with generative AI or have already developed their first demonstrators or prototypes.
Structure:
(1) GenAI applications in Streamlit are cool (2) The challenges on the way from prototype to productive deployment (3) Ramming heads through walls (4) The path to a flexible but stable stack (5) What still plagues us