From Queries to Confidence: Ensuring SQL Reliability with Python

Anna Varzina

Friday 10:15 in Titanium3

SQL is an essential part of data-driven applications, powering everything from simple queries to complex data transformations. However, ensuring the accuracy and reliability of SQL code is often challenging, particularly when dealing with intricate logic or large-scale datasets. Also, deploying changes in SQL code to production is another complex task, as it requires careful validation to avoid breaking the query logic.

Fortunately, integrating Python’s testing framework such as pytest into SQL workflows provides a streamlined solution for these challenges. Such approach enables creating clean, efficient, and automated testing processes for SQL code and database logic. Therefore, we can validate query results, enforce schema consistency, and simulate complex data scenarios, all while reducing manual effort and improving test coverage.

This talk will address:

  • configuring lightweight database fixtures
  • verifying SQL query result and testing scripts seamlessly
  • data mocking
  • schema validation
  • testing non-deterministic queries
  • handling large datasets

Attendees will gain insights into improving SQL code quality, identifying issues early in the development process, and ensuring the reliability of data-driven products. This presentation is particularly beneficial for Data Scientists, Engineers, and Analysts seeking to enhance the efficiency and precision of their testing practices.

Anna Varzina

Anna Varzina is a Data Science Engineer at Lighthouse, where she has been developing data-driven solutions for the hospitality industry since 2021. She specialises in working with large datasets and performing complex data transformations using Python and SQL to extract meaningful insights. This is Anna's first time speaking at PyCon/PyData, and she is excited to share her experiences in overcoming the challenges of building reliable and scalable data workflows.