All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

How Mindbox replaced PySpark with YAML-based pipelines using dlt, dbt, and Trino

By

Kiril Kazlou

3d ago· 11 min readenInsight

Summary

Data engineer Kiril Kazlou describes how Mindbox replaced PySpark-based data pipelines with a stack using dlt, dbt, and Trino, configured through just 4 YAML files. This transformation allowed analysts with no Python experience to build and deploy data pipelines in a single day, compared to the previous three-week turnaround time when every pipeline required a developer. The article details the technical architecture, the reasoning behind each tool choice, and the organizational impact of empowering non-engineers to own data workflows.

Key quotes

· 3 pulled
It took us three weeks to ship a single data pipeline. Today, an analyst with zero Python experience does it in a day.
You can't really work with PySpark without Python experience. Every new pipeline required a developer. And that meant waiting — sometimes for weeks.
We replaced Python pipelines with dlt, dbt, and Trino — and cut delivery time from weeks to one day.
Snippet from the RSS feed
How we replaced Python pipelines with dlt, dbt, and Trino — and cut delivery time from weeks to one day.

You might also wanna read