How We Built a Scalable Analytics Architecture with Airflow and dbt
Key functionality of Updater’s data platform is its ability to transform raw data (think raw application logs) into data structures designed for analytics (think Looker, Domo, etc.). This allows for 1) our Data team to spend more time finding insights and driving business impact, and 2) internal stakeholders to analyze that same data without needing a technical skillset.
Two of our data teammates, John Lynch and Flavien Bessede, recently contributed a brilliant article on exactly how we make magic happen via Airflow and dbt on the Astronomer blog.
You can read our team’s full solution in two parts:
Part One – Intro to the challenge and a better way to have Airflow manage and schedule dbt runs at scale
Part Two – Creative ways to take the authoring experience into a production-ready setup
Airflow and dbt are the primary tools we use to build and automate data transformations, and as our platform scales, so too has the number of transformations. We eventually reached a point of complexity where Airflow and dbt were no longer working together harmoniously. Instead, the two created tension within our Data Platform architecture and adding/maintaining transformations became onerous for our team.
As a result, John and Flavien envisaged and built an architecture that allowed dbt and Airflow to work together to resolve the tension. Resolving the tension and issues between the two is a common concern in the data industry, but to our knowledge, it’s never been solved as wholly and as advantageously as John and Flavien’s solution. We’ve been calling them trendsetters ever since.
Once live, this solution provided advantages for both our Data team and our company:
Benefits for the Data team:
- It’s faster and easier to create new transformations
- It requires less effort to maintain existing transformations
- We have more control over interplay between different transformations
- Our team is more efficient
- We don’t constantly struggle against our software #winning
Benefits for the entire Updater team:
- The Data team delivers more reporting and self-serve analytics (because we can create more transformations)
- The Data team has more time available to focus on business outcomes, insights, and data education
We’re proud to have these two problem solvers on our team. If the interplay between Airflow and dbt is of interest, check out Parts One and Two or reach out to us to brainstorm together.
Or, simply join our team and build our next big win – view our current openings.