The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI - podcast cover

The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

Astronomerairflow.apache.org
Welcome to The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI— the podcast where we keep you up to date with insights and ideas propelling the Airflow community forward. Join us each week, as we explore the current state, future and potential of Airflow with leading thinkers in the community, and discover how best to leverage this workflow management system to meet the ever-evolving needs of data engineering and AI ecosystems. Podcast Webpage: https://www.astronomer.io/podcast/
Download Metacast podcast app
Podcasts are better in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episodes

Transforming Customer Education in Data Engineering at Astronomer with Marc Lamberti

Understanding the complexities of Apache Airflow can be daunting for newcomers and seasoned data engineers. But with the right guidance, mastering the tool becomes an achievable milestone. In this episode, Marc Lamberti , Head of Customer Education at Astronomer , joins us to share his journey from Udemy instructor to driving education at Astronomer, and how he's helping over 100,000 learners demystify Airflow. Key Takeaways: (02:36) Early exposure to Airflow while addressing inefficiencies in d...

Jun 26, 202522 minSeason 1Ep. 46

Embracing Data Mesh and SQL Sensors for Scalable Workflows at lastminute.com with Alberto Crespi

The flexibility of Airflow plays a pivotal role in enabling decentralized data architectures and empowering cross-functional teams. In this episode, we speak with Alberto Crespi , Data Architect at lastminute.com , who shares how his team scales Airflow across 12 teams while supporting both vertical and horizontal structures under a data mesh approach. Key Takeaways: (02:17) Defining responsibilities within data architecture teams. (04:15) Consolidating multiple orchestrators into a single solut...

Jun 20, 202530 minSeason 1Ep. 45

The AI-Ready Pipeline: Reimagining Airflow at Veyer® Logistics with Anu Pabla

Innovation in orchestration is redefining how engineers approach both traditional ETL pipelines and emerging AI workloads. Understanding how to harness Airflow’s flexibility and observability is essential for teams navigating today’s evolving data landscape. In this episode, Anu Pabla , Principal Engineer at The ODP Corporation , joins us to discuss her journey from legacy orchestration patterns to AI-native pipelines and why she sees Airflow as the future of AI workload orchestration. Key Takea...

Jun 12, 202523 minSeason 1Ep. 44

Streamlining AI and ML Operations at IBM with BJ Adesoji and Ryan Yackel

The orchestration layer is foundational to building robust AI- and ML-powered data pipelines, especially in complex hybrid enterprise environments. IBM’s partnership with Astronomer reflects a strategic alignment to simplify and scale Airflow-based workflows across industries. In this episode, we’re joined by IBM ’s Senior Product Manager, BJ Adesoji , and GTM PM and Growth Leader, Ryan Yackel . We discuss how IBM customers are using Airflow in production, the challenges they face at scale and w...

Jun 05, 202525 minSeason 1Ep. 43

Inside the Custom Framework for Managing Airflow Code at Wix with Gil Reich

Efficient orchestration and maintainability are crucial for data engineering at scale. Gil Reich , Data Developer for Data Science at Wix , shares how his team reduced code duplication, standardized pipelines, and improved Airflow task orchestration using a Python-based framework built within the data science team. In this episode, Gil explains how this internal framework simplifies DAG creation, improves documentation accuracy, and enables consistent task generation for machine learning pipelin...

May 29, 202531 minSeason 1Ep. 42

Modernizing Legacy Data Systems With Airflow at Procter & Gamble with Adonis Castillo Cordero

Legacy architecture and AI workloads pose unique challenges at scale, especially in a global enterprise with complex data systems. In this episode, we explore strategies to proactively monitor and optimize pipelines while minimizing downstream failures. Adonis Castillo Cordero , Senior Automation Manager at Procter & Gamble , joins us to share actionable best practices for dependency mapping, anomaly detection and architecture simplification using Apache Airflow. Key Takeaways: (03:13) Integ...

May 22, 202522 minSeason 1Ep. 41

Building an End-to-End Data Observability System at Netflix with Joseph Machado

Building reliable data pipelines starts with maintaining strong data quality standards and creating efficient systems for auditing, publishing and monitoring. In this episode, we explore the real-world patterns and best practices for ensuring data pipelines stay accurate, scalable and trustworthy. Joseph Machado , Senior Data Engineer at Netflix , joins us to share practical insights gleaned from supporting Netflix’s Ads business as well as over a decade of experience in the data engineering spa...

May 15, 202539 minSeason 1Ep. 40

Why Developer Experience Shapes Data Pipeline Standards at Next Insurance with Snir Israeli

Creating consistency across data pipelines is critical for scaling engineering teams and ensuring long-term maintainability. In this episode, Snir Israeli , Senior Data Engineer at Next Insurance , shares how enforcing coding standards and investing in developer experience transformed their approach to data engineering. He explains how implementing automated code checks, clear documentation practices and a scoring system helped drive alignment across teams, improve collaboration and reduce techn...

May 08, 202530 minSeason 1Ep. 39

Data Quality and Observability at Tekmetric with Ipsa Trivedi

Airflow’s adaptability is driving Tekmetric’s ability to unify complex data workflows, deliver accurate insights and support both internal operations and customer-facing services — all within a rapidly growing startup environment. In this episode, Ipsa Trivedi , Lead Data Engineer at Tekmetric , shares how her team is standardizing pipelines while supporting unique customer needs. She explains how Airflow enables end-to-end data services, simplifies orchestration across varied sources and suppor...

May 01, 202523 minSeason 1Ep. 38

Introducing Apache Airflow® 3 with Vikram Koka and Jed Cunningham

The Airflow 3.0 release marks a significant leap forward in modern data orchestration, introducing architectural upgrades that improve scalability, flexibility and long-term maintainability. In this episode, we welcome Vikram Koka , Chief Strategy Officer at Astronomer , and Jed Cunningham , Principal Software Engineer at Astronomer , to discuss the architectural foundations, new features and future implications of this milestone release. They unpack the rationale behind DAG versioning and task ...

Apr 24, 202527 minSeason 1Ep. 37

Airflow in Action: Powering Instacart's Complex Ecosystem

The evolution of data orchestration at Instacart highlights the journey from fragmented systems to robust, standardized infrastructure. This transformation has enabled scalability, reliability and democratization of tools for diverse user personas. In this episode, we’re joined by Anant Agarwal , Software Engineer at Instacart , who shares insights into Instacart's Airflow journey, from its early adoption in 2019 to the present-day centralized cluster approach. Anant discusses the challenges of ...

Apr 17, 202525 minSeason 1Ep. 36

From ETL to Airflow: Transforming Data Engineering at Deloitte Digital with Raviteja Tholupunoori

Data orchestration at scale presents unique challenges, especially when aiming for flexibility and efficiency across cloud environments. Choosing the right tools and frameworks can make all the difference. In this episode, Raviteja Tholupunoori, Senior Engineer at Deloitte Digital , joins us to explore how Airflow enhances orchestration, scalability and cost efficiency in enterprise data workflows. Key Takeaways: (01:45) Early challenges in data orchestration before implementing Airflow. (02:42)...

Apr 10, 202528 minSeason 1Ep. 35

A Deep Dive Into the 2025 State of Airflow Survey Results with Tamara Fingerlin of Astronomer

The 2025 State of Airflow report sheds light on how global users are adopting, evolving and innovating with Apache Airflow. With over 5,000 responses from 116 countries, the survey reveals critical insights into Airflows’ role in business operations, new use cases and what’s ahead for the community. In this episode, Tamara Fingerlin , Developer Advocate at Astronomer , walks us through her process of analyzing survey data, key trends from the report and what to expect from Airflow 3.0. Key Takea...

Apr 03, 202523 minSeason 1Ep. 34

Airflow’s Role in the Rise of DataOps with Andy Byron

The orchestration layer is evolving into a critical component of the modern data stack. Understanding its role in DataOps is key to optimizing workflows, improving reliability and reducing complexity. In this episode, Andy Byron , CEO at Astronomer , discusses the rapid growth of Apache Airflow, the increasing importance of orchestration and how Astronomer is shaping the future of DataOps. Key Takeaways: (01:54) Orchestration is central to modern data workflows. (03:16) Airflow 3.0 will enhance ...

Mar 27, 202526 minSeason 1Ep. 33

The Software Risk That Affects Everyone and How To Address It with Michael Winser and Jarek Potiuk

The security of open-source software is a growing concern, especially as dependencies and regulations become more complex, making it essential to understand how to manage software supply chains effectively. In this episode, we sit down with Michael Winser , Co-Founder at Alpha-Omega and Security Strategy Ambassador at Eclipse Foundation , and Jarek Potiuk , Member of the Security Committee at the Apache Software Foundation , to discuss the challenges of securing Airflow’s dependencies, the evolv...

Mar 20, 202528 minSeason 1Ep. 32

Building Scalable ML Infrastructure at Outerbounds with Savin Goyal

Machine learning is changing fast, and companies need better tools to handle AI workloads. The right infrastructure helps data scientists focus on solving problems instead of managing complex systems. In this episode, we talk with Savin Goyal , Co-Founder and CTO at Outerbounds , about building ML infrastructure, how orchestration makes workflows easier and how Metaflow and Airflow work together to simplify data science. Key Takeaways: (02:02) Savin spent years building AI and ML infrastructure,...

Mar 13, 202537 minSeason 1Ep. 31

Customizing Airflow for Complex Data Environments at Stripe with Nick Bilozerov and Sharadh Krishnamurthy

Keeping data pipelines reliable at scale requires more than just the right tools — it demands constant innovation. In this episode, Nick Bilozerov , Senior Data Engineer at Stripe , and Sharadh Krishnamurthy , Engineering Manager at Stripe, discuss how Stripe customizes Airflow for its needs, the evolution of its data orchestration framework and the transition to Airflow 2. They also share insights on scaling data workflows while maintaining performance, reliability and developer experience. Key...

Mar 06, 202528 minSeason 1Ep. 30

Harnessing Airflow for Data-Driven Policy Research at CSET with Jennifer Melot

Turning complex datasets into meaningful analysis requires robust data infrastructure and seamless orchestration. In this episode, we’re joined by Jennifer Melot , Technical Lead at the Center for Security and Emerging Technology (CSET) at Georgetown University, to explore how Airflow powers data-driven insights in technology policy research. Jennifer shares how her team automates workflows to support analysts in navigating complex datasets. Key Takeaways: (02:04) CSET provides data-driven analy...

Feb 27, 202518 minSeason 1Ep. 29

Leveraging Airflow To Build Scalable and Reliable Data Platforms at 99acres.com with Samyak Jain

Data orchestration is evolving rapidly, with dynamic workflows becoming the cornerstone of modern data engineering. In this episode, we are joined by Samyak Jain , Senior Software Engineer - Big Data at 99acres.com . Samyak shares insights from his journey with Apache Airflow, exploring how his team built a self-service platform that enables non-technical teams to launch data pipelines and marketing campaigns seamlessly. Key Takeaways: (02:02) Starting a career in data engineering by troubleshoo...

Feb 20, 202525 minSeason 1Ep. 28

Hybrid Testing Solutions for Autonomous Driving at Bosch with Jens Scheffler and Christian Schilling

Testing autonomous vehicles demands precision, scalability and powerful orchestration tools — enter Apache Airflow, a key component of Bosch’s cutting-edge testing framework. In this episode, we sit down with Jens Scheffler , Test Execution Cluster Technical Architect, and Christian Schilling , Product Owner Open Loop Testing Automated Driving, both at Bosch , to explore how Bosch harnesses Airflow to streamline complex testing scenarios. They share insights on scaling workflows, integrating hyb...

Feb 13, 202534 minSeason 1Ep. 27

Overcoming Airflow Scaling Challenges at Monzo Bank with Jonathan Rainer

Scaling a data orchestration platform to manage thousands of tasks daily demands innovative solutions and strategic problem-solving. In this episode, we explore the complexities of scaling Airflow and the challenges of orchestrating thousands of tasks in dynamic data environments. Jonathan Rainer , Former Platform Engineer at Monzo Bank , joins us to share his journey optimizing data pipelines, overcoming UI limitations and ensuring DAG consistency in high-stakes scenarios. Key Takeaways: (03:11...

Feb 07, 202544 minSeason 1Ep. 26

Orchestrating Analytics and AI Workflows at Telia with Arjun Anandkumar

The future of data engineering lies in seamless orchestration and automation. In this episode, Arjun Anandkumar , Data Engineer at Telia , shares how his team uses Airflow to drive analytics and AI workflows. He highlights the challenges of scaling data platforms and how adopting best practices can simplify complex processes for teams across the organization. Arjun also discusses the transformative role of tools like Cosmos and Terraform in enhancing efficiency and collaboration. Key Takeaways: ...

Jan 30, 202526 minSeason 1Ep. 25

The Role of Airflow in Finance Transformation at Etraveli Group with Mihir Samant

Transforming bottlenecked finance processes into streamlined, automated systems requires the right tools and a forward-thinking approach. In this episode, Mihir Samant , Senior Data Analyst at Etraveli Group , joins us to share how his team leverages Airflow to revolutionize finance automation. With extensive experience in data workflows and a passion for open-source tools, Mihir provides valuable insights into building efficient, scalable systems. We explore the transformative power of Airflow ...

Jan 23, 202521 minSeason 1Ep. 24

Inside Ford’s Data Transformation: Advanced Orchestration Strategies with Vasantha Kosuri-Marshall

Data engineering is entering a new era, where orchestration and automation are redefining how large-scale projects operate. This episode features Vasantha Kosuri-Marshall , Data and ML Ops Engineer at Ford Motor Company . Vasantha shares her expertise in managing complex data pipelines. She takes us through Ford's transition to cloud platforms, the adoption of Airflow and the intricate challenges of orchestrating data in a diverse environment. Key Takeaways: (03:10) Vasantha’s transition to the ...

Jan 16, 202539 minSeason 1Ep. 23

Powering Finance With Advanced Data Solutions at Ramp with Ryan Delgado

Data is the backbone of every modern business, but unlocking its full potential requires the right tools and strategies. In this episode, Ryan Delgado , Director of Engineering at Ramp , joins us to explore how innovative data platforms can transform business operations and fuel growth. He shares insights on integrating Apache Airflow, optimizing data workflows and leveraging analytics to enhance customer experiences. Key Takeaways: (01:52) Data is the lifeblood of Ramp, touching every vertical ...

Jan 10, 202525 minSeason 1Ep. 22

Exploring the Power of Airflow 3 at Astronomer with Amogh Desai

What does it take to go from fixing a broken link to becoming a committer for one of the world’s leading open-source projects? Amogh Desai , Senior Software Engineer at Astronomer , takes us through his journey with Apache Airflow. From small contributions to building meaningful connections in the open-source community, Amogh’s story provides actionable insights for anyone on the cusp of their open-source journey. Key Takeaways: (02:09) Building data engineering platforms at Cloudera with Kubern...

Dec 20, 202430 minSeason 1Ep. 21

Using Airflow To Power Machine Learning Pipelines at Optimove with Vasyl Vasyuta

Data orchestration and machine learning are shaping how organizations handle massive datasets and drive customer-focused strategies. Tools like Apache Airflow are central to this transformation. In this episode, Vasyl Vasyuta , R&D Team Leader at Optimove , joins us to discuss how his team leverages Airflow to optimize data processing, orchestrate machine learning models and create personalized customer experiences. Key Takeaways: (01:59) Optimove tailors marketing notifications with persona...

Dec 12, 202424 minSeason 1Ep. 20

Maximizing Business Impact Through Data at GlossGenius with Katie Bauer

Bridging the gap between data teams and business priorities is essential for maximizing impact and building value-driven workflows. Katie Bauer , Senior Director of Data at GlossGenius , joins us to share her principles for creating effective, aligned data teams. In this episode, Katie draws from her experience at GlossGenius, Reddit and Twitter to highlight the common pitfalls data teams face and how to overcome them. She offers practical strategies for aligning team efforts with organizational...

Dec 05, 202426 minSeason 1Ep. 19

Optimizing Large-Scale Deployments at LinkedIn with Rahul Gade

Scaling deployments for a billion users demands innovation, precision and resilience. In this episode, we dive into how LinkedIn optimizes its continuous deployment process using Apache Airflow. Rahul Gade , Staff Software Engineer at LinkedIn , shares his insights on building scalable systems and democratizing deployments for over 10,000 engineers. Rahul discusses the challenges of managing large-scale deployments across 6,000 services and how his team leverages Airflow to enhance efficiency, r...

Dec 02, 202428 minSeason 1Ep. 18

How Uber Manages 1 Million Daily Tasks Using Airflow, with Shobhit Shah and Sumit Maheshwari

When data orchestration reaches Uber’s scale, innovation becomes a necessity, not a luxury. In this episode, we discuss the innovations behind Uber’s unique Airflow setup. With our guests Shobhit Shah and Sumit Maheshwari , both Staff Software Engineers at Uber , we explore how their team manages one of the largest data workflow systems in the world. Shobhit and Sumit walk us through the evolution of Uber’s Airflow implementation, detailing the custom solutions that support 200,000 daily pipelin...

Nov 14, 202429 minSeason 1Ep. 17
For the best experience, listen in Metacast app for iOS or Android
Open in Metacast