Data Engineering Weekly - podcast cover

Data Engineering Weekly

Data Engineering Weeklywww.dataengineeringweekly.com
Data Engineering Weekly is a podcast reflection of the popular data engineering newsletter www.dataengineeringweekly.com

Episodes

Data Engineering Weekly: Reflecting on 2023 and Looking Ahead to 2024

Welcome to another insightful edition of Data Engineering Weekly. As we approach the end of 2023, it's an opportune time to reflect on the key trends and developments that have shaped the field of data engineering this year. In this article, we'll summarize the crucial points from a recent podcast featuring Ananth and Ashwin, two prominent voices in the data engineering community. Understanding the Maturity Model in Data Engineering A significant part of our discussion revolved around th...

Dec 25, 202338 min

DEW #133: How to Implement Write-Audit-Publish (WAP), Vector Database - Concepts and examples & Data Warehouse Testing Strategies for Better Data Quality

Welcome to another episode of Data Engineering Weekly. Aswin and I select 3 to 4 articles from each edition of Data Engineering Weekly and discuss them from the author’s and our perspectives. On DEW #133, we selected the following article LakeFs: How to Implement Write-Audit-Publish (WAP) I wrote extensively about the WAP pattern in my latest article, An Engineering Guide to Data Quality - A Data Contract Perspective . Super excited to see a complete guide on implementing the WAP pattern in Iceb...

Jul 05, 202323 min

DEW #132: The New Generative AI Infra Stack, Databricks cost management at Coinbase, Exploring an Entity Resolution Framework Across Various Use Cases & What's the hype behind DuckDB?

Welcome to another episode of Data Engineering Weekly. Aswin and I select 3 to 4 articles from each edition of Data Engineering Weekly and discuss them from the author’s and our perspectives. On DEW #132, we selected the following article Cowboy Ventures: The New Generative AI Infra Stack Generative AI has taken the tech industry by storm. In Q1 2023, a whopping $1.7B was invested into gen AI startups. Cowboy ventures unbundle the various categories of Generative AI infra stack here. https://med...

Jul 05, 202335 min

DEW #131: dbt model contract, Instacart ads modularization in LakeHouse Architecture, Jira to automate Glue tables, Server-Side Tracking

Welcome to another episode of Data Engineering Weekly. Aswin and I select 3 to 4 articles from each edition of Data Engineering Weekly and discuss them from the author’s and our perspectives. On DEW #131, we selected the following article Ramon Marrero: DBT Model Contracts - Importance and Pitfalls dbt introduces model contract with 1.5 release. There were a few critics of the dbt model implementation, such as The False Promise of dbt Contracts . I found the argument made in the false promise of...

Jun 09, 202328 min

DEW #129: DoorDash's Generative AI, Europe data salary, Data Validation with Great Expectations, Expedia's Event Sourcing

Welcome to another episode of Data Engineering Weekly. Aswin and I select 3 to 4 articles from each edition of Data Engineering Weekly and discuss them from the author’s and our perspectives. On DEW #129, we selected the following article DoorDash identifies Five big areas for using Generative AI Generative AI has taken the industry by storm, and every company is trying to determine what it means to them. DoorDash writes about its discovery of Generative AI and its application to boost its busin...

May 27, 202332 min

DEW #124: State of Analytics Engineering, ChatGPT, LLM & the Future of Data Consulting, Unified Streaming & Batch Pipeline, and Kafka Schema Management

Welcome to another episode of Data Engineering Weekly. Aswin and I select 3 to 4 articles from each edition of Data Engineering Weekly and discuss them from the author’s and our perspectives. On DEW #124 [ https://www.dataengineeringweekly.com/p/data-engineering-weekly-124 ], we selected the following article dbt: State of Analytics Engineering dbt publishes the state of analytical [data???🤔] engineering. If you follow Data Engineering Weekly, We actively talk about data contracts & how dat...

Apr 29, 202337 minSeason 1Ep. 9

DEW #123: Generative AI at BuzzFeed, Building OnCall Culture & Dimensional Modeling at WhatNot

Welcome to another episode of Data Engineering Weekly Radio. Ananth and Aswin discussed a blog from BuzzFeed that shares lessons learned from building products powered by generative AI. The blog highlights how generative AI can be integrated into a company's work culture and workflow to enhance creativity rather than replace jobs. BuzzFeed provided their employees with intuitive access to APIs and integrated the technology into Slack for better collaboration. Some of the lessons learned from...

Apr 22, 202333 minSeason 1Ep. 8

DEW #122: dbt Reimagined, Change Data Capture @ Brex, on Data Products and how to describe them

DBT Reimagined by Pedram Navid https://pedram.substack.com/p/dbt-reimagined The challenge with this, having the Jinja templating, I found out two things. One is like; it is on runtime. So you have to build it and then run some simulations to understand whether you did it correctly or not. Jinja Templates also add cognitive load. The developers have to know how the Jinja template will work; how SQL will work, and it becomes a bit difficult to read and understand. In this conversation with Aswin, ...

Apr 13, 202341 minSeason 1Ep. 7

What Happened at Data Council 2023?

Hey folks, have you heard about the Data Council conference in Austin? The three-day event was jam-packed with exciting discussions and innovative ideas on data engineering and infrastructure, data science and algorithms, MLOps, generative AI, streaming infrastructure, analytics, and data culture and community. "People are so nice in the data community. Meeting them and brainstorming with many ideas and various thought processes is amazing. It was an amazing experience; The conference is mo...

Apr 06, 202336 minSeason 1Ep. 6

Analysis on MAD [Machine Learning, Artificial Intelligence & Data] Landscape

In this episode of Data Engineering Weekly Radio, we delve into modern data stacks under pressure and the potential consolidation of the data industry. We refer to a four-part article series that explores the data infrastructure landscape and the Software as a Service (SaaS) products available in data engineering, machine learning, and artificial intelligence. We discussed that the siloed nature of many data products has led to industry consolidation, ultimately benefiting customers. Throughout ...

Mar 31, 202349 minSeason 1Ep. 5

DEW #121: Data Product @ Oda, Reflection Talking with Data Leaders & Great Migration To Snowflake

Subscribe to www.dataengineeringweekly.com From Data Engineering Weekly Edition #121, we took the following articles Oda: Data as a product at Oda Oda writes an exciting blog about “Data as a Product,” describing why we must treat data as a product, dashboard as a product, and the ownership model for data products. https://medium.com/oda-product-tech/data-as-a-product-at-oda-fda97695e820 The blog highlights six key principles of the value creation of data. Domain knowledge + discipline expertise...

Mar 22, 202343 minSeason 1Ep. 4

DEW #120: The Case for Data Contracts, Action-Position data quality assessment framework & Stop emphasizing the Data Catalog

Please read Data Engineering Weekly Edition #120  Topic 1: Colin Campbell: The Case for Data Contracts - Preventative data quality rather than reactive data quality In this episode, we focus on the importance of data contracts in preventing data quality issues. We discuss an article by Colin Campbell highlighting the need for a data catalog and the market scope for data contract solutions. We also touch on the idea that data creation will be a decentralized process and the role of tools lik...

Mar 12, 202336 minSeason 1Ep. 3

DEW #119: Netflix's Scaling Media Machine Learning at Netflix, Open Table Formats Square Off in Lakehouse Data Smackdown & Building a semantic layer in Preset (Superset) with dbt

We are super excited to be back to discussing Data Engineering Weekly Newsletter articles every week. We will take 2 or 3 articles from each week's Data Engineering Weekly edition and go through an in-depth analysis.  On Data Engineering Weekly edition #119, We are taking three articles. #1 Netflix's article about Scaling Media Machine Learning at Netflix https://netflixtechblog.com/scaling-media-machine-learning-at-netflix-f19b400243 #2 Alex Woodie's article about Open Table Formats Square...

Mar 06, 202323 minSeason 1Ep. 2

Data Engineering Weekly #75

I am sharing my thoughts around the 75th edition of the Data Engineering Weekly newsletter. You can read the edition here https://www.dataengineeringweekly.com/p/data-engineering-weekly-75 The featured articles this week are, 📚 Dagster: Bundling Vs UnBundling the Data Platform 📚 Prefect: Logs, the Prefect Way 📚 Pinterest: Spinner - Pinterest’s Workflow Platform 📚 Apache Arrow: Introducing Apache Arrow Flight SQL - Accelerating Database Access 📚 Kevin Kho: Introducing Fugue — Reducing PySpar...

Feb 21, 202216 minSeason 1Ep. 1