Welcome to another insightful edition of Data Engineering Weekly. As we approach the end of 2023, it's an opportune time to reflect on the key trends and developments that have shaped the field of data engineering this year. In this article, we'll summarize the crucial points from a recent podcast featuring Ananth and Ashwin, two prominent voices in the data engineering community. Understanding the Maturity Model in Data Engineering A significant part of our discussion revolved around th...
Dec 25, 2023•38 min•Transcript available on Metacast Welcome to another episode of Data Engineering Weekly. Aswin and I select 3 to 4 articles from each edition of Data Engineering Weekly and discuss them from the author’s and our perspectives. On DEW #133, we selected the following article LakeFs: How to Implement Write-Audit-Publish (WAP) I wrote extensively about the WAP pattern in my latest article, An Engineering Guide to Data Quality - A Data Contract Perspective . Super excited to see a complete guide on implementing the WAP pattern in Iceb...
Jul 05, 2023•23 min•Transcript available on Metacast Welcome to another episode of Data Engineering Weekly. Aswin and I select 3 to 4 articles from each edition of Data Engineering Weekly and discuss them from the author’s and our perspectives. On DEW #132, we selected the following article Cowboy Ventures: The New Generative AI Infra Stack Generative AI has taken the tech industry by storm. In Q1 2023, a whopping $1.7B was invested into gen AI startups. Cowboy ventures unbundle the various categories of Generative AI infra stack here. https://med...
Jul 05, 2023•35 min•Transcript available on Metacast Welcome to another episode of Data Engineering Weekly. Aswin and I select 3 to 4 articles from each edition of Data Engineering Weekly and discuss them from the author’s and our perspectives. On DEW #131, we selected the following article Ramon Marrero: DBT Model Contracts - Importance and Pitfalls dbt introduces model contract with 1.5 release. There were a few critics of the dbt model implementation, such as The False Promise of dbt Contracts . I found the argument made in the false promise of...
Jun 09, 2023•28 min•Transcript available on Metacast Welcome to another episode of Data Engineering Weekly. Aswin and I select 3 to 4 articles from each edition of Data Engineering Weekly and discuss them from the author’s and our perspectives. On DEW #129, we selected the following article DoorDash identifies Five big areas for using Generative AI Generative AI has taken the industry by storm, and every company is trying to determine what it means to them. DoorDash writes about its discovery of Generative AI and its application to boost its busin...
May 27, 2023•32 min•Transcript available on Metacast Welcome to another episode of Data Engineering Weekly. Aswin and I select 3 to 4 articles from each edition of Data Engineering Weekly and discuss them from the author’s and our perspectives. On DEW #124 [ https://www.dataengineeringweekly.com/p/data-engineering-weekly-124 ], we selected the following article dbt: State of Analytics Engineering dbt publishes the state of analytical [data???🤔] engineering. If you follow Data Engineering Weekly, We actively talk about data contracts & how dat...
Apr 29, 2023•37 min•Ep 9•Transcript available on Metacast Welcome to another episode of Data Engineering Weekly Radio. Ananth and Aswin discussed a blog from BuzzFeed that shares lessons learned from building products powered by generative AI. The blog highlights how generative AI can be integrated into a company's work culture and workflow to enhance creativity rather than replace jobs. BuzzFeed provided their employees with intuitive access to APIs and integrated the technology into Slack for better collaboration. Some of the lessons learned from...
Apr 22, 2023•33 min•Ep 8•Transcript available on Metacast DBT Reimagined by Pedram Navid https://pedram.substack.com/p/dbt-reimagined The challenge with this, having the Jinja templating, I found out two things. One is like; it is on runtime. So you have to build it and then run some simulations to understand whether you did it correctly or not. Jinja Templates also add cognitive load. The developers have to know how the Jinja template will work; how SQL will work, and it becomes a bit difficult to read and understand. In this conversation with Aswin, ...
Apr 13, 2023•41 min•Ep 7•Transcript available on Metacast Hey folks, have you heard about the Data Council conference in Austin? The three-day event was jam-packed with exciting discussions and innovative ideas on data engineering and infrastructure, data science and algorithms, MLOps, generative AI, streaming infrastructure, analytics, and data culture and community. "People are so nice in the data community. Meeting them and brainstorming with many ideas and various thought processes is amazing. It was an amazing experience; The conference is mo...
Apr 06, 2023•36 min•Ep 6•Transcript available on Metacast In this episode of Data Engineering Weekly Radio, we delve into modern data stacks under pressure and the potential consolidation of the data industry. We refer to a four-part article series that explores the data infrastructure landscape and the Software as a Service (SaaS) products available in data engineering, machine learning, and artificial intelligence. We discussed that the siloed nature of many data products has led to industry consolidation, ultimately benefiting customers. Throughout ...
Mar 31, 2023•49 min•Ep 5•Transcript available on Metacast Subscribe to www.dataengineeringweekly.com From Data Engineering Weekly Edition #121, we took the following articles Oda: Data as a product at Oda Oda writes an exciting blog about “Data as a Product,” describing why we must treat data as a product, dashboard as a product, and the ownership model for data products. https://medium.com/oda-product-tech/data-as-a-product-at-oda-fda97695e820 The blog highlights six key principles of the value creation of data. Domain knowledge + discipline expertise...
Mar 22, 2023•43 min•Ep 4•Transcript available on Metacast Please read Data Engineering Weekly Edition #120 Topic 1: Colin Campbell: The Case for Data Contracts - Preventative data quality rather than reactive data quality In this episode, we focus on the importance of data contracts in preventing data quality issues. We discuss an article by Colin Campbell highlighting the need for a data catalog and the market scope for data contract solutions. We also touch on the idea that data creation will be a decentralized process and the role of tools lik...
Mar 12, 2023•36 min•Ep 3•Transcript available on Metacast We are super excited to be back to discussing Data Engineering Weekly Newsletter articles every week. We will take 2 or 3 articles from each week's Data Engineering Weekly edition and go through an in-depth analysis. On Data Engineering Weekly edition #119, We are taking three articles. #1 Netflix's article about Scaling Media Machine Learning at Netflix https://netflixtechblog.com/scaling-media-machine-learning-at-netflix-f19b400243 #2 Alex Woodie's article about Open Table Formats Square...
Mar 06, 2023•23 min•Ep 2•Transcript available on Metacast I am sharing my thoughts around the 75th edition of the Data Engineering Weekly newsletter. You can read the edition here https://www.dataengineeringweekly.com/p/data-engineering-weekly-75 The featured articles this week are, 📚 Dagster: Bundling Vs UnBundling the Data Platform 📚 Prefect: Logs, the Prefect Way 📚 Pinterest: Spinner - Pinterest’s Workflow Platform 📚 Apache Arrow: Introducing Apache Arrow Flight SQL - Accelerating Database Access 📚 Kevin Kho: Introducing Fugue — Reducing PySpar...
Feb 21, 2022•16 min•Ep 1•Transcript available on Metacast