The Data Stack Show - podcast cover

The Data Stack Show

Rudderstackdatastackshow.com
Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
Last refreshed:
Follow this podcast in the Metacast mobile app to refresh it and see new episodes.
Download Metacast podcast app
Podcasts are better in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episodes

69: What is the Modern Data Stack?

Highlights from this week’s conversation include: Panel introductions and backgrounds (2:55) What the modern data stack means to each of our panelists (5:04) Defining the fundamental components of a modern data stack (17:22) How the modern stack drives insights and actions for businesses (28:03) Getting to a uniform definition to the modern stack (33:45) Managing the modernization of a large scale data stack (39:09) How testing works in the dbt context (48:44) The relationship between the data w...

Jan 05, 20221 hr 4 min

The PRQL: Should Data Trust Drive the Evolution of Your Data Stack?

In this PRQL, Eric and Kostas preview their upcoming show where they discuss the modern data stack with some of the top experts in the industry. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Dec 31, 20215 min

68: Season Three Recap: Holiday Edition with Eric Dodds and Kostas Pardalis

In this episode, Eric and Kostas look back over the great topics and guests from season three of the Data Stack Show. The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of ...

Dec 29, 202125 min

67: Now is the Time to Think About Data Quality with Manu Bansal of Lightup Data

Highlights from this week’s conversation include: Manu’s career background and describing Lightup (2:31) Why traditional tools don’t work for modern data problems (6:04) How a data lake differs from a data warehouse (11:35) Defining data quality (14:07) The business impact of solving and applying data quality (31:36) Constructing a healthy financial view on the impact of data (41:09) How to work with unstructured data in a meaningful way (47:44) The Data Stack Show is a weekly podcast powered by...

Dec 22, 202156 min

66: How Data Infrastructure Has Evolved and Managing High Performing Data Teams with Srivatsan Sridharan

Highlights from this week’s conversation include: Starting his career on the first-ever data team at Yelp (2:00) How to approach the adoption of new technology (7:04) When to use stream processing vs. batching (11:35) What is a pipeline and why is it core to a data engineer? (14:07) Where a new data scientist should begin their career (19:14) The key factors impacting a new technology decision (27:09) Managing team emotions in decision making (34:25) The unique challenge of Fintech vs other cons...

Dec 15, 202151 min

65: Operationalizing Data from the Warehouse With Aayush Jain of Cliff.ai

Highlights from this week’s conversation include: Aayush’s career background (4:13) How his biological sciences academic training impacts his work (8:04) How do we allow dashboards to get messy? (9:35) Building cultural or technical solutions to effective dashboards (15:19) Using data dashboards to make material business improvements (23:19) What is business observability? (32:23) Building a platform for operations teams (43:15) How important community is to the cliff.ai business proposition (41...

Dec 08, 202156 min

The PRQL: Why is the Data Engineer's Role Expanding?

In this show PRQL, Eric and Kostas talk about the evolution of the role of a data engineer and preview the conversation with Aayush Jain. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Dec 03, 202110 min

64: Data Stack Composability and Commoditization with Michel Tricot of Airbyte

Highlights from this week’s conversation include: Announcement: Data Stack Live! (1:00) Michel’s career background (4:13) Solving the technical and process challenges of moving data (7:04) Lessons learned from managing data at Live Ramp (9:35) How to build a modern data stack (16:19) Triggers to signal when more data infrastructure is needed (23:19) Why Airbyte is an open-source product (30:23) Airbyte’s role in providing support to open-source problems (38:15) How important DPT is for the Airby...

Dec 01, 202156 min

The PRQL: The Beauty of Commoditization

For this week's PRQL, Eric and Kostas preview their upcoming episode with Michel Tricot. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Nov 26, 20217 min

63: The ETL - ELT Flip With Ciaran Dynes of Matillion

On this week’s episode of The Data Stack Show, Eric and Kostas have a conversation with Ciaran Dynes, the Chief Product Officer at Matillion, a powerful and easy-to-use, completely cloud-capable ETL/ELT solution. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Nov 24, 202156 min

The PRQL: What Part of the Data Stack Will Be Commoditized Next?

On this week's PRQL, Kostas and Eric preview their upcoming conversation with Ciaran Dynes of Matillion. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Nov 19, 20217 min

62: The Internet of Everything with Rob Rastovich of ThingLogix

Highlights from this week’s conversation include: Rob’s career began as an early adopter in internet marketing and then he got the bug for machine-to-machine IoT (2:47) Making assumptions about mass scale (8:44) Pervasiveness of IoT in the market (11:47) Initial reactions to technological advances that we take for granted today (17:28) What makes IoT unique (23:56) Killing the SQL server (29:11) What really separates a smart device from a dumb device that can send data to the cloud (33:13) 5G, L...

Nov 17, 202152 min

The PRQL: Are you afraid of IOT?

In this PRQL, Eric and Kostas preview their upcoming conversation with Rob Rastovich of ThingLogix, Inc. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Nov 12, 20217 min

61: What is Data Design? With Kevin Gervais of Touchless

Highlights from this week’s conversation include: Kevin’s interaction with data at an early age (2:35) Working with telecom data (5:08) Analyzing emojis in customer sentiment (8:44) Infrastructure needed for diverse data (12:22) Building better interfaces and looking out for human error (24:17) Dealing with differences in identities in different layers of the stack (41:21) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data enginee...

Nov 10, 202155 min

The PRQL: Will we ever get rid of the CSV?

Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Nov 08, 202110 min

Data Debrief: The Highs and Lows of Open Source Projects

Eric and Kostas break down further topics from episode 60 about stream processing and open source projects. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Nov 05, 20216 min

60: Architecting a Boring Stream Processing Tool With Ashley Jeffs of Benthos

Highlights from this week’s conversation include: A brief overview of Ashley’s background (2:47) Benthos’ creation and the problems it was meant to address (4:01) Use cases for Benthos (18:25) Key features of Benthos that make it stand out (22:23) Adding windowing to Benthos for fun (29:23) The highs and lows of maintaining an open source project for five years (32:17) The architecture of Benthos (36:23) The importance of ordering in streaming processing (42:15) Gaining traction with an open sou...

Nov 03, 20211 hr 7 min

59: Making ETL Optional with Justin Borgman of Starburst Data

Highlights from this week’s conversation include: Starburst Data is Justin’s second startup (2:42) Starburst focuses on doing data warehousing analytics without the need for the data warehouse (4:14) Multi-cloud solutions among merger and acquisition use cases (8:32) Ways the stack is increasing in complexity (12:25) Comparing essential components of a data stack from 2010 to now (15:01) The future of ETL (27:36) The best maturity stage for an organization to implement Starburst (31:27) Starburs...

Oct 27, 202158 min

Data Debrief: Will Enterprise Build The Future of Data Tooling?

On this week's Data Debrief, Eric and Kostas dig more into the topic of data tooling. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Oct 22, 20218 min

58: Data Federation is No Longer The "F" Word with Scott Gnau of InterSystems

Highlights from this week’s conversation include: Solving problems with data has been a long-time passion of Scott’s (2:52) Day-to-day use of data at InterSystems (6:25) The technical aspects involved in constructing a data fabric (17:52) Companies at a variety of maturity levels can adopt a data fabric (26:49) A paradigm shift in the marketplace (28:39) Comparing and contrasting data fabric and data mesh (30:49) Sharing data across the business and not having it siloed in different departments ...

Oct 20, 202150 min

57: Improving Data Quality Using Data Product SLAs with Egor Gryaznov of Bigeye

Highlights from this week’s conversation include: Egor’s software engineering background and history with Uber (2:19) Experimentation platforms and analytics definitions (7:49) Bigeye’s function and use cases (9:40) Managing the relationship between the data engineer maintaining the pipelines and the downstream teams providing the context (18:49) Pinpointing problems in data compared to problems in software (21:55) Defining data quality at Bigeye (24:13) Machine learning models as a data product...

Oct 13, 202156 min

56: Stream Processing and Observability with Jeff Chao of Stripe

Highlights from this week’s conversation include: Jeff’s history with stream processing (2:52) Working with Mantis to address the impact of Netflix downtime (4:20) Defining observability as operational insight (6:58) Time series data and the value of data today (18:52) Data integration’s shift from batch to streaming (29:34) The current state of change data capture (32:20) How an engineer thinks of the end-user (56:21) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for d...

Oct 06, 20211 hr 4 min

55: Tables vs. Streams and Defining Real-Time with Pete Goddard of Deephaven Data Labs

Highlights from this week’s conversation include: Pete’s background in data engineering and capital market trading (2:10) Comparison of the tooling from 2012 when Deephaven started with that of today (10:30) Taking a closer look at defining real-time data (19:47) Getting non-technical people, clients, and developers all on the same platform (36:11) Deephaven’s incremental update model (40:25) Kafka, timely data flow, and Deephaven (44:22) Use cases for Deephaven (51:52) Going to GitHub to try ou...

Sep 29, 20211 hr 7 min

54: The Center of the Modern Data Stack with Neil Rahilly of Mixpanel

Highlights from this week’s conversation include: Neil’s programming hobby turned into a career and how he cold-contacted Mixpanel for a job (2:28) Lessons learned from nine years at Mixpanel (5:05) Defining product analytics (8:06) How Mixpanel has evolved into the product it is today (10:56) The importance of Mixpanel’s real-time analysis (19:52) Looking at Arb, Mixpanel’s own arbitrary segmentation database (23:44) The business impact that the rise of the cloud data warehouse had on Mixpanel ...

Sep 22, 20211 hr 9 min

53: What Religion, a Cult, and a Tech Product Have in Common, with Bart Farrell of DoKC

Highlights from this week’s conversation include: Bart’s journey from southern California, to New York, to Egypt, to London, to Spain (3:31) Exposure to different communities and finding shared language and experience (10:21) Looking back at early online communities and how they furthered your learning journey (27:50) How the level of niche-ness impacts a community (44:06) The cautionary tale of WeWork (57:28) Surefire community killers (1:03:44) Open source communities in tech and the passion t...

Sep 15, 20211 hr 20 min
For the best experience, listen in Metacast app for iOS or Android