The Data Stack Show - podcast cover

The Data Stack Show

Rudderstackdatastackshow.com
Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
Last refreshed:
Follow this podcast in the Metacast mobile app to refresh it and see new episodes.
Download Metacast podcast app
Podcasts are better in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episodes

192: Business Logic As Code: A New LLM-Powered Operating System for Business Automation with Binny Gill of Kognitos

Highlights from this week’s conversation include: The history of computer science and AI inflection point (1:23) Binny's early programming experiences and the constraints of technology (2:14) Getting interested in computer programming (5:02) The experiment that impacted the starting of Kognitos (8:23) Challenges in traditional computer science (16:04) Reimagining programming and debugging through natural language (19:08) The operating system for applications (20:19) Changing the paradigm of prog...

Jun 05, 202448 min

The PRQL: From Programming Tic Tac Toe to Building an Operating System for Natural Language Programs With Binny Gill of Kognitos

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com . ...

Jun 03, 20243 min

191: From Amazon to Consulting: Time Series Forecasting and How to Communicate Data Analytics Insights with David McCandless of McCandless Consulting

Highlights from this week’s conversation include: David's Background and Journey in Data (0:30) Transition to Time Series Forecasting (2:03) Working on Time Series Forecasting at Amazon (2:55) Challenges and Experience in Time Series Forecasting (4:32) Transitioning to a New Role at Amazon (5:52) Tools and Methods for Time Series Forecasting (8:17) Forecasting Impact and Accuracy (15:30) Explaining Variance and Lessons Learned (18:58) Understanding Downstream Consumers and Empathy for Business L...

May 29, 202449 min

The PRQL: Practical Applications for Time Series Forecasting with David McCandless of McCandless Consulting

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com . ...

May 28, 20243 min

190: Aligning Data Teams and Data Tools With Business Needs Featuring Ben Rogojan, the Seattle Data Guy

Highlights from this week’s conversation include: Ben’s background and journey in data (0:18) Relating data to business outcomes (2:33) Facebook's approach to data-driven business outcomes (4:43) Subjectivity and data-driven business outcomes (8:43) Infrastructure and data collection at Facebook (12:04) The importance of first-party data and the death of third-party cookies (16:27) Facebook's Data and Attribution Challenges (20:08) Facebook's Infrastructure and Tooling (23:41) Differences in Dat...

May 22, 202452 min

The PRQL: Data Success From Mid-market to Enterprise with Ben Rogojan

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com . ...

May 20, 20243 min

189: Customer Data Modeling, The Data Warehouse, Reverse ETL, and Data Activation with Ryan McCrary of RudderStack

Highlights from this week’s conversation include: Ryan's Background and Roles in Data (0:05) Data Activation and Dashboard Staleness (1:27) Profiles and Data Activation (2:54) Customer-Facing Experience and Product Management (3:40) Profiles Product Overview (5:10) Use Cases for Profiles (6:44) Challenges with Data Projects (9:19) Entity Management and Account Views (15:33) Handling Entities and Duplicates (17:55) Challenges in Entity Management (22:18) Product Management and Data Solutions (26:...

May 16, 20241 hr 4 min

The PRQL: How to Get Business Teams Closer to Customer Data (The Right Way)

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com . ...

May 13, 20243 min

188: How To Invest in Data Infrastructure and Data Projects That Create Business Value with Matthew Kelliher-Gibson of Rudderstack

Highlights from this week’s conversation include: Matt KG’s Background in Data (0:35) Challenges in purchasing data tools (1:28) Early experiences in data analysis (9:51) Matt’s Transition to a subprime auto loan company (13:19 Transition to RudderStack and software purchase decisions (17:36) Tech Problems: People and Process (22:02) Challenges in Purchasing Data Tools (22:55) Budget Constraints and Purchasing Decisions (24:46) Challenges with Platform Documentation (26:55) Metrics and Cost Effi...

May 08, 202456 min

The PRQL: Navigating the Procurement Process for Data Infrastructure Tooling With Matthew Kelliher-Gibson of Rudderstack

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com . ...

May 06, 20243 min

187: Startup Lessons and Torch Passing with Kostas Pardalis

Highlights from this week’s conversation include: Kostas Passes the Baton as Co-Host of the Podcast (0:24) Reflecting on the Podcast (2:56) New Co-Host John Wessel and His Background in Data (4:34) Kostas Journey in Data (10:55) Rudderstack's Explosive Growth (21:28) The Podcast's Inception and Marketing Activities (24:19) Evolution of the podcast (27:22) Memorable guests and experiences (28:29) Connecting with industry leaders and key innovators in the space (33:05) Kostas' new venture (36:26) ...

May 01, 202446 min

The PRQL: Why Is Kostas a Guest on His Own Podcast?

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com . ...

Apr 29, 20244 min

186: Data Fusion and The Future Of Specialized Databases with Andrew Lamb of InfluxData

Highlights from this week’s conversation include: The Evolution of Data Systems (0:47) The Role of Open Source Software (2:39) Challenges of Time Series Data (6:38) Architecting InfluxDB (9:34) High Cardinality Concepts (11:36) Trade-Offs in Time Series Databases (15:35) High Cardinality Data (18:24) Evolution to InfluxDB 3.0 (21:06) Modern Data Stack (23:04) Evolution of Database Systems (29:48) InfluxDB Re-Architecture (33:14) Building an Analytic System with Data Fusion (37:33) Challenges of ...

Apr 24, 202458 min

The PRQL: Open Source and the Evolution of Data Systems with Andrew Lamb of InfluxData

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com . ...

Apr 22, 20244 min

Data Council Week: A Decade of Supporting the Data Community with Pete Soderling

Highlights from this week’s conversation include: Pete’s background and the origin story of Data Council (1:04) Reflecting on 10 years of Data Council (2:07) Impact of the pandemic on conferences (5:25) Rebuilding after the pandemic (7:42) Evolution of Data Council (10:33) Balancing content and sponsorship (16:17) Selecting speakers and content at Data Council (19:39) Highlights from the conference this year (21:58) Realization of AI Future (22:45) Embracing AI at Data Council (23:31) Announceme...

Apr 18, 202436 min

Data Council Week: AI Isn’t Just Hype - How To Successfully Apply LLMs Today with Tristan Zajonc of Continual

Highlights from this week’s conversation include: Tristan's Background and Journey into Data (1:14) Evolution of Machine Learning and AI (3:13) Impact of Generative AI (6:33) MLOps and Challenges in Early Data Science (8:48) Success and Applications of AI Today (11:34) Continual AI Copilot Platform (18:04) Challenges in building remarkable AI assistants (19:58) Reliability and accuracy in AI responses (25:31) Regulation and adoption of AI assistants (31:30) Future of AI assistants and Continual ...

Apr 17, 202436 min

Data Council Week: How To Do Self-Service Data Analytics and Business Intelligence Right with Ryan Dolley of GoodData

Highlights from this week’s conversation include: Ryan’s background in data (0:58) Transition from Performing Arts to Data (2:23) Understanding End Users in Data Projects (6:08) Learning from Failures in Data Projects (8:07) The self-service era (19:50) Struggles of self-service (21:23) The disillusion with dashboards (26:23) GoodData's approach (30:06) Merging wisdom with modern approach (31:50) User experience with GoodData (34:05) Defining metrics and AI (36:35) Connecting with Ryan and GoodD...

Apr 15, 202442 min

185: The Evolution of Data Processing, Data Formats, and Data Sharing with Ryan Blue of Tabular

Highlights from this week’s conversation include: The Evolution of Data Processing (2:36) Ryan’s Background and Journey in Data (4:52) Challenges in Transitioning to S3 (8:47) Impact of Latency on Query Performance (11:43) Challenges with Table Representation (15:26) Designing a New Metadata Format (21:36) Integration with Existing Tools and Open Source Project (24:07) Initial Features of Iceberg (26:11) Challenges of Manual Partitioning (31:49) Designing the Iceberg Table Format (37:31) Trade-o...

Apr 10, 20241 hr 30 min

The PRQL: The Two Parallel Tracks of Development In Data Processing with Ryan Blue of Tabular

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com . ...

Apr 08, 20245 min

184: Kafka Streams and Operationalizing Event Driven Applications with Apurva Mehta of Responsive

Highlights from this week’s conversation include: Apruva’s background in streaming technology (0:48) Developer experience and Kafka streams (2:47) Motivation to bootstrap a startup (4:09) Meeting the Confluent founders and early work at Confluent (6:59) Projects at Confluent and transition to engineering management (10:34) Overview of Responsive and event-driven applications (12:55) Defining event-driven applications (15:33) Importance of latency and state in event-driven applications (18:54) Lo...

Apr 03, 202458 min

The PRQL: Event-Driven Applications: Where Low Latency Meets High Impact with Apurva Mehta of Responsive

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com . ...

Apr 01, 20244 min

183: Why Modern Data Quality Must Move Beyond Traditional Data Management Practices with Chad Sanderson of Gable.ai

Highlights from this week’s conversation include: Chad’s background and journey in data (0:46) Importance of Data Supply Chain (2:19) Challenges with Modern Data Stack (3:28) Comparing Data Supply Chain to Real-world Supply Chains (4:49) Overview of Gable.ai (8:05) Rethinking Data Catalogs (11:42) New Ideas for Managing Data (15:16) Data Discovery and Governance Challenges (18:51) Static Code Analysis and AI Impact on Data (24:55) Creating Contracts and Defining Data Lineage (27:31) Data Quality...

Mar 27, 20241 hr 3 min

The PRQL: The Data Supply Chain with Chad Sanderson of Gable.ai

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com . ...

Mar 25, 20248 min

182: Building a Dynamic Data Infrastructure at Enterprise Scale Featuring Kevin Liu of Stripe

Highlights from this week’s conversation include: Kevin’s background and work at Stripe (0:31) Evolution of Data Infrastructure at Stripe (2:18) Kevin's Interest in Data (5:29) Software Engineer or Data Engineer? (8:27) Speech Recognition Work at Amazon (11:06) Efficiency and Cost Management (15:50) Metadata and Query Analysis (18:38) Surprising Discoveries in Metadata Analysis (21:43) Optimizing Cost and Value (23:55) Product Sizing Stripe Data (26:39) Popular Tool for Data Interaction (30:08) ...

Mar 20, 20241 hr 1 min

The PRQL: Exploring the Intersection of Software Engineering and Data Management with Kevin Liu of Stripe

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com . ...

Mar 18, 20246 min

181: OLAP Engines and the Next Generation of Business Intelligence with Mike Driscoll of Rill Data

Highlights from this week’s conversation include: Michael’s background and journey in data (0:33) The origin story of Druid (2:39) Experiences and growth in Data (8:08) Druid's evolution (21:46) Druid's architectural decisions (26:32) The user experience (30:06) The developer experience (35:14) The evolution of BI tools (40:55) Data architecture and integration (47:53) AI's impact on BI (52:26) What would Mike be doing if he didn’t work in data? (56:27) Final thoughts and takeaways (57:02) The D...

Mar 13, 20241 hr

The PRQL: Making the Data Stack Serverless in the Cloud with Mike Driscoll of Rill Data

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com . ...

Mar 11, 20246 min

180: Data Observability and AI for Data Operations Featuring Kunal Agarwal of Unravel Data

Highlights from this week’s conversation include: The evolution of data operations (1:13) Unravel's role in simplifying data operations (2:17) Kunal’s journey from fashion to enterprise data management (5:23)\ The Unravel platform and its components (10:08) Challenges in data operations at scale (16:34) Users of Unravel within an organization (22:32) Calculating ROI on data products (25:55) Understanding the cost of data operations (27:01) Measuring productivity and reliability (30:59) Diversity...

Mar 06, 202453 min

179: Time Series Data Management and Data Modeling with Tony Wang of Stanford University

Highlights from this week’s conversation include: Tony's background and research focus (3:35) Challenges in academia and industry (6:15) Ph.D. student's routine (10:47) Academic paper review process (15:26) Aha moments in research (20:05) Academic lab structure (23:09) The decision to move from hardware to data research (24:43) Research focus on time series data management (27:40) Data modeling in time series and OLAP systems (32:01) Issues and potential solutions for parquet format (37:32) Role...

Feb 28, 202451 min
For the best experience, listen in Metacast app for iOS or Android