The Data Stack Show - podcast cover

The Data Stack Show

Rudderstackdatastackshow.com
Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
Last refreshed:
Follow this podcast in the Metacast mobile app to refresh it and see new episodes.
Download Metacast podcast app
Podcasts are better in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episodes

96: How To Collect and Leverage Data From the Physical World with Prateek Joshi of Plutoshift

Highlights from this week’s conversation include: Prateek’s background and career journey (2:10) The lack of advanced data tools for the physical world (4:55) Dealing with data from the physical world (10:53) Stocks in the physical world (14:20) What it takes to execute this kind of project (19:05) Challenges around this infrastructure (25:56) ML tools that are useful in this environment (31:55) Physical instrumentation and environmental interaction (36:43) Current adoption of physical instrumen...

Jul 20, 202255 min

The PRQL: Collecting Data in the Physical World

Eric and Kostas preview their upcoming conversation with Prateek Joshi. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jul 15, 20223 min

95: How the Metrics Layer Bridges the Gap Between Data & Business with Nick Handel of Transform

Highlights from this week’s conversation include: Nick’s background and career journey (2:40) What Transform does (5:53) Metrics layer vs. metrics store (8:04) Signals vs. metrics (13:24) The user of a metric layer (14:34) Using Transform within an organization (17:05) How to fuse two sources into a metric (23:54) Currently supported databases (28:46) Community engagement (31:33) Optimizing for queries, metrics, and use cases (35:33) Technology and the human factor (40:49) Managing metrics amids...

Jul 13, 202258 min

The PRQL: Data Marts Aren’t Just for the Enterprise

Eric and Kostas preview their upcoming conversation with Nick Hansel from Transform. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jul 08, 20224 min

94: Notebooks Aren’t Just for Data Scientists With Barry McCardel of Hex Technologies

Highlights from this week’s conversation include: Bary’s background and Hex (3:05) Reconciling two sides of data (9:16) Collaboration at Hex (15:10) What it takes to build something like Hex (20:02) Defining “commitment engineering” (26:01) How to begin working with Hex (30:56) Hex customers and uniqueness (40:31) The future in a world of data acquisition (45:30) Crossover between analytics and ML (51:33) Advice for data engineers (57:19) The Data Stack Show is a weekly podcast powered by Rudder...

Jul 06, 20221 hr 3 min

93: There Is No Data Observability Without Lineage with Kevin Hu of Metaplane

Highlights from this week’s conversation include: Kevin’s background and career journey (1:54) Metaplane and the problem that is solves (6:47) The silence of data problems (9:53) Data physics work that requires more (13:35) Trusting data when bugs are present (19:12) Building a navigable experience (22:36) Developing anomaly detection (30:06) What Metaplane provides today (35:05) Metaplane’s plans for the future (37:45) Comparing Bigquery, Snowflake, and Redshift (40:56) Why data goes bad (48:15...

Jun 29, 20221 hr 5 min

The PRQL: What Are the Similarities Between VCs and Tilapia?

Eric and Kostas preview their upcoming conversation with Kevin Hu of Metaplane. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jun 24, 20223 min

92: Building a Decentralized Storage System for Media File Collaboration with Tejas Chopra

Highlights from this week’s conversation include: Tejas’ background and career journey (2:49, 43:04) Digital collaboration with Netflix Drive (7:57) A formal version control component (23:44) Centralized store vs. local affairs (31:05) The different skill sets a data engineer needs (37:38) How to get into data engineering (40:57) New technologies coming into day-to-day work (44:39) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to dat...

Jun 22, 202256 min

The PRQL: What is Netflix Cloud?

Eric and Kostas preview their upcoming conversation with Tejas Chopra of Netflix. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jun 17, 20223 min

91: The Future of Streaming Data with Stripe, Deephaven, Materialize, and Benthos

Highlights from this week’s conversation include: How we should think about batch versus streaming (6:02) Defining “streaming ETL” (9:34) A brief history of streaming processing platforms (22:07) The birth and evolution of Benthos (28:41) What led Jeff to build a new tool (34:29) Why you shouldn’t share all the data (37:23) Making streaming technologies approachable to engineers (42:09) Breaking out of traditional terminology (52:58) The Data Stack Show is a weekly podcast powered by RudderStack...

Jun 15, 20221 hr

The PRQL: Can Streaming Simplify Your Data Flows?

Eric and Kostas preview their upcoming livestream panel talking about all things streaming. Don't miss next week's episode with experts from Stripe, Deephaven, Materialize and Benthos Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jun 10, 20223 min

90: The Modern Data Stack Has a Join Problem with Ahmed Elsamadisi of Narrator AI

Highlights from this week’s conversation include: Ahmed’s background and career journey (2:27) Why the modern data stack “sucks” (4:53) The limitations of progress (9:13) Showing data with only 11 columns (11:55) Managing one table that rules them all (19:02) Viewing the world as timestamped activities (32:40) When this model becomes harder to use (35:15) The two parts you need in a company (44:41) Those who use Narrator (48:32) The Data Stack Show is a weekly podcast powered by RudderStack, the...

Jun 08, 202257 min

The PRQL: Can One Table Rule Them All?

Eric and Kostas preview their upcoming episode with Ahmed Elsamadisi of Narrator AI. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jun 03, 20225 min

89: Solving Microservice Orchestration Issues at Netflix with Viren Baraiya of Orkes

Highlights from this week’s conversation include: Viren’s background and career journey (2:23) Engineering challenges in Netflix transitions (6:05) How Conductor changed the process (9:30) Building a lot more microservices (16:04) Open sourcing Conductor (17:38) Defining “orchestration” (22:05) Using an orchestrator written in Java (31:04) Building a cloud service around microservices (34:59) Differentiating product experiences (37:17) Orchestration platforms in new environments (42:15) Advice f...

Jun 01, 202252 min

The PRQL: What are the Different Flavors of Orchestration?

Eric and Kostas preview their upcoming conversation with Viren Baraiya of Orkes. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

May 27, 20224 min

88: What Is Data Observability? With Tristan Spaulding of Acceldata

Highlights from this week’s conversation include: Tristan’s background and career journey (2:43) Updating old technology (11:40) Defining “data observability” (18:44) The primary user of a data observability tool (29:56) Handling an incident (33:01) Why multipliers for data observability (37:06) Early symptoms of a data drift (43:12) Tuning in the context of data engineering (50:11) What keeps Tristan working with data (55:12) The Data Stack Show is a weekly podcast powered by RudderStack, the C...

May 25, 20221 hr 2 min

The PRQL: Does Data Exist if We Do Not Observe It?

Eric and Kostas preview their upcoming conversation with Tristan Spaulding of Acceldata. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

May 20, 20224 min

87: Why Is Now the Golden Age of Data Analytics? With Cindi Howson of ThoughtSpot

Highlights from this week’s conversation include: Cindi’s career journey (2:36) Major shifts in analytics (6:34) Where we are in formation of the modern analytics cloud (9:07) The process of moving into the cloud (11:01) How to accelerate the digital transformation (17:29) Common patterns amongst company cultures (19:42) Data regulations affecting change (22:34) ThoughtSpot customer base (24:06) The need to know SQL (27:42) Power users leveraging the AI Insights (31:24) Who should audit technolo...

May 18, 202258 min

The PRQL: Can You Trust AI Enabled Analytics?

Eric and Kostas preview their upcoming conversation with Cindi Howson of ThoughtSpot and Host of The Data Chief Podcast. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

May 13, 20223 min

86: Solving the Data Quality Problem with Bigeye, Great Expectations, Metaplane, and Lightup.ai

Highlights from this week’s conversation include: Guest introductions (1:02) Defining data quality (4:08) Forgetting to apply software best practices (8:33) Differentiating observability and data quality (17:53) Who should care about quality in the organization (26:55) Why this is still a valid conversation (35:44) The jurisdiction of various components (45:39) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts...

May 11, 202259 min

85: You Can Stop Doing Data Fire Drills with Barr Moses of Monte Carlo

Highlights from this week’s conversation include: Barr’s background and career journey (2:12) Trust: a technical or human problem? (9:47) Behind the name “Monte Carlo” (15:41) Defining data accuracy and reliability (17:36) How much can be done with standardization (22:27) How to avoid frustration when generating data about data (25:49) Defining “resolution” (28:59) Understanding the concept of SLAs (33:25) Building a company for a category that doesn’t exist yet (37:40) What it looks like to use...

May 04, 202252 min

Data Council Week (Ep 5): A Primer on Spatial Data With Gabriel Hidalgo of Carto

Highlights from this week’s conversation include: How Gabriel got into data (1:54) What Carto is (5:28) Location data vs spatial data (6:37) Time data vs space data (7:50) System supports for spatial data (9:50) Explaining “spatial functions” (14:19) Who uses Carto and why (15:52) What’s coming for Carto (19:15) What Gabriel does at Carto (22:22) The coolest things Carto’s done (23:52) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to...

Apr 29, 202229 min

Data Council Week (Ep 4): The Data Council Origin Story With Pete Soderling

Highlights from this week’s conversation include: Pete’s start in data and Data Council (2:01) Learning more from failure (6:42) Shaping terminology and definitions (9:30) What investors look for in data technology (12:43) Working as a data engineer (16:32) Data Council takeaways (18:16) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintainin...

Apr 28, 202222 min

Data Council Week (Ep 3): Product Analytics the Right Way With James Greenhill of PostHog

Highlights from this week’s conversation include: How James got started in data (2:42) What makes PostHog different (10:43) Why we need product analytics (13:40) Capturing and collecting data (15:17) Dealing with drift on a platform like PostHog (19:45) Starting from the metrics versus events (22:50) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building a...

Apr 27, 202229 min

Data Council Week (Ep 2): Testing and Observability Are Two Sides of the Same Coin With Ben Castleton of Great Expectations

Highlights from this week’s conversation include: Ben’s background and career journey (2:13) The birth of Great Expectations (5:02) Defining software engineering (9:38) Adopting open source products (13:04) Working in data versus healthcare (18:01) What's next for Great Expectations (20:29) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintai...

Apr 26, 202226 min

Data Council Week (Ep 1): Discussing Firebolt’s Engine With Benjamin HoppDiscussing Firebolt’s Engine With Benjamin Hopp

Highlights from this week’s conversation include: Ben’s career journey (2:55) What makes Firebolt different (3:58) Firebolt’s data product family (7:37) Table engines and Firebolt (10:57) Ben’s favorite part of ClickHouse (12:52) The experience of building an optimizer (15:19) Where Firebolt fits into architecture (17:27) Working in the data space: to love and dislike (19:51) Coming soon in the near future (24:35) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for develo...

Apr 25, 202228 min

The PRQL: A Data Council Austin Quintuple

Eric and Kostas preview an upcoming mini-series for next week featuring conversations with experts at Data Council Austin. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Apr 22, 20226 min
For the best experience, listen in Metacast app for iOS or Android