The Data Stack Show - podcast cover

The Data Stack Show

Rudderstackdatastackshow.com
Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

Episodes

91: The Future of Streaming Data with Stripe, Deephaven, Materialize, and Benthos

Highlights from this week’s conversation include: How we should think about batch versus streaming (6:02) Defining “streaming ETL” (9:34) A brief history of streaming processing platforms (22:07) The birth and evolution of Benthos (28:41) What led Jeff to build a new tool (34:29) Why you shouldn’t share all the data (37:23) Making streaming technologies approachable to engineers (42:09) Breaking out of traditional terminology (52:58) The Data Stack Show is a weekly podcast powered by RudderStack...

Jun 15, 20221 hr

The PRQL: Can Streaming Simplify Your Data Flows?

Eric and Kostas preview their upcoming livestream panel talking about all things streaming. Don't miss next week's episode with experts from Stripe, Deephaven, Materialize and Benthos

Jun 10, 20223 min

90: The Modern Data Stack Has a Join Problem with Ahmed Elsamadisi of Narrator AI

Highlights from this week’s conversation include: Ahmed’s background and career journey (2:27) Why the modern data stack “sucks” (4:53) The limitations of progress (9:13) Showing data with only 11 columns (11:55) Managing one table that rules them all (19:02) Viewing the world as timestamped activities (32:40) When this model becomes harder to use (35:15) The two parts you need in a company (44:41) Those who use Narrator (48:32) The Data Stack Show is a weekly podcast powered by RudderStack, the...

Jun 08, 202257 min

89: Solving Microservice Orchestration Issues at Netflix with Viren Baraiya of Orkes

Highlights from this week’s conversation include: Viren’s background and career journey (2:23) Engineering challenges in Netflix transitions (6:05) How Conductor changed the process (9:30) Building a lot more microservices (16:04) Open sourcing Conductor (17:38) Defining “orchestration” (22:05) Using an orchestrator written in Java (31:04) Building a cloud service around microservices (34:59) Differentiating product experiences (37:17) Orchestration platforms in new environments (42:15) Advice f...

Jun 01, 202252 min

88: What Is Data Observability? With Tristan Spaulding of Acceldata

Highlights from this week’s conversation include: Tristan’s background and career journey (2:43) Updating old technology (11:40) Defining “data observability” (18:44) The primary user of a data observability tool (29:56) Handling an incident (33:01) Why multipliers for data observability (37:06) Early symptoms of a data drift (43:12) Tuning in the context of data engineering (50:11) What keeps Tristan working with data (55:12) The Data Stack Show is a weekly podcast powered by RudderStack, the C...

May 25, 20221 hr 2 min

87: Why Is Now the Golden Age of Data Analytics? With Cindi Howson of ThoughtSpot

Highlights from this week’s conversation include: Cindi’s career journey (2:36) Major shifts in analytics (6:34) Where we are in formation of the modern analytics cloud (9:07) The process of moving into the cloud (11:01) How to accelerate the digital transformation (17:29) Common patterns amongst company cultures (19:42) Data regulations affecting change (22:34) ThoughtSpot customer base (24:06) The need to know SQL (27:42) Power users leveraging the AI Insights (31:24) Who should audit technolo...

May 18, 202258 min

86: Solving the Data Quality Problem with Bigeye, Great Expectations, Metaplane, and Lightup.ai

Highlights from this week’s conversation include: Guest introductions (1:02) Defining data quality (4:08) Forgetting to apply software best practices (8:33) Differentiating observability and data quality (17:53) Who should care about quality in the organization (26:55) Why this is still a valid conversation (35:44) The jurisdiction of various components (45:39) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts...

May 11, 202259 min

85: You Can Stop Doing Data Fire Drills with Barr Moses of Monte Carlo

Highlights from this week’s conversation include: Barr’s background and career journey (2:12) Trust: a technical or human problem? (9:47) Behind the name “Monte Carlo” (15:41) Defining data accuracy and reliability (17:36) How much can be done with standardization (22:27) How to avoid frustration when generating data about data (25:49) Defining “resolution” (28:59) Understanding the concept of SLAs (33:25) Building a company for a category that doesn’t exist yet (37:40) What it looks like to use...

May 04, 202252 min

Data Council Week (Ep 5): A Primer on Spatial Data With Gabriel Hidalgo of Carto

Highlights from this week’s conversation include: How Gabriel got into data (1:54) What Carto is (5:28) Location data vs spatial data (6:37) Time data vs space data (7:50) System supports for spatial data (9:50) Explaining “spatial functions” (14:19) Who uses Carto and why (15:52) What’s coming for Carto (19:15) What Gabriel does at Carto (22:22) The coolest things Carto’s done (23:52) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to...

Apr 29, 202229 min

Data Council Week (Ep 4): The Data Council Origin Story With Pete Soderling

Highlights from this week’s conversation include: Pete’s start in data and Data Council (2:01) Learning more from failure (6:42) Shaping terminology and definitions (9:30) What investors look for in data technology (12:43) Working as a data engineer (16:32) Data Council takeaways (18:16) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintainin...

Apr 28, 202222 min

Data Council Week (Ep 3): Product Analytics the Right Way With James Greenhill of PostHog

Highlights from this week’s conversation include: How James got started in data (2:42) What makes PostHog different (10:43) Why we need product analytics (13:40) Capturing and collecting data (15:17) Dealing with drift on a platform like PostHog (19:45) Starting from the metrics versus events (22:50) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building a...

Apr 27, 202229 min

Data Council Week (Ep 2): Testing and Observability Are Two Sides of the Same Coin With Ben Castleton of Great Expectations

Highlights from this week’s conversation include: Ben’s background and career journey (2:13) The birth of Great Expectations (5:02) Defining software engineering (9:38) Adopting open source products (13:04) Working in data versus healthcare (18:01) What's next for Great Expectations (20:29) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintai...

Apr 26, 202226 min

Data Council Week (Ep 1): Discussing Firebolt’s Engine With Benjamin HoppDiscussing Firebolt’s Engine With Benjamin Hopp

Highlights from this week’s conversation include: Ben’s career journey (2:55) What makes Firebolt different (3:58) Firebolt’s data product family (7:37) Table engines and Firebolt (10:57) Ben’s favorite part of ClickHouse (12:52) The experience of building an optimizer (15:19) Where Firebolt fits into architecture (17:27) Working in the data space: to love and dislike (19:51) Coming soon in the near future (24:35) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for develo...

Apr 25, 202228 min

84: Why Are Analytics Still So Hard? With Kaycee Lai of Promethium

Highlights from this week’s conversation include: Kaycee’s background and career journey (2:34) Why analytics are hard (7:28) Defining “data management” (11:47) Defining “data virtualization” (15:57) The relationship between data virtualization and ETL (18:34) Where a company should invest first (21:40) Building without a Frankenstein stack (25:19) How Promethium solves data stack issues (27:53) Giving context to data (35:14) Cataloging: background, at Promethium, future (39:29) Who uses data ca...

Apr 20, 202256 min

83: Closing the Gap Between Business Analytics and Operational Analytics With Max Beauchemin of Preset

Highlights from this week’s conversation include: Max’s career journey and role today (2:56) Hitting the limits of traditional BI (11:06) The most influential technology (14:34) Merging with BI and visualization (17:35) Two thoughts on real-time (21:02) Defining BI (24:53) How many have actually achieved self-serve BI (29:54) How preset.io fits in the BI architecture of today (32:36) How to use preset.io to expose analytics (35:23) The analytics process to power something like embedded (42:45) O...

Apr 13, 202257 min

82: Databases: The Fun Never Stops with Robert Hodges of Altinity

Highlights from this week’s conversation include: Robert’s background and career journey (2:21) How studying languages influences database work (5:13) Why Robert has been working with databases for 40+ years (7:50) Explaining the ClickHouse database (10:43) How ClickHouse is able to focus on latency (13:39) The use cases behind ClickHouse (19:19) How ClickHouse is different than other databases (25:47) Why old problems are just now getting addressed (29:04) How ClickHouse works with others again...

Apr 06, 20221 hr 2 min

81: Digging into Data Ops with Prukalpa Sankar of Atlan

Highlights from this week’s conversation include: Prukalpa’s background and career journey (3:16) Applying a data-driven mindset to poverty (7:21) What Atlan does (11:53) The makeup of a realistically functioning data team (15:25) How to create a company’s first data team (18:13) Defining “agile data” (22:01) The necessity of data ops (26:36) The minimum data stack needed (29:16) Data team size (31:58) Where to start when you need to make adjustments (34:51) Collaborate with different parts of t...

Mar 30, 202256 min

80: Is Reverse-ETL Just Another Data Pipeline? With Census, Hightouch, & Workato

Highlights from this week’s conversation include: Panel introductions (2:23) What is driving the trend behind Reverse ETL? (5:24) The obstacles to building an internal Reverse ETL tool at scale (15:34) How to decide system management vs. user flexibility (20:14) Why previous products failed in creating this category (29:12) Increased demand and democratization of datastack skills via SaaS (42:03) Broader applications for Reverse ETL (47:29) Limitations of Reverse ETL (55:05) How user technical a...

Mar 23, 20221 hr 16 min
For the best experience, listen in Metacast app for iOS or Android
Open in Metacast