The Data Stack Show - podcast cover

The Data Stack Show

Rudderstackdatastackshow.com
Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

Episodes

Data Council Week (Ep 2) - The Convergence of MLops and DataOps With Team Featureform

Highlights from this week’s conversation include: Introducing the team from Featureform (0:31) In the work vs. leading the work (3:01) Difference between MLOps and data ops (7:06) The MLOps cycle (10:12) What is Featureform and what makes it different? (13:30) Is there another layer needed in feature stores? (18:46) Getting in touch with Featureform (23:55) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, an...

Apr 24, 202326 min

Data Council Week (Ep 1) - The Evolution of Stream Processing With Eric Sammer of Decodable

Highlights from this week’s conversation include: Eric’s journey to becoming CEO of Decoable (0:20) Does real time matter? (2:12) Differences in stream processing systems (7:57) Processing in motion (13:04) Why haven’t there been more open source projects around CDC? (20:34) The Decodable experience and future focuses for the company (24:31) Streaming processing and data lakes (32:54) Data flow processing technologies of today (39:01) The Data Stack Show is a weekly podcast powered by RudderStac...

Apr 23, 202343 min

135: Database Knob Tuning and AI with Andy Pavlo and Dana Van Aken of OtterTune

Highlights from this week’s conversation include: Origins of OtterTune (4:43) The problem of knob tuning (6:25) Roles of machine learning (9:32) OtterTune’s development and industry recognition (12:03) The challenges of database tuning and the role of human expertise (16:15) Tuning in production (20:23) Observability and Data Collection (23:37) Data Security and Privacy (29:59) Optimizing on-prem vs. cloud workloads (35:52) Performance benchmarks (40:20) Future opportunities OtterTune is focusin...

Apr 19, 20231 hr 16 min

134: Unpacking the AI Revolution and the Technology Behind A Feature-First Future with H.O. Maycotte of FeatureBase

Highlights from this week’s conversation include: The journey of H.O. into data and becoming the CEO of FeatureBase (2:37) Characteristics of the super evolution in technology (6:36) ChatGPT as the missionary of AI (9:45) The tension between authenticity and technology (13:12) What is FeatureBase? (17:53) Comparing FeatureBase to feature stores (25:58) Workload capacities and possibilities in FeatureBase (33:20) The importance of developer experience on a platform (38:23) Exciting developments f...

Apr 12, 202359 min

133: Building the Data Warehouse for Everything Else with Sammy Sidhu of Eventual

Highlights from this week’s conversation include: Sammy’s background in data and tooling (2:46) Going from multipurpose engineering to a CTO position (5:14) Changes in technology and deep learning models (7:31) The state of self-driving and adoption (13:49) What is Eventual and what are they solving in the space? (20:54) What are daft and data frame and how they work? (28:11) Building a query optimizer (33:42) Sammy’s take on what is going on in data and future possibilities (45:18) Eventual’s f...

Apr 05, 202358 min

132: Data Quality and Data Contracts with Chad Sanderson of Data Quality Camp

Highlights from this week’s conversation include: Chad’s background in data (2:10) Breaking down data quality (4:02) Semantic and logical layers of data (10:04) What are data contracts and how do they work? (17:41) Implicit contracts at companies (24:01) Where do data contracts fit in data infrastructure? (28:14) The value of data contracts to the producer and consumer (31:18) Tools needed in effective data contracts (46:13) The importance of community in data quality (50:53) Getting connected t...

Mar 29, 20231 hr 7 min

131: How Data Teams Interact With Marketing Tools with Jason Davis of Simon Data

Highlights from this week’s conversation include: Defining CDPs (2:28) The data team's role in marketing (7:41) Leveraging commonalities across businesses (12:49) Building a CDP with customer data (18:05) Challenges in identity modeling (23:00) CDP lifecycle and one-to-one data (30:06) Segmentation and optimization (33:23) Real-time data in the cloud (40:37) The future of AI and machine learning (43:02) Final thoughts and takeaways (46:42) The Data Stack Show is a weekly podcast powered by Rudde...

Mar 22, 202348 min

130: From Business Intelligence to Product Analytics and Beyond with Vijay Ganesan of NetSpring.io

Highlights from this week’s conversation include: Vijay’s background in data (2:09) The journey of founding ThoughtSpot and its impact in the world of BI (2:49) The maturation of BI (6:34) What is NetSpring.io? (8:21) Bridging the gap of BI and product analytics (14:41) Why data warehouses and not time-series databases? (19:58) The difficulty of using SQL in product analytics (28:35) Challenges in pricing models for product analytics and tooling (35:41) Combining analytics and attribution (42:00...

Mar 15, 202358 min

129: Databases, Data Warehouses, and Timeseries Data with David Kogn of Timescale

Highlights from this week’s conversation include: David’s background and journey to Timescale (2:12) What are time series databases? (14:13) How Timescale would have impacted David’s trajectory early in his career (17:51) Innovation in postgreSQL (21:02) Why does Timescale build their timeseries databases differently? (27:08) The challenges of building a new database on top of an old software (32:22) Writing outside of SQL and Timescale’s secret sauce (37:47) The importance of the developer expe...

Mar 08, 20231 hr 9 min

The PRQL: Time-Series Data 101

In this bonus episode, Eric and Kostas preview their upcoming conversation with David Kohn of Timescale.

Mar 06, 20234 min

128: The Possibilities Are Endless for Synthetic Data with Alex Watson of Gretel.ai

Highlights from this week’s conversation include: Alex’s background working for NSA and starting a company (1:51) The Gretel.ai journey (9:30) Defining synthetic data (13:26) The evolution of AI in deep learning data and language learning (16:28) The properties of synthetic data (21:31) Boundaries between synthetic data and prediction models (25:52) The developer experience in Gretel.ai (36:44) Stewardship and expansion of deep learning models in the future (45:36) Final thoughts and takeaways (...

Mar 01, 202357 min

127: The Anatomy of a Data Lakehouse with Alex Merced of Dremio

Highlights from this week’s conversation include: Alex’s background in the data space (2:41) Comics and Pop Culture Blending with Finance training (5:20) What is a data lake house? (7:36) What is Dremio solving in for users? (11:21) Essential components of a data lake house (16:35) Difference between on-prem and cloud experiences (33:53) What does it mean to be a developer advocate? (41:31) Final thoughts and takeaways (49:02) The Data Stack Show is a weekly podcast powered by RudderStack, the C...

Feb 22, 202353 min

126: Crossing the Product Analytics Chasm with Spenser Skates of Amplitude Analytics

Highlights from this week’s conversation include: Spenser’s journey to Co-Founding Amplitude (3:02) Looking back over the last decade of success at Amplitude (8:31) Going from Engineer to Sales (14:41) Comparing product analytics and general analytics (20:11) How cloud data warehousing has impacted analytics (31:38) Providing an out-of-the-box experience for consumers (41:12) Final thoughts and takeaways (54:27) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for develope...

Feb 15, 202358 min

125: Authorization Is A Data Problem with Jeff Chao of Abbey Labs

Highlights from this week’s conversation include: Jeff’s background at Netflix and Stripe leading him to Abbey Labs (2:22) What Abbey is solving in the space (5:16) Tackling permissions in an organization (7:30) Opportunities to improve the availability of data (10:14) The challenge of tackling a new problem area at a new company (14:59) What is the most common challenges in the identity and security space (18:43) Importance of identity and the ability to track it in data (22:46) Connecting all ...

Feb 08, 202355 min

124: Pragmatism About Data Stacks with Pedram Navid of West Marin Data

Highlights from this week’s conversation include: Pedram’s journey into the world of data (4:05) What should the datastack at an early-stage startup look like? (9:53) New ideas surrounding access control for data (24:45) What can data teams learn around complexity from software engineering (30:55) Scaling up instead of scaling out in processing data (37:40) Why DuckDB is making so much noise in the market (41:06) Final thoughts and takeaways (53:25) The Data Stack Show is a weekly podcast powere...

Feb 01, 202357 min

123: What Is a Universal Database? Featuring Stavros Papadopoulos of TileDB, Inc.

Highlights from this week’s conversation include: Stavros’ journey into data and founding TileDB (3:12) What problem was TileDB going to solve? (12:05) Defining database systems (21:35) What part of database architecture is TileDB? (31:58) Storage engine solutions (42:37) What does the API look like in using TileDB? (50:40) What makes genomics unique in working with data (55:28) Final thoughts and takeaways (1:06:46) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for dev...

Jan 25, 20231 hr 11 min

122: Why Accounting Needs Its Own Database with Joran Greef of Tiger Beetle

Highlights from this week’s conversation include: Joran’s background leading him from accounting to coding (3:10) What is Tiger Beetle? (5:53) Double-entry accounting and why it is important for a database (12:28) The need for low latency and high throughput (26:27) Why financial database software needs a laser focus (29:01) What are people using to implement a double-entry system? (36:09) Safety in financial software and addressing storage faults (40:26) Final thoughts and takeaways (55:52) The...

Jan 18, 20231 hr