The Data Stack Show - podcast cover

The Data Stack Show

Rudderstackdatastackshow.com
Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

Episodes

121: Materialize Origins: Breaking Down Data Flow Layers with Arjun Narayan and Frank McSherry

Highlights from this week’s conversation include: Defining data flow (2:31) Are there limitations in timely data flow operation and/or building operators? (8:20) Areas of incremental computation that are having an impact today (17:10) Building a library vs building a product (24:06) Combining delight and empathy into a focus (27:52) Final thoughts and takeaways (32:42) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, ...

Jan 11, 202337 min

120: Materialize Origins: A Timely Dataflow Story with Arjun Narayan and Frank McSherry

Highlights from this week’s conversation include: What is Materialize? (2:43) Frank and Arjun’s journey in data and what led them to the idea of Materialize (6:22) The good and the bad of research in academia vs starting a company (25:20) The MVP for databases (33:49) Materialize’s end-to-end benefit for the user experience (43:03) Interchanging Materialize in warehouse and cloud data usage (48:25) The trade-offs within Materialize (1:00:02) Final takeaways and previewing part two of the convers...

Jan 04, 20231 hr 14 min

119: The Data Stack Show Wrapped: 2022

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com ....

Dec 28, 202212 min

118: Bringing Powerful Business Intelligence to Mobile with Zack Hendlin of Zing Data

Highlights from this week’s conversation include: Zack’s extensive background in the world of data and the genesis of Zing (3:02) Working on relevance, feeds, and ads at Facebook & LinkedIn (9:20) Exploring BI and queries on mobile devices (16:48) Reliance of input quality in data (23:28) Delivering a mobile-first experience in BI (30:11) Limitations of visualization on mobile devices (34:00) How BI tools interact with one another in Zing (45:21) The future of user-experience in consuming data (...

Dec 21, 20221 hr 3 min

117: DX for Data Tooling with Taylor Murphy of Meltano

Highlights from this week’s conversation include: Taylor’s journey into data (3:09) What’s been going on at Meltano recently? (7:28) Addressing basic problems in data even with advancements in technology (12:23) What makes Meltano unique in the space (16:53) Why the CLI experience is important (25:37) Quality vs quantity in supporting connectors (35:51) What does data ops look like for Meltano (46:44) Takeaways and closing thoughts (52:56) The Data Stack Show is a weekly podcast powered by Rudde...

Dec 14, 202256 min

116: Data Democratization & Self Service with Aron Clymer of Data Clymer

Highlights from this week’s conversation include: Aron’s background in the world of data (2:18) Recent Clients and major projects (3:30) Helping to spearhead data-driven growth at Salesforce (6:50) Stories about Marc Benioff, co-founder of Salesforce (16:12) Biggest learnings as a consultant in the data strategy space (17:58) The need for data democratization (23:33) Advice for Aron’s younger self in consulting (28:45) Current trends in data democratization and sales service (35:01) Aron’s favor...

Dec 07, 202254 min

115: What Is Production Grade Data? Featuring Ashwin Kamath of Spectre

Highlights from this week’s conversation include: Ashwin’s background in the data space (2:43) The unique nature of working with data in finance (7:32) Technological challenges of working in the finance data space (13:55) The third-party data factor and judging if it is reliable enough (17:07) What made Ashwin decide to go out and build his own company? (31:47) Defining data decay and data storing and why both are important (37:52) Advice on the importance of data quality (42:10) Final takeaways...

Nov 30, 202255 min

114: Solving Data Infrastructure Problems at Startups and Enterprises with Max Werner of Obsessive Analytics Consulting

Highlights from this week’s conversation include: Max’s career journey (2:54) Going from a small startup to a big enterprise (11:15) Dynamics of a switchboard operator (17:09) Common threads through different companies (20:53) When data is not the answer (26:57) The evolution of CDP (29:38) Data sources to include in a CDP (35:16) Working with event data (37:19) Max’s take on other tools (41:18) The cutting edge in data (43:09) Building your data company in an evolving environment (49:28) Find M...

Nov 23, 202259 min

The PRQL: The Data Switchboard

In this bonus episode, Eric and Kostas preview their upcoming conversation with Max Werner of Obsessive Analytics Consulting.

Nov 21, 20223 min

113: What Is Streaming Graph? Featuring Ryan Wright of thatDot

Highlights from this week’s conversation include: Ryan’s background and career journey (2:49) Quine and where it came from (4:36) Graph databases 101 (7:17) Use cases for graph databases (13:44) Purposes for graphs (22:27) How to use Quine (31:49) Quine’s performance and scale (43:06) Educating users about a new product (49:13) The team that would optimize Quine (52:23) When graph will gain popularity (56:15) Quine: https://quine.io/ The Data Stack Show is a weekly podcast powered by RudderStack...

Nov 16, 20221 hr 5 min

The PRQL: Graph as a Utility

In this bonus episode, Eric and Kostas preview their upcoming conversation with Ryan Wright of thatDot.

Nov 14, 20223 min

112: Python Native Stream Processing with Zander Matheson of bytewax

Highlights from this week’s conversation include: Zander’s background and career journey (2:32) Introducing bytewax (5:16) The difference between systems (10:57) Bytewax’s most common use cases (16:15) How bytewax integrates with other systems (20:25) The technology that makes up bytewax (24:31) Comparing bytewax to other systems (34:17) What’s next for bytewax (36:31) Try it out: bytewax.io The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll t...

Nov 09, 202250 min

111: What if Your Code Just Ran in the Cloud for You? Featuring Erik Bernhardsson of Modal Labs

Highlights from this week’s conversation include: Erik’s background and career journey (2:51) Managing scale in a rapidly changing environment (6:35) The people side of hypergrowth (12:36) Coding competitions (17:50) Introducing Modal Labs (19:02) How Erik got into building Modal (21:45) The employee experience at Modal (28:09) How a data engineering team would use Modal (31:21) What it takes to build a platform like Modal (36:27) What makes Modal different (42:49) Evolution coming for the data ...

Nov 02, 202258 min

110: How Can Data Discovery Help You Understand Your Data? Featuring Shinji Kim of Select Star

Highlights from this week’s conversation include: Shinji’s background and career journey (3:35) Defining “data discovery” (6:03) The best conditions to use Select Star (8:45) Where Select Star fits on the data spectrum (13:38) Why Select Star is needed (17:35) How Select Star uses metadata (21:02) Exposing data queries (27:04) Composing queries into metadata (33:27) Automating BI tools (37:28) Limits to data governance (41:39) Maintaining economies of scale (48:56) The Data Stack Show is a weekl...

Oct 26, 202258 min

109: How Does Headless Business Intelligence Work? Featuring Artyom Keydunov and Pavel Tiunov of Cube Dev

Highlights from this week’s conversation include: The context of Headless BI (3:31) What Cube Dev does (9:24) How Headless BI works with other tools (13:03) An analysis of LookML (18:04) User interaction with Cube Dev (23:40) Who manages data artifacts (25:22) Taking care of the developer experience (30:37) Levels of performance (30:37) Artyom and Pavel’s background and career journey (35:47) Why you should use Cube Dev (43:38) Roles within a data organization (48:55) How Cube Dev impacts visual...

Oct 19, 20221 hr 1 min

108: You Can’t Separate Data Reliability From Workflow with Gleb Mezhanskiy of Datafold

Highlights from this week’s conversation include: Gleb’s background and career journey (2:51) The adoption problems (10:53) How Datafold solves these problems (18:08) The vision for Datafold (26:27) Incorporating Datafold as a data engineer (38:53) The importance of the data engineer (42:12) Something to keep in mind when designing data tools (46:46) Implementing new technology into your company (53:18) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each ...

Oct 12, 20221 hr