The Data Stack Show - podcast cover

The Data Stack Show

Rudderstackdatastackshow.com
Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

Episodes

156: Simple, Performant, Cost-effective Data Streaming with Alex Gallego of Redpanda Data

Highlights from this week’s conversation include: Alex’s background in the data space and the creation of Redpanda (4:23) The cost and complexity of streaming (11:07) The evolution of storage with Kafka (12:04) The distinction between streaming technologies (15:10) Simplicity as a Core Design Principle (27:03) Cost Efficiency in a Cloud Native Era (30:44) Removing complexity with Redpanda (34:21) Migrations and compatibility with Redpanda (40:35) The Future of Redpanda (43:44) The Story Behind R...

Sep 20, 202355 min

155: Bringing Innovation to Enterprise Resource Planning with Emilie Schario of Turbine

Highlights from this week’s conversation include: Emilie’s background and journey in data (3:42) The problem of three-way match (8:56) Operational workflows and how data stacks solve them (13:16) Turbine’s solution as a lightweight ERP (14:05) Workflows and analytics (14:59) Consolidating information into helpful application (27:41) Challenges in operational workflows (32:19) Friction and hurdles in ERP usage (39:28) A solution for purchase order management (40:47) Turbine’s focus and limitation...

Sep 13, 20231 hr 1 min

154: Making Cross-Company Data Exchange Easy with Pardis Noorzad of General Folders

Highlights from this week’s conversation include: Pardis’ background and journey in data (3:24) AI before the hype (8:37) Founding General Folders (12:36) Data collaboration challenges (15:31) Examples of data sharing (17:40) Data transfer in various industries (22:16) Defining the transfer problem (28:30) The demand for scalable solutions (32:06) Data transfer and model exposition (41:02) Data governance and API (43:23) Final thoughts and takeaways (56:48) The Data Stack Show is a weekly podcas...

Sep 06, 20231 hr 3 min

153: The Future of Data Science Notebooks with Jakub Jurových of Deepnote

Highlights from this week’s conversation include: Jakub’s journey into data and working with notebooks (2:43) Overview of Deepnote and its features (7:22) Notebook 1.0 and 2.0 (14:04) Notebook 3.0 and its potential impact (15:46) The need for collaboration across organizations (17:16) Real-time, asynchronous, and organizational collaboration (28:02) Challenges to collaboration (32:03) Notebooks as a universal computational medium (36:14) The rise of exploratory programming (41:40) The power of n...

Aug 30, 20231 hr

152: Three Steps To Enhance Product Analytics with Ken Fine of Heap

Highlights from this week’s conversation include: Ken’s background and journey to Heap (2:32) Heap’s problem-solving approach (8:19) Auto-capture and its significance in the marketplace (13:03) Providing qualitative context: sessions and surveys (16:23) Collection and storage of data (25:42) Challenges of real-time data collection (26:40) The true gap in the market today (37:39) Consolidation and aggregation of data solutions (41:58) Simplifying the data stack (47:32) A different approach in eng...

Aug 23, 20231 hr 7 min

151: How To Unlock the Data Warehouse for Marketing with Chris Sell of GrowthLoop

Highlights from this week’s conversation include: The need for reverse ETL in marketing (2:24) Closing the gap between engineering, data, and marketing teams (8:37) The analytics persona’s opportunity (11:53) Interface layer (13:06) Approach to messy warehouse data (15:57) The need for a complicated infrastructure (28:43) Challenges in data integration for marketers (29:26) The evolution of the analytics stack (31:53) Orchestration of the data warehouse (38:39) The role of marketing tools (40:35...

Aug 16, 202353 min

150: How Salespeople Use Data, Salesforce vs. Snowflake, and How LLMs Are Transforming Sales with Brendan Short of Groundswell

Highlights from this week’s conversation include: Brendan’s background and journey to Groundswell (2:25) The impact of generative AI on sales reps and product building (5:38) Lead sourcing challenges (12:22) Salesforce as a data model (14:30) The need for guardrails in building applications around sales (24:37) The question of interfaces in the layers of Salesforce (26:11) A UI solution for sales and marketing (30:45) The future of logic and machine learning models (37:11) The battle for data ow...

Aug 09, 20231 hr 12 min

149: Turning Tables Into APIs for Real-time Data Apps, Featuring Matteo Pelati and Vivek Gudapuri of Dozer

Highlights from this week’s conversation include: Building Dozer: Simplifying Data Sources into APIs (1:13) Bridging Data Engineering with Application Engineering (4:19) Turning Data Sources into APIs (7:46) The cost of caching (12:59) Challenges with legacy systems (14:30) Real-time data integration (19:31) YAML and SQL experience (25:37) Behind the scenes of Dozer (29:18) Heavy Workloads and Low Latency (42:00) Use Cases of Dozer (45:51) Reliability and storing data from different connectors (...

Aug 02, 20231 hr 4 min

148: Exploring the Intersection of DAGs, ML Code, and Complex Code Bases: An Elegant Solution Unveiled with Stefan Krawczyk of DAGWorks

Highlights from this week’s conversation include: Stefan’s background in data (2:39) What is DAGWorks? (3:55) How building point solutions influenced Stefan’s journey (5:03) Solving the tooling problems of self-service at an organization (11:44) Creating Hamilton (15:53) How Hamilton works with definitions and time-series data (19:34) What makes Hamilton an ML-oriented framework? (23:39) Navigating the differences between ML teams and other data teams (26:27) Understanding the fundamentals of Ha...

Jul 26, 202357 min

The PRQL: A Methodology for Better DAGs with Stefan Krawczyk of DAGWorks

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com ....

Jul 24, 20234 min

Shop Talk: Snowflake Summit Recap

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com ....

Jul 21, 202320 min

147: Where Data and Infrastructure Converge Featuring Lars Kamp of Resoto

Highlights from this week’s conversation include: Lars work on Resoto in helping to cut cloud costs for organizations (2:02) The trend of large resources to micro resources (5:59) What are some of the typical resource drains in data infrastructure (8:56) Managing cost on the backend with scale and experimentation (12:51) Solutions for resource management problems (17:38) How Resoto is solving pain points in resource management (26:17) Navigating the complexities of data infrastructure (29:01) Re...

Jul 19, 202358 min

146: What Is a Customer Data Platform? Featuring Soumyadeb Mitra of Rudderstack

Highlights from this week’s conversation include: Soumyadeb’s background and journey in data (5:49) Defining customer data (8:10) The complexity of customer data collection (10:04) What is a CDP and how it is properly deployed (17:12) Bridging the gap of data collection and useful analytics for marketing (21:46) How Rudderstack translates data and the new profile feature (25:30) The foundations of data in building a 360 degree customer profile (30:30) Solutions for the intersection between engin...

Jul 12, 202352 min

The PRQL: Synthetic Data and Self Driving Cars with Omar Maher of Parallel Domain

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com....

Jul 03, 20236 min

144: Explaining Features, Embeddings, and the Difference Between ML and AI with Simba Khadder of Featureform

Highlights from this week’s conversation include: Simba’s background in the data space (3:05) Subscription intelligence (6:41) ML and Distributed Systems (9:09) The Brutal Subscription Industry (12:31) Serendipity in Recommender Systems (16:31) Subscription as a Strategy (20:47) Customizing Content for Subscribers (22:19) Creating User Embeddings (25:53) Building Featureform (28:01) Embedding Projections (32:47) Spaces and similarity (35:53) User embeddings and transformer models (38:22) Vector ...

Jun 28, 20231 hr 12 min

Shop Talk: Accountability and Opportunity for AI

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com ....

Jun 23, 202320 min

143: Collaborative Data Analytics on the Data Warehouse, featuring Rob Woollen & Stipo Josipovic of Sigma

Highlights from this week’s conversation include: Stipo and Rob’s background in data (2:43) What is Sigma? (7:46) Takeaways from building analytics products in-house (9:16) Sigma’s approach to datastore interface (11:32) Why analytics and BI are still not a solved problem (15:50) Combining SQL and spreadsheets for useful interface (23:17) The evolution of BI to today (29:40) Overcoming the challenges of collaboration in working with data (33:17) Creating operational coding that humans can unders...

Jun 21, 20231 hr 15 min
For the best experience, listen in Metacast app for iOS or Android
Open in Metacast