Highlights from this week’s conversation include: Chad’s background and journey in data (0:46) Importance of Data Supply Chain (2:19) Challenges with Modern Data Stack (3:28) Comparing Data Supply Chain to Real-world Supply Chains (4:49) Overview of Gable.ai (8:05) Rethinking Data Catalogs (11:42) New Ideas for Managing Data (15:16) Data Discovery and Governance Challenges (18:51) Static Code Analysis and AI Impact on Data (24:55) Creating Contracts and Defining Data Lineage (27:31) Data Quality...
Mar 27, 2024•1 hr 3 min•Transcript available on Metacast The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com ....
Mar 25, 2024•8 min•Transcript available on Metacast Highlights from this week’s conversation include: Kevin’s background and work at Stripe (0:31) Evolution of Data Infrastructure at Stripe (2:18) Kevin's Interest in Data (5:29) Software Engineer or Data Engineer? (8:27) Speech Recognition Work at Amazon (11:06) Efficiency and Cost Management (15:50) Metadata and Query Analysis (18:38) Surprising Discoveries in Metadata Analysis (21:43) Optimizing Cost and Value (23:55) Product Sizing Stripe Data (26:39) Popular Tool for Data Interaction (30:08) ...
Mar 20, 2024•1 hr 1 min•Transcript available on Metacast The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com ....
Mar 18, 2024•6 min•Transcript available on Metacast Highlights from this week’s conversation include: Michael’s background and journey in data (0:33) The origin story of Druid (2:39) Experiences and growth in Data (8:08) Druid's evolution (21:46) Druid's architectural decisions (26:32) The user experience (30:06) The developer experience (35:14) The evolution of BI tools (40:55) Data architecture and integration (47:53) AI's impact on BI (52:26) What would Mike be doing if he didn’t work in data? (56:27) Final thoughts and takeaways (57:02) The D...
Mar 13, 2024•1 hr•Transcript available on Metacast The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com ....
Mar 11, 2024•6 min•Transcript available on Metacast Highlights from this week’s conversation include: The evolution of data operations (1:13) Unravel's role in simplifying data operations (2:17) Kunal’s journey from fashion to enterprise data management (5:23)\ The Unravel platform and its components (10:08) Challenges in data operations at scale (16:34) Users of Unravel within an organization (22:32) Calculating ROI on data products (25:55) Understanding the cost of data operations (27:01) Measuring productivity and reliability (30:59) Diversity...
Mar 06, 2024•53 min•Transcript available on Metacast In this bonus episode, Eric and Kostas preview their upcoming conversation with Kunal Agarwal of Unravel Data.
Mar 04, 2024•5 min•Transcript available on Metacast Highlights from this week’s conversation include: Tony's background and research focus (3:35) Challenges in academia and industry (6:15) Ph.D. student's routine (10:47) Academic paper review process (15:26) Aha moments in research (20:05) Academic lab structure (23:09) The decision to move from hardware to data research (24:43) Research focus on time series data management (27:40) Data modeling in time series and OLAP systems (32:01) Issues and potential solutions for parquet format (37:32) Role...
Feb 28, 2024•51 min•Transcript available on Metacast In this bonus episode, Eric and Kostas preview their upcoming conversation with Tony Wang of Stanford University.
Feb 26, 2024•3 min•Transcript available on Metacast Highlights from this week’s conversation include: Peter's background and journey in data (0:26) Introduction to PLG (4:18) Starting in data at Heroku (6:05) Building the data stack at Heroku (8:13) Data stack requirements for early-stage companies (12:00) Differentiating PLG companies from open source companies (19:26) Venture capital and open source as a lever for growth (22:56) Initial data modeling and analysis (25:38) Operationalizing Data (29:16) Sales and Marketing Operationalization (31:5...
Feb 21, 2024•57 min•Transcript available on Metacast In this bonus episode, Eric and Kostas preview their upcoming conversation with Peter Chapman, a GTM consultant.
Feb 19, 2024•6 min•Transcript available on Metacast Highlights from this week’s conversation include: The overview of refuel (0:33) The evolution of AI and LLMs (3:51) Types of LLM models (12:31) Implementing LLM use cases and cost considerations (00:15:52) User experience and fine-tuning LLM models (21:49) Categorizing search queries (22:44) Creating internal benchmark framework (29:50) Benchmarking and evaluation (35:35) Using refuel for documentation (44:18) The challenges of analytics (46:45) Using customer support ticket data (48:17) The tag...
Feb 14, 2024•1 hr 7 min•Transcript available on Metacast In this bonus episode, Eric and Kostas preview their upcoming conversation with Rishabh Bhargava of refuel.
Feb 12, 2024•4 min•Transcript available on Metacast Highlights from this week’s conversation include: Viren’s background in data (0:39) Evolution of Orchestration (1:52) AI Orchestration (3:00) Understanding Conductor and orkes (6:26) Event-Driven Orchestration (8:10) Viren’s Transition to Founder (12:27) Non-Technical Aspects of Being a Founder (15:50) Democratizing AI for Developers (18:16) The evolution of microservices orchestration (21:56) Challenges in appealing to the 99% developer group (24:32) Value of orchestration for developers (30:31...
Feb 07, 2024•53 min•Transcript available on Metacast In this bonus episode, Eric and Kostas preview their upcoming conversation with Viren Baraiya of orkes.io.
Feb 05, 2024•4 min•Transcript available on Metacast Highlights from this week’s conversation include: Introduction of the panel (0:05) Defining composable data stack (5:22) Components of a composable data stack (7:49) Challenges and incentives for composable components (10:37) Specialization and modularity in data workloads (13:05) Organic evolution of composable systems (17:50) Efficiency and common layers in data management systems (22:09) The IR and Data Computation (23:00) Components of the Storage Layer (26:16) Decoupling Language and Execut...
Jan 31, 2024•1 hr 19 min•Transcript available on Metacast In this bonus episode, Eric and Kostas preview their upcoming discussion with a panel of experts as Wes McKinney (Co-Founder, Voltron), Pedro Pedreira Software Engineer, Meta), Chris Riccomini (Seed Investor, various startups), and Ryan Blue (Co-Founder and CEO, Tabular) join the show.
Jan 29, 2024•5 min•Transcript available on Metacast Highlights from this week’s conversation include: Artyom’s background in the data space (0:32) The growth and changes at Cube (5:58) Pain points of managing metrics definitions across different tools (9:39) Trade-offs between coupled and decoupled semantic layers (12:12) Making a case for implementing a semantic layer (14:17) The evolution of semantic layers (23:28) Challenges in designing a decoupled semantic layer (24:16) Different approaches to solving the interface problem (26:58) Implementi...
Jan 24, 2024•58 min•Transcript available on Metacast In this bonus episode, Eric and Kostas preview their upcoming conversation with Artyom Keydunov of Cube Dev.
Jan 22, 2024•3 min•Transcript available on Metacast Highlights from this week’s conversation include: No Code Analytics (1:22) Analytics as a Team Sport (2:31) The workflow of someone without Alteryx (11:27) Alteryx's ability to handle diverse data sources (14:32) The balance between ease of use and complexity (23:06) Enabling casual end users with a no code interface (24:19) Taking analytics to the data (31:47) The boundaries between data engineers and end users (33:44) The importance of collaboration in analytics (34:12) The potential of every ...
Jan 17, 2024•47 min•Transcript available on Metacast In this bonus episode, Eric and Kostas preview their upcoming conversation with Jay Henderson of Alteryx.
Jan 15, 2024•4 min•Transcript available on Metacast Highlights from this week’s conversation include: Matt’s background and journey with Fermyon (2:32) WebAssembly and enhanced security models (3:43) The IOT Startup and Google Acquisition (10:49) Google's Early Containers (11:50) Scaling and anticipating requests (20:22) Introduction to WebAssembly and its importance (23:32) The Benefits of WebAssembly (30:57) Comparison of Virtual Machines, Containers, and Micro VMs (33:12) The Importance of Fast Startup Times in WebAssembly (37:39) Metaphysics ...
Jan 10, 2024•56 min•Transcript available on Metacast In this bonus episode, Eric and Kostas preview their upcoming conversation with Matt Butcher of Fermyon Technologies.
Jan 08, 2024•5 min•Transcript available on Metacast Highlights from this week’s conversation include: The role of an orchestrator in the lifecycle of data (1:34) Relevance of orchestration in data pipelines (00:02:45) Changes around data ops and MLOps (3:37) Data Cleaning (11:42) Overview of Dagster (13:50) Assets vs Tasks in Data Pipeline (19:15) Building a Data Pipeline with Dexter (25:40) Difference between Data Asset and Materialized Dataset (28:28) Defining Lineage and Data Assets in Dagster (29:32) The boundaries of software and organizatio...
Jan 03, 2024•56 min•Transcript available on Metacast In this bonus episode, Eric and Kostas preview their upcoming conversation with Sandy Ryza of Dagster.
Jan 02, 2024•4 min•Transcript available on Metacast Highlights from this week’s conversation include: The evolution of the data scientist role (1:03) Common problems in different companies (2:05) Measuring and curating content on Reddit (4:29) The challenges of working with unstructured content at Reddit and Twitter (11:03) Lessons learned from Reddit and applying them at Twitter (13:17) Data challenges and customer behavior analysis at GlossGenius (20:16) How the data scientist's role has changed over time (00:25:10) The essence of the data scie...
Dec 27, 2023•54 min•Transcript available on Metacast In this bonus episode, Eric and Kostas preview their upcoming conversation with Katie Bauer of GlossGenius.
Dec 26, 2023•3 min•Transcript available on Metacast Highlights from this week’s conversation include: The Evolution of Databases and Data Systems (2:33) Abstracting Data for Business Users (4:31) Building a Database for Google-like Search (7:58) The Big Data Explosion (11:10) Selling Myspace as First Customer (13:14) Starting ActionIQ (16:57) The customer-centric organization (22:46) Transitioning to customer data focus (23:53) Understanding business users' needs (28:30) Supporting Arbitrary Queries and Data Models (34:42) Unique Technical Perspe...
Dec 20, 2023•1 hr 6 min•Transcript available on Metacast In this bonus episode, Eric and Kostas preview their upcoming conversation with Tasso Argyros of ActionIQ.
Dec 18, 2023•6 min•Transcript available on Metacast