The Analytics Engineering Podcast - podcast cover

The Analytics Engineering Podcast

dbt Labs, Inc.roundup.getdbt.com
Tristan Handy has been curating the Analytics Engineering Roundup newsletter since 2015, pulling together the internet's best data science & analytics articles. Tristan and co-host Julia Schottenstein now bring the Roundup to real life, hosting biweekly conversations with data practitioners inventing the future of analytics engineering. You can view full episode summaries and read back issues of the Roundup newsletter at https://roundup.getdbt.com. The podcast is sponsored by dbt labs, makers of the data transformation framework dbt. To reach our team, drop a note to podcast@dbtlabs.com.
Last refreshed:
Follow this podcast in the Metacast mobile app to refresh it and see new episodes.
Download Metacast podcast app
Podcasts are better in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episodes

The Hard Problems™️ of Data Observability w/ Kevin Hu of Metaplane

As a PhD candidate at MIT, Kevin (and friends) published Sherlock, a data type detection engine (a surprisingly bedeviling problem) for data cleaning + data discovery. Now as co-founder and CEO of Metaplane, a data observability startup, Kevin applies these same automated data discovery methods to help data teams keep their data healthy. In this conversation with Tristan & Julia, Kevin wins the coveted award for "most crystal-clear explanations of complex technical concepts through physics a...

Apr 08, 202243 min

The Bundling vs Unbundling Debate w/ Tristan, Benn Stancil and David Jayatillake

A debate has erupted on data Twitter and data Substack - should the modern data stack remain unbundled, or should it consolidate? In this conversation, Benn Stancil (Mode), David Jayatillake (Avora) and our host Tristan Handy try to make some sense of this debate, and play with various future scenarios for the modern data stack. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com . The Analytics Engineering Podcast is ...

Mar 25, 202243 min

One Database to Rule All Workloads? With Jon "Natty" Natkins of dbt Labs

Will the dream of a mythical database to handle all workloads (transactional + analytical) ever become a reality, or does it violate the laws of physics? This question sparked a hearty debate internally at dbt Labs, and Jon "Natty" Natkins joins Julia here to continue the conversation. Natty knows databases, and this episode will take you on a historical romp through the rise and fall of Hadoop, the transition to cloud data warehouses, and what's waiting for us next in database-land. For full sh...

Mar 11, 202236 min

Ashley Sherwood (AE @ Hubspot): Permissionless Innovation for Data Teams

Ashley is a Principal Analytics Engineer at Hubspot, and has helped lead their implementation of dbt. Ashley makes unique connections in her writing and work. On her Substack, "syntax error at or near ❤️," Ashley might be found comparing growing companies to butterflies, or going deep on how to accommodate sensitive people in the workplace. In this conversation with Tristan & Julia, Ashley dives into the nuts and bolts of her trajectory pushing data innovation forward at Hubspot. For full sh...

Feb 25, 202246 min

Tristan in the Hot Seat

In this very special episode, we'll be turning the spotlight on co-host Tristan Handy, the CEO & Co-founder of dbt Labs. In this AMA with Julia, you'll get to know more about Tristan as a human, as a writer, and as the CEO of dbt Labs helping to push the analytics engineering practice forward. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

Dec 17, 202139 minSeason 1Ep. 20

[COALESCE] Down With "Data Science" w/ Emilie Schario of Amplify Partners

Your company has one definition for revenue across the organization, one definition of the customer, and one definition of sign-up. For people whose jobs are so defined by ensuring we're aligned, we can't seem to standardize on one definition for the Data Scientist. In this talk, Emilie Schario (Data Strategist-in-Residence at Amplify Partners and longtime dbt community member) proposes we lobby against the title Data Scientist, instead choosing some variation of the Core Four Data Roles: Data A...

Dec 10, 202146 minSeason 1Ep. 19

[COALESCE] Peeking Into the Future of Data Analytics w/ Julia

How is the data landscape evolving, what trends should you pay attention to and which should you ignore? In this panel, Julia Schottenstein (our fearless co-host and dbt Labs product manager) catches up with Sarah Catanzaro, Jennifer Li and Astasia Myers to dive into the trends playing out in our work. Register to catch the rest of Coalesce, the Analytics Engineering Conference, at https://coalesce.getdbt.com. The Analytics Engineering Podcast is brought to you by dbt Labs....

Dec 09, 202145 minSeason 1Ep. 18

[COALESCE] The Modern Data Experience w/ Benn Stancil of Mode

In this talk, former podcast guest Benn Stancil walks through what he believe the next evolution of the modern data stack should look like - and more importantly, how those who use it should experience it. Register to catch the rest of Coalesce, the Analytics Engineering Conference, at https://coalesce.getdbt.com. The Analytics Engineering Podcast is brought to you by dbt Labs.

Dec 09, 202130 minSeason 1Ep. 17

[COALESCE] Data Analytics In A Snowflake World ft. Christian Kleinerman

Where does Snowflake go from here? What meta trends and technologies play into that vision? How does that impact the world of data analytics? Christian and Tristan have no shortage of opinions or ideas. This is your chance to hear some of them, live and unfiltered. Register to catch the rest of Coalesce, the Analytics Engineering Conference, at https://coalesce.getdbt.com. The Analytics Engineering Podcast is brought to you by dbt Labs.

Dec 09, 202124 minSeason 1Ep. 16

[COALESCE] You Don't Need Another Database W/ Reynold Xin of Databricks and Drew Banin of dbt Labs

Reynold Xin is a technical co-founder and Chief Architect at Databricks. He's also a co-creator and the top contributor to the Apache Spark project. In this casual conversation with Drew Banin, co-founder and Chief Product Officer at dbt Labs, the two will be discussing the data infrastructure trends they find most interesting. Register to catch the rest of Coalesce, the Analytics Engineering Conference, at https://coalesce.getdbt.com. The Analytics Engineering Podcast is brought to you by dbt L...

Dec 07, 202130 minSeason 1Ep. 15

[COALESCE] How big is this wave? Ft. Martin Casado of a16z

The modern data stack is the third generation of data analysis products to come to prominence since the 90's. The prior waves—data warehouse appliances and then Hadoop—were both big steps forwards but ultimately failed to live up to their initial promise. Is the modern data stack just another iteration in a long string of "trendy technologies" in data––waves that crash upon the shore but ultimately recede? Or is it somehow more permanent? Register to catch the rest of Coalesce, the Analytics Eng...

Dec 07, 202145 minSeason 1Ep. 14

[COALESCE] Scaling Knowledge > Scaling Bodies: Why dbt Labs is making the bet on a data literate organization (ft. Erica Louie of dbt Labs!)

What is it like to build a data team for a company in the data space? This talk is centered around how dbt Labs is building their data team. We will cover how our team is structured, how we operate and interact with the greater organization, and how we set expectations and responsibilities that are helping us become a self-service organization. Register to catch the rest of Coalesce, the Analytics Engineering Conference, at https://coalesce.getdbt.com. The Analytics Engineering Podcast is brough...

Dec 07, 202126 minSeason 1Ep. 13

DeVaris Brown: Bringing Streaming Data to Analysts

As a product leader at companies like Heroku and Zendesk, DeVaris specialized in building infrastructure-grade products. Currently, as the CEO of Meroxa, he enables teams to build real-time data infrastructure with the same ease as we now take for granted in batch. In this romp of an episode, Tristan, Julia and DeVaris flow from his experience in tech mentorship, into the nuts and bolts of Change Data Capture (CDC), and how streaming data infrastructure can help data teams provide better end use...

Dec 02, 202150 minSeason 1Ep. 12

David Jayatillake: Should Great Data People Become Managers or Not?

David is Sr. Director of Data at Lyst, and as leader of their analytics + data science teams he has followed the evolution of data roles closely over the past decade. David spends a lot of time thinking about career progression + data team structure, and in this conversation with Tristan + Julia they dive into the classic individual contributor vs manager conundrum, migrating between warehouses, and reactive vs proactive data workflows. For full show notes and to read 6+ years of back issues of ...

Nov 18, 202141 minSeason 1Ep. 11

Julien Le Dem: Why Data Lineage Matters

Julien has a unique history of building open frameworks that make data platforms interoperable. He's contributed in various ways to Apache Arrow, Apache Iceberg, Apache Parquet, and Marquez, and is currently leading OpenLineage, an open framework for data lineage collection and analysis. In this episode, Tristan & Julia dive into how open source projects grow to become standards, and why data lineage in particular is in need of an open standard. They also cover into some of the compelling us...

Nov 04, 202149 minSeason 1Ep. 10

Benn Stancil: Friday Night (Data) Fights

Benn is Chief Analytics Officer and a Co-founder at Mode Analytics, but you may know him from his Substack newsletter ( benn.substack.com ), where each Friday he dives into a semi-controversial topic (recent examples: "Is BI Dead?" and "BI is Dead"). In this episode, Benn, Tristan & Julia finally hash out some of these debates IRL: what *is* the modern data stack, why is the metrics layer important, and what's the point of all of this? For full show notes and to read 6+ years of back issues ...

Oct 21, 202149 minSeason 1Ep. 9

Seth Rosen: On Becoming a Full-stack Data Analyst

Seth Rosen has broken data Twitter many times, and in his early-fatherhood sleep deprivation developed a wonderful Twitter persona as the battle-tested data analyst. IRL though Seth is a serious data practitioner, and as Founder at the data consultancy HashPath has helped dozens of companies get into the modern data stack + build public-facing data apps. Now, as the founder of TopCoat, he's empowering analysts to build + publish those same public-facing data apps. In this episode, Tristan, Julia...

Oct 07, 202139 minSeason 1Ep. 8

Brittany Bennett: Training the Next Generation of 'Data for Good' Practitioners @ Sunrise Movement

Brittany Bennett is Data Director at Sunrise Movement, the youth climate movement that numbers tens of thousands of members throughout every US state. Given how quickly our industry moves, developing junior data talent is hard, but Brittany's team at Sunrise makes it look easy. And that's no accident—because Sunrise hires for mission alignment rather than technical background, they dedicate significant resources to training + mentorship. In this conversation, Tristan, Julia & Brittany dive d...

Sep 23, 202139 minSeason 1Ep. 7

Caitlin Colgrove (CTO @ Hex): Notebooks for the Rest of Us

Caitlin Colgrove is Co-founder & CTO at Hex, a data workspace that allows teams to collaborate in both SQL and Python to publish interactive data apps. In this conversation, Tristan, Julia and Caitlin dive into the possibilities that real-time collaborative notebooks unlock for data teams — what if our collaboration style looked more like Google Docs than a Git workflow? For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.get...

Sep 09, 202140 minSeason 1Ep. 6

Erik Bernhardsson: The Missing Tool in the Data Team's Toolbox

Erik Bernhardsson spent six years at Spotify, where he contributed to the first version of the music recommendation system. After a stint as CTO at Better.com, he's now working on building new infrastructure tooling for data teams. In this wide-ranging conversation with Tristan & Julia, Erik dives into the nuts and bolts of Spotify's recommendation algorithm, (paradoxically) why you should rarely need to use ML, and the fundamental infrastructure challenges that drag down the productivity of...

Aug 26, 202142 minSeason 1Ep. 5

Meet Co-Host Julia Schottenstein

In this episode, we're going to do something a little different, and turn the spotlight on co-host Julia Schottenstein. In this conversation with Tristan, you'll get to know Julia a bit—from her early childhood ambitions of becoming a "computer tycoon" (adorable!), to working in venture at NEA and now as a Product Manager at dbt Labs. They also dive into Julia's opinions on key trends shaping the future of the data industry (the phrase oligopoly makes an appearance). For full show notes and to r...

Aug 12, 202132 minSeason 1Ep. 4

Brian Amadio: The Practice of Experimentation @ Stitch Fix

Brian Amadio is a Data Platform Engineer at Stitch Fix, where experimentation underpins everything they do across merchandising, planning, forecasting, operations and more. In this conversation with Tristan, Julia, and Brian you'll get into the weeds of executing multi-armed bandit experiments and learn how you can perform experiments even with limited data. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com . The Ana...

Jul 29, 202139 minSeason 1Ep. 3

Venkat Venkataramani: The Future is Real-time

Step with Venkat into a world where data is always fresh, queries run in 1ms, and analytics engineers build web-scale, real-time data apps. As Engineering Director at Facebook, Venkat helped build the RocksDB real-time database that powered growth to 5 billion queries per second(!)—and now with his colleagues at Rockset, he's bringing that real-time database infrastructure to the rest of us. In this conversation, Tristan, Julia and Venkat explore the fundamental technological advances that are e...

Jul 15, 202145 minSeason 1Ep. 2

Robert Chang: Building the Minerva Metrics Store @ Airbnb

Robert Chang is a product manager for the data platform at Airbnb, where he helped build and roll out Minerva, Airbnb's internal metrics store. They use Minerva to track over 12,000(!) metrics and 4,000(!) dimensions with consistency across the organization. In this conversation with Tristan and Julia, Robert dives into why they built it, what it took to get it done—and crucially, what you should do if your company doesn't have the resources to build your own internal metrics store. For full sho...

Jul 01, 202138 minSeason 1Ep. 1
Hosted on Libsyn
For the best experience, listen in Metacast app for iOS or Android