AI Engineering Podcast

Tobias Macey•www.aiengineeringpodcast.com

This show is your guidebook to building scalable and maintainable AI systems. You will learn how to architect AI applications, apply AI to your work, and the considerations involved in building or customizing new models. Everything that you need to know to deliver real impact and value with machine learning and artificial intelligence.

Last refreshed: February 25th, 2026 at 12:34 AM ⓘ

Follow this podcast in the Metacast mobile app to refresh it and see new episodes.

Follow on

Apple Podcasts

Spotify

RSS

Podcasts are better in Metacast mobile app

Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episodes

Kubernetes, Compliance, and Control: The Operational Backbone of AI Sovereignty

Summary In this episode of the AI Engineering Podcast, Steven Watt, leader of the Office of the CTO at Red Hat, discusses practical paths to achieving AI sovereignty for organizations. He shares his two-decade experience in AI, highlighting how governments are building GPU platforms and protected data hubs to maintain control over AI workloads. Steve emphasizes why self-managed infrastructure is becoming a strategic necessity as companies outgrow cloud costs and require tighter control over mode...

Feb 25, 2026•1 hr 1 min•Ep. 78

From Blind Spots to Observability: Operationalizing LLM Apps with OpenLit

Summary In this episode of the AI Engineering Podcast, Aman Agarwal, creator of OpenLit, discusses the operational foundations required to run LLM-powered applications in production. He highlights common early blind spots teams face, including opaque model behavior, runaway token costs, and brittle prompt management, emphasizing that strong observability and cost tracking must be established before an MVP ships. Aman explains how OpenLit leverages OpenTelemetry for vendor-neutral tracing across ...

Feb 15, 2026•51 min•Ep. 77

Taming Voice Complexity with Dynamic Ensembles at Modulate

Summary In this episode of the AI Engineering Podcast, Carter Huffman, co-founder and CTO of Modulate, discusses the engineering behind low-latency, high-accuracy Voice AI. He explains why voice is a uniquely challenging modality due to its rich non-textual signals like tone, emotion, and context, and how simple speech-to-text-to-speech pipelines can't capture the necessary nuance. Carter introduces Modulate's Ensemble Listening Model (ELM) architecture, which uses dynamic routing and cost-based...

Feb 08, 2026•59 min•Ep. 76

GPU Clouds, Aggregators, and the New Economics of AI Compute

Summary In this episode I sit down with Hugo Shi, co-founder and CTO of Saturn Cloud, to map the strategic realities of sourcing and operating GPUs across clouds. Hugo breaks down today’s provider landscape—from hyperscalers to full-service GPU clouds, bare metal/concierge providers, and emerging GPU aggregators—and how to choose among them based on security posture, managed services, and cost. We explore practical layers of capability (compute, orchestration with Kubernetes/Slurm, storage, netw...

Jan 27, 2026•46 min•Ep. 75

The Future of Dev Experience: Spotify’s Playbook for Organization‑Scale AI

Summary In this episode of the AI Engineering Podcast Niklas Gustavsson, Chief Architect at Spotify, talks about scaling AI across engineering and product. He explores how Spotify's highly distributed architecture was built to support rapid adoption of coding agents like Copilot, Cursor, and Claude Code, enabled by standardization and Backstage. The conversation covers the tension between bottoms-up experimentation and platform standardization, and how Spotify is moving toward monorepos and flee...

Jan 20, 2026•56 min•Ep. 74

Generative AI Meets Accessibility: Benchmarks, Breakthroughs, and Blind Spots with Joe Devon

Summary In this episode Joe Devon, co-founder of Global Accessibility Awareness Day (GAAD), talks about how generative AI can both help and harm digital accessibility — and what it will take to tilt the balance toward inclusion. Joe shares his personal motivation for the work, real-world stakes for disabled users across web, mobile, and developer tooling, and compelling stories that illustrate why accessible design is a human-rights issue as much as a compliance checkbox. He digs into AI’s curre...

Jan 05, 2026•56 min•Ep. 73

Beyond the Chatbot: Practical Frameworks for Agentic Capabilities in SaaS

Summary In this episode product and engineering leader Preeti Shukla explores how and when to add agentic capabilities to SaaS platforms. She digs into the operational realities that AI agents must meet inside multi-tenant software: latency, cost control, data privacy, tenant isolation, RBAC, and auditability. Preeti outlines practical frameworks for selecting models and providers, when to self-host, and how to route capabilities across frontier and cheaper models. She discusses graduated autono...

Dec 29, 2025•54 min•Ep. 72

MCP as the API for AI‑Native Systems: Security, Orchestration, and Scale

Summary In this episode Craig McLuckie, co-creator of Kubernetes and founder/CEO of Stacklok, talks about how to improve security and reliability for AI agents using curated, optimized deployments of the Model Context Protocol (MCP). Craig explains why MCP is emerging as the API layer for AI‑native applications, how to balance short‑term productivity with long‑term platform thinking, and why great tools plus frontier models still drive the best outcomes. He digs into common adoption pitfalls (to...

Dec 16, 2025•1 hr 8 min•Ep. 71

Context as Code, DevX as Leverage: Accelerating Software with Multi‑Agent Workflows

Summary In this episode Max Beauchemin explores how multiplayer, multi‑agent engineering is reshaping individual and team velocity for building data and AI systems. Max shares his journey from Airflow and Superset to going all‑in on AI coding agents, describing a pragmatic “AI‑first reflex” for nearly every task and the emerging role of humans as orchestrators of agents. He digs into shifting bottlenecks — code review, QA, async coordination — and how better DevX/AIX, just‑in‑time context via to...

Nov 24, 2025•1 hr•Ep. 70

Inside the Black Box: Neuron-Level Control and Safer LLMs

Summary In this episode of the AI Engineering Podcast Vinay Kumar, founder and CEO of Arya.ai and head of Lexsi Labs, talks about practical strategies for understanding and steering AI systems. He discusses the differences between interpretability and explainability, and why post-hoc methods can be misleading. Vinay shares his approach to tracing relevance through deep networks and LLMs using DL Backtrace, and how interpretability is evolving from an audit tool into a lever for alignment, enabli...

Nov 16, 2025•1 hr 1 min•Ep. 69

Building the Internet of Agents: Identity, Observability, and Open Protocols

Summary In this episode Guillaume de Saint Marc, VP of Engineering at Cisco Outshift, talks about the complexities and opportunities of scaling multi‑agent systems. Guillaume explains why specialized agents collaborating as a team inspire trust in enterprise settings, and contrasts rigid, “lift-and-shift” agentic workflows with fully self-forming systems. We explore the emerging Internet of Agents, the need for open, interoperable protocols (A2A for peer collaboration and MCP for tool calling), ...

Nov 10, 2025•1 hr 7 min•Ep. 68

Agents, IDEs, and the Blast Radius: Practical AI for Software Engineers

Summary In this episode of the AI Engineering Podcast Will Vincent, Python developer advocate at JetBrains (PyCharm), talks about how AI utilities are revolutionizing software engineering beyond basic code completion. He discusses the shift from "vibe coding" to "vibe engineering," where engineers collaborate with AI agents through clear guidelines, iterative specs, and tight guardrails. Will shares practical techniques for getting real value from these tools, including loading the whole codebas...

Nov 02, 2025•59 min•Ep. 67

From MRI to World Models: How AI Is Changing What We See

Summary In this episode of the AI Engineering Podcast Daniel Sodickson, Chief of Innovation in Radiology at NYU Grossman School of Medicine, talks about harnessing AI systems to truly understand images and revolutionize science and healthcare. Dan shares his journey from linear reconstruction to early deep learning for accelerated MRI, highlighting the importance of domain expertise when adapting models to specialized modalities. He explores "upstream" AI that changes what and how we measure, us...

Oct 27, 2025•49 min•Ep. 66

Specs, Tests, and Self‑Verification: The Playbook for Agentic Engineering Teams

Summary In this episode Andrew Filev, CEO and founder of ZenCoder, takes a deep dive into the system design, workflows, and organizational changes behind building agentic coding systems. He traces the evolution from autocomplete to truly agentic models, discusses why context engineering and verification are the real unlocks for reliability, and outlines a pragmatic path from “vibe coding” to AI‑first engineering. Andrew shares ZenCoder’s internal playbook: PRD and tech spec co‑creation with AI, ...

Oct 19, 2025•1 hr 6 min•Ep. 65

From Probabilistic to Trustworthy: Building Orion, an Agentic Analytics Platform

Summary In this episode of the AI Engineering Podcast Lucas Thelosen and Drew Gillson talk about Orion, their agentic analytics platform that delivers proactive, push-based insights to business users through asynchronous thinking with rich organizational context. Lucas and Drew share their approach to building trustworthy analysis by grounding in semantic layers, fact tables, and quality-assurance loops, as well as their focus on accuracy through parallel test-time compute and evolving from prob...

Oct 11, 2025•1 hr 12 min•Ep. 64

Building Production-Ready AI Agents with Pydantic AI

Summary In this episode of the AI Engineering Podcast Samuel Colvin, creator of Pydantic and founder of Pydantic Inc, talks about Pydantic AI - a type-safe framework for building structured AI agents in Python. Samuel explains why he built Pydantic AI to bring FastAPI-like ergonomics and production-grade engineering to agents, focusing on strong typing, minimal abstractions, and reliability, observability, and stability. He explores the evolving agent ecosystem, patterns for single vs. many agen...

Oct 07, 2025•51 min•Ep. 63

From GPUs to Workloads: Flex AI’s Blueprint for Fast, Cost‑Efficient AI

Summary In this episode of the AI Engineering Podcast Brijesh Tripathi, CEO of Flex AI, talks about revolutionizing AI engineering by removing DevOps burdens through "workload as a service". Brijesh shares his expertise from leading AI/HPC architecture at Intel and deploying supercomputers like Aurora, highlighting how access friction and idle infrastructure slow progress. He discusses Flex AI's innovative approach to simplifying heterogeneous compute, standardizing on consistent Kubernetes laye...

Sep 28, 2025•55 min•Ep. 62

Right-Sizing AI: Small Language Models for Real-World Production

Summary In this episode of the AI Engineering Podcast Steven Huels, Vice President of AI Engineering & Product Strategy at Red Hat, talks about the practical applications of small language models (SLMs) for production workloads. He discusses how SLMs offer a pragmatic choice due to their ability to fit on single enterprise GPUs and provide model selection trade-offs. The conversation covers self-hosting vs using API providers, organizational capabilities needed for running production-grade L...

Sep 20, 2025•51 min•Ep. 61

AI Agents and Identity Management

Summary In this episode of the AI Engineering Podcast Julianna Lamb, co-founder and CTO of Stytch, talks about the complexities of managing identity and authentication in agentic workflows. She explores the evolving landscape of identity management in the context of machine learning and AI, highlighting the importance of flexible compute environments and seamless data exchange. The conversation covers implications of AI agents on identity management, including granular permissions, OAuth protoco...

Sep 13, 2025•54 min•Ep. 60

Revolutionizing Production Systems: The Resolve AI Approach

Summary In this episode of the AI Engineering Podcast, CEO of Resolve AI Spiros Xanthos shares his insights on building agentic capabilities for operational systems. He discusses the limitations of traditional observability tools and the need for AI agents that can reason through complex systems to provide actionable insights and solutions. The conversation highlights the architecture of Resolve AI, which integrates with existing tools to build a comprehensive understanding of production environ...

Sep 04, 2025•51 min•Ep. 59

Designing Scalable AI Systems with FastMCP: Challenges and Innovations

Summary In this episode of the AI Engineering Podcast Jeremiah Lowin, founder and CEO of Prefect Technologies, talks about the FastMCP framework and the design of MCP servers. Jeremiah explains the evolution of FastMCP, from its initial creation as a simpler alternative to the MCP SDK to its current role in facilitating the deployment of AI tools. The discussion covers the complexities of designing MCP servers, the importance of context engineering, and the potential pitfalls of overwhelming AI ...

Aug 26, 2025•1 hr 14 min•Ep. 58

Proactive Monitoring in Heavy Industry: The Role of AI and Human Curiosity

Summary In this episode of the AI Engineering Podcast Dr. Tara Javidi, CTO of KavAI, talks about developing AI systems for proactive monitoring in heavy industry. Dr. Javidi shares her background in mathematics and information theory, influenced by Claude Shannon's work, and discusses her approach to curiosity-driven AI that mimics human curiosity to improve data collection and predictive analytics. She explains how KavAI's platform uses generative AI models to enhance industrial monitoring by a...

Aug 23, 2025•41 min•Ep. 57

Navigating the AI Landscape: Challenges and Innovations in Retail

Summary In this episode of the AI Engineering Podcast machine learning engineer Shashank Kapadia explores the transformative role of generative AI in retail. Shashank shares his journey from an engineering background to becoming a key player in ML, highlighting the excitement of understanding human behavior at scale through AI. He discusses the challenges and opportunities presented by generative AI in retail, where it complements traditional ML by enhancing explainability and personalization, p...

Aug 07, 2025•52 min•Ep. 56

The Anti-CRM CRM: How Spiro Uses AI to Transform Sales

Summary In this episode of the AI Engineering podcast Adam Honig, founder of Spiro AI, about using AI to automate CRM systems, particularly in the manufacturing sector. Adam shares his journey from running a consulting company focused on Salesforce to founding Spiro, and discusses the challenges of traditional CRM systems where data entry is often neglected. He explains how Spiro addresses this issue by automating data collection from emails, phone calls, and other communications, providing a ri...

Jul 21, 2025•47 min•Ep. 55

Unlocking AI Potential with AMD's ROCm Stack

Summary In this episode of the AI Engineering podcast Anush Elangovan, VP of AI software at AMD, discusses the strategic integration of software and hardware at AMD. He emphasizes the open-source nature of their software, fostering innovation and collaboration in the AI ecosystem, and highlights AMD's performance and capability advantages over competitors like NVIDIA. Anush addresses challenges and opportunities in AI development, including quantization, model efficiency, and future deployment a...

Jun 23, 2025•42 min•Ep. 54

Applying AI To The Construction Industry At Buildots

Summary In this episode of the Machine Learning Podcast Ori Silberberg, VP of Engineering at Buildots, talks about transforming the construction industry with AI. Ori shares how Buildots uses computer vision and AI to optimize construction projects by providing real-time feedback, reducing delays, and improving efficiency. Learn about the complexities of digitizing the construction industry, the technical architecture of Buildoz, and how its AI-driven solutions create a digital twin of construct...

Jun 14, 2025•49 min•Ep. 53

The Future of AI Systems: Open Models and Infrastructure Challenges

Summary In this episode of the AI Engineering Podcast Jamie De Guerre, founding SVP of product at Together.ai, explores the role of open models in the AI economy. As a veteran of the AI industry, including his time leading product marketing for AI and machine learning at Apple, Jamie shares insights on the challenges and opportunities of operating open models at speed and scale. He delves into the importance of open source in AI, the evolution of the open model ecosystem, and how Together.ai's A...

Jun 01, 2025•51 min•Ep. 52

The Rise of Agentic AI: Transforming Business Operations

Summary In this episode of the AI Engineering Podcast, host Tobias Macey sits down with Ben Wilde, Head of Innovation at Georgian, to explore the transformative impact of agentic AI on business operations and the SaaS industry. From his early days working with vintage AI systems to his current focus on product strategy and innovation in AI, Ben shares his expertise on what he calls the "continuum" of agentic AI - from simple function calls to complex autonomous systems. Join them as they discuss...

May 21, 2025•1 hr 2 min•Ep. 51

Protecting AI Systems: Understanding Vulnerabilities and Attack Surfaces

Summary In this episode of the AI Engineering Podcast Kasimir Schulz, Director of Security Research at HiddenLayer, talks about the complexities and security challenges in AI and machine learning models. Kasimir explains the concept of shadow genes and shadow logic, which involve identifying common subgraphs within neural networks to understand model ancestry and potential vulnerabilities, and emphasizes the importance of understanding the attack surface in AI integrations, scanning models for s...

May 03, 2025•52 min•Ep. 50

Understanding The Operational And Organizational Challenges Of Agentic AI

Summary In this episode of the AI Engineering podcast Julian LaNeve, CTO of Astronomer, talks about transitioning from simple LLM applications to more complex agentic AI systems. Julian shares insights into the challenges and considerations of this evolution, emphasizing the importance of starting with simpler applications to build operational knowledge and intuition. He discusses the parallels between microservices and agentic AI, highlighting the need for careful orchestration and observabilit...

Apr 21, 2025•1 hr 12 min•Ep. 49

For the best experience, listen in Metacast app for iOS or Android