Latent Space: The AI Engineer Podcast

Latent.Space•www.latent.space

The podcast by and for AI Engineers! In 2025, over 10 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0. We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al. Full show notes always on https://latent.space

www.latent.space

Last refreshed: July 1st, 2026 at 4:33 PM

Follow on

Apple Podcasts

Spotify

RSS

Podcasts are better in Metacast mobile app

Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episodes

NVIDIA's AI Engineers: Agent Inference at Planetary Scale and "Speed of Light" — Nader Khalil (Brev), Kyle Kranen (Dynamo)

Join Kyle, Nader, Vibhu, and swyx live at NVIDIA GTC next week ! Now that AIE Europe tix are ~sold out, our attention turns to Miami and World’s Fair ! The definitive AI Accelerator chip company has more than 10xed this AI Summer: And is now a $4.4 trillion megacorp… that is somehow still moving like a startup. We are blessed to have a unique relationship with our first ever NVIDIA guests: Kyle Kranen who gave a great inference keynote at the first World’s Fair and is one of the leading architec...

Mar 10, 2026•1 hr 24 min

Cursor's Third Era: Cloud Agents

All speakers are announced at AIE EU , schedule coming soon. Join us there or in Miami with the renowned organizers of React Miami! Singapore CFP also open! We’ve called this out a few times over in AINews , but the overwhelming consensus in the Valley is that “ the IDE is Dead ”. In November it was just a gut feeling, but now we actually have data : even at the canonical “VSCode Fork” company, people are officially using more agents than tab autocomplete (the first wave of AI coding): Cursor ha...

Mar 06, 2026•1 hr 7 min

Every Agent Needs a Box — Aaron Levie, Box

The reception to our recent post on Code Reviews has been strong . Catch up! Amid a maelstrom of discussion on whether or not AI is killing SaaS , one of the top publicly listed SaaS companies in the world has just reported record revenues, clearing well over $1.1B in ARR for the first time with a 28% margin . As we comment on the pod, Aaron Levie is the rare public company CEO equally at home in both worlds of Silicon Valley and Wall Street/Main Street, by day helping 70% of the Fortune 500 wit...

Mar 05, 2026•1 hr 17 min

METR’s Joel Becker on exponential Time Horizon Evals, Threat Models, and the Limits of AI Productivity

This is a free preview of a paid episode. To hear more, visit www.latent.space AIE Europe CFP and AIE World’s Fair paper submissions for CAIS peer review are due TODAY - do not delay! Last call ever. We’re excited to welcome METR for their first LS Pod, hopefully the first of many: METR are keepers of currently the single most infamous chart in AI : But every Latent Space reader should be sophisticated enough to know that the details matter and that hype and hyperbole go hand in hand in AI socia...

Feb 27, 2026•56 min

[LIVE] Anthropic Distillation & How Models Cheat (SWE-Bench Dead) | Nathan Lambert & Sebastian Raschka

Swyx joined SAIL ! Thank you SAIL Media , Prof. Tom Yeh , 8Lee , Hamid Bagheri , c9n , and many others for tuning into SAIL Live #6 with Nathan Lambert and Sebastian Raschka, PhD . Sharing here for the LS paid subscribers. We covered: This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe...

Feb 26, 2026•52 min

🔬Searching the Space of All Possible Materials — Prof. Max Welling, CuspAI

Editor’s note: CuspAI raised a $100m Series A in September and is rumored to have reached a unicorn valuation . They have all-star advisors from Geoff Hinton to Yann Lecun and team of deep domain experts to tackle this next frontier in AI applications. In this episode, Max Welling traces the thread connecting quantum gravity, equivariant neural networks, diffusion models, and climate-focused materials discovery (yes, there is one!!!). We begin with a provocative framing: experiments as computati...

Feb 25, 2026•34 min

Claude Code for Finance + The Global Memory Shortage: Doug O'Laughlin, SemiAnalysis

This is a free preview of a paid episode. To hear more, visit www.latent.space First speakers for AIE Europe and AIEi Miami have been announced. If you’re in Asia/Aus, come by Singapore and Melbourne . AI Engineering is going global! One year ago today , Anthropic launched Claude Code , to not much fanfare : The word of mouth was incredibly strong however, and so we were glad to be one of the first podcasts to invite Boris and Cat on in early May: As we discussed on the pod, all CC usage was API...

Feb 24, 2026•2 hr 4 min

⚡️The End of SWE-Bench Verified — Mia Glaese & Olivia Watkins, OpenAI Frontier Evals & Human Data

Olivia Watkins (Frontier Evals team) and Mia Glaese (VP of Research at OpenAI, leading the Codex, human data, and alignment teams) discuss a new blog post ( https://openai.com/index/why-we-no-longer-evaluate-swe-bench-verified/ ) arguing that SWE-Bench Verified—long treated as a key “North Star” coding benchmark—has become saturated and highly contaminated, making it less useful for measuring real coding progress. SWE-Bench Verified originated as a major OpenAI-led cleanup of the original Prince...

Feb 23, 2026•26 min

Bitter Lessons in Venture vs Growth: Anthropic vs OpenAI, Noam Shazeer, World Labs, Thinking Machines, Cursor, ASIC Economics — Martin Casado & Sarah Wang of a16z

Tickets for AIEi Miami and AIE Europe are live, with first wave speakers announced ! From pioneering software-defined networking to backing many of the most aggressive AI model companies of this cycle, Martin Casado and Sarah Wang sit at the center of the capital, compute, and talent arms race reshaping the tech industry. As partners at a16z investing across infrastructure and growth, they’ve watched venture and growth blur, model labs turn dollars into capability at unprecedented speed, and sta...

Feb 19, 2026•55 min

Owning the AI Pareto Frontier — Jeff Dean

From rewriting Google’s search stack in the early 2000s to reviving sparse trillion-parameter models and co-designing TPUs with frontier ML research , Jeff Dean has quietly shaped nearly every layer of the modern AI stack. As Chief AI Scientist at Google and a driving force behind Gemini , Jeff has lived through multiple scaling revolutions from CPUs and sharded indices to multimodal models that reason across text, video, and code. Jeff joins us to unpack what it really means to “own the Pareto ...

Feb 12, 2026•1 hr 24 min

🔬Beyond AlphaFold: How Boltz is Open-Sourcing the Future of Drug Discovery

This podcast features Gabriele Corso and Jeremy Wohlwend , co-founders of Boltz and authors of the Boltz Manifesto , discussing the rapid evolution of structural biology models from AlphaFold to their own open-source suite, Boltz-1 and Boltz-2 . The central thesis is that while single-chain protein structure prediction is largely “solved” through evolutionary hints, the next frontier lies in modeling complex interactions (protein-ligand, protein-protein) and generative protein design , which Bol...

Feb 12, 2026•1 hr 21 min

The First Mechanistic Interpretability Frontier Lab — Myra Deng & Mark Bissell of Goodfire AI

From Palantir and Two Sigma to building Goodfire into the poster-child for actionable mechanistic interpretability, Mark Bissell (Member of Technical Staff) and Myra Deng (Head of Product) are trying to turn “peeking inside the model” into a repeatable production workflow by shipping APIs, landing real enterprise deployments, and now scaling the bet with a recent $150M Series B funding round at a $1.25B valuation . In this episode, we go far beyond the usual “SAEs are cool” take. We talk about G...

Feb 06, 2026•1 hr 8 min

🔬 Automating Science: World Models, Scientific Taste, Agent Loops — Andrew White

Editor’s note : Welcome to our new AI for Science pod, with your new hosts RJ and Brandon! See the writeup on Latent.Space (https://Latent.Space) for more details on why we’re launching 2 new pods this year. RJ Honicky is a co-founder and CTO at MiraOmics (https://miraomics.bio/) , building AI models and services for single cell, spatial transcriptomics and pathology slide analysis. Brandon Anderson builds AI systems for RNA drug discovery at Atomic AI ( https://atomic.ai ). Anything said on thi...

Jan 28, 2026•1 hr 14 min

Captaining IMO Gold, Deep Think, On-Policy RL, Feeling the AGI in Singapore — Yi Tay

From shipping Gemini Deep Think and IMO Gold to launching the Reasoning and AGI team in Singapore , Yi Tay has spent the last 18 months living through the full arc of Google DeepMind’s pivot from architecture research to RL-driven reasoning—watching his team go from a dozen researchers to 300+, training models that solve International Math Olympiad problems in a live competition, and building the infrastructure to scale deep thinking across every domain, and driving Gemini to the top of the lead...

Jan 23, 2026•1 hr 32 min

Brex’s AI Hail Mary — With CTO James Reggio

From building internal AI labs to becoming CTO of Brex, James Reggio has helped lead one of the most disciplined AI transformations inside a real financial institution where compliance, auditability, and customer trust actually matter. We sat down with Reggio to unpack Brex’s three-pillar AI strategy (corporate, operational, and product AI) [ https://www.brex.com/journal/brex-ai-native-operations ], how SOP-driven agents beat overengineered RL in ops, why Brex lets employees “build their own AI ...

Jan 17, 2026•1 hr 13 min

Artificial Analysis: Independent LLM Evals as a Service — with George Cameron and Micah-Hill Smith

Happy New Year! You may have noticed that in 2025 we had moved toward YouTube as our primary podcasting platform. As we’ll explain in the next State of Latent Space post, we’ll be doubling down on Substack again and improving the experience for the over 100,000 of you who look out for our emails and website updates! We first mentioned Artificial Analysis in 2024, when it was still a side project in a Sydney basement. They then were one of the few Nat Friedman and Daniel Gross’ AIGrant companies ...

Jan 08, 2026•1 hr 18 min

[State of Evals] LMArena's $1.7B Vision — Anastasios Angelopoulos, LMArena

We are reupping this episode after LMArena announced their fresh Series A ( https://www.theinformation.com/articles/ai-evaluation-startup-lmarena-valued-1-7-billion-new-funding-round?rc=luxwz4 ), raising $150m at a $1.7B valuation, with $30M annualized consumption revenue (aka $2.5m MRR) after their September evals product launch. —- From building LMArena in a Berkeley basement to raising $100M and becoming the de facto leaderboard for frontier AI , Anastasios Angelopoulos returns to Latent Spac...

Jan 06, 2026•24 min

[NeurIPS Best Paper] 1000 Layer Networks for Self-Supervised RL — Kevin Wang et al, Princeton

From undergraduate research seminars at Princeton to winning Best Paper award at NeurIPS 2025 , Kevin Wang, Ishaan Javali, Michał Bortkiewicz, Tomasz Trzcinski, Benjamin Eysenbach defied conventional wisdom by scaling reinforcement learning networks to 1,000 layers deep —unlocking performance gains that the RL community thought impossible. We caught up with the team live at NeurIPS to dig into the story behind RL1000 : why deep networks have worked in language and vision but failed in RL for ove...

Jan 02, 2026•28 min

[State of Code Evals] After SWE-bench, Code Clash & SOTA Coding Benchmarks recap — John Yang

From creating SWE-bench in a Princeton basement to shipping CodeClash , SWE-bench Multimodal , and SWE-bench Multilingual , John Yang has spent the last year and a half watching his benchmark become the de facto standard for evaluating AI coding agents—trusted by Cognition (Devin), OpenAI, Anthropic, and every major lab racing to solve software engineering at scale. We caught up with John live at NeurIPS 2025 to dig into the state of code evals heading into 2026: why SWE-bench went from ignored ...

Dec 31, 2025•18 min

[State of Post-Training] From GPT-4.1 to 5.1: RLVR, Agent & Token Efficiency — Josh McGrath, OpenAI

From pre-training data curation to shipping GPT-4o , o1 , o3 , and now GPT-5 thinking and the shopping model , Josh McGrath has lived through the full arc of OpenAI’s post-training evolution—from the PPO vs DPO debates of 2023 to today’s RLVR era, where the real innovation isn’t optimization methods but data quality, signal trust, and token efficiency . We sat down with Josh at NeurIPS 2025 to dig into the state of post-training heading into 2026: why RLHF and RLVR are both just policy gradient ...

Dec 31, 2025•28 min

[State of RL/Reasoning] IMO/IOI Gold, OpenAI o3/GPT-5, and Cursor Composer — Ashvin Nair, Cursor

From Berkeley robotics and OpenAI’s 2017 Dota-era internship to shipping RL breakthroughs on GPT-4o, o1, and o3, and now leading model development at Cursor , Ashvin Nair has done it all. We caught up with Ashvin at NeurIPS 2025 to dig into the inside story of OpenAI’s reasoning team (spoiler: it went from a dozen people to 300+), why IOI Gold felt reachable in 2022 but somehow didn’t change the world when o1 actually achieved it, how RL doesn’t generalize beyond the training distribution (and w...

Dec 30, 2025•45 min

[State of AI Startups] Memory/Learning, RL Envs & DBT-Fivetran — Sarah Catanzaro, Amplify

From investing through the modern data stack era (DBT, Fivetran, and the analytics explosion) to now investing at the frontier of AI infrastructure and applications at Amplify Partners , Sarah Catanzaro has spent years at the intersection of data, compute, and intelligence—watching categories emerge, merge, and occasionally disappoint. We caught up with Sarah live at NeurIPS 2025 to dig into the state of AI startups heading into 2026: why $100M+ seed rounds with no near-term roadmap are now the ...

Dec 30, 2025•29 min

One Year of MCP — with David Soria Parra and AAIF leads from OpenAI, Goose, Linux Foundation

One year ago, Anthropic launched the Model Context Protocol (MCP) —a simple, open standard to connect AI applications to the data and tools they need. Today, MCP has exploded from a local-only experiment into the de facto protocol for agentic systems, adopted by OpenAI, Microsoft, Google, Block, and hundreds of enterprises building internal agents at scale. And now, MCP is joining the newly formed Agentic AI Foundation (AAIF) under the Linux Foundation, alongside Block’s Goose coding agent, with...

Dec 27, 2025•1 hr 39 min

Steve Yegge's Vibe Coding Manifesto: Why Claude Code Isn't It & What Comes After the IDE

Note: Steve and Gene’s talk on Vibe Coding and the post IDE world was one of the top talks of AIE CODE: From building legendary platforms at Google and Amazon to authoring one of the most influential essays on AI-powered development ( Revenge of the Junior Developer , quoted by Dario Amodei himself), Steve Yegge has spent decades at the frontier of software engineering—and now he’s leading the charge into what he calls the “factory farming” era of code. After stints at SourceGraph and building B...

Dec 26, 2025•37 min

⚡️GPT5-Codex-Max: Training Agents with Personality, Tools & Trust — Brian Fioca + Bill Chen, OpenAI

From the frontlines of OpenAI’s Codex and GPT-5 training teams, Bryan and Bill are building the future of AI-powered coding—where agents don’t just autocomplete, they architect, refactor, and ship entire features while you sleep. We caught up with them at AI Engineer Conference right after the launch of Codex Max , OpenAI’s newest long-running coding agent designed to work for 24+ hours straight, manage its own context, and spawn sub-agents to parallelize work across your entire codebase. We sat...

Dec 26, 2025•28 min

SAM 3: The Eyes for AI — Nikhila & Pengchuan (Meta Superintelligence), ft. Joseph Nelson (Roboflow)

As with all demo-heavy and especially vision AI podcasts, we encourage watching along on our YouTube (and tossing us an upvote/subscribe if you like!) From SAM 1’s 11-million-image data engine to SAM 2’s memory-based video tracking, MSL’s Segment Anything project has redefined what’s possible in computer vision. Now SAM 3 takes the next leap: concept segmentation —prompting with natural language like “yellow school bus” or “tablecloth” to detect, segment, and track every instance across images a...

Dec 18, 2025•1 hr 15 min

⚡️Jailbreaking AGI: Pliny the Liberator & John V on Red Teaming, BT6, and the Future of AI Security

Note: this is Pliny and John’s first major podcast. Voices have been changed for opsec. From jailbreaking every frontier model and turning down Anthropic’s Constitutional AI challenge to leading BT6 , a 28-operator white-hat hacker collective obsessed with radical transparency and open-source AI security, Pliny the Liberator and John V are redefining what AI red-teaming looks like when you refuse to lobotomize models in the name of “safety.” Pliny built his reputation crafting universal jailbrea...

Dec 16, 2025•41 min

AI to AE's: Grit, Glean, and Kleiner Perkins' next Enterprise AI hit — Joubin Mirzadegan, Roadrunner

Glean started as a Kleiner Perkins incubation and is now a $7B, $200m ARR Enterprise AI leader. Now KP has tapped its own podcaster to lead it’s next big swing. From building go-to-market the hard way in startups (and scaling Palo Alto Networks’ public cloud business) to joining Kleiner Perkins to help technical founders turn product edge into repeatable revenue, Joubin Mirzadegan has spent the last decade obsessing over one thing: distribution and how ideas actually spread, sell, and compound. ...

Dec 12, 2025•1 hr 10 min

The Future of Email: Superhuman CTO on Your Inbox As the Real AI Agent (Not ChatGPT) — Loïc Houssier

From applied cryptography and offensive security in France’s defense industry to optimizing nuclear submarine workflows, then selling his e-signature startup to Docusign ( https://www.docusign.com/company/news-center/opentrust-joins-docusign-global-trust-network and now running AI as CTO of Superhuman Mail (Superhuman, recently acquired by Grammarly https://techcrunch.com/2025/07/01/grammarly-acquires-ai-email-client-superhuman/ ), Loïc Houssier has lived the full arc from deep infra and complia...

Dec 11, 2025•1 hr 11 min

World Models & General Intuition: Khosla's largest bet since LLMs & OpenAI

From building Medal into a 12M-user game clipping platform with 3.8B highlight moments to turning down a reported $500M offer from OpenAI ( https://www.theinformation.com/articles/openai-offered-pay-500-million-startup-videogame-data ) and raising a $134M seed from Khosla ( https://techcrunch.com/2025/10/16/general-intuition-lands-134m-seed-to-teach-agents-spatial-reasoning-using-video-game-clips/ ) to spin out General Intuition , Pim is betting that world models trained on peak human gameplay a...

Dec 06, 2025•1 hr 4 min

← Prev Next →

For the best experience, listen in Metacast app for iOS or Android