Latent Space: The AI Engineer Podcast

Latent.Space•www.latent.space

The podcast by and for AI Engineers! In 2025, over 10 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0. We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al. Full show notes always on https://latent.space

www.latent.space

Last refreshed: June 24th, 2026 at 8:16 PM ⓘ

Follow this podcast in the Metacast mobile app to refresh it and see new episodes.

Follow on

Apple Podcasts

Spotify

RSS

Podcasts are better in Metacast mobile app

Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episodes

Language Agents: From Reasoning to Acting

OpenAI DevDay is almost here ! Per tradition, we are hosting a DevDay pregame event for everyone coming to town! Join us with demos and gossip! Also sign up for related events across San Francisco: the AI DevTools Night , the xAI open house , the Replicate art show , the DevDay Watch Party (for non-attendees), Hack Night with OpenAI at Cloudflare . For everyone else, join the Latent Space Discord for our online watch party and find fellow AI Engineers in your city. OpenAI’s recent o1 release (an...

Sep 27, 2024•1 hr 30 min

The Ultimate Guide to Prompting

Noah Hein from Latent Space University is finally launching with a free lightning course this Sunday for those new to AI Engineering. Tell a friend! Did you know there are >1,600 papers on arXiv just about prompting ? Between shots, trees, chains, self-criticism, planning strategies, and all sorts of other weird names, it’s hard to keep up. Luckily for us, Sander Schulhoff and team read them all and put together The Prompt Report as the ultimate prompt engineering reference, which we’ll break...

Sep 20, 2024•1 hr 9 min

From API to AGI: Structured Outputs, OpenAI API platform and O1 Q&A — with Michelle Pokrass & OpenAI Devrel + Strawberry team

Congrats to Damien on successfully running AI Engineer London ! See our community page and the Latent Space Discord for all upcoming events. This podcast came together in a far more convoluted way than usual, but happens to result in a tight 2 hours covering the ENTIRE OpenAI product suite across ChatGPT-latest, GPT-4o and the new o1 models , and how they are delivered to AI Engineers in the API via the new Structured Output mode, Assistants API, client SDKs, upcoming Voice Mode API, Finetuning/...

Sep 13, 2024•2 hr 4 min

Efficiency is Coming: 3000x Faster, Cheaper, Better AI Inference from Hardware Improvements, Quantization, and Synthetic Data Distillation

AI Engineering is expanding! Join the first 🇬🇧 AI Engineer London meetup in Sept and get in touch for sponsoring the second 🗽 AI Engineer Summit in NYC this Dec! The commoditization of intelligence takes on a few dimensions: * Time to Open Model Equivalent : 15 months between GPT-4 and Llama 3.1 405B * 10-100x CHEAPER/year : from $30/mtok for Claude 3 Opus to $3/mtok for L3-405B, and a 400x reduction in the frontier OpenAI model from 2022-2024. Notably, for personal use cases, both Gemini Fla...

Sep 03, 2024•1 hr 5 min

Why you should write your own LLM benchmarks — with Nicholas Carlini, Google DeepMind

Today's guest, Nicholas Carlini, a research scientist at DeepMind, argues that we should be focusing more on what AI can do for us individually , rather than trying to have an answer for everyone. "How I Use AI" - A Pragmatic Approach Carlini's blog post "How I Use AI" went viral for good reason. Instead of giving a personal opinion about AI's potential, he simply laid out how he, as a security researcher, uses AI tools in his daily work. He divided it in 12 sections: * To make applications * As...

Aug 29, 2024•1 hr 10 min

Is finetuning GPT4o worth it? — with Alistair Pullen, Cosine (Genie)

Betteridge's law says no: with seemingly infinite flavors of RAG, and >2million token context + prompt caching from Anthropic/Deepmind/Deepseek, it's reasonable to believe that "in context learning is all you need". But then there’s Cosine Genie , the first to make a huge bet using OpenAI’s new GPT4o fine-tuning for code at the largest scale it has ever been used externally; resulting in what is now the #1 coding agent in the world according to SWE-Bench Full, Lite, and Verified: SWE-Bench ha...

Aug 22, 2024•1 hr 5 min

AI Magic: Shipping 1000s of successful products with no managers and a team of 12 — Jeremy Howard of Answer.ai

Disclaimer: We recorded this episode ~1.5 months ago, timing for the FastHTML release. It then got bottlenecked by Llama3.1 , Winds of AI Winter , and SAM2 episodes, so we’re a little late. Since then FastHTML was released , swyx is building an app in it for AINews , and Anthropic has also released their prompt caching API . Remember when Dylan Patel of SemiAnalysis coined the GPU Rich vs GPU Poor war ? (if not, see our pod with him ). The idea was that if you’re GPU poor you shouldn’t waste you...

Aug 16, 2024•59 min

Segment Anything 2: Demo-first Model Development

Because of the nature of SAM, this is more video heavy than usual. See our YouTube ! Because vision is first among equals in multimodality, and yet SOTA vision language models are closed, we’ve always had an interest in learning what’s next in vision. Our first viral episode was Segment Anything 1 , and we have since covered LLaVA , IDEFICS , Adept , and Reka . But just like with Llama 3 , FAIR holds a special place in our hearts as the New Kings of Open Source AI. The list of sequels better tha...

Aug 07, 2024•1 hr 4 min

The Winds of AI Winter (Q2 Four Wars Recap) + ChatGPT Voice Mode Preview

Thank you for 1m downloads of the podcast and 2m readers of the Substack! 🎉 This is the audio discussion following The Winds of AI Winter essay that also serves as a recap of Q2 2024 in AI viewed through the lens of our Four Wars framework . Enjoy! Full Video Discussion Full show notes are here . Timestamps * [00:00:00] Intro Song by Suno.ai * [00:02:01] Swyx and Alessio in Singapore * [00:05:49] GPU Rich vs Poors: Frontier Labs * [00:06:35] GPU Rich Frontier Models: Claude 3.5 * [00:10:37] GPU...

Aug 02, 2024•1 hr 55 min

Llama 2, 3 & 4: Synthetic Data, RLHF, Agents on the path to Open Source AGI

If you see this in time, join our emergency LLM paper club on the Llama 3 paper! For everyone else, join our special AI in Action club on the Latent Space Discord for a special feature with the Cursor cofounders on Composer, their newest coding agent! Today, Meta is officially releasing the largest and most capable open model to date, Llama3-405B , a dense transformer trained on 15T tokens that beats GPT-4 on all major benchmarks: The 8B and 70B models from the April Llama 3 release have also re...

Jul 23, 2024•1 hr 5 min

Benchmarks 201: Why Leaderboards > Arenas >> LLM-as-Judge

The first AI Engineer World’s Fair talks from OpenAI and Cognition are up! In our Benchmarks 101 episode back in April 2023 we covered the history of AI benchmarks, their shortcomings, and our hopes for better ones. Fast forward 1.5 years, the pace of model development has far exceeded the speed at which benchmarks are updated. Frontier labs are still using MMLU and HumanEval for model marketing, even though most models are reaching their natural plateau at a ~90% success rate (any higher and th...

Jul 12, 2024•58 min

The 10,000x Yolo Researcher Metagame — with Yi Tay of Reka

Livestreams for the AI Engineer World’s Fair ( Multimodality ft. the new GPT-4o demo , GPUs and Inference (ft. Cognition/Devin), CodeGen , Open Models tracks) are now live! Subscribe to @aidotEngineer to get notifications of the other workshops and tracks! It’s easy to get de-sensitized to new models topping leaderboards every other week — however, the top of the LMsys leaderboard has typically been the exclusive domain of very large, very very well funded model labs like OpenAI, Anthropic, Goog...

Jul 05, 2024•1 hr 45 min

State of the Art: Training >70B LLMs on 10,000 H100 clusters

It’s return guest season here at Latent Space! We last talked to Kanjun in October and Jonathan in May (and December post Databricks acquisition): Imbue and Databricks are back for a rare treat: a double-header interview talking about DBRX from Databricks and Imbue 70B , a new internal LLM that “outperforms GPT-4o” zero-shot on a range of reasoning and coding-related benchmarks and datasets, while using 7x less data than Llama 3 70B . While Imbue, being an agents company rather than a model prov...

Jun 25, 2024•1 hr 22 min

[High Agency] AI Engineer World's Fair Preview

The World’s Fair is officially sold out! Thanks for all the support and stay tuned for recaps of all the great goings on in this very special celebration of the AI Engineer! Longtime listeners will remember the fan favorite Raza Habib, CEO of HumanLoop, on the pod: Well, he’s caught the podcasting bug and is now flipping the tables on swyx! Subscribe to High Agency wherever the finest Artificial Intelligence podcast are sold. High Agency Pod Description In this episode, I chatted with Shawn Wang...

Jun 25, 2024•50 min

How To Hire AI Engineers — with James Brady & Adam Wiggins of Elicit

Editor’s note: One of the top reasons we have hundreds of companies and thousands of AI Engineers joining the World’s Fair next week is, apart from discussing technology and being present for the big launches planned, to hire and be hired! Listeners loved our previous Elicit episode and were so glad to welcome 2 more members of Elicit back for a guest post (and bonus podcast) on how they think through hiring. Don’t miss their AI engineer job description , and template which you can use to create...

Jun 21, 2024•1 hr 4 min

How AI is eating Finance — with Mike Conover of Brightwave

In April 2023 we released an episode named “Mapping the future of *truly* open source models” to talk about Dolly , the first open, commercial LLM. Mike was leading the OSS models team at Databricks at the time. Today, Mike is back on the podcast to give us the “one year later” update on the evolution of large language models and how he’s been using them to build Brightwave , an an AI research assistant for investment professionals. Today they are announcing a $6M seed round (led by Alessio and ...

Jun 11, 2024•55 min

ICLR 2024 — Best Papers & Talks (Benchmarks, Reasoning & Agents) — ft. Graham Neubig, Aman Sanger, Moritz Hardt)

Our second wave of speakers for AI Engineer World’s Fair were announced ! The conference sold out of Platinum/Gold/Silver sponsors and Early Bird tickets! See our Microsoft episode for more info and buy now with code LATENTSPACE . This episode is straightforwardly a part 2 to our ICLR 2024 Part 1 episode , so without further ado, we’ll just get right on with it! Timestamps [00:03:43] Section A: Code Edits and Sandboxes, OpenDevin, and Academia vs Industry — ft. Graham Neubig and Aman Sanger * [0...

Jun 10, 2024•4 hr 29 min

How to train a Million Context LLM — with Mark Huang of Gradient.ai

<150 Early Bird tickets left for the AI Engineer World’s Fair in SF! Prices go up soon. Note that there are 4 tracks per day and dozens of workshops/expo sessions; the livestream will air <30% of the content this time. Basically you should really come if you dont want to miss out on the most stacked speaker list/AI expo floor of 2024 . Apply for free/discounted Diversity Program and Scholarship tickets here. We hope to make this the definitive technical conference for ALL AI engineers. Exa...

May 30, 2024•58 min

ICLR 2024 — Best Papers & Talks (ImageGen, Vision, Transformers, State Space Models) ft. Durk Kingma, Christian Szegedy, Ilya Sutskever

Speakers for AI Engineer World’s Fair have been announced ! See our Microsoft episode for more info and buy now with code LATENTSPACE — we’ve been studying the best ML research conferences so we can make the best AI industry conf! Note that this year there are 4 main tracks per day and dozens of workshops/expo sessions; the free livestream will air much less than half of the content this time. Apply for free/discounted Diversity Program and Scholarship tickets here. We hope to make this the defi...

May 27, 2024•3 hr 38 min

Emulating Humans with NSFW Chatbots - with Jesse Silver

Disclaimer: today’s episode touches on NSFW topics. There’s no graphic content or explicit language, but we wouldn’t recommend blasting this in work environments. Product website: https://usewhisper.me/ For over 20 years it’s been an open secret that porn drives many new consumer technology innovations, from VHS and Pay-per-view to VR and the Internet . It’s been no different in AI - many of the most elite Stable Diffusion and Llama enjoyers and merging/prompting/PEFT techniques were born in the...

May 16, 2024•54 min

WebSim, WorldSim, and The Summer of Simulative AI — with Joscha Bach of Liquid AI, Karan Malhotra of Nous Research, Rob Haisfield of WebSim.ai

We are 200 people over our 300-person venue capacity for AI UX 2024 , but you can subscribe to our YouTube for the video recaps. Our next event, and largest EVER, is the AI Engineer World’s Fair . See you there! Parental advisory: Adult language used in the first 10 mins of this podcast . Any accounting of Generative AI that ends with RAG as its “final form” is seriously lacking in imagination and missing out on its full potential. While AI generation is very good for “spicy autocomplete” and “r...

Apr 27, 2024•54 min

High Agency Pydantic > VC Backed Frameworks — with Jason Liu of Instructor

We are reuniting for the 2nd AI UX demo day in SF on Apr 28. Sign up to demo here ! And don’t forget tickets for the AI Engineer World’s Fair — for early birds who join before keynote announcements ! About a year ago there was a lot of buzz around prompt engineering techniques to force structured output. Our friend Simon Willison tweeted a bunch of tips and tricks, but the most iconic one is Riley Goodside making it a matter of life or death : Guardrails ( friend of the pod and AI Engineer speak...

Apr 19, 2024•52 min

Supervise the Process of AI Research — with Jungwon Byun and Andreas Stuhlmüller of Elicit

Maggie, Linus, Geoffrey, and the LS crew are reuniting for our second annual AI UX demo day in SF on Apr 28. Sign up to demo here ! And don’t forget tickets for the AI Engineer World’s Fair — for early birds who join before keynote announcements! It’s become fashionable for many AI startups to project themselves as “the next Google” - while the search engine is so 2000s, both Perplexity and Exa referred to themselves as a “ research engine ” or “ answer engine ” in our NeurIPS pod . However thes...

Apr 11, 2024•56 min

Latent Space Chats: NLW (Four Wars, GPT5), Josh Albrecht/Ali Rohde (TNAI), Dylan Patel/Semianalysis (Groq), Milind Naphade (Nvidia GTC), Personal AI (ft. Harrison Chase — LangFriend/LangMem)

Our next 2 big events are AI UX and the World’s Fair . Join and apply to speak/sponsor! Due to timing issues we didn’t have an interview episode to share with you this week, but not to worry, we have more than enough “weekend special” content in the backlog for you to get your Latent Space fix, whether you like thinking about the big picture, or learning more about the pod behind the scenes, or talking Groq and GPUs, or AI Leadership, or Personal AI. Enjoy! AI Breakdown The indefatigable NLW had...

Apr 06, 2024•2 hr 45 min

Presenting the AI Engineer World's Fair — with Sam Schillace, Deputy CTO of Microsoft

TL;DR: You can now buy tickets , apply to speak , or join the expo for the biggest AI Engineer event of 2024. We’re gathering *everyone* you want to meet - see you this June. In last year’s the Rise of the AI Engineer we put our money where our mouth was and announced the AI Engineer Summit , which fortunately went well: With ~500 live attendees and over ~500k views online , the first iteration of the AI Engineer industry affair seemed to be well received. Competing in an expensive city with 3 o...

Mar 29, 2024•43 min

Why Google failed to make GPT-3 + why Multimodal Agents are the path to AGI — with David Luan of Adept

Our next SF event is AI UX 2024 - let’s see the new frontier for UX since last year ! Last call: we are recording a preview of the AI Engineer World’s Fair with swyx and Ben Dunphy, send any questions about Speaker CFPs and Sponsor Guides you have! Alessio is now hiring engineers for a new startup he is incubating at Decibel: Ideal candidate is an “ex-technical co-founder type”. Reach out to him for more! David Luan has been at the center of the modern AI revolution: he was the ~30th hire at Ope...

Mar 22, 2024•42 min

Making Transformers Sing - with Mikey Shulman of Suno

Giving computers a voice has always been at the center of sci-fi movies; “I’m sorry Dave, I’m afraid I can’t do that” wouldn’t hit as hard if it just appeared on screen as a terminal output, after all. The first electronic speech synthesizer, the Voder, was built at Bell Labs 85 years ago (1939!), and it’s…. something: We will not cover the history of Text To Speech (TTS), but the evolution of the underlying architecture has generally been Formant Synthesis → Concatenative Synthesis → Neural Net...

Mar 14, 2024•53 min

Top 5 Research Trends + OpenAI Sora, Google Gemini, Groq Math (Jan-Feb 2024 Audio Recap) + Latent Space Anniversary with Lindy.ai, RWKV, Pixee, Julius.ai, Listener Q&A!

We will be recording a preview of the AI Engineer World’s Fair soon with swyx and Ben Dunphy, send any questions about Speaker CFPs and Sponsor Guides you have! Alessio is now hiring engineers for a new startup he is incubating at Decibel: Ideal candidate is an ex-technical co-founder type (can MVP products end to end, comfortable with ambiguous prod requirements, etc). Reach out to him for more! Thanks for all the love on the Four Wars episode ! We’re excited to develop this new “swyx & Ale...

Mar 09, 2024•1 hr 49 min

Open Source AI is AI we can Trust — with Soumith Chintala of Meta AI

Speaker CFPs and Sponsor Guides are now available for AIE World’s Fair — join us on June 25-27 for the biggest AI Engineer conference of 2024 ! Soumith Chintala needs no introduction in the ML world — his insights are incredibly accessible across Twitter , LinkedIn , podcasts , and conference talks (in this pod we’ll assume you’ll have caught up on the History of PyTorch pod from last year and cover different topics). He’s well known as the creator of PyTorch, but he's more broadly the Engineeri...

Mar 06, 2024•1 hr 20 min

A Brief History of the Open Source AI Hacker - with Ben Firshman of Replicate

This Friday we’re doing a special crossover event in SF with Dylan Patel of SemiAnalysis ( previous guest !), and we will do a live podcast on site. RSVP here . Also join us on June 25-27 for the biggest AI Engineer conference of the year ! Replicate is one of the most popular AI inference providers, reporting over 2 million users as of their $40m Series B with a16z . But how did they get there? The Definitive Replicate Story (warts and all) Their overnight success took 5 years of building, and ...

Feb 28, 2024•1 hr 10 min

← Prev Next →

For the best experience, listen in Metacast app for iOS or Android