Latent Space: The AI Engineer Podcast - podcast cover

Latent Space: The AI Engineer Podcast

swyx + Alessiowww.latent.space
The podcast by and for AI Engineers! In 2024, over 2 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0. We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al. Full show notes always on https://latent.space

Episodes

Windsurf: The Enterprise AI IDE - with Varun and Anshul of Codeium AI

Our second podcast guest ever in March 2023 was Varun Mohan, CEO of Codeium; at the time, they had around 10,000 users and how they vowed to keep their autocomplete free forever: Today, over a million developers use their products, they still have their free tier, and they recently launched Windsurf , an AI IDE. Chapters * 00:00:00: Introductions & Catchup * 00:03:52: Why they created Windsurf * 00:05:52: Limitations of VS Code * 00:10:12: Evaluation methods for Cascade and Windsurf * 00:16:15: ...

Dec 13, 20241 hr 7 min

Generative Video WorldSim, Diffusion, Vision, Reinforcement Learning and Robotics — ICML 2024 Part 1

Regular tickets are now sold out for Latent Space LIVE! at NeurIPS ! We have just announced our last speaker and newest track, friend of the pod Nathan Lambert who will be recapping 2024 in Reasoning Models like o1 ! We opened up a handful of late bird tickets for those who are deciding now — use code DISCORDGANG if you need it. See you in Vancouver! We’ve been sitting on our ICML recordings for a while (from today’s first-ever SOLO guest cohost, Brittany Walker ), and in light of Sora Turbo’s l...

Dec 10, 20247 hr 8 min

Bolt.new, Flow Engineering for Code Agents, and >$8m ARR in 2 months as a Claude Wrapper

The full schedule for Latent Space LIVE! at NeurIPS has been announced, featuring Best of 2024 overview talks for the AI Startup Landscape, Computer Vision, Open Models, Transformers Killers, Synthetic Data, Agents, and Scaling, and speakers from Sarah Guo of Conviction, Roboflow, AI2/Meta, Recursal/Together, HuggingFace, OpenHands and SemiAnalysis. Join us for the IRL event/Livestream ! Alessio will also be holding a meetup at AWS Re:Invent in Las Vegas this Wednesday. See our new Events page f...

Dec 02, 20241 hr 39 min

The new Claude 3.5 Sonnet, Computer Use, and Building SOTA Agents — with Erik Schluntz, Anthropic

We have announced our first speaker , friend of the show Dylan Patel, and topic slates for Latent Space LIVE! at NeurIPS. Sign up for IRL/Livestream and to debate ! We are still taking questions for our next big recap episode! Submit questions and messages on Speakpipe here for a chance to appear on the show! The vibe shift we observed in July - in favor of Claude 3.5 Sonnet, first introduced in June — has been remarkably long lived and persistent, surviving multiple subsequent updates of 4o, o1...

Nov 28, 20241 hr 11 min

Why Compound AI + Open Source will beat Closed AI

We have a full slate of upcoming events : AI Engineer London, AWS Re:Invent in Las Vegas, and now Latent Space LIVE! at NeurIPS in Vancouver and online. Sign up to join and speak ! We are still taking questions for our next big recap episode! Submit questions and messages on Speakpipe here for a chance to appear on the show! We try to stay close to the inference providers as part of our coverage, as our podcasts with Together AI and Replicate will attest: However one of the most notable pull quo...

Nov 25, 202458 min

Agents @ Work: Lindy.ai

Alessio will be at AWS re:Invent next week and hosting a casual coffee meetup on Wednesday, RSVP here! And subscribe to our calendar for our Singapore, NeurIPS, and all upcoming meetups! We are still taking questions for our next big recap episode! Submit questions and messages on Speakpipe here for a chance to appear on the show! If you've been following the AI agents space, you have heard of Lindy AI; while founder Flo Crivello is hesitant to call it "blowing up," when folks like Andrew Wilkin...

Nov 15, 20241 hr 10 min

Agents @ Work: Dust.tt

We are recording our next big recap episode and taking questions! Submit questions and messages on Speakpipe here for a chance to appear on the show! Also subscribe to our calendar for our Singapore, NeurIPS, and all upcoming meetups! In our first ever episode with Logan Kilpatrick we called out the two hottest LLM frameworks at the time: LangChain and Dust. We’ve had Harrison from LangChain on twice ( as a guest and as a co-host ), and we’ve now finally come full circle as Stanislas from Dust j...

Nov 11, 20241 hr

In the Arena: How LMSys changed LLM Benchmarking Forever

Apologies for lower audio quality; we lost recordings and had to use backup tracks. Our guests today are Anastasios Angelopoulos and Wei-Lin Chiang , leads of Chatbot Arena, fka LMSYS, the crowdsourced AI evaluation platform developed by the LMSys student club at Berkeley, which became the de facto standard for comparing language models. Arena Elo is often more cited than MMLU scores to many folks, and they have attracted >1,000,000 people to cast votes since its launch, leading top model traine...

Nov 01, 202441 min

How NotebookLM Was Made

If you’ve listened to the podcast for a while, you might have heard our ElevenLabs-powered AI co-host Charlie a few times. Text-to-speech has made amazing progress in the last 18 months, with OpenAI’s Advanced Voice Mode (aka “Her”) as a sneak peek of the future of AI interactions (see our “Building AGI in Real Time” recap). Yet, we had yet to see a real killer app for AI voice ( not counting music ). Today’s guests, Raiza Martin and Usama Bin Shafqat , are the lead PM and AI engineer behind the...

Oct 25, 20241 hr 14 min

Building the AI Engineer Nation — with Josephine Teo, Minister of Digital Development and Information, Singapore

Singapore's GovTech is hosting an AI CTF challenge with ~$15,000 in prizes, starting October 26th, open to both local and virtual hackers. It will be hosted on Dreadnode's Crucible platform; signup here ! It is common to say if you want to work in AI, you should come to San Francisco. Not everyone can. Not everyone should. If you can only do meaningful AI work in one city, then AI has failed to generalize meaningfully . As non-Americans working in the US, we know what it’s like to see AI progres...

Oct 19, 202457 min

Building the Silicon Brain - with Drew Houston of Dropbox

CEOs of publicly traded companies are often in the news talking about their new AI initiatives, but few of them have built anything with it. Drew Houston from Dropbox is different; he has spent over 400 hours coding with LLMs in the last year and is now refocusing his 2,500+ employees around this new way of working, 17 years after founding the company. Timestamps 00:00 Introductions 00:43 Drew's AI journey 04:14 Revalidating expectations of AI 08:23 Simulation in self-driving vs. knowledge work ...

Oct 18, 20241 hr 12 min

Production AI Engineering starts with Evals — with Ankur Goyal of Braintrust

We are in 🗽 NYC this Monday! Join the AI Eng NYC meetup , bring demos and vibes! It is a bit of a meme that the first thing developer tooling founders think to build in AI is all the non-AI operational stuff outside the AI. There are well over 60 funded LLM Ops startups all with hoping to solve the new observability, cost tracking, security, and reliability problems that come with putting LLMs in production, not to mention new LLM oriented products from incumbent, established ops/o11y players l...

Oct 11, 20241 hr 57 min

Building AGI in Real Time (OpenAI Dev Day 2024)

We all have fond memories of the first Dev Day in 2023 : and the blip that followed soon after. As Ben Thompson has noted , this year’s DevDay took a quieter, more intimate tone. No Satya, no livestream, (slightly fewer people?). Instead of putting ChatGPT announcements in DevDay as in 2023, o1 was announced 2 weeks prior, and DevDay 2024 was reserved purely for developer-facing API announcements, primarily the Realtime API, Vision Finetuning, Prompt Caching, and Model Distillation . However the...

Oct 03, 20242 hr 9 min

Language Agents: From Reasoning to Acting

OpenAI DevDay is almost here ! Per tradition, we are hosting a DevDay pregame event for everyone coming to town! Join us with demos and gossip! Also sign up for related events across San Francisco: the AI DevTools Night , the xAI open house , the Replicate art show , the DevDay Watch Party (for non-attendees), Hack Night with OpenAI at Cloudflare . For everyone else, join the Latent Space Discord for our online watch party and find fellow AI Engineers in your city. OpenAI’s recent o1 release (an...

Sep 27, 20241 hr 30 min

The Ultimate Guide to Prompting

Noah Hein from Latent Space University is finally launching with a free lightning course this Sunday for those new to AI Engineering. Tell a friend! Did you know there are >1,600 papers on arXiv just about prompting ? Between shots, trees, chains, self-criticism, planning strategies, and all sorts of other weird names, it’s hard to keep up. Luckily for us, Sander Schulhoff and team read them all and put together The Prompt Report as the ultimate prompt engineering reference, which we’ll break do...

Sep 20, 20241 hr 9 min

From API to AGI: Structured Outputs, OpenAI API platform and O1 Q&A — with Michelle Pokrass & OpenAI Devrel + Strawberry team

Congrats to Damien on successfully running AI Engineer London ! See our community page and the Latent Space Discord for all upcoming events. This podcast came together in a far more convoluted way than usual, but happens to result in a tight 2 hours covering the ENTIRE OpenAI product suite across ChatGPT-latest, GPT-4o and the new o1 models , and how they are delivered to AI Engineers in the API via the new Structured Output mode, Assistants API, client SDKs, upcoming Voice Mode API, Finetuning/...

Sep 13, 20242 hr 4 min

Efficiency is Coming: 3000x Faster, Cheaper, Better AI Inference from Hardware Improvements, Quantization, and Synthetic Data Distillation

AI Engineering is expanding! Join the first 🇬🇧 AI Engineer London meetup in Sept and get in touch for sponsoring the second 🗽 AI Engineer Summit in NYC this Dec! The commoditization of intelligence takes on a few dimensions: * Time to Open Model Equivalent : 15 months between GPT-4 and Llama 3.1 405B * 10-100x CHEAPER/year : from $30/mtok for Claude 3 Opus to $3/mtok for L3-405B, and a 400x reduction in the frontier OpenAI model from 2022-2024. Notably, for personal use cases, both Gemini Fla...

Sep 03, 20241 hr 5 min

Why you should write your own LLM benchmarks — with Nicholas Carlini, Google DeepMind

Today's guest, Nicholas Carlini, a research scientist at DeepMind, argues that we should be focusing more on what AI can do for us individually , rather than trying to have an answer for everyone. "How I Use AI" - A Pragmatic Approach Carlini's blog post "How I Use AI" went viral for good reason. Instead of giving a personal opinion about AI's potential, he simply laid out how he, as a security researcher, uses AI tools in his daily work. He divided it in 12 sections: * To make applications * As...

Aug 29, 20241 hr 10 min

Is finetuning GPT4o worth it? — with Alistair Pullen, Cosine (Genie)

Betteridge's law says no: with seemingly infinite flavors of RAG, and >2million token context + prompt caching from Anthropic/Deepmind/Deepseek, it's reasonable to believe that "in context learning is all you need". But then there’s Cosine Genie , the first to make a huge bet using OpenAI’s new GPT4o fine-tuning for code at the largest scale it has ever been used externally; resulting in what is now the #1 coding agent in the world according to SWE-Bench Full, Lite, and Verified: SWE-Bench has b...

Aug 22, 20241 hr 5 min

AI Magic: Shipping 1000s of successful products with no managers and a team of 12 — Jeremy Howard of Answer.ai

Disclaimer: We recorded this episode ~1.5 months ago, timing for the FastHTML release. It then got bottlenecked by Llama3.1 , Winds of AI Winter , and SAM2 episodes, so we’re a little late. Since then FastHTML was released , swyx is building an app in it for AINews , and Anthropic has also released their prompt caching API . Remember when Dylan Patel of SemiAnalysis coined the GPU Rich vs GPU Poor war ? (if not, see our pod with him ). The idea was that if you’re GPU poor you shouldn’t waste you...

Aug 16, 202459 min

Segment Anything 2: Demo-first Model Development

Because of the nature of SAM, this is more video heavy than usual. See our YouTube ! Because vision is first among equals in multimodality, and yet SOTA vision language models are closed, we’ve always had an interest in learning what’s next in vision. Our first viral episode was Segment Anything 1 , and we have since covered LLaVA , IDEFICS , Adept , and Reka . But just like with Llama 3 , FAIR holds a special place in our hearts as the New Kings of Open Source AI. The list of sequels better tha...

Aug 07, 20241 hr 4 min

The Winds of AI Winter (Q2 Four Wars Recap) + ChatGPT Voice Mode Preview

Thank you for 1m downloads of the podcast and 2m readers of the Substack! 🎉 This is the audio discussion following The Winds of AI Winter essay that also serves as a recap of Q2 2024 in AI viewed through the lens of our Four Wars framework . Enjoy! Full Video Discussion Full show notes are here . Timestamps * [00:00:00] Intro Song by Suno.ai * [00:02:01] Swyx and Alessio in Singapore * [00:05:49] GPU Rich vs Poors: Frontier Labs * [00:06:35] GPU Rich Frontier Models: Claude 3.5 * [00:10:37] GPU...

Aug 02, 20241 hr 55 min

Llama 2, 3 & 4: Synthetic Data, RLHF, Agents on the path to Open Source AGI

If you see this in time, join our emergency LLM paper club on the Llama 3 paper! For everyone else, join our special AI in Action club on the Latent Space Discord for a special feature with the Cursor cofounders on Composer, their newest coding agent! Today, Meta is officially releasing the largest and most capable open model to date, Llama3-405B , a dense transformer trained on 15T tokens that beats GPT-4 on all major benchmarks: The 8B and 70B models from the April Llama 3 release have also re...

Jul 23, 20241 hr 5 min

Benchmarks 201: Why Leaderboards > Arenas >> LLM-as-Judge

The first AI Engineer World’s Fair talks from OpenAI and Cognition are up! In our Benchmarks 101 episode back in April 2023 we covered the history of AI benchmarks, their shortcomings, and our hopes for better ones. Fast forward 1.5 years, the pace of model development has far exceeded the speed at which benchmarks are updated. Frontier labs are still using MMLU and HumanEval for model marketing, even though most models are reaching their natural plateau at a ~90% success rate (any higher and th...

Jul 12, 202458 min

The 10,000x Yolo Researcher Metagame — with Yi Tay of Reka

Livestreams for the AI Engineer World’s Fair ( Multimodality ft. the new GPT-4o demo , GPUs and Inference (ft. Cognition/Devin), CodeGen , Open Models tracks) are now live! Subscribe to @aidotEngineer to get notifications of the other workshops and tracks! It’s easy to get de-sensitized to new models topping leaderboards every other week — however, the top of the LMsys leaderboard has typically been the exclusive domain of very large, very very well funded model labs like OpenAI, Anthropic, Goog...

Jul 05, 20241 hr 45 min

State of the Art: Training >70B LLMs on 10,000 H100 clusters

It’s return guest season here at Latent Space! We last talked to Kanjun in October and Jonathan in May (and December post Databricks acquisition): Imbue and Databricks are back for a rare treat: a double-header interview talking about DBRX from Databricks and Imbue 70B , a new internal LLM that “outperforms GPT-4o” zero-shot on a range of reasoning and coding-related benchmarks and datasets, while using 7x less data than Llama 3 70B . While Imbue, being an agents company rather than a model prov...

Jun 25, 20241 hr 22 min

[High Agency] AI Engineer World's Fair Preview

The World’s Fair is officially sold out! Thanks for all the support and stay tuned for recaps of all the great goings on in this very special celebration of the AI Engineer! Longtime listeners will remember the fan favorite Raza Habib, CEO of HumanLoop, on the pod: Well, he’s caught the podcasting bug and is now flipping the tables on swyx! Subscribe to High Agency wherever the finest Artificial Intelligence podcast are sold. High Agency Pod Description In this episode, I chatted with Shawn Wang...

Jun 25, 202450 min

How To Hire AI Engineers — with James Brady & Adam Wiggins of Elicit

Editor’s note: One of the top reasons we have hundreds of companies and thousands of AI Engineers joining the World’s Fair next week is, apart from discussing technology and being present for the big launches planned, to hire and be hired! Listeners loved our previous Elicit episode and were so glad to welcome 2 more members of Elicit back for a guest post (and bonus podcast) on how they think through hiring. Don’t miss their AI engineer job description , and template which you can use to create...

Jun 21, 20241 hr 4 min

How AI is eating Finance — with Mike Conover of Brightwave

In April 2023 we released an episode named “Mapping the future of *truly* open source models” to talk about Dolly , the first open, commercial LLM. Mike was leading the OSS models team at Databricks at the time. Today, Mike is back on the podcast to give us the “one year later” update on the evolution of large language models and how he’s been using them to build Brightwave , an an AI research assistant for investment professionals. Today they are announcing a $6M seed round (led by Alessio and ...

Jun 11, 202455 min

ICLR 2024 — Best Papers & Talks (Benchmarks, Reasoning & Agents) — ft. Graham Neubig, Aman Sanger, Moritz Hardt)

Our second wave of speakers for AI Engineer World’s Fair were announced ! The conference sold out of Platinum/Gold/Silver sponsors and Early Bird tickets! See our Microsoft episode for more info and buy now with code LATENTSPACE . This episode is straightforwardly a part 2 to our ICLR 2024 Part 1 episode , so without further ado, we’ll just get right on with it! Timestamps [00:03:43] Section A: Code Edits and Sandboxes, OpenDevin, and Academia vs Industry — ft. Graham Neubig and Aman Sanger * [0...

Jun 10, 20244 hr 29 min
For the best experience, listen in Metacast app for iOS or Android
Open in Metacast