Latent Space: The AI Engineer Podcast - podcast cover

Latent Space: The AI Engineer Podcast

swyx + Alessiowww.latent.space
The podcast by and for AI Engineers! In 2024, over 2 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0. We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al. Full show notes always on https://latent.space

Episodes

Powering your Copilot for Data – with Artem Keydunov of Cube.dev

The first workshops and talks from the AI Engineer Summit are now up ! Join the >20k viewers on YouTube , find clips on Twitter (we’re also clipping @latentspacepod ), and chat with us on Discord ! Text-to-SQL was one of the first applications of NLP. Thoughtspot offered “Ask your data questions” as their core differentiation compared to traditional dashboarding tools. In a way, they provide a much friendlier interface with your own structured (aka “tabular”, as in “SQL tables”) data, the same w...

Oct 26, 202339 min

The End of Finetuning — with Jeremy Howard of Fast.ai

Thanks to the over 17,000 people who have joined the first AI Engineer Summit! A full recap is coming. Last call to fill out the State of AI Engineering survey ! See our Community page for upcoming meetups in SF, Paris and NYC . This episode had good interest on Twitter and was discussed on the Vanishing Gradients podcast . Fast.ai’s “Practical Deep Learning” courses been watched by over >6,000,000 people, and the fastai library has over 25,000 stars on Github. Jeremy Howard, one of the creators...

Oct 19, 20231 hr 9 min

Why AI Agents Don't Work (yet) - with Kanjun Qiu of Imbue

Thanks to the over 11,000 people who joined us for the first AI Engineer Summit! A full recap is coming, but you can 1) catch up on the fun and videos on Twitter and YouTube , 2) help us reach 1000 people for the first comprehensive State of AI Engineering survey and 3) submit projects for the new AI Engineer Foundation . See our Community page for upcoming meetups in SF, Paris, NYC, and Singapore . This episode had good interest on Twitter . Last month, Imbue was crowned as AI’s newest unicorn ...

Oct 14, 20231 hr 5 min

[AIE Summit Preview #2] The AI Horcrux — Swyx on Cognitive Revolution

This is a special double weekend crosspost of AI podcasts, helping attendees prepare for the AI Engineer Summit next week. After our first friendly feedswap with the Cognitive Revolution pod , swyx was invited for a full episode to go over the state of AI Engineering and to preview the AI Engineer Summit Schedule , where we share many former CogRev guests as speakers. For those seeking to understand how two top AI podcasts think about major top of mind AI Engineering topics, this should be the p...

Oct 08, 20231 hr 30 min

[AIE Summit Preview #1] Swyx on Software 3.0 and the Rise of the AI Engineer

This is a special double weekend crosspost of AI podcasts, helping attendees prepare for the AI Engineer Summit next week. Swyx gave a keynote on the Software 3.0 Landscape recently (referenced in our recent Humanloop episode ) and was invited to go deeper in podcast format, and to preview the AI Engineer Summit Schedule . For those seeking to ramp up on the current state of thinking on AI Engineering, this should be the perfect place to start, alongside our upcoming Latent Space University cour...

Oct 07, 202339 min

RAG Is A Hack - with Jerry Liu from LlamaIndex

Want to help define the AI Engineer stack ? >800 folks have weighed in on the top tools, communities and builders for the first State of AI Engineering survey, which we will present for the first time at next week’s AI Engineer Summit . Join us online ! This post had robust discussion on HN and Twitter . In October 2022, Robust Intelligence hosted an internal hackathon to play around with LLMs which led to the creation of two of the most important AI Engineering tools: LangChain 🦜⛓️ ( our inter...

Oct 05, 20231 hr 8 min

Building the Foundation Model Ops Platform — with Raza Habib of Humanloop

Want to help define the AI Engineer stack? >500 folks have weighed in on the top tools, communities and builders for the first State of AI Engineering survey! Please fill it out (and help us reach 1000!) The AI Engineer Summit schedule is now live! We are running two Summits and judging two Hackathons this Oct. As usual, see our Discord and community page for all events. A rite of passage for every AI Engineer is shipping a quick and easy demo, and then having to cobble together a bunch of solut...

Sep 29, 20231 hr 21 min

Heralds of the AI Content Flippening — with Youssef Rizk of Wondercraft.ai

Want to help define the AI Engineer stack? Have opinions on the top tools, communities and builders? We’re collaborating with friends at Amplify to launch the first State of AI Engineering survey ! Please fill it out (and tell your friends)! In March, we started off our GPT4 coverage framing one of this year’s key forks in the road as the “ Year of Multimodal vs Multimodel AI ”. 6 months in, neither has panned out yet. The vast majority of LLM usage still defaults to chatbots built atop OpenAI (...

Sep 20, 202353 min

Doing it the Hard Way: Making the AI engine and language 🔥 of the future — with Chris Lattner of Modular

Want to help define the AI Engineer stack? Have opinions on the top tools, communities and builders? We’re collaborating with friends at Amplify to launch the first State of AI Engineering survey! Please fill it out (and tell your friends)! If AI is so important, why is its software so bad? This was the motivating question for Chris Lattner as he reconnected with his product counterpart on Tensorflow, Tim Davis , and started working on a modular solution to the problem of sprawling, monolithic, ...

Sep 14, 20231 hr 29 min

The Point of LangChain — with Harrison Chase of LangChain

As alluded to on the pod, LangChain has just launched LangChain Hub : “the go-to place for developers to discover new use cases and polished prompts.” It’s available to everyone with a LangSmith account, no invite code necessary. Check it out ! In 2023, LangChain has speedrun the race from 2:00 to 4:00 to 7:00 Silicon Valley Time . From the back to back $10m Benchmark seed and (rumored) $20-25m Sequoia Series A in April, to back to back critiques of “ LangChain is Pointless ” and “ The Problem w...

Sep 06, 20231 hr 1 min

RWKV: Reinventing RNNs for the Transformer Era — with Eugene Cheah of UIlicious

The AI Engineer Summit Expo has been announced , presented by AutoGPT (and future guest Toran Bruce-Richards !) Stay tuned for more updates on the Summit livestream and Latent Space University . This post was on HN for 10 hours . What comes after the Transformer? This is one of the Top 10 Open Challenges in LLM Research that has been the talk of the AI community this month. Jon Frankle ( friend of the show !) has an ongoing bet with Sasha Rush on whether Attention is All You Need , and the most ...

Aug 30, 20231 hr 12 min

Cursor.so: The AI-first Code Editor — with Aman Sanger of Anysphere

Thanks to the almost 30k people who tuned in to the last episode ! Your podcast cohosts have been busy shipping: * Alessio open sourced smol-podcaster , which makes the show notes here! * swyx launched GodMode . Maybe someday the Cursor of browsers? * We’re also helping organize a Llama Finetuning Hackameetup this Saturday in anticipation of the CodeLlama release. Lastly, more speakers were announced at AI Engineer Summit ! 👀 ~46% of code typed through VS Code is written by Copilot. How do we g...

Aug 22, 202359 min

The Mathematics of Training LLMs — with Quentin Anthony of Eleuther AI

Invites are going out for AI Engineer Summit ! In the meantime, we have just announced our first Actually Open AI event with Brev.dev and Langchain, Aug 26 in our SF HQ (we’ll record talks for those remote). See you soon (and join the Discord)! Special thanks to @nearcyan for helping us arrange this with the Eleuther team. This post was on the HN frontpage for 15 hours. As startups and even VCs hoard GPUs to attract talent, the one thing more valuable than GPUs is knowing how to use them (aka, m...

Aug 16, 202351 min

LLMs Everywhere: Running 70B models in browsers and iPhones using MLC — with Tianqi Chen of CMU / OctoML

We have just announced our first set of speakers at AI Engineer Summit ! Sign up for the livestream or email [email protected] if you’d like to support. We are facing a massive GPU crunch . As both startups and VC’s hoard Nvidia GPUs like countries count nuclear stockpiles, tweets about GPU shortages have become increasingly common. But what if we could run LLMs with AMD cards, or without a GPU at all? There’s just one weird trick: compilation. And there’s one person uniquely qualified to do ...

Aug 10, 202352 min

[AI Breakdown] Summer AI Technical Roundup: a Latent Space x AI Breakdown crossover pod!

Our 3rd podcast feed swap with other AI pod friends! Check out Cognitive Revolution and Practical AI as well. NLW is the best daily AI YouTube/podcaster with the AI Breakdown. His summaries and content curation are spot on and always finds the interesting angle that will keep you thinking. Subscribe to the AI Breakdown wherever fine podcasts are sold! https://pod.link/1680633614 You can also watch on YouTube: Timestamps courtesy of summarize.tech The hosts discuss the launch of Code Interpreter ...

Aug 04, 202359 min

FlashAttention 2: making Transformers 800% faster w/o approximation - with Tri Dao of Together AI

FlashAttention was first published by Tri Dao in May 2022 and it had a deep impact in the large language models space. Most open models you’ve heard of (RedPajama, MPT , LLaMA , Falcon, etc) all leverage it for faster inference. Tri came on the podcast to chat about FlashAttention, the newly released FlashAttention-2, the research process at Hazy Lab, and more. This is the first episode of our “Papers Explained” series, which will cover some of the foundational research in this space. Our Discor...

Jul 26, 202355 min

Llama 2: The New Open LLM SOTA (ft. Nathan Lambert, Matt Bornstein, Anton Troynikov, Russell Kaplan, Whole Mars Catalog et al.)

As first discussed on our May Emergency pod and leaked 4 days ago , Llama (renamed from LLaMA) was upgraded to Llama 2 (pretraining on 2 trillion tokens with 2x the context length - bigger than any dataset discussed in Datasets 101 , and adding ~$20m of RLHF/preference annotation) and released for commercial use on 18 July. It immediately displaced Falcon-40B as the leading open LLM and was immediately converted/ quantized to GGML and other formats. Llama 2 seems to outperform all other open sou...

Jul 19, 20231 hr 20 min

AI Fundamentals: Datasets 101

In April, we released our first AI Fundamentals episode: Benchmarks 101 . We covered the history of benchmarks, why they exist, how they are structured, and how they influence the development of artificial intelligence. Today we are (finally!) releasing Datasets 101 ! We’re really enjoying doing this series despite the work it takes - please let us know what else you want us to cover! Stop me if you’ve heard this before: “GPT3 was trained on the entire Internet”. Blatantly, demonstrably untrue :...

Jul 17, 20231 hr 1 min

Code Interpreter == GPT 4.5 (w/ Simon Willison, Alex Volkov, Aravind Srinivas, Alex Graveley, et al.)

Code Interpreter is GA! As we do with breaking news, we convened an emergency pod and >17,000 people tuned in, by far our most biggest ever. This is a 2-for-1 post - a longform essay with our trademark executive summary and core insights - and a podcast capturing day-after reactions. Don’t miss either of them! Essay and transcript: https://latent.space/p/code-interpreter Podcast Timestamps [00:00:00] Intro - Simon and Alex [00:07:40] Code Interpreter for Edge Cases [00:08:59] Code Interpreter's ...

Jul 10, 20232 hr 4 min

[Practical AI] AI Trends: a Latent Space x Practical AI crossover pod!

Part 2 of our podcast feed swap weekend! Check out Cognitive Revolution as well. "Data" Dan Whitenack has been co-host of the Practical AI podcast for the past 5 years, covering full journey of the modern AI wave post Transformers. He joined us in studio to talk about their origin story and highlight key learnings from past episodes, riff on the AI trends we are all seeing as AI practitioner-podcasters, and his passion for low-resource-everything! Subscribe on the Changelog , RSS , Apple Podcast...

Jul 02, 20231 hr

[Cognitive Revolution] The Tiny Model Revolution with Ronen Eldan and Yuanzhi Li of Microsoft Research

Thanks to the over 1m people that have checked out the Rise of the AI Engineer . It’s a long July 4 weekend in the US, and we’re celebrating with a podcast feed swap! We’ve been big fans of Nathan Labenz and Erik Torenberg’s work at the Cognitive Revolution podcast for a while, which started around the same time as we did and has done an incredible job of hosting discussions with top researchers and thinkers in the field, with a wide range of topics across computer vision (a special focus thanks...

Jul 01, 20232 hr 5 min

Commoditizing the Petaflop — with George Hotz of the tiny corp

We are now launching our dedicated new YouTube and Twitter ! Any help in amplifying our podcast would be greatly appreciated, and of course, tell your friends! Notable followon discussions collected on Twitter , Reddit , Reddit , Reddit , HN , and HN . Please don’t obsess too much over the GPT4 discussion as it is mostly rumor; we spent much more time on tinybox/tinygrad on which George is the foremost authority! We are excited to share the world’s first interview with George Hotz on the tiny co...

Jun 20, 20231 hr 13 min

Emergency Pod: OpenAI's new Functions API, 75% Price Drop, 4x Context Length (w/ Alex Volkov, Simon Willison, Riley Goodside, Joshua Lochner, Stefania Druga, Eric Elliott, Mayo Oshin et al)

Full Transcript and show notes: https://www.latent.space/p/function-agents?sd=pf Timestamps: [00:00:00] Intro [00:01:47] Recapping June 2023 Updates [00:06:24] Known Issues with Long Context [00:08:00] New Functions API [00:10:45] Riley Goodside [00:12:28] Simon Willison [00:14:30] Eric Elliott [00:16:05] Functions API and Agents [00:18:25] Functions API vs Google Vertex JSON [00:21:32] From English back to Code [00:26:14] Embedding Price Drop and Pinecone Perspective [00:30:39] Xenova and Huggi...

Jun 14, 20231 hr 28 min

From RLHF to RLHB: The Case for Learning from Human Behavior - with Jeffrey Wang and Joe Reeve of Amplitude

Welcome to the almost 3k latent space explorers that joined us last month! We’re holding our first SF listener meetup with Practical AI next Monday; join us if you want to meet past guests and put faces to voices! All events are in /community . Who among you regularly click the ubiquitous 👍 /👎 buttons in ChatGPT/Bard/etc? Anyone? I don’t see any hands up. OpenAI has told us how important reinforcement learning from human feedback (RLHF) is to creating the magic that is ChatGPT, but we know fro...

Jun 08, 202349 min

Building the AI × UX Scenius — with Linus Lee of Notion AI

Read: https://www.latent.space/p/ai-interfaces-and-notion Show Notes * Linus on Twitter * Linus’ personal blog * Notion * Notion AI * Notion Projects * AI UX Meetup Recap Timestamps * [00:03:30] Starting the AI / UX community * [00:10:01] Most knowledge work is not text generation * [00:16:21] Finding the right constraints and interface for AI * [00:19:06] Linus' journey to working at Notion * [00:23:29] The importance of notations and interfaces * [00:26:07] Setting interface defaults and stand...

Jun 01, 20231 hr 10 min

Debugging the Internet with AI agents – with Itamar Friedman of Codium AI and AutoGPT

We are hosting the AI World’s Fair in San Francisco on June 8th! You can RSVP here . Come meet fellow builders, see amazing AI tech showcases at different booths around the venue, all mixed with elements of traditional fairs: live music, drinks, games, and food! We are also at Amplitude’s AI x Product Hackathon and are hosting our first joint Latent Space + Practical AI Podcast Listener Meetup next month! We are honored by the rave reviews for our last episode with MosaicML! They are also welcom...

May 25, 20231 hr 3 min

MPT-7B and The Beginning of Context=Infinity — with Jonathan Frankle and Abhinav Venigalla of MosaicML

We are excited to be the first podcast in the world to release an in-depth interview on the new SOTA in commercially licensed open source models - MosiacML MPT-7B! The Latent Space crew will be at the NYC Lux AI Summit next week, and have two meetups in June. As usual, all events are on the Community page ! We are also inviting beta testers for the upcoming AI for Engineers course. See you soon! One of GPT3’s biggest limitations is context length - you can only send it up to 4000 tokens (3k word...

May 20, 20231 hr 7 min

Guaranteed quality and structure in LLM outputs - with Shreya Rajpal of Guardrails AI

Tomorrow, 5/16, we’re hosting Latent Space Liftoff Day in San Francisco. We have some amazing demos from founders at 5:30pm, and we’ll have an open co-working starting at 2pm. Spaces are limited, so please RSVP here ! One of the biggest criticisms of large language models is their inability to tightly follow requirements without extensive prompt engineering. You might have seen examples of ChatGPT playing a game of chess and making many invalid moves, or adding new pieces to the board. Guardrail...

May 16, 20231 hr 2 min

The AI Founder Gene: Being Early, Building Fast, and Believing in Greatness — with Sharif Shameem of Lexica

Thanks to the over 42,000 latent space explorers who checked out our Replit episode ! We are hosting/attending a couple more events in SF and NYC this month. See you if in town! Lexica.art was introduced to the world 24 hours after the release of Stable Diffusion as a search engine for prompts, gaining instant product-market fit as a world discovering generative AI also found they needed to learn prompting by example. Lexica is now 8 months old, serving 5B image searches/day, and just shipped V3...

May 08, 202351 min

No Moat: Closed AI gets its Open Source wakeup call — ft. Simon Willison

It’s now almost 6 months since Google declared Code Red , and the results — Jeff Dean’s recap of 2022 achievements and a mass exodus of the top research talent that contributed to it in January, Bard’s rushed launch in Feb, a slick video showing Google Workspace AI features and confusing doubly linked blogposts about PaLM API in March, and merging Google Brain and DeepMind in April — have not been inspiring. Google’s internal panic is in full display now with the surfacing of a well written memo...

May 05, 202344 min
For the best experience, listen in Metacast app for iOS or Android
Open in Metacast