We have just announced our first set of speakers at AI Engineer Summit ! Sign up for the livestream or email sponsors@ai.engineer if you’d like to support. We are facing a massive GPU crunch . As both startups and VC’s hoard Nvidia GPUs like countries count nuclear stockpiles, tweets about GPU shortages have become increasingly common. But what if we could run LLMs with AMD cards, or without a GPU at all? There’s just one weird trick: compilation. And there’s one person uniquely qualified to do ...
Aug 10, 2023•52 min
Our 3rd podcast feed swap with other AI pod friends! Check out Cognitive Revolution and Practical AI as well. NLW is the best daily AI YouTube/podcaster with the AI Breakdown. His summaries and content curation are spot on and always finds the interesting angle that will keep you thinking. Subscribe to the AI Breakdown wherever fine podcasts are sold! https://pod.link/1680633614 You can also watch on YouTube: Timestamps courtesy of summarize.tech The hosts discuss the launch of Code Interpreter ...
Aug 04, 2023•59 min
FlashAttention was first published by Tri Dao in May 2022 and it had a deep impact in the large language models space. Most open models you’ve heard of (RedPajama, MPT , LLaMA , Falcon, etc) all leverage it for faster inference. Tri came on the podcast to chat about FlashAttention, the newly released FlashAttention-2, the research process at Hazy Lab, and more. This is the first episode of our “Papers Explained” series, which will cover some of the foundational research in this space. Our Discor...
Jul 26, 2023•55 min
As first discussed on our May Emergency pod and leaked 4 days ago , Llama (renamed from LLaMA) was upgraded to Llama 2 (pretraining on 2 trillion tokens with 2x the context length - bigger than any dataset discussed in Datasets 101 , and adding ~$20m of RLHF/preference annotation) and released for commercial use on 18 July. It immediately displaced Falcon-40B as the leading open LLM and was immediately converted/ quantized to GGML and other formats. Llama 2 seems to outperform all other open sou...
Jul 19, 2023•1 hr 20 min
In April, we released our first AI Fundamentals episode: Benchmarks 101 . We covered the history of benchmarks, why they exist, how they are structured, and how they influence the development of artificial intelligence. Today we are (finally!) releasing Datasets 101 ! We’re really enjoying doing this series despite the work it takes - please let us know what else you want us to cover! Stop me if you’ve heard this before: “GPT3 was trained on the entire Internet”. Blatantly, demonstrably untrue :...
Jul 17, 2023•1 hr 1 min
Code Interpreter is GA! As we do with breaking news, we convened an emergency pod and >17,000 people tuned in, by far our most biggest ever. This is a 2-for-1 post - a longform essay with our trademark executive summary and core insights - and a podcast capturing day-after reactions. Don’t miss either of them! Essay and transcript: https://latent.space/p/code-interpreter Podcast Timestamps [00:00:00] Intro - Simon and Alex [00:07:40] Code Interpreter for Edge Cases [00:08:59] Code Interpreter...
Jul 10, 2023•2 hr 4 min
Part 2 of our podcast feed swap weekend! Check out Cognitive Revolution as well. "Data" Dan Whitenack has been co-host of the Practical AI podcast for the past 5 years, covering full journey of the modern AI wave post Transformers. He joined us in studio to talk about their origin story and highlight key learnings from past episodes, riff on the AI trends we are all seeing as AI practitioner-podcasters, and his passion for low-resource-everything! Subscribe on the Changelog , RSS , Apple Podcast...
Jul 02, 2023•1 hr
Thanks to the over 1m people that have checked out the Rise of the AI Engineer . It’s a long July 4 weekend in the US, and we’re celebrating with a podcast feed swap! We’ve been big fans of Nathan Labenz and Erik Torenberg’s work at the Cognitive Revolution podcast for a while, which started around the same time as we did and has done an incredible job of hosting discussions with top researchers and thinkers in the field, with a wide range of topics across computer vision (a special focus thanks...
Jul 01, 2023•2 hr 5 min
We are now launching our dedicated new YouTube and Twitter ! Any help in amplifying our podcast would be greatly appreciated, and of course, tell your friends! Notable followon discussions collected on Twitter , Reddit , Reddit , Reddit , HN , and HN . Please don’t obsess too much over the GPT4 discussion as it is mostly rumor; we spent much more time on tinybox/tinygrad on which George is the foremost authority! We are excited to share the world’s first interview with George Hotz on the tiny co...
Jun 20, 2023•1 hr 13 min
Full Transcript and show notes: https://www.latent.space/p/function-agents?sd=pf Timestamps: [00:00:00] Intro [00:01:47] Recapping June 2023 Updates [00:06:24] Known Issues with Long Context [00:08:00] New Functions API [00:10:45] Riley Goodside [00:12:28] Simon Willison [00:14:30] Eric Elliott [00:16:05] Functions API and Agents [00:18:25] Functions API vs Google Vertex JSON [00:21:32] From English back to Code [00:26:14] Embedding Price Drop and Pinecone Perspective [00:30:39] Xenova and Huggi...
Jun 14, 2023•1 hr 28 min
Welcome to the almost 3k latent space explorers that joined us last month! We’re holding our first SF listener meetup with Practical AI next Monday; join us if you want to meet past guests and put faces to voices! All events are in /community . Who among you regularly click the ubiquitous 👍 /👎 buttons in ChatGPT/Bard/etc? Anyone? I don’t see any hands up. OpenAI has told us how important reinforcement learning from human feedback (RLHF) is to creating the magic that is ChatGPT, but we know fro...
Jun 08, 2023•49 min
Read: https://www.latent.space/p/ai-interfaces-and-notion Show Notes * Linus on Twitter * Linus’ personal blog * Notion * Notion AI * Notion Projects * AI UX Meetup Recap Timestamps * [00:03:30] Starting the AI / UX community * [00:10:01] Most knowledge work is not text generation * [00:16:21] Finding the right constraints and interface for AI * [00:19:06] Linus' journey to working at Notion * [00:23:29] The importance of notations and interfaces * [00:26:07] Setting interface defaults and stand...
Jun 01, 2023•1 hr 10 min
We are hosting the AI World’s Fair in San Francisco on June 8th! You can RSVP here . Come meet fellow builders, see amazing AI tech showcases at different booths around the venue, all mixed with elements of traditional fairs: live music, drinks, games, and food! We are also at Amplitude’s AI x Product Hackathon and are hosting our first joint Latent Space + Practical AI Podcast Listener Meetup next month! We are honored by the rave reviews for our last episode with MosaicML! They are also welcom...
May 25, 2023•1 hr 3 min
We are excited to be the first podcast in the world to release an in-depth interview on the new SOTA in commercially licensed open source models - MosiacML MPT-7B! The Latent Space crew will be at the NYC Lux AI Summit next week, and have two meetups in June. As usual, all events are on the Community page ! We are also inviting beta testers for the upcoming AI for Engineers course. See you soon! One of GPT3’s biggest limitations is context length - you can only send it up to 4000 tokens (3k word...
May 20, 2023•1 hr 7 min
Tomorrow, 5/16, we’re hosting Latent Space Liftoff Day in San Francisco. We have some amazing demos from founders at 5:30pm, and we’ll have an open co-working starting at 2pm. Spaces are limited, so please RSVP here ! One of the biggest criticisms of large language models is their inability to tightly follow requirements without extensive prompt engineering. You might have seen examples of ChatGPT playing a game of chess and making many invalid moves, or adding new pieces to the board. Guardrail...
May 16, 2023•1 hr 2 min
Thanks to the over 42,000 latent space explorers who checked out our Replit episode ! We are hosting/attending a couple more events in SF and NYC this month. See you if in town! Lexica.art was introduced to the world 24 hours after the release of Stable Diffusion as a search engine for prompts, gaining instant product-market fit as a world discovering generative AI also found they needed to learn prompting by example. Lexica is now 8 months old, serving 5B image searches/day, and just shipped V3...
May 08, 2023•51 min
It’s now almost 6 months since Google declared Code Red , and the results — Jeff Dean’s recap of 2022 achievements and a mass exodus of the top research talent that contributed to it in January, Bard’s rushed launch in Feb, a slick video showing Google Workspace AI features and confusing doubly linked blogposts about PaLM API in March, and merging Google Brain and DeepMind in April — have not been inspiring. Google’s internal panic is in full display now with the surfacing of a well written memo...
May 05, 2023•44 min
Latent Space is popping off! Welcome to the over 8500 latent space explorers who have joined us. Join us this month at various events in SF and NYC , or start your own! This post spent 22 hours at the top of Hacker News . As announced during their Developer Day celebrating their $100m fundraise following their Google partnership , Replit is now open sourcing its own state of the art code LLM: replit-code-v1-3b ( model card , HF Space ), which beats OpenAI’s Codex model on the industry standard H...
May 03, 2023•1 hr 10 min
The race is on for the first fully GPT3/4-equivalent, truly open source Foundation Model! LLaMA’s release proved that a great model could be released and run on consumer-grade hardware (see llama.cpp ), but its research license prohibits businesses from running it and all it’s variants (Alpaca, Vicuna, Koala, etc) for their own use at work. So there is great interest and desire for *truly* open source LLMs that are feasible for commercial use (with far better customization, finetuning, and priva...
Apr 29, 2023•1 hr 16 min
The most recent YCombinator W23 batch graduated 59 companies building with Generative AI for everything from sales, support, engineering, data, and more: Many of these B2B startups will be seeking to establish an AI foothold in the enterprise. As they look to recent success, they will find Glean, started in 2019 by a group of ex-Googlers to finally solve AI-enabled enterprise search. In 2022 Sequoia led their Series C at a $1b valuation and Glean have just refreshed their website touting new log...
Apr 22, 2023•1 hr 4 min
2023 is the year of Multimodal AI , and Latent Space is going multimodal too! * This podcast comes with a video demo at the 1hr mark and it’s a good excuse to launch our YouTube - please subscribe! * We are also holding two events in San Francisco — the first AI | UX meetup next week (already full; we’ll send a recap here on the newsletter) and Latent Space Liftoff Day on May 4th ( signup here ; but get in touch if you have a high profile launch you’d like to make). * We also joined the Chroma/O...
Apr 13, 2023•1 hr 20 min
We’re trying a new format, inspired by Acquired.fm ! No guests, no news, just highly prepared, in-depth conversation on one topic that will level up your understanding. We aren’t experts, we are learning in public. Please let us know what we got wrong and what you think of this new format! When you ask someone to break down the basic ingredients of a Large Language Model, you’ll often hear a few things: You need lots of data. You need lots of compute. You need models with billions of parameters....
Apr 07, 2023•51 min
We are excited to feature our first academic on the pod! I first came across Shreya when her tweetstorm of MLOps principles went viral: Shreya’s holistic approach to production grade machine learning has taken her from Stanford to Facebook and Google Brain, being the first ML Engineer at Viaduct, and now a PhD in Databases (trust us, its relevant) at UC Berkeley with the new EPIC Data Lab . If you know Berkeley’s history in turning cutting edge research into gamechanging startups, you should be ...
Mar 29, 2023•42 min
This blogpost has been updated since original release to add more links and references. The ChatGPT Plugins announcement today could be viewed as the launch of ChatGPT’s “App Store”, a moment as significant as when Apple opened its App Store for the iPhone in 2008 or when Facebook let developers loose on its Open Graph in 2010. With a dozen lines of simple JSON and a mostly-english prompt to help ChatGPT understand what the plugin does, developers will be able to add extensions to ChatGPT to get...
Mar 24, 2023•1 hr 36 min
If Text is the Universal Interface , then Text to SQL is perhaps the killer B2B business usecase for Generative AI. You may have seen incredible demos from Perplexity AI , OSS Insights , and CensusGPT where the barrier of learning SQL and schemas goes away and you can intuitively converse with your data in natural language. But in the multi-billion dollar data engineering industry, Seek.ai has emerged as the forerunner in building a conversational engine and knowledge base that truly democratize...
Mar 10, 2023•38 min
OpenAI just rollicked the AI world yet again yesterday — while releasing the long awaited ChatGPT API, they also priced it at $2 per million tokens generated, which is 90% cheaper than the text-davinci-003 pricing of the “GPT3.5” family. Their blogpost on how they did it is vague: Through a series of system-wide optimizations, we’ve achieved 90% cost reduction for ChatGPT since December; we’re now passing through those savings to API users. We were fortunate enough to record Episode 2 of our pod...
Mar 02, 2023•51 min
We’re so glad to launch our first podcast episode with Logan Kilpatrick ! This also happens to be his first public interview since joining OpenAI as their first Developer Advocate. Thanks Logan! Recorded in-person at the beautiful StudioPod studios in San Francisco. Full transcript is below the fold. Timestamps * 00:29: Logan’s path to OpenAI * 07:06: On ChatGPT and GPT3 API * 16:16: On Prompt Engineering * 20:30: Usecases and LLM-Native Products * 25:38: Risks and benefits of building on OpenAI...
Feb 23, 2023•52 min