Interconnects - podcast cover

Interconnects

Nathan Lambertwww.interconnects.ai
Audio essays about the latest developments in AI and interviews with leading scientists in the field. Breaking the hype, understanding what's under the hood, and telling stories.

www.interconnects.ai

Episodes

Elicitation, the simplest way to understand post-training

Full post: https://www.interconnects.ai/p/elicitation-theory-of-post-training If you look at most of the models we've received from OpenAI, Anthropic, and Google in the last 18 months, you'll hear a lot of "Most of the improvements were in the post-training phase." The most recent one was Anthropic’s CEO Dario Amodei explaining Claude 3.7 on the Hard Fork Podcast : We are not too far away from releasing a model that's a bigger base model. Most of the improvements in 3.6/3.7 are in the post-train...

Mar 10, 20258 min

Where inference-time scaling pushes the market for AI companies

Link: https://www.interconnects.ai/p/where-inference-time-scaling-pushes There’s a lot of noise about the current costs of AI models served for free users, mostly saying it’s unsustainable and making the space narrow for those with the historical perspective of costs of technology always plummeting. GPT-4.5’s odd release of a “giant” model without a clear niche only amplified these critics. With inference-time compute being a new default mode, can we still have free AI products? Are we just in t...

Mar 05, 202514 min

GPT-4.5: "Not a frontier model"?

More: https://www.interconnects.ai/p/gpt-45-not-a-frontier-model As GPT-4.5 was being released, the first material the public got access to was OpenAI’s system card for the model that details some capability evaluations and mostly safety estimates. Before the live stream and official blog post, we knew things were going to be weird because of this line: GPT-4.5 is not a frontier model. The updated system card in the launch blog post does not have this. Here’s the original system card if you need...

Feb 28, 202510 min

Character training: Understanding and crafting a language model's personality

https://www.interconnects.ai/p/character-training The vast majority of evaluations used to measure progress on post-training at frontier laboratories are internal evaluations rather than the evaluations you hear about all the time like MATH or GPQA. These, the well-known intra-industry evaluations, are certainly important for ballparking behavior, but for every public evaluation, these frontier laboratories are likely to have 10+ fine-grained internal evaluations. The internal evaluations these ...

Feb 26, 202512 min

Claude 3.7 thonks and what's next for inference-time scaling

On Monday, February 24th, 2025, Anthropic announced their latest model , Claude 3.7 Sonnet, which is their first model explicitly trained to use more inference time tokens to improve performance. This is another reinforcement learning (RL) trained model (mentioned in system card ). With this model, they also released Claude Code as a limited research preview, which is a “command line tool for agentic coding.” Continuous improvements in models are enabling new modalities and domains addressable b...

Feb 24, 202510 min

Grok 3 and an accelerating AI roadmap

Full post: https://www.interconnects.ai/p/grok-3-and-an-accelerating-ai-roadmap xAI launched their latest flagship model, Grok 3, last night via a live stream on X , which is a new take on the launch process, but it largely felt familiar. Grok 3 is a state-of-the-art model on some important benchmarks. The core is that it is state-of-the-art relative to available models and we know better models are out there. Only some of them have been announced , some of them have been teased , and others lie...

Feb 18, 202512 min

An unexpected RL Renaissance

The era we are living through in language modeling research is one characterized by complete faith that reasoning and new reinforcement learning (RL) training methods will work. This is well-founded. A day | cannot | go | by | without | a new | reasoning model , RL training result , or dataset distilled from DeepSeek R1 . The difference, compared to the last time RL was at the forefront of the AI world with the fact that reinforcement learning from human feedback (RLHF) was needed to create Chat...

Feb 13, 202540 min

Deep Research, information vs. insight, and the nature of science

Article: https://www.interconnects.ai/p/deep-research-information-vs-insight-in-science (sorry about some more audible breaths in this -- I'm going to work on it!) We at Ai2 released a local LM iPhone app for our OLMoE model (1B active, 7B total params), with greatly improved scores ! Let us know what you think, or read more here . OpenAI’s Deep Research has largely been accepted as a super valuable tool for knowledge workers and analysts across the economy, but its real engine of economic progr...

Feb 12, 202514 min

Making the U.S. the home for open-source AI

As many of you know, this weekend I appeared on the Lex Fridman Podcast with my friend Dylan Patel of SemiAnalysis to cover DeepSeek and the implications on the AI ecosystem. I recommend you check it out. This post was tricky to pull together. I decided to share it anyways given the timeliness of the topic and other more exciting things I have to get to. The minor, thematic contradictions on motivations, costs, and trajectories are exactly indicative of why analysis and productionization of open...

Feb 05, 202516 min

Why reasoning models will generalize

This post is early to accommodate some last minute travel on my end! The new models trained to express extended chain of thought are going to generalize outside of their breakthrough domains of code and math. The “reasoning” process of language models that we use today is chain of thought reasoning. We ask the model to work step by step because it helps it manage complexity, especially in domains where the answer requires precision across multiple specific tokens. The domains where chain of thou...

Jan 28, 202512 min

Interviewing OLMo 2 leads: Open secrets of training language models

We're here to share the story of building our Open Language Models (OLMos) and what we improved to build the OLMo 2 7B/13B model that is competitive with the Llama 3.1 8B model. This is all about building an effective, small language modeling team that can share all it learns with the scientific community. Dirk, Luca, and Kyle are some of the people I learn the most from and have more knowledge (and entertainment) to share than we have time. Some questions were pulled from Twitter , but please c...

Jan 22, 20251 hr 13 min

DeepSeek R1's recipe to replicate o1 and the future of reasoning LMs

Full post for links, images, etc: https://www.interconnects.ai/p/deepseek-r1-recipe-for-o1 I have a few shows to share with you this week: * On The Retort a week or two ago, we discussed the nature of AI and if it is a science (in the Kuhn’ian sense) * I appeared on Dean W. Ball and Timothy B. Lee ’s new podcast AI Summer to discuss “thinking models” and the border between post-training and reasoning methods. Listen here . * Finally, a talk I gave at NeurIPs on how I think about post-training fo...

Jan 21, 202520 min

Let me use my local LMs on Meta Ray-Bans

Full post for images, etc: https://www.interconnects.ai/p/to-meta-ray-ban-local-ai With the Rabbit r1, the Humane pin, the Friend thing, the Sam Altman rumors, Meta Ray-Bans, and everything in between , it is obvious that we are going to get new devices in the near future driven by advancements in AI. Trying some of those that already are public makes this obvious from a functional perspective rather than a marketing perspective. Even though many of these devices will have a shelf life drastical...

Jan 15, 202510 min

(Voiceover) DeepSeek V3 and the actual cost of training frontier AI models

Original post: https://www.interconnects.ai/p/deepseek-v3-and-the-actual-cost-of Chapters 00:00 Opening 03:15 DeepSeek’s learning efficiency 06:49 DeepSeek’s compute transparency and reality Figures Fig 1: Benchmark Results Fig 2: ChatBotArena Results Fig 3: Compute Usage Table Get full access to Interconnects at www.interconnects.ai/subscribe...

Jan 09, 202517 min

The state of post-training in 2025

Slides for this post-training talk and slides for the full tutorial on language modeling (with a bit less post-training content and no recording yet). Here are some timestamps for the video: 00:00 Introduction 10:00 Prompts & Skill Selection 14:19 Instruction Finetuning 21:45 Preference Finetuning 36:17 Reinforcement Finetuning 45:28 Open Questions 52:02 Wrap Up Psssst… we just recently released our technical report for OLMo 2 — 2 OLMo 2 Furious , check it out for tons of training details and ti...

Jan 08, 202554 min

Quick recap on the state of reasoning

In 2025 we need to disambiguate three intertwined topics: post-training, reasoning, and inference-time compute. Post-training is going to quickly become muddied with the new Reasoning Language Models (RLMs — is that a good name), given that loss functions that we studied via advancements in post-training are now being leveraged at a large scale to create new types of models. I would not call the reinforcement learning training done for OpenAI’s o1 series of models post-training. Training o1 is l...

Jan 02, 202516 min

(Voiceover) OpenAI's o3: The grand finale of AI in 2024

Original post: https://www.interconnects.ai/p/openais-o3-the-2024-finale-of-ai Chapters 00:00 Introduction 02:51 o3 overview 05:57 Solving the Abstraction and Reasoning Corpus (ARC) 10:41 o3’s architecture, cost, and training (hint: still no tree search) 16:36 2024: RL returns Figures Fig 1, Frontier Math results Fig 2, Coding results Fig 3, ARC AGI results Fig 4, ARC AGI result details Fig 5, ARC AGI example 1 Fig 6, ARC AGI example in text Fig 7, ARC AGI example “easy” Get full access to Inter...

Dec 20, 202418 min

(Voiceover) The AI agent spectrum

Original post: https://www.interconnects.ai/p/the-ai-agent-spectrum Chapters 00:00 Introduction 03:24 Agent cartography 08:02 Questions for the near future Figures Fig 1. multiple feedbacks diagram Get full access to Interconnects at www.interconnects.ai/subscribe...

Dec 18, 202411 min

(Voiceover) OpenAI's Reinforcement Finetuning and RL for the masses

Original post: https://www.interconnects.ai/p/openais-reinforcement-finetuning Chapters 00:00 Introduction 04:19 The impact of reinforcement finetuning’s existence 07:29 Hypotheses on reinforcement finetuning’s implementation Figures Fig. 1, Yann’s Cake Fig. 2, Grader config Fig. 3, RLVR learning curves Get full access to Interconnects at www.interconnects.ai/subscribe...

Dec 11, 202413 min

Interviewing Finbarr Timbers on the "We are So Back" Era of Reinforcement Learning

Finbarr Timbers is an AI researcher who writes Artificial Fintelligence — one of the technical AI blog’s I’ve been recommending for a long time — and has a variety of experiences at top AI labs including DeepMind and Midjourney. The goal of this interview was to do a few things: * Revisit what reinforcement learning (RL) actually is, its origins, and its motivations. * Contextualize the major breakthroughs of deep RL in the last decade, from DQN for Atari to AlphaZero to ChatGPT. How could we ha...

Dec 05, 20241 hr 9 min

(Voiceover) OpenAI's o1 using "search" was a PSYOP

Original post: https://www.interconnects.ai/p/openais-o1-using-search-was-a-psyop Figures Figure 0: OpenAI’s seminal test-time compute plot Figure 1: Setup for bucketed evals Figure 2: Evals with correctness labels Figure 3: Grouped evals Figure 4: Hypothetical inference scaling law Get full access to Interconnects at www.interconnects.ai/subscribe...

Dec 04, 202412 min

(Voiceover) OLMo 2 and building effective teams for training language models

Full post: https://www.interconnects.ai/p/olmo-2-and-building-language-model-training OLMo 2 demo: https://playground.allenai.org/ OLMo 2 artifacts: https://huggingface.co/collections/allenai/olmo-2-674117b93ab84e98afc72edc Chapters 00:00 Building AI Teams 06:35 OLMo 2 Figures Fig 1, pretrain plot: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/olmo2/pretrain.webp Fig 2, pretrain table: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main...

Nov 26, 202410 min

(Voiceover) Tülu 3: The next era in open post-training

Original post: https://www.interconnects.ai/p/tulu-3 Chapters 00:00 History 05:44 Technical details sneak peak Figures Fig 1, results: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/tulu3-img/results.webp Fig 2, overview: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/tulu3-img/overview.webp Fig 3, preferences: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/tulu3-img/preferences.webp Fig 4, RLVR: http...

Nov 21, 20248 min

(Voiceover) Scaling realities

Original post: https://www.interconnects.ai/p/scaling-realities Get full access to Interconnects at www.interconnects.ai/subscribe

Nov 14, 20244 min

Interviewing Tim Dettmers on open-source AI: Agents, scaling, quantization and what's next

Tim Dettmers does not need an introduction for most people building open-source AI. If you are part of that minority, you’re in for a treat. Tim is the lead developer behind most of the open-source tools for quantization: QLoRA , bitsandbytes , 4 and 8 bit inference , and plenty more. He recently finished his Ph.D. at the University of Washington, is now a researcher at the Allen Institute for AI, and is starting as a professor at Carnegie Mellon University in fall of 2025. Tim is a joy to talk ...

Nov 07, 20241 hr 16 min

Interviewing Andrew Carr of Cartwheel on the State of Generative AI

Andrew Carr is co-founder and chief scientist at Cartwheel , where he is building text-to-motion AI models and products for gaming, film, and other creative endeavors. We discuss how to keep generative AI fun and expansive — niche powerful use-cases, AI poetry, AI devices like Meta RayBans, generalization to new domains like robotics, and building successful AI research cultures. Andrew is one of my well read friends on the directions AI is going, so it is great to bring him in for an official c...

Oct 31, 202454 min

(Voiceover) Claude's agentic future and the current state of the frontier models

How Claude's computer use works. Where OpenAI, Anthropic, and Google all have a lead on eachother. Original post: https://www.interconnects.ai/p/claudes-agency Chapters 00:00 Claude's agentic future and the current state of the frontier models 04:43 The state of the frontier models 04:49 1. Anthropic has the best model we are accustomed to using 05:27 Google has the best small & cheap model for building automation and basic AI engineering 08:07 OpenAI has the best model for reasoning, but we don...

Oct 23, 202411 min