⚡️GPT 4.1: The New OpenAI Workhorse
Apr 15, 2025•42 min
Episode description
We’ll keep this brief because we’re on a tight turnaround: GPT 4.1, previously known as the Quasar and Optimus models, is now live as the natural update for 4o/4o-mini (and the research preview of GPT 4.5). Though it is a general purpose model family, the headline features are:
Coding abilities (o1-level SWEBench and SWELancer, but ok Aider)
Instruction Following (with a very notable prompting guide)
Long Context up to 1m tokens (with new MRCR and Graphwalk benchmarks)
Vision (simply o1 level)
Cheaper Pricing (cheaper than 4o, greatly improved prompt caching savings)
We caught up with returning guest Michelle Pokrass and Josh McGrath to get more detail on each!
Chapters
00:00:00 Introduction and Guest Welcome
00:00:57 GPC 4.1 Launch Overview
00:01:54 Developer Feedback and Model Names
00:02:53 Model Naming and Starry Themes
00:03:49 Confusion Over GPC 4.1 vs 4.5
00:04:47 Distillation and Model Improvements
00:05:45 Omnimodel Architecture and Future Plans
00:06:43 Core Capabilities of GPC 4.1
00:07:40 Training Techniques and Long Context
00:08:37 Challenges in Long Context Reasoning
00:09:34 Context Utilization in Models
00:10:31 Graph Walks and Model Evaluation
00:11:31 Real Life Applications of Graph Tasks
00:12:30 Multi-Hop Reasoning Benchmarks
00:13:30 Agentic Workflows and Backtracking
00:14:28 Graph Traversals for Agent Planning
00:15:24 Context Usage in API and Memory Systems
00:16:21 Model Performance in Long Context Tasks
00:17:17 Instruction Following and Real World Data
00:18:12 Challenges in Grading Instructions
00:19:09 Instruction Following Techniques
00:20:09 Prompting Techniques and Model Responses
00:21:05 Agentic Workflows and Model Persistence
00:22:01 Balancing Persistence and User Control
00:22:56 Evaluations on Model Edits and Persistence
00:23:55 XML vs JSON in Prompting
00:24:50 Instruction Placement in Context
00:25:49 Optimizing for Prompt Caching
00:26:49 Chain of Thought and Reasoning Models
00:27:46 Choosing the Right Model for Your Task
00:28:46 Coding Capabilities of GPC 4.1
00:29:41 Model Performance in Coding Tasks
00:30:39 Understanding Coding Model Differences
00:31:36 Using Smaller Models for Coding
00:32:33 Future of Coding in OpenAI
00:33:28 Internal Use and Success Stories
00:34:26 Vision and Multi-Modal Capabilities
00:35:25 Screen vs Embodied Vision
00:36:22 Vision Benchmarks and Model Improvements
00:37:19 Model Deprecation and GPU Usage
00:38:13 Fine-Tuning and Preference Steering
00:39:12 Upcoming Reasoning Models
00:40:10 Creative Writing and Model Humor
00:41:07 Feedback and Developer Community
00:42:03 Pricing and Blended Model Costs
00:44:02 Conclusion and Wrap-Up