From Model Weights to Agent Workflows: Charting the New Frontier of Optimization in Large Language Models

Best AI papers explained

Aug 12, 2025•17 min

--:--

Listen in podcast apps:

Apple Podcasts

Spotify

Download

Listen to this episode in Metacast mobile app

Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

We discusse a significant shift in artificial intelligence, moving from optimizing single, monolithic **Large Language Models (LLMs)** to optimizing complex, multi-component **LLM agents**. Previously, optimization focused on tuning model **weights ($\theta$)** using methods like **Reinforcement Learning from Human Feedback (RLHF)**, which relied on a clear mathematical objective including **KL-regularized expected reward**. However, the emerging paradigm of agent optimization involves tuning an entire **workflow program ($\Pi$)**, which includes textual prompts, tool usage, and control flow logic. This creates a challenging, **non-differentiable** and **combinatorial optimization space** that lacks a clear mathematical objective. The text then analyzes two prominent frameworks, **DSPy** and **LLM-AutoDiff**, which attempt to bring structure to this new problem by treating it as either a **program search problem** (DSPy) or by introducing a **"calculus of prompts"** with **"textual gradients"** (LLM-AutoDiff), although the latter still relies on semantic, rather than strictly mathematical, objectives.

For the best experience, listen in Metacast app for iOS or Android