ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization

Daily Paper Cast

Feb 08, 2025•21 min•Ep. 501

--:--

Listen in podcast apps:

Apple Podcasts

Spotify

Download

Listen to this episode in Metacast mobile app

Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

🤗 Upvotes: 12 | cs.CL

Authors:
Yinjie Wang, Ling Yang, Guohao Li, Mengdi Wang, Bryon Aragam

Title:
ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization

Arxiv:
http://arxiv.org/abs/2502.04306v1

Abstract:
Recent research has leveraged large language model multi-agent systems for complex problem-solving while trying to reduce the manual effort required to build them, driving the development of automated agent workflow optimization methods. However, existing methods remain inflexible due to representational limitations, a lack of adaptability, and poor scalability when relying on discrete optimization techniques. We address these challenges with ScoreFlow, a simple yet high-performance framework that leverages efficient gradient-based optimization in a continuous space. ScoreFlow incorporates Score-DPO, a novel variant of the direct preference optimization method that accounts for quantitative feedback. Across six benchmarks spanning question answering, coding, and mathematical reasoning, ScoreFlow achieves an 8.2% improvement over existing baselines. Moreover, it empowers smaller models to outperform larger ones with lower inference costs. Project: https://github.com/Gen-Verse/ScoreFlow

For the best experience, listen in Metacast app for iOS or Android