Memento: Fine-tuning LLM Agents without Fine-tuning LLMs - podcast episode cover

Memento: Fine-tuning LLM Agents without Fine-tuning LLMs

Sep 01, 202519 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

The research introduces Memento, a novel approach for adaptive Large Language Model (LLM) agents that enables continuous learning without requiring fine-tuning of the base LLM parameters. This method leverages a memory-based online reinforcement learning framework, formally defined as a Memory-augmented Markov Decision Process (M-MDP), which stores past experiences in an episodic memory and continually updates a neural case-selection policy. Memento utilizes a planner-executor architecture and a comprehensive suite of tools, demonstrating state-of-the-art performance on various benchmarks, including GAIA, DeepResearcher, and SimpleQA. The ablation studies confirm that both parametric and non-parametric case-based reasoning (CBR) are crucial for significant performance gains and effective generalization to out-of-distribution tasks.

For the best experience, listen in Metacast app for iOS or Android