Temporal difference flow - podcast episode cover

Temporal difference flow

Oct 06, 202515 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

This paper introduces a novel set of generative models, temporal difference flows, designed to overcome the compounding error limitation of traditional world models in Reinforcement Learning, especially for long-horizon predictive modeling. These new methods, like td2-cfm and td2-dd, leverage the temporal difference structure of the Geometric Horizon Model (GHM), or successor measure, to achieve provable convergence and reduced variance in gradient estimates, leading to stable and significantly more accurate predictions over extended time horizons. The paper provides a rigorous theoretical foundation extending flow matching and diffusion models, alongside extensive empirical evaluations demonstrating superior performance in prediction accuracy, value function estimation, and Generalized Policy Improvement (GPI) across various robotics and maze tasks.

For the best experience, listen in Metacast app for iOS or Android