Past-Token Prediction for Long-Context Robot Policies - podcast episode cover

Past-Token Prediction for Long-Context Robot Policies

May 20, 202516 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

This research presents Past Token Prediction (PTP), an auxiliary technique designed to improve long-context diffusion policies for robots learning tasks through imitation. The core idea is to explicitly train the policy to predict past actions along with future ones, which helps address the issue of modern diffusion policies failing to capture strong temporal dependencies. A multi-stage training strategy is introduced, separating visual encoder training from long-context policy training using cached embeddings to enhance efficiency. Additionally, PTP is used as a self-verification mechanism during inference by selecting candidate actions that best match previously executed actions. Experiments show this method significantly boosts performance and training speed on various simulated and real-world tasks, especially those requiring memory of past events.

For the best experience, listen in Metacast app for iOS or Android