Past-Token Prediction for Long-Context Robot Policies

Best AI papers explained

May 20, 2025•16 min

--:--

Listen in podcast apps:

Apple Podcasts

Spotify

Download

Listen to this episode in Metacast mobile app

Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

This research presents Past Token Prediction (PTP), an auxiliary technique designed to improve long-context diffusion policies for robots learning tasks through imitation. The core idea is to explicitly train the policy to predict past actions along with future ones, which helps address the issue of modern diffusion policies failing to capture strong temporal dependencies. A multi-stage training strategy is introduced, separating visual encoder training from long-context policy training using cached embeddings to enhance efficiency. Additionally, PTP is used as a self-verification mechanism during inference by selecting candidate actions that best match previously executed actions. Experiments show this method significantly boosts performance and training speed on various simulated and real-world tasks, especially those requiring memory of past events.

For the best experience, listen in Metacast app for iOS or Android