Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach - podcast episode cover

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Feb 11, 2025•23 min•Ep. 514
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

🤗 Upvotes: 30 | cs.LG, cs.CL

Authors:
Jonas Geiping, Sean McLeish, Neel Jain, John Kirchenbauer, Siddharth Singh, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Tom Goldstein

Title:
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Arxiv:
http://arxiv.org/abs/2502.05171v1

Abstract:
We study a novel language model architecture that is capable of scaling test-time computation by implicitly reasoning in latent space. Our model works by iterating a recurrent block, thereby unrolling to arbitrary depth at test-time. This stands in contrast to mainstream reasoning models that scale up compute by producing more tokens. Unlike approaches based on chain-of-thought, our approach does not require any specialized training data, can work with small context windows, and can capture types of reasoning that are not easily represented in words. We scale a proof-of-concept model to 3.5 billion parameters and 800 billion tokens. We show that the resulting model can improve its performance on reasoning benchmarks, sometimes dramatically, up to a computation load equivalent to 50 billion parameters.

For the best experience, listen in Metacast app for iOS or Android