Continuous Autoregressive Language Models

Best AI papers explained

Nov 08, 2025•16 min

--:--

Listen in podcast apps:

Apple Podcasts

Spotify

Download

Listen to this episode in Metacast mobile app

Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

This paper introduces **Continuous Autoregressive Language Models (CALM)**, a new paradigm designed to overcome the efficiency limitations of conventional, token-by-token generation in Large Language Models (LLMs). CALM achieves significant computational savings by employing a robust **autoencoder** to compress a chunk of $K$ discrete tokens into a single, high-fidelity continuous vector, thereby reducing the number of sequential generation steps by a factor of $K$. This shift necessitates a comprehensive **likelihood-free framework**, including an **energy loss** for generative modeling and a new evaluation metric called **BrierLM**, which offers a reliable alternative to Perplexity for implicit models. Furthermore, the paper details a provably exact, but computationally expensive, **likelihood-free temperature sampling algorithm**, along with a highly efficient batch approximation that demonstrates an equivalent trade-off between accuracy and diversity as traditional LLMs. The empirical results confirm that increasing the **semantic bandwidth** $K$ provides a powerful new axis for achieving a superior performance-compute balance in language modeling.

For the best experience, listen in Metacast app for iOS or Android