Training Large Language Models to Reason in Continuous Latent Space

Deep Papers

Jan 14, 2025•25 min

--:--

Listen in podcast apps:

Apple Podcasts

Spotify

Download

Listen to this episode in Metacast mobile app

Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

LLMs have typically been restricted to reason in the "language space," where chain-of-thought (CoT) is used to solve complex reasoning problems. But a new paper argues that language space may not always be the best for reasoning. In this paper read, we cover an exciting new technique from a team at Meta called Chain of Continuous Thought—also known as "Coconut." In the paper, "Training Large Language Models to Reason in a Continuous Latent Space" explores the potential of allowing LLMs to reason in an unrestricted latent space instead of being constrained by natural language tokens.

Read a full breakdown of Coconut on our blog, or join us live for the next paper reading.

Learn more about AI observability and evaluation, join the Arize AI Slack community or get the latest on LinkedIn and X.

For the best experience, listen in Metacast app for iOS or Android