Physics in Next-token Prediction - podcast episode cover

Physics in Next-token Prediction

Nov 05, 2024•19 min•Ep. 12
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

🤗 Paper Upvotes: 7 | cs.LG, cs.AI

Authors:
Hongjun An, Yiliang Song, Xuelong Li

Title:
Physics in Next-token Prediction

Arxiv:
http://arxiv.org/abs/2411.00660v1

Abstract:
We discovered the underlying physics in Next-token Prediction (NTP). We identified the law of information conservation within NTP and proposed the First Law of Information Capacity (IC-1), demonstrating that the essence of intelligence emergence in auto-regressive models is fundamentally a process of information transfer. We also introduced Landauer's Principle into NTP, formulating the Second Law of Information Capacity (IC-2), which establishes the relationship between auto-regressive model training and energy consumption. Additionally, we presented several corollaries, which hold practical significance for production practices. Finally, we validated the compatibility and complementarity of our findings with existing theories.

For the best experience, listen in Metacast app for iOS or Android