LLM In-Context Learning as Kernel Regression

Best AI papers explained

May 23, 2025•13 min

--:--

Listen in podcast apps:

Apple Podcasts

Spotify

Download

Listen to this episode in Metacast mobile app

Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

This academic paper investigates the mechanism behind in-context learning (ICL) in large language models (LLMs). The authors propose a theoretical analysis suggesting that ICL can be understood as kernel regression, where the model uses input-output examples within the prompt to make predictions on new data. Through analysis of attention patterns and experiments across different tasks, the study provides evidence that LLMs allocate significant attention to the demonstration samples, particularly their labels, and that internal key and value vectors store relevant information for this process. While the kernel regression framework explains phenomena like the importance of example similarity and output formats, the paper acknowledges that certain aspects of ICL, such as sensitivity to sample order, remain unexplained by this model.

For the best experience, listen in Metacast app for iOS or Android