LLM In-Context Learning as Kernel Regression - podcast episode cover

LLM In-Context Learning as Kernel Regression

May 23, 202513 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

This academic paper investigates the mechanism behind in-context learning (ICL) in large language models (LLMs). The authors propose a theoretical analysis suggesting that ICL can be understood as kernel regression, where the model uses input-output examples within the prompt to make predictions on new data. Through analysis of attention patterns and experiments across different tasks, the study provides evidence that LLMs allocate significant attention to the demonstration samples, particularly their labels, and that internal key and value vectors store relevant information for this process. While the kernel regression framework explains phenomena like the importance of example similarity and output formats, the paper acknowledges that certain aspects of ICL, such as sensitivity to sample order, remain unexplained by this model.

For the best experience, listen in Metacast app for iOS or Android