Ep 255: Does this research explain how LLMs work? - podcast episode cover

Ep 255: Does this research explain how LLMs work?

Jan 14, 20261 hr 23 minEp. 256
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

I take a look at these three papers: 1. https://www.arxiv.org/abs/2512.22471 2. https://arxiv.org/abs/2512.23752 3. https://arxiv.org/abs/2512.22473 Collectively titled "The Bayesian Attention Trilogy" along with some other material - in particular an interview with one of the authors "Vishal Misra" - https://www.engineering.columbia.edu/faculty-staff/directory/vishal-misra For those familiar with my output on this you can probably skip to about halfway through at 42:40. Prior to this is a lot of background on Induction, Bayesianism, Critical Rationalism and so on that people may have heard from me before in different contexts - although for what it's worth these are new ways of expressing those ideas. At the end I am reacting to a video found here: https://www.youtube.com/watch?v=uRuY0ozEm3Q

For the best experience, listen in Metacast app for iOS or Android