Mechanistic Interpretability for AI Alignment | Callum McDougall, Joseph Bloom | EAGxBerlin 2023 - podcast episode cover

Mechanistic Interpretability for AI Alignment | Callum McDougall, Joseph Bloom | EAGxBerlin 2023

Nov 11, 202351 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

This talk covers the fundamentals of mechanistic interpretability: what it is, why it might be impactful for alignment, and how you can get involved. It is most useful for people who are either new to AI safety, or limited knowledge about mechanistic interpretability. The speakers hope you gain insights into what interpretability research can look like, and whether it is a good fit for you. They also discuss some of the other work being done in technical alignment research.



For the best experience, listen in Metacast app for iOS or Android