[HUMAN VOICE] "Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training" by evhub et al - podcast episode cover

[HUMAN VOICE] "Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training" by evhub et al

Jan 20, 20249 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

This is a linkpost for https://arxiv.org/abs/2401.05566

Support ongoing human narrations of LessWrong's curated posts:
www.patreon.com/LWCurated

Source:
https://www.lesswrong.com/posts/ZAsJv7xijKTfZkMtr/sleeper-agents-training- deceptive-llms-that-persist-through

Narrated for LessWrong by Perrin Walker.

Share feedback on this narration.

[Curated Post]
[
125+ Karma Post]

For the best experience, listen in Metacast app for iOS or Android