[HUMAN VOICE] "A case for AI alignment being difficult" by jessicata - podcast episode cover

[HUMAN VOICE] "A case for AI alignment being difficult" by jessicata

Jan 02, 202429 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

This is a linkpost for https://unstableontology.com/2023/12/31/a-case-for-ai-alignment-being-difficult/

Support ongoing human narrations of LessWrong's curated posts:
www.patreon.com/LWCurated

This is an attempt to distill a model of AGI alignment that I have gained primarily from thinkers such as Eliezer Yudkowsky (and to a lesser extent Paul Christiano), but explained in my own terms rather than attempting to hew close to these thinkers. I think I would be pretty good at passing an ideological Turing test for Eliezer Yudowsky on AGI alignment difficulty (but not AGI timelines), though what I'm doing in this post is not that, it's more like finding a branch in the possibility space as I see it that is close enough to Yudowsky's model that it's possible to talk in the same language.

Even if the problem turns out to not be very difficult, it's helpful to have a model of why one might think it is difficult, so as to identify weaknesses in the case so as to find AI designs that avoid the main difficulties. Progress on problems can be made by a combination of finding possible paths and finding impossibility results or difficulty arguments.

Most of what I say should not be taken as a statement on AGI timelines. Some problems that make alignment difficult, such as ontology identification, also make creating capable AGI difficult to some extent.

Source:
https://www.lesswrong.com/posts/wnkGXcAq4DCgY8HqA/a-case-for-ai-alignment-being-difficult

Narrated for LessWrong by Perrin Walker.

Share feedback on this narration.

[Curated Post]

For the best experience, listen in Metacast app for iOS or Android