AI Safety: Constitutional AI vs Human Feedback - podcast episode cover

AI Safety: Constitutional AI vs Human Feedback

Jun 17, 202417 minSeason 1Ep. 27
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

With great power comes great responsibility. How do leading AI companies implement safety and ethics as language models scale? OpenAI uses Model Spec combined with RLHF (Reinforcement Learning from Human Feedback). Anthropic uses Constitutional AI. The technical approaches to maximizing usefulness while minimizing harm. Solo episode on AI alignment.

REFERENCE

OpenAI Model Spec

https://cdn.openai.com/spec/model-spec-2024-05-08.html#overview

Anthropic Constitutional AI

https://www.anthropic.com/news/claudes-constitution



To stay in touch, sign up for our newsletter at https://www.superprompt.fm

For the best experience, listen in Metacast app for iOS or Android