This is section 2.2.3 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-pow...
Nov 16, 2023•17 min
This is section 2.2.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-pow...
Nov 16, 2023•21 min
This is section 2.2.1 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-pow...
Nov 16, 2023•12 min
This is section 2.1 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power...
Nov 16, 2023•9 min
This is section 1.5 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power...
Nov 16, 2023•7 min
This is sections 1.3-1.4 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-...
Nov 16, 2023•31 min
This is section 1.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power...
Nov 16, 2023•11 min
This is section 1.1 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power...
Nov 16, 2023•18 min
This is the full audio for my report "Scheming AIs: Will AIs fake alignment during training in order to get power?" (I’m also posting audio for individual sections of the report on this podcast, but the ordering was getting messed up on various podcast apps, and I think some people might want one big audio file regardless, so here it is. I’m going to be posting the individual sections one by one, in the right order, over the coming days. ) Full text of the report here: https://arxiv.org/abs/2311...
Nov 15, 2023•6 hr 13 min
This is a recording of the introductory section of my report "Scheming AIs: Will AIs fake alignment during training in order to get power?". This section includes a summary of the full report. The summary covers most of the main points and technical terminology, and I'm hoping that it will provide much of the context necessary to understand individual sections of the report on their own. (Note: the text of the report itself may not be public by the time this episode goes live.)
Nov 14, 2023•57 min
"It was, she said, a great discovery, albeit my real life."
Oct 15, 2023•21 min
Contra some meta-ethical views, you can't forever aim to approximate the self you would become in idealized conditions. You have to actively create yourself, often in the here and now. Originally published in 2021. Text version here: https://joecarlsmith.com/2021/06/21/on-the-limits-of-idealized-values
May 12, 2023•1 hr
How worried about AI risk will we feel in the future, when we can see advanced machine intelligence up close? We should worry accordingly now. Text version here: https://joecarlsmith.com/2023/05/08/predictable-updating-about-ai-risk
May 08, 2023•1 hr 3 min
A shorter version of my report on existential risk from power-seeking AI. Forthcoming in an essay collection from Oxford University Press. Text version here: https://jc.gatspress.com/pdf/existential_risk_and_powerseeking_ai.pdf
Mar 19, 2023•55 min
Is everything holy? Can reality, in itself, be worthy of reverence? Text version here: https://joecarlsmith.com/2021/04/19/problems-of-evil
Mar 05, 2023•36 min
On looking out of your own eyes. Text version at joecarlsmith.com.
Feb 17, 2023•52 min
Who needs a system if you're free? Text version at https://joecarlsmith.com/2023/02/16/why-should-ethical-anti-realists-do-ethics
Feb 16, 2023•53 min
Audio version of my report on existential risk from power-seeking AI. Text here: https://arxiv.org/pdf/2206.13353.pdf. Narration by Type III audio.
Jan 25, 2023•3 hr 21 min
Nearby is the country they call life. Text version at: https://joecarlsmith.com/2022/12/23/on-sincerity
Dec 23, 2022•1 hr 35 min
Can the epistemology of consciousness save moral realism and redeem experience machines? No.
Dec 01, 2022•1 hr 2 min
If your find a button that gives you a hundred dollars if a certain controversial meta-ethical view is true, but you and your family get burned alive if that view is false, should you press the button? No. Text version here . Edited for Joe Carlsmith by TYPE III AUDIO .
Oct 09, 2022•43 min
Infinities puncture the dream of a simple, bullet-biting utilitarianism. But they're everyone's problem. Text version here . Edited for Joe Carlsmith by TYPE III AUDIO .
Oct 05, 2022•1 hr 25 min
Life in the future could be profoundly good. I think this is an extremely important fact, and one that often goes under-estimated. Text version here . Edited for Joe Carlsmith by TYPE III AUDIO .
Oct 05, 2022•29 min
Making happy people is good. Just ask the golden rule. Text version here . Edited for Joe Carlsmith by TYPE III AUDIO .
Oct 05, 2022•23 min
I find imagining future people looking back on present-day longtermism (the view that positively influencing the long-term future should be a key moral priority) a helpful intuition pump, especially re: a certain kind of “holy sh**” reaction to existential risk, and to the possible size and quality of the future at stake. Text version here . Edited for Joe Carlsmith by TYPE III AUDIO ....
Oct 05, 2022•25 min
Sometimes, you can “control” events you have no causal interaction with (for example, if you're a deterministic software twin). Text version here . Edited for Joe Carlsmith by TYPE III AUDIO .
Oct 05, 2022•1 hr 17 min
If you kill something, look it in the eyes as you do. Text version here . Edited for Joe Carlsmith by TYPE III AUDIO .
Oct 05, 2022•15 min
How can "non-attachment" be compatible with care? We need to distinguish between caring and clinging. Text version here . Edited for Joe Carlsmith by TYPE III AUDIO .
Oct 05, 2022•18 min
You can't keep any of it. The only thing to do is to give it away on purpose. Text version here . Edited for Joe Carlsmith by TYPE III AUDIO .
Oct 05, 2022•13 min