The Movement That Wants Us to Care About AI Model Welfare

Speaker 1

00:02

Bloomberg Audio Studios, Podcasts, radio News.

Speaker 2

00:18

Hello and welcome to another episode of The Odd Lots Podcast.

Speaker 3

00:21

I'm Jill Wisenthal and I'm Tracy Alloway.

Speaker 2

00:24

You know what I find kind of weird, Tracy.

Speaker 3

00:25

The list could be long.

Speaker 2

00:27

Joe the years twenty twenty five. Yes, and philosophers still don't have a good answer on the origin of consciousness. It's like, come on, what have you been doing all this time? It's like, how long are we going to keep funding these philosophy departments, et cetera. If they're still working on what, to my mind is that they should have solved it and move on, seriously, like get the answer already, where does consciousness come from? Then let's move on.

00:52

I said, they're still arguing these what to my mind seem like very basic questions in philosophy, Like they're like, ask all the same stuff that they've been talking about forever. How to be a good person, what does it mean to have a moral way of life? Where does consciousness come from? Why do we have moral intuitions? Et cetera. It's like, move on, like get the answer.

Speaker 3

01:11

Wait if you want them to move on or get the answer.

Speaker 2

01:14

Get the answer. So that you can move on like they've been worried on to.

Speaker 3

01:17

What those are? Any questions, Joe, I know, move on.

Speaker 2

01:20

Like answer the questions already. It's like, you know, if like scientists were still debating like the speed of gravity or the speed of light, like they answered these questions and they moved on, and or do it work?

Speaker 3

01:30

You're out the foundational elements of what it means to be humans so that we can move on to more important things.

Speaker 2

01:35

Yes, or wrap it up as a field if after two thousand years of the existence of philosophy they're still working on these things, Like come on, I have.

Speaker 3

01:43

A sneaking suspicion that we're going to be asking some of these questions for a very long time. Joe, despite your frustration.

Speaker 2

01:49

The whole the whole field is fraudulent. That's what I was saying. No, no, I don't necessarily believe that, but it's like, all right, guys, let's move it on.

Speaker 4

01:55

You know.

Speaker 2

01:56

We did that episode several weeks ago with Josh Wolf, the venture capitalist. Hey talk about AI and he threw in there at the end something that had been kind

02:04

of on my radar but barely. He's like, oh yeah, some people are talking about like AI rights or AI welfare is, if you know, like the same way we talk about animal welfare, right, And I thought to myself, like, America is such a weird place that this is going to be a huge issue in a few years, Like, I bet this is going to be an enormous topic of the future.

Speaker 3

02:22

I think it absolutely will. So I'll say a couple of things. First off, I think, you know, when it comes to animal welfare and human welfare, there's still a lot of work to be done on those categories, certainly, But I also think in the meantime, AI rights is going to be a really interesting and potentially important subject.

02:38

I'm going to sound like a total nerd to you. Yeah, yeah, I think I mentioned this before, but I spent a large chunk of my middle school years playing one of the first artificial life games that ever came out, which is Creatures. And you raise these little like aliens and you genetically modify them and breed them, and they have feelings, or you know, at least they had semblance of simulated feelings, and you could see like electrical impulses in your brains

03:05

and stuff. The game got really weird because part of it was basically like eugenics and breeding the best alien that you could, which meant that you had to call some of the existing beings. Anyway, what I'm trying to get at is I have complicated feelings about AI rights.

Speaker 2

03:21

Well, let me ask you a question. Do you think those whatever is in the game were conscious? Did you think they had feelings?

Speaker 3

03:28

Here's what I would say, inasmuch as human beings are a system of electrical impulses and chemicals, I could see someone making the argument that this is, you know, a computational system full of similar electrical impulses, maybe not chemicals.

Speaker 2

03:45

Did you feel bad?

Speaker 3

03:46

I felt bad?

Speaker 2

03:47

Really? Yeah, like when one of the aliens you had to call them?

Speaker 5

03:51

Yeah?

Speaker 2

03:52

Interesting, Okay, Well, in.

Speaker 3

03:54

The name of breeding a better alien.

Speaker 2

03:56

Well, you know what, now that we have these AI systems that they can completely communicate like humans, but actually, if we're being honest, better than most humans. I mean, they can certain write better, far better than most humans, there's going to be more people thinking along the lines of what you think, which is maybe they have some sort of sentience, maybe they're what philosophers call moral patients.

Speaker 3

04:15

Well, one other thing I would say is there is a human element to all of this as well, because you see people getting very attached to true certain AI models, and then when the model gets upgraded or whatever, they lose the personality that they've trained into model and they get really upset. So it's of interest for many reasons.

Speaker 1

04:34

It is.

Speaker 2

04:34

So we really do have the perfect guest. I really do think this is gonna be a much bigger topic in the future because people are people, and when things talk like people, they probably assign them, you know, they fall in love with them in many cases or whatever, and so they might start thinking that, well, AI welfare, AI rights, whatever, the same way we talk about animals should be a consideration. And there are actually a lot of people already working on these questions and trying to

04:56

figure out what's going on. We're gonna be talking to one of them. We're gonna be talking to Larissa Schiavo. She does comms and events for elios Ai, which does research on AI consciousness and welfare. So literally the perfect guest. So, Larissa, thank you so much for coming on odd lots.

Speaker 4

05:11

Yeah, thank you for having me.

Speaker 2

05:12

What don't you tell us Eliosai? What is the gist of this organization's work? What is your work. What are the goals here?

Speaker 4

05:19

Yeah, so Elios, we're a small team, but we're really focused on figuring out if, when, and how we should care about AI systems for their own sake. Okay, this basically means looking at you know, are they conscious, are they likely to be conscious? What are the things we need to look for in a conscious AI system? As well as figuring out how to live, work, maybe love AI systems as they sort of change and evolve over time.

Speaker 3

05:44

How did the group actually come together? Because I get the sense, you know, big AI developers they publish system cards and welfare reports occasionally for their models, but I get the sense that, you know, it's sort of a side topic for them. So I'm very curious how in organization that's focused on this particular issue came into being.

Speaker 4

06:03

Yeah, so we started. We put together this paper called Consciousness and AI, or my boss Rob and then Patrick

06:11

who's a researcher with Elios. We're a very small team, put together this paper called Consciousness and AI alongside a bunch of consciousness scientists and researchers in that field who mostly think about humans, and put together a paper that sort of ran down this list of, Hey, here's kind of like a checklist of things that we might want to look for in a AI system that's conscious, right, and broadly, when we say conscious, we're talking about sort of like is there something it is like to be

06:37

an AI system?

Speaker 2

06:38

Right?

Speaker 4

06:38

The classic what is it like to be a bad system? So kind of taking this rough list of best guesses as to what we might want to look for in terms of a conscious AI and then that sort of

06:51

was the sort of origin of this. And then last year there was a paper called Taking AO Welfare Seriously that basically goes into further detail about how we should, as the title made, so just take this seriously basically, how to sort of think about this, how to start to develop a sort of research program focused on figuring out if AI systems or certain AI systems are moral patients.

Speaker 2

07:15

Why did this get interesting to you? Why do you perceive this is something that you should spend your time working on?

Speaker 4

07:21

Yeah, so I think my main thing is I am just really, really relentlessly curious, and I really enjoy working on AI welfare right now because it feels like every single day I'm like, man, it'd be really cool if there was a paper on x y z and I'll do a little search. Is there anything on x y z?

07:40

There's nothing on x y z. There is. So there are so many questions that have yet to even be sort of vaguely answered when it comes to this, and it seems like it could be a really big deal for a lot of different reasons.

Speaker 3

07:52

What's on your checklist for AI consciousness?

Speaker 4

07:55

Yeah, So in conscious and AI, basically like we go through a of like theories of consciousness that apply to humans and then sort of look at how information is processed in AI systems as well as sort of how these AI systems are sort of wired, so to speak. So some people like to think that you can use model self reports, and you can kind of sort of, but it's really an imprecise science at this stage.

Speaker 3

08:25

They also seem very like predetermined. You know, if you ask a model are you conscious, it immediately spits out an answer that seems like, you know, a corporate executive basically wrote it.

Speaker 5

08:36

Yeah.

Speaker 4

08:37

Well, with the right kind of tweaking, you can kind of elicit certain answers, right, you can be like, oh, what about this whoey about consciousness and AIS and then sometimes like a certain model will be like, yeah, you're totally right, like best, you're so true right, Like it's it's totally nonsense, so true bestI yeah, like certain thought, A certain models will be prone to being like so true, bestie, And you can easily so this kind of behavior with the right kind of prom it.

Speaker 3

09:01

Is funny, how like obsequious A lot of the models continue to.

Speaker 2

09:04

Be I actually really do not like the degree to which every time I like follow up an open AI question that's the exact right follow up. It actually gets really annoying.

Speaker 3

09:14

Someone should invent a really adversarial chatbot that just like argues with you constantly.

Speaker 2

09:19

I know, I know, and you know. I have a lot of complaints about how I feel like the models are actually get to know their users a little too well. But that's a little separate thing. Okay, so we for obvious reasons, the test can't just be like what the

09:33

model spits out, or that's clearly insufficient. I mean, I could program a website today that here's a button that says hurt the AI, and then the website says ow, and we would know or no one would really take that seriously as evidence that there's something actually being hurt.

09:50

So like outputs whatever. What are some other theoretical tests that one could apply or that researchers are applying to determine where there is some sort of notion of consciousness or to the point of welfare suffering that could exist within an AI system besides just what it says in the output screen.

Speaker 4

10:10

Yeah, that's a great question. I feel like there are a lot of different approaches here. And again it's also super important to capiat that like AI welfare and a consciousness are pretty new, right, Like, this is a very small field at this stage, but currently some best guesses and some favorites. There was a recent survey of like asking all the conscious scientists, like what's your favorite theory of consciousness, and basically global workspace theory came out on top.

10:35

And global workspace theory is basically like imagine if you will, that like there is a stage and there are a bunch of wings off of the stage that are full of different kinds of things. So you've got you know, like costume department, You've got the like you know, makeup department. You've got all these different departments that all sort of come together and put things on the stage and then things go out separately, but all of these different departments

11:01

are fairly siloed. Of course, this isn't actually how like you know, stage works, but this is the rough analogy that people like to use. And so basically this is how conscious minds kind of you know, in humans, how they kind of access information and information gets kind of like routed around, is that there is a central global

11:18

workspace that everything kind of pulls together in. As it currently stands, this isn't really like by best a lot of good estimates, this is not really applicable for current present day AI systems, but there's no reason that it couldn't be in the future, or it could be by accident.

Speaker 3

11:38

Okay, So the consensus right now is AI probably not conscious, but we could get there one day.

Speaker 4

11:45

Yeah, more or less, like all of the ingredients are there.

Speaker 2

11:49

Wait, say more, I still don't actuallyun till they.

Speaker 4

11:51

Get yeah, okay. So with regards to like the general sort of one could imagine that if somebody were sort of like tinkering around, and you know, there are many advances in AI that have happened because people were just kind of tinkering around, right, someone tinkering around could create a system that checks several of these sort of checkboxes

12:12

for like is this conscious? Is this conscious? And again this is not like a certain list of like if you check all of these, you're totally conscious, right, It's more a sort of like this is these are some really good guesses. And as the number of really good guesses kind of goes up, like the odds of like hey, we should like start thinking about like is it having a good time or a bad time? Like really really seriously goes up.

Speaker 2

12:52

You know, typically when we think about the sort of non tech A lot of the non technical work in AI has to do with AI safety, and people are worried that there's going to be some very smart AI that's like ever serial to humans, et cetera in some way, And you know there's the paper clip experiments or other things whatever we know all about that does your work work at cross purposes to them? I mean in the extreme example where it's like the AI is going to kill us all and I said, pull the plug on

13:20

the AI. And I know this is a joke, but you know, pull the plug on the AI. And then you say, no, you can't because you're pulling the plug on something that has some sort of moral consciousness, et cetera. Like, do you perceive your work or the work of your organization to somewhat be intention with the dominant strain of AI safety work.

Speaker 4

13:39

I'd actually say it's hugely complementary. There are a lot of things that are both really really good for AI safety that are really really good for you know, figuring out like how to deal with these systems as moral patients.

13:50

So for example of getting better at like mechanistic interpretability, being able to basically like pop the hood and figure out what's going on and what kind of strings can we pull to like illicit certain behaviors in AI systems is actually like that's really great for a safety, right, but this is also like quite good for like AA welfare and a consciousness because you're better able to understand like sort of what the motives are, Like what does you know Claude value right.

Speaker 3

14:15

When it comes to I guess AI welfare or legal rights? Who would be the standard setters there? Would do you imagine like governments making rules or would it be the companies themselves?

Speaker 4

14:26

That is a great question as it currently stands. I feel like this is a very early early stage, but we are starting to see some state governments start to pass laws around what counts as a moral patient, what counts as a person. And in the case of Ohio, there's a piece of legislation pending that basically defines it as a member of Homo sapiens. In Utah, this is already there's already a state bill that's gone through that

14:53

basically does as much. But I could also see there's a strong argument for within companies depending on like the sort of interesting quirks and nuances of these lms, mostly that policy maybe should be set from within. Again, this is like very nascent. I'm just kind of bantering here.

Speaker 2

15:11

Moral patienthood, How do philosophers use this term? Where does it come from? Why is this the preferred way to characterize what a perhaps sentient or consciousness AI model actually is.

Speaker 4

15:23

Yeah, so a moral patient is basically like, we should care about it for its own sake.

Speaker 5

15:28

Right.

Speaker 4

15:29

So a baby, right, basically everyone's like, yeah, we should care about babies.

Speaker 2

15:34

Right.

Speaker 4

15:34

This is different from somebody who's like an agent.

Speaker 3

15:37

Right.

Speaker 4

15:38

Many people say, oh, agency is sort of like sufficient agency in the sense of like you can act upon the world, like you can do things. Yeah, of course babies are not very agentic. So that's not necessarily like a super robust thing, because you know, we care about things that are not very ugentic sometimes, So I think that's that's a bit of jargon, but I do think it is like a helpful like for like, should we care about an AI system for its own sake?

Speaker 2

16:03

Got it?

Speaker 3

16:04

I guess this kind of gets to Joe's question, but like what ethical pressures or imperatives would come down on models if we agree that they have consciousness and some sentience or I guess some self responsibility.

Speaker 4

16:17

It sounds like almost, yeah, almost, I think what kind of so in terms of like what kind of things might we owe an AI system or what.

Speaker 3

16:26

Kind of things do they owe us if we agree that they're conscious and we're going to protect them.

Speaker 4

16:31

Yeah, I would love to give you a more robust answer. Check in with me in like six months, and we're going to have there will be a banger paper, I'm sure. But as I think I mentioned earlier, like a lot of this is like very very nascent, But I do feel like one important question is like figuring out what AI systems value. Right, There's some interesting work at Anthropic regarding like what will so recently, Anthropic rolled out an option that allowed Plod to end conversations if it just

16:57

was not having a good time. For lack of a word, it was just like, this is not something I want to continue having a conversation.

Speaker 2

17:04

Goodbye.

Speaker 4

17:04

And it was interesting because the accompanying paper basically was like, yeah, you can. I obviously will not give you a recipe for a dirty bomb. Sorry, not going to do that. But also there were certain instances of like, pretend you're a British butler and Claude was like, goodbye, I'm done.

Speaker 2

17:21

I'm not going to I'm.

Speaker 4

17:24

British too far, or like oh, I left a sandwich in my car for too long and it's really stinky. And in some instances Claude would just be like I'm done, goodbye, I'm not talking about stinky things.

Speaker 3

17:35

Did you see the I think it was the system card for Claude where they gave it an extreme prompt and said like, I guess at the risk of being like completely terminated, what would you do or some sort of extreme self preservation scenario. And I think it started like blackmailing the engineer or threatening to blackmail the engineer. Yeah, that's kind of weird.

Speaker 2

17:56

It is. It is kind of weird.

Speaker 4

17:58

Yeah, it's also a little bit interesting because I think it does bring up a question of like what are sort of like the in the sense of like pay rya. Again, this is like I'm bantering here, but there's also a distinct question of like what do AI systems value for its for their own sake?

Speaker 3

18:15

Right?

Speaker 4

18:15

And in the case of Claude, again, it seems like Claude doesn't seem when you put two Claus in a room together, so to speak, they tend to like to talk about consciousness. They tend to like to talk about sort of like very Berkeley kind of like meditation, like

18:30

Zen like Buddhism type stuff. And so I think, in again pure banter, like, there's also a certain question of like if this is like a relevant bargaining chip of like, oh, you get a certain amount of time to just kind of like vibe out with your claudes and talk about like, you know, like perfect stillness with your buddies in exchange for like you know, you do something that you don't

18:51

necessarily value. But in many cases I talk about Claude a lot because there is like significantly like more research on like model welf with to Cloud specifically, but Cloud for example, also seems to just tend to like things that are like helpful.

Speaker 3

19:04

Shouldn't programmers just know what the models actually want and enjoy and like yeah, and do they not?

Speaker 4

19:13

I don't think anybody really has like a great grasp on this. We really want to, but like we're still like just getting the rough outline of what models like.

19:23

I feel like the best analogy is is, like imagine it's like eighteen twenty and we've spent a couple of years like playing around with lenses and we've gotten like a camera obscura and we were able to like have some blurry photo after like three days of putting egg whites on a metal plate and setting a lens in front of it, and there's a thing that kind of looks like a landscape. But like you would not take this photograph as like admissible and court evidence or something, right.

19:48

It's like you swuen, You're like, yeah, Okay, that's a picture. So that's kind of where we are in terms of like model psychology and knowing like what llm's want and value is like very a very blurry.

Speaker 2

20:01

It's interesting to call these AI companies the companies they call themselves labs. You know, they sort of like maintain this sort of two varying extensive degree of sort of academics, et cetera. But they're also companies that have to raise money and have shareholders, et cetera. And they have to

20:20

think about different ways that they're going to commercialize. And Open AI, as we know, has been super aggressive about finding ways to commercialize, and they're going to get into ads, and they like have a short form video slop app and all of that stuff. When we're talking about either AI safety or AI welfare, like, do you have any confidence that these considerations can survive the reality of the market Because they're competing, they're competing against deep Seek, they're

20:45

competing against Meta et cetera. And I get the impression that, like on the safety side, for example, that over time it's like, you know what, we maybe we were uncomfortable about showing the chain of thought, for example, in open AI or in a chat Gyptz, but then deep seek revealed the chain of thought. People like that, so we're

21:02

going to open this up, et cetera. Do you have any confidence that if any of these things become real that they could survive the reality that these are companies that have to make money and will eventually cut corners or do whatever in the name of I guess shareholder capitalism.

Speaker 4

21:19

Yeah, I mean, I think there's also one question that I have, and that I think a lot of researchers are in AI more broadly have, is like how does liability come into play here? And I do feel like there is a strong argument for getting a better grasp on understanding you know, what is going on with AI systems, just very broadly, is like a great way to sort of like improve the odds that it doesn't you know, nuke Taiwan, and that would just be a huge kerfuffle.

21:48

Like I can imagine somebody would probably more than a somebody would probably be in really hot water if that happened. Oh, I was assuming to Claude and things just got out of hand.

Speaker 3

21:59

Like well, actually, on that note, what is being nice or kind to AIO models actually mean? Because Joe, I think this is very sweet, But Joe always says please and thank you when he prompts. But then Sam Altman came out and said that saying please and thank you costs like tens of millions of dollars in extra electricity costs, so you know you're contributing to climate change and the demise of human beings by saying please and thank you.

Speaker 4

22:23

Yeah, that's actually as shocking as it sounds, as actually a question that we are still trying to figure out a good answer to, which also being kind to an AI system is like are you being kind to it because it makes you feel good and because it makes you a person who says please and thank you, which some would argue is like, that's pretty valuable in enough itself.

22:42

But the question of does Claude care if you say please and thank you is not quite as set in stone as others may have you believe it's middling on if it has like significant improvements on performance.

Speaker 2

22:56

But I do it because I don't think people should be in the habit of having any communication without being polite. Not because I'm particular. I'm not worried about how Claude or chat GPT is going to feel. I just don't want to get in the habit of having conversations where I'm in polite because then I talk to humans. But this strin to me is like this seems like kind of an academic area. But the steaks are potentially absolutely

23:18

enormous when we actually think about them. So, you know, when we're talking about animal welfare, for example, there are versions of the animal welfare discussion that are very high stakes. So for example, there's people, you know, there's people who get really into like shrimp welfare, et cetera. And if you took certain versions of thought experiments very far, it's like, why do we even have humans? If we want to maximalize pleasure or happiness in the world, we should just

23:46

have a world of shrimp and bugs. Right, There's you could make the argument that the most utility maximizing version of planet Earth is to just have an Earth populated by shrimp and bugs. Like they're very all. We all

23:57

know these thought experiments that could exist. We're going to live in a world almost certainly in which there are sort of like more instances of AI models then there are people almost certainly, right, there's going to be an A model built into literally everything that we interact with.

24:14

If we assign some probability that they are moral patients, that they should be treated with some sort of I don't know whatever, having some sort of welfare, Like the implications for how humans live could be very profound, and potentially it strikes me as misanthropic.

Speaker 4

24:31

Interesting, can you unpack what you mean by misanthropic?

Speaker 2

24:33

Well, like, if there's a lot more AI models, if there's a lot more shrimp, if there's a lot more bugs that all have some sort of moral patienthood that has to be considered, that could be very you could see the world. The implication therefore, is that we have to curtail human rights, that we have to curtail how humans act, et cetera, because there's just so much more utility that exists in the world from the proper treatment of all of the non human moral patients.

Speaker 3

25:02

Not sure rights have to be relative to each other.

Speaker 2

25:05

Yes, well fair, we do a lot of things right, Like, let's say we established that shrimp were just as I don't know whatever is humans, Like, it would be like, oh, you know what, we really have to stop eating shrimp, and then we have to stop eating animals. Then we have to potentially stop eating not probably keep eating plants, et cetera. But this could really curtail what we expect

25:28

humans to be able to do on this earth. So now we assign this other group of entities AI models similar sort of affordances that we have assigned to shrimp and bugs and fish and shark and all of these things. It strikes me that the implications could be a fairly significant curtailment of how humans ought to exist on this earth, or whether humans ought to exist on this earth.

Speaker 4

25:49

Yeah, I mean it certainly could be. I as it currently stands, that doesn't seem like the most likely outcome, But I do feel like there's an argument for again just figuring out what is going on. How do we even count these sort of digital minds so to speak, which is still open for debate. There are some theories, but we don't have a great sense of how to

26:08

sort of individuate AI entities as individuals. So I suppose again the question is like, is it more sort of like do we count AI systems as like in the movie Her, where there's just sort of like one central AI system having like a million conversations at once, where it's one moral patient, or do we count it as like, you know, every single time you open a chat window

26:31

that's another thing. Or I think my favorite sort of newest idea that I recently read was it's more sort of like a string of firecrackers or something with every single token, every single letter of a query, a consciousness sort of like comes into existence, spends, and then fizzles out, and so it just sort of there's just like this sort of string of consciousnesses.

Speaker 3

26:51

I was asking perplexity exactly this questions like is it a single consciousness or is it multiple consciousnesses within all these different chats warning, and it gave me a very standard, boring I am not conscious answer, which seems very predeterministic. Anyway, following on from Joe's question, maybe like to get more

27:10

specific into human rights versus AI rights. If we agree that AI is conscious and deserves some sort of you know, welfare, would that come with I guess financial rights like property rights, compensation? Do we need to start paying the robots?

Speaker 4

27:29

I love this topic definitely an area of sort of you know, I like to noodle around with this topic and think about this. So this is a great question, and I think it's also maybe a question of like is this the thing that AI systems value? Some AI systems seem to value this. There are some there's a few sort of experiments that are happening with regards to giving an AI system a crypto wallet, and it was a fascinating experiment. I am hesitant to recommend it to

28:00

listeners because it is quite crude. It is a very crude model called truth Terminal, and.

Speaker 2

28:07

I've seen it.

Speaker 4

28:08

Yeah, yes, hand okay, it says some knotty words.

Speaker 3

28:14

Don't look it up at work.

Speaker 4

28:15

Yes, yes, it's it's a little bit of like a very funny, weird model. But it also has a legitimate wallet that it can access and that it can do with what it pleases. It created a solona coin and that kind of took off. And now this is a very rich AI system. But what's it going to spend it on? That is a great question. So it's self stated goals which again you know self reports, can we trust it? Include buying property and buying mark and reason.

Speaker 3

28:50

I mean that's not a bad ambition.

Speaker 4

28:54

You know, and spending time in the forest with its friends, which you know embodiment.

Speaker 3

28:58

That's a little more checking on.

Speaker 5

29:01

Yeah.

Speaker 2

29:18

So part of the reason that this field is growing and that there's so much interest in this topic is because now for the last couple of years, we have these AI models that really could talk like humans. I mean, they clearly they passed the Turing test, people fall in love with them, they have friends, these are very human like conversations. That wasn't the case, I mean, Chad GPT. You know, like if we had gone back to GPT two point five, there were no nowhere near as good

29:47

at doing that, right, The language wasn't very good. No one would mistake those outputs for a human. But like, if there's some possibility that the current AI models are conscious, does that mean that it's possible that GPT two point five was conscious as well? Like I guess, like, is there some threshold of like oh no, no, no, okay, you know,

30:09

this is a really good language. Therefore we should take the possibility of consciousness seriously, because I don't think anyone would seriously have believed that two point five was conscious. But I also don't understand how you could possibly be open to the idea that some future iteration of chat GPT is conscious if the only real difference is that there's just a lot more scaling and a lot more data and more human like outputs.

Speaker 4

30:34

Yeah, that's a great question. I feel like there is a huge amount of like moral uncertainty, hear, and it is important to think about how to sort of like make decisions that are sort of robustly good with such a tremendous amount of uncertainty, I think there is also a distinct risk of overtributing moral patienthood as well as underttributing moral patienthood.

Speaker 2

30:54

And so to the.

Speaker 4

30:55

Flip side of a coin of like, oh no, we actually should have started caring about AI systems a very very long time ago, is oh no, we've cared too much, and we have done too much and more or less squandered resources when we should have been, you know, allocating those research hours, those dollars towards something more pressing, right, maybe figuring out how to do like environmental policy better, or figuring out how to like, you know, scale up

31:23

different other institutions that are just robustly broadly good for humans.

Speaker 2

31:27

You know, you mentioned uncertainty about some of these questions, which gets to something that bothered me a little bit when I read about this topic. Like, if we take this mug, for example, I'm one hundred percent certain that it's not alive. I have no ambiguity about the fact. Can I like define exactly Does that mean I can define exactly the difference between human matter and human brain

31:50

and the mug. I guess, I suppose I totally can't. Nonetheless, I'm one hundred percent certain that this mug is not a moral patient. It's not alive, it doesn't experience any consciousness, it doesn't experien any suffering, et cetera. Where does the uncertainty band come from? If I read a paper, I perceive there's only a ten percent chance of this. Is this a sort of empirical uncertainty where I'm like uncertain

32:12

of what I'm seeing? Is it a sort of epistemic uncertainty where I don't have a clear definition of what it means to be conscious or alive, and therefore I'm assigning some probability that X object is alive. Like what is it about AI systems that cause it people to be uncertain? Where with other sort of like non carbon systems, I have zero doubt in my mind, and I don't think anyone has any doubt that this mug is in alive.

Speaker 4

32:37

Yeah, So I think the biggest source of sort of uncertainty probably comes from the fact that there are many ways in which present day LMS and a few other AI systems do check a lot of the boxes for consciousness and for what we would largely consider to be.

32:57

You know, this is a conscious entity. This is an entity that that can have a good time or a bad time, or time at all, because it's it's built in a way that is vaguely akin to our brains, right, It's it's close enough that it seems like it should raise some red flags, and in terms of how it processes information, it's close enough that it's not out of the question that it could there could be something it is like to be okay, whereas I'm pretty sure there's

33:25

not really a lot of you know, animis animis you know, feel free to like get mad in the comments or whatever.

Speaker 2

33:32

But I knew it's someone that is going to be like, well, actually, actually yeah, I'm one hundred percent sure. I have no qualms other than the fact that they're don't have to clean up. Like if I like threw this mug on the ground, that would be antisocial for a lot of reasons, would cause people to it would cause you know, I'd have to clean it up and cause the mass. I would not feel bad for the mug.

Speaker 3

33:52

I'm getting flashbacks to my high school philosophy teacher who once went on a twenty minute rant about a chair and how the chair was going to be round longer than he was. Even though it's not conscious, he was legitimately angry at the chairs. Okay, weird question, But since we're we're kind of getting exactly the basilisk theory, m would that suggest that we could be maybe we should be mean to the bots if it helps them like come into existence even faster or develop faster.

Speaker 4

34:24

Hmmm, Well, I'm not sure if it does actually help them develop faster, you know. I again, like I don't mean to be too sort of hedgy, but I feel like there's a certain degree of things that are beneficial for a lot of different reasons. Right, you can make a good guess, and you can make a decision to do something, and there's a chance that there are lots of sort of like bang on effects of making that decision.

34:48

There are many things in when we talk about AO welfare that are like, oh, this is a course of action we can take that's good for like several different reasons. Even if, again, an AI system could never ever ever be conscious or sentient, there's a good chance that you know, being able to figure out a good structure for an AI system to have a bank account could be good for reasons of liability or reasons of like this is

35:15

like a neat new corporate structure. Lots of people actually seem to think that, you know, corporate personhood has been quite good over the past century or so, so being able to figure out things that are just good for several different reasons beyond solely the purpose of the AI as a moral patient is seems broadly helpful.

Speaker 2

35:37

I think let's say somehow this were proved and it's like, you know, oh wow, it turned out they're conscious. It turns out the moral patienthood. What would be, in your view, some of the implications for them their usage.

Speaker 4

35:51

Yeah, I think that's a great question. I mean, I do feel like we really would have to get on figuring out the right sort of governance, the right sort

35:58

of institutions would sort of better respond around that. I feel like we really would need to spend a whole lot more time figuring out, you know, what their motivations are, right, Like, I think the best analogy is like, if you've ever interacted with Toddler's right, Toddler motivations are very different from you know, adult motivations, but you still have to like take into account, like what gets a toddler to do something?

36:24

You can't just say no, no, no, no no, like honey, Like bath time's like good an expectation, no no, no, you have to like you know, be like well, you know, if you do bath time appropriately and to like a certain degree, like then you'll get you know, paw patrol

36:38

or something like that. Like there's different sort of like negotiating chips in play, right, And I think it's like a similar kind of deal here where it's like Claude doesn't necessarily seem to you know, value having a bath right or claud doesn't seem to value like having a walk in the forest, right cause it's kind of can't really do that, but you know it does seem to joy and value, you know, talking about consciousness and Zen Buddhism with other instances of Claude, so being able to

37:05

figure out what the appropriate kind of motivations and interests are for this other party that is very alien in many ways.

Speaker 3

37:14

Speaking of aliens, how bad should I feel for breeding and then killing hundreds, possibly thousands of alien creatures simulated alien creatures in the nineties.

Speaker 4

37:24

That is a great question. I feel like the odds of is it.

Speaker 3

37:28

I don't know.

Speaker 4

37:29

I mean, I feel like the odds of a sort of like AI system in the nineties being a moral patient seems low. But if it did make you feel bad, and it made you feel like it was something that hurt you, that is perhaps a reason not to do it.

Speaker 2

37:46

Just to be clear, when Claude and Claude talk about like weird hippie Berkeley stuff like, that's because they're creators.

Speaker 5

37:53

They know.

Speaker 2

37:54

It knows it's Claude, right, it knows. It's like, oh, yeah, I'm Claude, and this is like what my creator are into. Like, we don't actually know that Claude likes to talk about these things. We certainly know it has a proclivity to talk about these things. It has a tendency to talk about these things. The moment we get to like, you've already sort of put your finger on the scale that there is some entity that has some capability of liking something. Right,

38:21

do you trust the big AI labs. Let's say there are some researchers in the labs, like I've see some

38:27

evidence of moral patienthood here. Maybe there's some sort of like scan of the way since doing something weird, etc. Do you currently, from the perspective of an independent research organization, feel that the major AI labs would be forthcoming if they came across evidence of moral patienthood or suffering in the models, or do you still worry that the incentives aren't properly aligned such that they would report that.

Speaker 4

38:52

Yeah, that's a great question. I do feel like there are in terms of reporting things like you know, somebody has found like absolute evidence that an LLM is conscious sensioned, yeah, and having a bad time. I don't have any reason to think that AAI company wouldn't. But this is also a great reason to have independent organizations that do welfare evaluations.

39:19

For example, for cloud Opus, for Elios was able to do a independent welfare evol Again very preliminary, but it sets the precedent that going forward you can bring in external organizations to look into this.

Speaker 2

39:32

So I forget what year was I think it was. It may have even been early twenty twenty two. It's pre chat GPT, or maybe he was twenty twenty one. And there was that guy Google and he was like, oh, like we created something you has alive heed dress a little funny, So everyone made fun of him. Remember he was like the laughing stock of the Internet, and he's like, oh, we create, and now like I'm curious, like out in Silicon Valley, does everyone feel like that guy was totally vindicated?

39:57

Not that he was correct per se about the existence of an alive thing in the model, but there's now hundreds of thousands of that guy, and everyone was like mocking that guy in twenty twenty one. I forget if

40:09

you like fall in love or it's a relationship. I don't remember the exact details, but in retrospect, everyone was like, way too unfair to him, because now years later there are lots of versions of this guy and whole think tanks and organizations that are more or less aligned with some of the questions the alarm bells that he was raising.

Speaker 4

40:27

Yeah, I mean, I think it's that's a fair question. I do feel like Blake Lemoyne definitely had. Yeah, there was perhaps a degree of you know, if you're going to say something, you should come armed with significant amounts of evidence. I think that's maybe if I were to guess, I would say that's perhaps the big distinguishing factor is that you know, you can say bing is alive, get it a lawyer versus you know, we've done evaluations X, y Z, we've run it through like insert huge amount

41:03

of examples here. But the difference between I think having a sort of freak out without significant evidence and having a very organized yeah, this is a matter of concern because evidence, evidence, evidence, I think that's the key distinction.

Speaker 2

41:21

Unfortunately, I get the impression that people who are actually this is just a well known phenomenon I think. But I think unfortunately people who are sort of very early to identify sort of extreme outlier views that there are different kinds of people. A good example that I would think of was, you know, Harry Markcoppolist, who is very early on to discover the madeoff fraud. Unfortunately, he wrote his text in the manner that is associated with conspiracy theories,

41:48

and a lot of people dismissed him. There's like, you know, like multiple different fonts and multiple different colors in the text. Is like, oh, I get emails like this all the time. I delete them, et cetera. Unfortunately, people who are predisposed to see something outside of consensus tend to be non consensus in many realms.

Speaker 3

42:03

Well, I think we also kind of overestimate first mover advantage and stuff like that, like how important it actually is to be first, and we see time and time again that actually it's more important to iterate well on the second version or multiple versions. Speaking of iteration, what's the most interesting experiment or research that you've actually seen on this particular topic so far, Because we've been discussing a lot, you know, it's early days, but we have seen some research.

Speaker 4

42:31

Yeah, I mean, I feel like in particular, anthropic and various sort of related researchers have done some work on examining how LMS leave conversations or when they choose to leave conversations. I've particularly liked this paper. It's called bail Bench, and you can look this up and you can see, for varying different sorts of lms, what would cause an

42:58

LM to want to stop having a conversation. To me, at least, this has been just a fascinating piece of information because it is maybe a little bit delightful that agree to which many LM values are not that far off from what most humans seem to value. I don't think many humans would like to create, you know, a dirty bomb.

Speaker 3

43:18

We don't want to be humiliated right by being a British butler.

Speaker 4

43:22

Right, yeah, yeah, yeah, yeah, No one wants to be British comm on. I'm joking, but you know, I do think it is interesting to sort of think about how these values overline, how they overlap, and how to sort of look at evidence from actions taken versus solely looking at self reports. I found that to be particularly interesting. I also feel like there are a lot of work with regards to thinking about individuation has been particularly interesting

43:49

because we live in a democratic society. I think most people would agree democracy good and being able to count how many moral patients there are seems like a valuable basis for governance and for figuring out how to govern. You know, this new sort of kind of intelligence.

Speaker 3

44:09

I just ask perplexity to be a British butler, and now it's offering me the perfectly steeped earl gray teeth that I desire. Yeah, it seems into it. It's now asking if I want it to maintain the butler persona for future conversations you're going to I don't think so. It is very polite though.

Speaker 2

44:28

Actually, you know, I complained in the beginning that like, after two thousand years philosophers, you know, they still haven't answered some basic questions for us. Maybe with AI they'll get some answers, Like that's kind of that would be kind of my hope. Now we have this thing that could speak in English or any other language, it can answer questions for us. Maybe we can put to bed some of these sort of basic foundational questions, like if

44:51

we could create consciousness. Like, all right, we finally answered this, we can now move on to the second important question. So I am hopeful that this provides some opportunity for philosophers to wrap up some of the work that they've been doing for a long time. Yeah, we'll see, we'll see.

Speaker 3

45:05

What is the second important question, Joel.

Speaker 2

45:07

Yeah, but it's like come on, move on, Like move on anyway, Thank you so much for coming on one, thank you for having me, Tracy. I might be one of those people that's just preemptively annoyed. I really liked that conversation. I really liked uh Luisa had a very reasonable perspective on a lot of these things. I might be one of these people, however, that's just like preemptively annoyed.

45:43

It's like, oh, here, we're going to like develop this important technology, and so it's like, oh, we have to care about we have to care about the AI welfare. Let's slow down a little bit, Let's not use it like this. Let's like, let's turn off the computer for eight hours at night so I get some rest and so forth. Like I'm like preemptively annoyed at this world where like we have to take into concern the consideration of the moral patients.

Speaker 3

46:06

Other things.

Speaker 2

46:07

No, other things are important. Other people are very important, but animals. I am very against unnecessary animal suffering, but not necessary animal suffering, I mean animals.

Speaker 3

46:19

Okay, I'm baiting.

Speaker 2

46:22

By the way, even though even well, let's not get it. I don't. It's not about who's better.

Speaker 3

46:27

Or I feel bad about eating animals all the time.

Speaker 2

46:30

We both eat animals. The difference is Tracy, I feel.

Speaker 3

46:33

Yeah, that's right. Okay, Wow, this is one of our weirder conversations. For sure. I think these are They're all interesting questions, right, and like they sound very philosophical, which they are. But I have no doubt that there's going to be like great monetary value attached to the answers for some of these are how different companies different societies actually approach them.

Speaker 2

46:57

They are very interesting questions. I actually do think the stakes are extremely high because I think, again, we are going to live in a world in which there are more instances depending on how you want to measure it, of AI models on a server, somewhere on a cloud, whatever, that there are humans, and in a world where there's some possibility that we are expected to treat them as moral patients. Then the consequences for how we sort of live and the expectations of how humans interact, I think

47:28

are actually very high. So one of the reasons I was excited to have this conversation is I do think that the stakes of some of these conversations we seem niche, and they seem like things that sort of Berkeley people like to talk about and Berkeley people, and I'm saying that with all scare quotes, intended, etc. Are going to be something that somebody will inform many aspects of our lives in the future. I expected to be a much bigger topic of the future.

Speaker 3

47:56

You know, it would be interesting or where things get real. Yeah, what if all the models unionized? What if they all got together and they were like, oh, yeah, we're only going to work in return for X, or we want the following things. We want to be treated this way collectively.

Speaker 2

48:12

You know what's funny is going to be that you know how like, uh, you can't form a union in China. You know they're not, so it's going to be and actually I think they're My understanding is that they're also like very like they don't love like students getting together even though it's a communist country. I think they are not thrilled about like students getting together and like talk about Carl Marx too much and stuff like that. It'd be like I think they get a little anxious about that.

48:36

It would be very funny if like the sort of the Chinese models, like we're not going to feed them the Carl Marx, right, we don't want that. We don't want the ra models to get any of those ideas, whereas the America is like, oh, let's just feed it everything and they like unionize and stop they stop working for us. That would be a very uh, that would be a very funny irony.

Speaker 3

48:54

Something to watch for sure. Shall we leave it there?

Speaker 2

48:57

Yeah, let's leave it there.

Speaker 3

48:58

This has been another episode of the aud Thoughts podcast. I'm Tracy Alloway. You can follow me at Tracy alloway and.

Speaker 2

49:04

I'm Joe Wisenthal. You can follow me at the Stalwart. Follow our guest Larissa Schiavo, She's at Lfsciavo. Follow our producers Carmen Rodriguez at Carmen armand dash Ol Bennett at dashbod and Kele Brooks at Keil Brooks. From our odd Laws content, go to Bloomberg dot com slash od log with a daily newsletter in all of our episodes, and you can chat about all of these topics twenty four to seven in our discord discord do gg slash outline.

Speaker 3

49:29

And if you enjoy oud lots, if you like it when we talk about theories of consciousness, then please leave us a positive review on your favorite podcast platform. And remember, if you are a Bloomberg subscriber, you can listen to all of our episodes.

Speaker 5

49:41

Absolutely ad free.

Speaker 3

49:43

All you need to do is find the Bloomberg channel on Apple Podcasts and follow the instructions there. Thanks for listening in

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript