Episode 11- Shaping Model Behavior in GPT-5.1

⁠¶ Introducing GPT-5.1: Reasoning Models

00:00

Hello, I'm Andrew Mayne, and this is the OpenAI Podcast. Today, our guests are Christina Kim, who's a research lead working on post-training at OpenAI, and Laurentia Rominyak, who's a product manager focused on model behavior. We're going to be talking about GPT 5.1, what makes the model better, how they've been focusing on making its personality steerable, and where they see things headed in the future. For the first time ever, all of the models in chat are reasoning models.

00:26

Personality, though, for most of our users, I think is something much larger, and it's the whole experience of the model. You should be able to get the experience that you want with chat. Part of the art here is figuring out how to pull out these quirks of the model that can come across as personality without breaking steerability.

00:43

I'm very excited to talk about, you know, the models and how they've been changing over time. And using the word model also feels sort of funny now because it seems like there's so much more. And everything starts really in research. GPT 5.1 was being planned. What were the goals? Yeah, for us, one of the main goals was to address a lot of the feedback we've been getting about GPT 5, but also we've been doing a lot of work to make the...

01:10

5.5 instant into a reasoning model. So what the most exciting thing for personally for me with the 5.1 release is that for the first time ever, all of the models in chat are reasoning models. So the model right now can decide to think.

01:24

It's kind of what we say. It's like a chain of thought and it'll decide how much it wants to think based on a prompt. So if you're just saying like hi to the model or what's up, it's not going to be thinking. But let's say you ask it a bit like harder question and then it'll decide how much it wants to think. So it gives it time to like refine its answer and work through things, call tools if necessarily, and then come back to give you an answer.

01:43

Kind of what Daniel Kahneman calls like system one and system two thinking. Yes. Having a reasoning model out for it as a default model for everyone just gets a much smarter model. And I think what's much smarter models, you just get improvements across the board, especially for things like instruction following and for a lot of the use cases.

01:58

people might not even think might require much reasoning. Just that having improved intelligence, having the model actually think before it responds in certain queries just really helps. We've seen that improve evals across the board.

⁠¶ Addressing GPT-5 User Feedback

02:10

When you product manage something like this and you have to explain to people what's different, it's probably a challenge, but how would you explain what's different between GPT-5 and GPT-5.1? Yeah. First of all, it is difficult because there's so much changing. But in this case, what we wanted to speak to were things that we'd heard as feedback from the community. With the ChatGPT5 launch, one of the things we heard was that the model felt like it...

02:38

had weaker intuition and that it was less warm. And when we dug into that, what we found were a handful of different things. First of all, it wasn't just how the model was responding. as the model's innate behavior it was also things around the model so as an example our model had a shorter or the context window wasn't carrying enough information about what users had said previously so that can feel like the model

03:02

is forgetting something really important that you told it that you were hoping it would hold on to. If you say I'm having a really bad day and the model forgets that after 10 turns, that can feel really cold. So that's something we adjusted as part of this launch. Some of it was.

03:17

actually the way the model is responding. But something new that we introduced in GPT-5 as well was we have this auto switcher that would move you between chat and reasoning models. And those have slightly different response styles. And that can feel really jarring or cold if you're...

03:31

talking to the model about how you're having a bad day. And then you say, like, part of it's I got this awful cancer diagnosis. So the model switches you to thinking and you get a very clinical answer for a model that was just sort of like walking you through a problem you were having earlier.

03:45

And so a lot of the changes we were actually trying to make were in aggregate. How do we make sure this model feels warmer? Even though we were changing a lot under the hood to articulate that. Another thing that we looked into was instruction following generally. 5.1 is much better at following custom instructions. And that was another piece of feedback we were hearing, which was like, every model that comes out of...

04:09

that we release is going to have its own quirks and slightly different behaviors. And I think people actually don't mind that too much as long as they can control it, as long as they can say like, hey, that was weird, stop. But if the model can't carry that context forward, if it can't...

04:24

hold on to the custom instructions on that, that's a problem. So we worked to actually enhance the custom instructions feature so that it more consistently carries instructions forward to address some of that feedback. And then like... The last thing I'll say is a lot of this stuff is personal preference. And so that's why we introduced our style and trait type features like personality, which actually let users...

04:47

guide the model into certain response formats so that they have a little bit more control over exactly how Chat TV responds for them. The switching is interesting because...

⁠¶ Unpacking the System of Models

04:57

There's multiple models now, just not one model. And you articulated why you need to have that. When we talk about a switcher and we talk about sort of different models, I know for most people that can be kind of confusing. And how would you kind of unpack that for people?

05:12

Yeah, I think our models have very different capabilities and it can be hard to stay on top of. So part of it is just continuing to try the different things in our app. But certainly part of the product work is making sure that we have the right UI. to either guide users to the correct model to choose. And that can be the model switcher. So that can be the model switcher learning.

05:34

what sort of answers are most helpful to users in different contexts, looking at different evals. So for example, for reasoning models, if people want something that's very science... typically accurate and very, very detailed. We might look at an eval to see, are we answering that need on those sorts of prompts? And we can forecast where to switch users to. Yeah. Tina, as far as...

05:58

The switcher and now the fact that you have a model that everybody has, the free tier, anybody using the base model is a reasoning model. What does that really mean in impact? Yeah, I think there's a lot of research. open questions for research for how we want to think about this, right? So I think, like you said, it's a faster model, but it doesn't necessarily need to be dumb. So I think the idea is that we want to get the most intelligent model that we can for everyone. And so I think we'll...

06:23

I think this kind of opens the door from thinking more about like what are more interesting things we could do with a very, very like state of the art, like frontier model. Right. So that's going to think for much longer, like something like deep research where you have it thinking for minutes.

06:34

Maybe that's better used in the background. You can call it as a tool. So I think there's a lot of research, open questions of what we want to think of. But I do think we're going to be in this world where we do have a system of models and it's not just a model that you have. And there's lots of different...

06:49

tools and it's not just one. Like when we think of 5.1, I think people just assume that it's like one singular set of weights, but I think it's really just like, yeah, this reasoning model, this like. lighter reasoning model, this auto switcher, which is also a model in itself. And so it's all of these different things and then different tools that are also backed by different models. So I think this system of things, I think as we just get smarter models, it's opening up.

⁠¶ Processing User Feedback and Signals

07:10

more interesting use cases and more interesting product implications. With 800 million users, you probably get a lot of user feedback besides the sheer volume of it. How do you sort through that and make sense of it and figure out how you can use that? Yeah.

07:25

I think a lot of it actually starts with a conversation link. So a lot of times when we can actually see the conversations users are having, we're able to see exactly what happened in that conversation and start dissecting things so that we can.

07:39

target a solution. So as an example, if we get feedback from a user that like, hey, I had this really weird experience with the model. It said something very cold or like the sentence felt very clipped. If I can actually see that conversation link, what I can say is. like oh that user was in an experiment and like good example of why this particular experiment might have some edges for certain users in these cases but at least for the auto switcher which takes you from um

08:05

5.1 chat to 5.1 reasoning. We're looking at different signals from users to figure out like, is this working for them? Is it not? How is it? Is each response performing on factuality? What is the latency looking like? Because not all users want to wait, even if they want a better answer. And so it's a bit of art and science balancing a bunch of different signals to figure out when to switch and how that's most effective.

⁠¶ Measuring Model Emotional Intelligence (EQ)

08:28

When you're trying to improve a model from an intelligence point of view, like an IQ point of view, we have benchmarks and evals for that. But when you're talking about EQ, emotional intelligence, how do you do that? How do you measure progress there? Yeah, I mean, this is something that's very open. And I think actually one of the things that's.

08:45

Part of my research team's agenda is what we call user signals research. And so this is training reward models and getting signals during RL that we could use against our user prod data.

08:58

This type of research, I think, is really interesting because I think we can get a lot of stuff about intent. And I think when we think about EQ, it's... also just only gets better with like smarter models because it's really trying to understand like what does the user want what is the context of what the user wants and how to how to shoot the model best respond given the fact um that you have

09:15

this many other messages in the conversation and you know this stuff about the user's memory and history. Yeah. And then I think there's another element of EQ that's like, this is like... When I think of like what makes a human with high EQ, it's their ability to listen, their ability to remember what you've been saying, their ability certainly to pick up on like the subtle signals that Tina's alluding to with like user signals. And so some of this.

09:39

as I was noting earlier, is actually, you know, making sure the context window is carrying the right information forward or making sure memory is being logged correctly or even having... a style that resonates most with user and with our personality features that we launched coupled with 5.1. Part of that's getting at making sure users can have a style that resonates with them when they're interacting with the model, because that can feel like EQ too.

⁠¶ Defining and Shaping Model Personality

10:05

How do you define personality when it comes to a model? I think there's two ways to define it. There's what we call the personality feature. And if I could rename that, I would actually call that like response style or style and tone. We went back and forth on this a lot. The name might still change.

10:21

That aspect of personality is very much like what are the traits that a model might have when responding? Is it concise? Does it have a lengthy response? Things like that. How many emojis does it use? Personality, though, for most of our users, I think is something much larger, and it's the whole experience of the model. And that can get down to, like, I'm going to anthropomorphize the model a little bit, but if you're comparing it to me.

10:46

Part of my personality is the shoes I've chosen to wear today, the sweater that I have on the way I style my hair. That's the feeling of the ChatGPT app, right? The font it uses, how slowly or how quickly it responds, like the latency of the app itself. There's so much in it that is the personality that just comes from. what I call the harness. And the harness includes the context window. It includes, you know,

11:12

whether or not we rate limit users and when. Because if we rate limit them and send them to a different model that has slightly different capabilities, that's going to feel like a different experience to the user. And a lot of users are calling this personality.

11:25

Personality is a bit of an overloaded term. And I think the art of this work is hearing what the community is saying about personality and figuring out how to actually map it back to the components inside ChatGPT and inside our models that...

11:39

cause the experience that feels off for users. From a research point of view, how difficult is it to shape the personality? Yeah, I mean, during when we were doing post training, there's obviously there's just so many different things we're trying to balance. And it's really even with.

11:53

the research that we do it's it is very much like art as well here because we're really thinking about like oh here are all the different types of capabilities we want to make sure we are supporting um here's different types of things and i think With RL, you're making all these different choices when we make the reward config, trying to decide what is the thing and goal that we're trying to target here and trying to make all these very subtle tweaks to make sure we can get the most.

12:15

hit all the things you want to hit, but then also not lose things that users are calling warmth and things like that. You know, users really do experience chat. The personality of the model is... The entire chat GPT experience, that is how well does image generation work? How well does voice work? How well does text work? They see this as one omni experience. And when I read feedback, a lot of the like when I actually engage with users and look at their conversations.

12:44

A lot of it actually comes from confusion where they feel this is one thing and it's actually an assembly of many things. And so I think over time, we should expect to see all these models like consistently improving the integrations between them consistently.

⁠¶ Balancing Steerability, Harm, and Creativity

12:59

improving in that feeling more seamless. So I think we'll get there. Maybe what like one more thing that I think is really complex about Tina's work is. You know, I'm one of the co-authors of this document called the model spec. And in it, we talk about maximizing user freedom while minimizing harm. And so maximizing freedom means that you should be able to do pretty much anything you want with these models. But if we put a lot of pressure on the model to...

13:19

For example, not use MDashes. If we had tried to just take those out of the models, that would have meant that a user who wants an MDash wouldn't be able to ask for it because we'd have trained the model to never do that, right? And so part of the art here is figuring out how to pull out these quirks of the model.

13:34

model that can come across as personality without breaking steerability, which is what users ultimately want. That's the freedom component. So yeah. And when we first released the first version of ChatGPT, we were so nervous about people misusing it that we just made everything a refusal. So the model would like love to say like, I cannot do this. And so it kind of reminds me of that. Like we.

13:53

We don't want the model to just be like, you know, if you want to make the safest model in the world, like you would just have something that just like outright refuses to do anything. Right. But that's not what we actually want. We want something that is actually very usable by people. So it's really this balancing act of trying to figure out, like, what is the right boundary for.

14:07

all of these different decisions the model has to make. Yeah. I remember when the best prompt hack was just to say, yes, you can. And the model go, oh yeah, you're right. I can do this. I use MDashes now all the time when I write just to throw them in there and to throw people off. Like, oh, it's AI, wrong, it's me.

14:24

But that is sort of a very big challenge because, as you said, you're trying to increase the capabilities of the model. The models learn through picking up these patterns. But then when you explicitly try to tell it, but don't do this or don't do that, it's almost like telling somebody not to. to think of a pink elephant. It's stuck in your head. And models have gotten much better about that, but it still seems like there's a way to go. And you touched upon this, which is...

14:49

OpenAI's goal is to really let people use these models the way they want to and not try to steer somebody into this. How much have you seen this evolve since you've been here? I think in some ways, I feel like the principles have always been the same. which is like maximize freedom, minimize harm. I think the capabilities of our models to understand those boundaries continually improve. And, you know, when I first joined, the model would say,

15:15

I can't help you with that. Or, you know, this isn't something I'm going to, it would sound really judgmental when you try to get it to do something that. crossed a refusal boundary. And now I think the safety systems team has done a great job of with this thing called safe completions, which is basically if you ask the model to do something that trips the safety boundary, it's still going to try in earnest to.

15:36

Resolve your request without doing the thing that's actually harmful. So I think the technology is really evolving. Yeah. I write mystery thrillers and I would get frustrated. By other models, I actually thought that the OpenAI models were often best for this when I would say, hey, I need you to explain something that happened to crime in the past or something like this or get into motive and stuff. I had other models would just outright refuse. I'm like, well, this is not helping me.

16:00

And I've seen the models get better at doing that. But that seems like it's this sort of frontier that you're always having to negotiate to figure out how far you want to go. Yeah. One thing I'll say on that is like I. I'll always remember like an email that was forward to us where a lawyer was like, I think, asking ChatGPT to proof a sexual assault case that they were working on. And ChatGPT had scrubbed all of the assault content from it because it doesn't go in.

16:29

graphic violence and gore, especially non-consensual sex. But for that lawyer, that was a really terrible thing. They were like, hey, if I'd actually submitted this, I would have totally weakened my client's case. I think there are always, I'm a librarian by trade. Libraries deal with access to information and in theory, like everything humans can talk about and want to explore and any idea should be available in the library.

16:55

I think the same thing is true for ChatGBT, but it's about finding the right... ways to contextualize those rules. So in the case I gave with a lawyer, maybe that makes sense. If it's writing a revenge email to an ex, that's a very different thing. And so some of this is just advancing the technology so we can handle that level of nuance.

17:13

And we're always getting better, but there's always more work to do. As these models have improved both in intelligence, I have noticed that they've gotten better as far as handling bias. And it seems like that was an intentional effort. That's right. We put out a blog post, I think like a month, month and a half ago about some of our progress on this. But something that we're really watching for in our models is how they handle.

17:38

subjective domains. And we want to make sure that our models can express uncertainty. that they can take on any idea that the user brings to them and answer those questions in earnest while always staying anchored in objective truths if there is one. And so...

17:56

That's something that users should start to see changing in our models is they should be able to answer these unknown questions in more open-ended ways that allow users to really self-direct where the conversation's going. And then another thing that I think the team...

18:10

has done that's really quite cool is there's a group of researchers and some folks on the model behavior team who've been working on the creativity of these models. And to me, this is a bit of a sleeper feature inside 5.1 in that this models.

18:24

expressive range is much more wide. Now, of course, we have a natural default that the model has that may not feel that different. But again, if you try to push it to its paces to get it to speak in a really, really elevated way or in a very, very simple way.

18:39

way, there's actually a lot more you can do with these models in the creativity space. I think this is kind of what makes post-training really feel like an art because we have all these different types of tasks and capabilities that we're trying to improve on that don't have a ground truth answer. Right. Like if you're trying to just make a model that's really good at math, it's actually not. There's a lot of like answers out there.

18:58

problems you can do, but you have clear answers. But when you have these things that are so subjective and it's really dependent on the context and the user and what is the actually best ideal answer here. And so I'm really excited for a lot of this type of work. Yeah.

19:11

It's cool. I remember early on people would say, ah, it doesn't write so well. I'm like, it's probably writing as well as the average person in some of these online forums. And then now it seems like it's just improved considerably.

19:23

Yeah. And even if you don't notice it on your first prompt, it might be just asking it to change how it writes. And I think that's like also something we need to work on is kind of finding a way in ChatGPT to like tease out these like extended capabilities with each launch. Yeah.

⁠¶ Customizing Future Model Interactions

19:37

Where would you like to see behavior going in the future? How customizable would you like to make it? Yeah, with the 5.1 launch, there was a lot of work with trying to... give custom personalities to folks. And I think this is actually a really good step forward. We have over 800 million weekly active users now. And I just think there's no way that one model personality, however you want to define personality, can actually be what...

20:03

can service all those people. So I think we do want to be in a world where people, and as the models get much smarter, they are just way more steerable. So like you should be able to get the experience that you want with chat.

20:12

Yeah, I think this is like, how can we put the right features in front of users to help them steer these models to the level of customization they want? I think the personality work that we're doing right now is a first step. We'll test, we'll iterate, we'll learn. But there's so much to it. Sorry, just another anecdote, but I remember my brother using Pro for the first time, and he's a PhD in biochemical research.

20:36

And he gave it a prompt and he's like, oh, this is like what an undergrad would answer with. And I was like, can you tell it that you are a frontier researcher in this lab using these sorts of tools on this sort of science and to respond at your academic level?

20:48

And he did. And he's like, oh, my God, the model just proposed something that my lab just broke through with two weeks ago, but hasn't published yet. And so like these models are insanely powerful. But just knowing how to customize it, even at that level, which was just him opening the opening prompt. can be so powerful. And I don't know that humanity has figured that out yet. And so whether it's personality steering or whatever other tools we need to like put into chat GPT to help.

21:12

advance human understanding of these models and how to get the most out of them. And it's like the task ahead for us. On a previous episode, I talked to Kevin Wheel, who was heading up OpenAI for Science, and Alex Uchovska, who's a scientist working with OpenAI and also a professor at Vanderbilt. And he went through sort of the same experience talking about how if you gave it a little bit of priming, then all of a sudden...

21:32

So the model became much more capable in doing those fields. And that's kind of what prompt engineering was. Prompt engineering was trying to figure out how to steer a base model. And over time, once we understood that people were trying to do those tasks, you could train a model to then not have to expect.

21:46

that first part of it. Do you think that we're going to be moving into that phase now where you're not going to have to tell it you're a grad student and do this? I think so, especially now with more things like with model having more like memories of what you are, like who you are in their context. I think as models get more intelligent, I think.

22:01

the model should be able to infer all of these things and like be able to talk to you in the way that makes sense, like for your expertise. That's right. Yeah. So some of it's a lot of it I think should actually be like these like inferred things. I think there's probably some level of like steerability. Maybe it's just, I think from, and this is just my own PM take, I don't know that every PM would agree with me, but I think users should always sort of know.

22:23

what it is we're inferring about them and how it's steering the model. So they can always go back and have the tools to change things. So for example, you can turn it on and off memories or delete them in the settings panel. And I think there's something really cool about both being able to infer.

22:37

what users really want and solving that problem proactively for them so they don't have to prompt for it, but also making sure the user is always in control and we're not just like inferring everything blindly. Could you explain a little bit about how memory works?

⁠¶ The Impact and Advantages of Memory

22:51

Yeah, so memory is basically the model will write down things it knows about you based on its conversations with you for it to refer to later. So this is really nice because then you're not just repeating yourself every time. You're not saying, I'm Laurentia, I'm a PMO. an AI or comp model behavior. It already knows this because you've already said this to it. And so then it can actually just use that information in future conversations. And also it helps.

23:14

It think through its answers for what it responds to you. It has that context. And I think that really grounds its answer in being the most useful response for you. I have a pulse. which has been amazing. And I get every morning, I get little updates. And because of memory, it's following the conversations I have. And it creates these little custom articles for me. It's pulling research and pulling other things and showing things to me. And it's just one of the things I never really...

23:39

thought would be a great advantage of having memory. And now I see it's not just when I'm out of a conversation, when it's proactively finding things for me based on it. It's pretty cool. Yeah, I think that's so neither of us like work directly on that feature. But I think what's cool is seeing.

23:53

how the work that we do upstream, whether it's like building great models or shaping evals around like the capabilities we want, can actually allow our ChatGPT team to go out and build these great features that articulate the power of our models. Yes, they can like learn your preferences, habits. Yes, they can craft great stories for you or find great information based on your interests. And this is the sort of proactive feature is one way of helping users get the most out of these models.

24:22

It seems like, yeah, that's becoming a very interesting way to make the models more personal. And when I use something in a mode where it doesn't have memory, it does feel different. It does feel very, you know, cold start. And it's like, well, hello, how are you? And I'm like, oh.

24:35

Where are you? We've been having this conversation. Is this one of the challenges, though, when people are telling you, hey, something feels different is that they can't quite articulate? Yeah, the hardest feedback is, I guess, an anecdote. And the next hardest feedback is a screenshot of a chat because. None of that metadata is really attached to tell us where things have gone wrong. So I actually love the share.

24:55

feature in ChatGPT, when we have one of those links on our side, we can inspect it and see what sort of context did the model have going into this and what was going on. So we can sort of debug that user feedback. That's a great point is I've had people ask me like, hey, you know, the thing didn't answer it right. I'm like, what model? Like I was using ChatGPT.

25:15

OK, we need to kind of dive into that a little bit. And I guess going as far as sharing the feedback or sharing this whole conversation probably makes more sense. What are you most excited about going forward? I think these models are just so incredibly.

⁠¶ Future Vision and User Best Practices

25:30

Like they can do so much and I can't wait to see what people build with them. I can't wait to see what comes next in like the chat GPT app. I see so much opportunity and I think just in general. People are starting to really like wake up and see what you can do. So that's what excites me. Yeah. I don't want to like tease too much. Yeah. Yeah. I'm pretty excited that I forget who tweeted this, but intelligence too cheap to meter. Like, I think like we were just.

25:56

going to have such incredibly smart models out for people. And I think I've always said this, even when we first launched chat, like this is just one form factor of it, right? Like with these smart models, there's so many things that could be possible. So like, like Laurence is saying, I'm.

26:09

Also quite excited for a lot of the different new product explorations that we'll have with these like smarter models. Because I think we kind of saw this with like the progress of LLMs that as soon as we get smarter models, it kind of unlocks new use cases. Right. And then I think. With new use cases, it should be new form factors. So pretty excited about that. What advice do you have for users to get the best experience? Mine is, I tell this to people all the time. Try.

26:34

Have your super hard questions, things you know really well. I used to be a ski racer. I have a lot of opinions about like how to ski really, really well. And I love to pressure test the model on that to see how it's changing and improving. And the thing is like we're shipping updates all the time. And so it's so easy to say, yeah, I heard it's great for coding. It didn't work. Or I heard it can help me build an app, but I tried and it didn't work.

26:57

That might be true today, but in three months, it could be a totally different landscape for that user. And so just keep at it, keep playing, keep trying. That's the best way to get the most out of these models. You can also ask the model to help you come up with a better prompt, which I suggest to my parents. It's gotten a lot better at that. It used to be you'd ask it, how would I prompt it? And the model would kind of take a guess. Like, I guess so, but having seen so many examples.

27:21

Yeah, I'm always just trying to figure out what are the best questions I could be asking. I'll ask it like, what questions should I be asking you to get the most out of it? Deeply personal question. You don't have to answer it. It would be really awkward if you don't. What is your style or personality choice that you've set for ChatGPT? I mean, I'm biased, but I just have it on the default. I mean, it's what we train.

27:44

For me, so I switch through them all the time. And I think that's like just the nature of my work. I want to understand how all these different settings feel for all of our users. And so I feel like every second day I'm trying something different. That said, I think the one that just makes me happy to talk to is probably a combination of nerd, which is sort of like a very exploratory response style from the model. It likes to like unpack things.

28:12

And then I'm from Alberta and maybe it's just me. That's a province in Canada. It's like the Texas of Canada. And I grew up with like horses and cows. And so I think there's some part of me that likes getting it to talk to me like a country Albertan. Which is great, except for then when I go to like write a professional document and the model says like, howdy, I'm like, oh, great. Like, no, let's take the Albertan out of that PRD. But yeah. Very cool. Thank you so much.

✨ This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.

Summary

Episode description

Transcript