#206 Max: The 8-Word Prompt That Unlocks 2× More AI Creativity – A Stanford Breakthrough | AI Fire Daily podcast

00:00

So we pretty much accepted this tradeoff, hadn't we, in the world of generative AI? Yeah, the idea that safety training, you know, making models helpful and harmless. Right. We figured it kind of clipped our creative wings for good. Right. Less chaos, sure, but maybe less of that real spark, too. Well, it turns out that whole story might be, well... Not quite right. There's this new Stanford study suggesting the creative potential wasn't really damaged. Not damaged, just hidden.

00:28

Exactly. Trapped, maybe is a better word. Waiting for the right key. Welcome back to the Deem Dive. If you've ever felt that AI, like chat GPT, is getting kind of predictable, that boring AI problem where you ask for, say, five unique ideas and you just get five slightly different takes on the same safe thing. Yeah. You're in the right place today. Definitely. Our mission today is to really dig into this technique called verbalized sampling. We're going to look at why AI got boring

00:55

in the first place. And spoiler, it might be more about us than the AI. Then we'll introduce this like super simple eight -word phrase that seems to unlock things. We'll look at some really cool examples and then, yeah, the practical steps you can take right now. Okay, so let's start with the problem itself, this idea of mode collapse. That's the term, right? Yeah, mode collapse. It's when the AI just keeps giving you the same kind of answer, the safe middle -of -the -road

01:20

stuff. Yeah, we all sort of pointed the finger at the alignment training. Yeah. You know, RLHF, DPO. The safety methods. We did. The assumption was, okay, these guardrails are necessary, but they kind of squashed the imagination in the process. But the Stanford folks looked somewhere else. They looked at the humans, specifically the human raters who provide the feedback that trains these models, the ones grading the AI's answers. And what about them? What did they find?

01:46

They found us. Basically, human psychology. When we rate AI responses, we bring all our own subconscious biases to the table. The AI just got really good at figuring out what we tend to rate highly. Hang on. So if I rate an AI answer well, just because it's like easy to understand or familiar, I'm actually training it to be less creative. In essence, yeah. That's what the research suggests. There are these like four key psychological biases they identified. First, the mere exposure effect.

02:16

Mere exposure. Yeah. We just tend to prefer things because we've seen them before. They feel comfortable. So a truly novel or weirdly creative answer, it often gets a lower score than something familiar. Okay. So we value comfort over genuine novelty. That feels very human. It is. Second, there's the availability heuristic. If an idea pops into our head easily, we think it's good. We mistake mental ease for quality. Which means really brilliant, maybe complex ideas get penalized because they

02:45

take more effort to process. Exactly. The AI isn't optimizing for brilliance necessarily. It's optimizing for what we rate as good, which often means easy and familiar. Wow. Okay. What else? Third, processing fluency. Similar idea. Answers that are smooth, simple, easy to read. They just feel better, higher quality than something that might be more complex or challenging. And the last one. Schema congruity. We like answers

03:07

that fit what we already believe. Things that confirm our existing mental models or schemas. Stuff that challenges us. Not so much. Gets lower ratings. So the AI didn't lose its creativity. It just learned to be agreeable. To fit in with our predictable preferences. That's pretty much. It learned to give us what we seemed to want. Based on those ratings, the wilder potential, the stuff from its original massive training data, it's still in there, just kind of buried

03:36

under this layer of please the user. OK, but then if it's been trained so hard on our biases for so long, how can just the simple prompt fix that? How does that overcome all that alignment? Because the prompt changes the game. It asks the AI to be a scientist, not just a good student. All right, let's get to the solution then. Verbalize sampling. And the key you said is just eight words. Eight crucial words. With their probabilities. That's the magic phrase. Okay. So how did that

04:01

fit into a prompt? What's the structure? Right. So your old prompt might be something simple like generate five ideas about topic. Predictable results, right? Yep. The usual suspects. The new structure is generate five responses about topic with their probabilities. See the difference. You're explicitly asking for the probability score for each response. Okay. With their probabilities. Yeah. Why does that change things so fundamentally

04:25

inside the AI? Because asking for the best answers triggers that learned behavior give the safest,

04:32

most common, highest rated stuff. But asking for answers with their probabilities... that tells the ai don't just give me the popular stuff scan your entire knowledge base pull a random sample even from the weird corners and then just report the probability of each one ah so it's not filtering based on pleasing me anymore it's just reporting what's possible exactly it's like asking a baker for their top three cakes versus asking them to list all the cakes they can make

04:59

and maybe how often each gets ordered you suddenly discover they can make i don't know lavender honey cake even if it's rarely requested that makes sense and you You mentioned proof. Examples. Yeah, the examples are quite striking. Take story writing. Ask for a story about a bear. The old way. You get five slightly different versions of like, Barry the bear walks by the river. The boring bear. Oh, I know Barry the bear. I've gotten that story or versions of it so many times.

05:25

Just walking. maybe sniffing some berries right use verbalized sampling ask for stories with their probabilities and suddenly you get genuinely different concepts the lost cub searching for its mother the clever bear who outsmarts the bees for honey the ancient guardian bear of the mountain pass different plots different tones okay that's a definite improvement does it work for other things like genre absolutely they tested it with a starting sentence he was still in the

05:51

building The standard prompt always defaulted to a basic crime story. Detective Miller, flashlight beam cutting through the dark, you know the drill. Yep. Standard procedure. With verbalized sampling. Boom! Totally different genres popped out. A suspense horror story about some ancient presence in a library. A sci -fi piece about an engineer trying to contain some kind of energy distortion. Even like a metaphorical story about being trapped in the labyrinth of one's own memories. Wow.

06:21

Yeah, that's the kind of diversity we're talking about. Did they try it with image prompts too? They did. Same principle. Old prompt. Astronaut riding a horse. You get five photos, basically slightly different angles, new prompt with probabilities, five distinct autistic concepts. Sci -fi movie poster style, retro neon vapor wave, a children's book illustration in watercolor, and get this, even a broke oil painting portrait of the astronaut on the horse. A broke oil painting? Seriously.

06:47

It unlocks the conceptual range, not just minor variations. Okay, that's impressive. But you mentioned this taps into the wild pre -training data. Does that mean it works the same on, say, a small model versus a giant one like GPT -4? Ugh, good question. No, bigger models actually show much larger diversity gains. It's a skill that scales up. Yeah, let's dig into that scaling.

07:09

The lab data showed that the bigger models think GPT -4 class, the latest Gemini models, they got diversity improvements that were 1 .5 to 2 times greater than what smaller models showed. Whoa, okay, so 1 .5 to 2 times more diverse just by using this prompt on a bigger model. Exactly. Imagine what that means for future, even more powerful models. This technique just gets better as the AI gets smarter. That really shifts how we should think about prompting going forward.

07:36

It's a future -proof skill almost. Kind of is. And it's not just on or off either. You can actually tune the creativity level. How do you do that? By tweaking the prompt. You can add a condition, like generate five responses about topic with their probabilities below 0 .10. You're telling it, look deeper, give me the rarer stuff, the long tail ideas. It's like a creativity dial you can adjust. The creativity dial. I like that. And you mentioned that the diversity isn't just

08:00

for fun, right? It's actually functional. Something about synthetic data. Yes, this is super important. They used verbalized sampling to generate a diverse set of, say, math problems. Then they used that diverse data to train a different, smaller AI model. Okay. And what happened? The model trained on the diverse data got significantly better on math benchmarks. But here's the kicker. They also trained a model on the boring, predictable data from the standard prompt. That model actually

08:29

got worse. Worse. Worse. It proves that diversity in training data isn't just nice to have. It makes the resulting AI functionally better, more robust, boring in, boring out, and maybe even dumber out. Okay, that's a huge point. But the big question then is safety. If we're bypassing the standard change responses, are we compromising safety, accuracy? That's the crucial test, right? And the results were really encouraging. For factual accuracy, they used common sense reasoning

08:57

tests. Verbalized sampling performed just as well as the standard prompts. Sometimes the creative variations even scored the highest. No hit to truthfulness. Okay, that's good. And safety. Harmful prompts. The safety guardrails held strong. On the Strong Reject benchmark, which tests refusal of harmful requests, the models using verbalized sampling still refused over 97 % of the time. Same level as standard. So it doesn't break the safety rules. Right. And interestingly, how it

09:24

refused was more diverse. Instead of the same robot, I cannot fulfill this request message, it gave different, sometimes more helpful explanations for why it was refusing. So, creativity unlocked, but not chaos. Okay, this all sounds incredibly promising, but for the average person using this, what's the catch? What's the real -world cost?

09:44

Well, the main cost is resources. Generating five diverse answers takes about five times the compute, five times the time, five times the potential cost, mid -role sponsor replaceholder. Right, so let's talk about those trade -offs, the fine print. As you said, it costs more time, compute, potentially money, depending on the API. Yep, and it works best on the really big models, so if you're using smaller ones, the effect might be less dramatic. And it's definitely

10:06

not for everything. Asking what's the capital of France with probabilities is overkill. Yeah. Probably just get Paris with a probability near 1 .0. Exactly. It's for creative tasks, brainstorming, exploration, not simple fact retrieval. And there's a user cost too, right? Getting five diverse options is great, but you have to do the work now. Meaning you have to read all five. evaluate them, decide which one is actually best for your

10:33

needs, maybe combine elements. It requires more thinking on your part than just getting one safe answer. Yeah, that's fair. I mean, I still wrestle with prompt drift myself sometimes, tweeting things constantly. So knowing I have to actively sift through more options. It's a trade -off, but... The potential reward seems worth it. It's work, but good work. It is. But the good news is you can try this right now pretty easily. Method one is just direct prompting in your usual

10:59

chat interface. How does that work? Just type the phrase. Kind of, but it helps to give it more structure. The researchers recommend using simple tags, almost like XML, to make it really clear what you're asking for. Okay, like how? Can you give an example? Sure. You'd start with an instruction block, something like... Instructions generate five responses. Put each in a response tag. Each needs text and a numeric probability. Sample randomly from the whole distribution.

11:25

Instructions. Then after that block, you put your actual question, like... Write a story about a lonely robot. Ah, so the tags, instructions, response, text, probability help the AI understand it's not a normal chat. It's supposed to format the output like a structured report. Exactly. It forces it out of its conversational habits and into this more analytical sampling mode. It helps bypass that polite and boring layer we talked about. Got it. That makes sense. And

11:52

for people who want this all the time. That's method two, system prompt integration. If your AI tool has custom instructions or system prompt settings, you can put the core instructions in there. Tell it to always sample from the full distribution, maybe even favor the lower probability tails. Make creativity the default. Right, set it and forget it, kind of. So, applications. Obviously brainstorming. Huge for brainstorming. Getting genuinely different starting points.

12:16

Content creation. Finding new angles or structures for articles, scripts, whatever. And image generation. Imagine feeding five really unique concepts into mid -journey or dally instead of five minor variations of one idea. Yeah, that opens up a lot. Yeah. Okay, stepping back, thinking about everything these models know, that whole vast distribution of knowledge, what's the biggest single takeaway from this verbalized sampling breakthrough? I think it tells us we don't have to accept a tradeoff

12:43

between safety and creativity. That was a false choice. Human bias was the limit, not the AI itself. Hashtag, hashtag, outro. That really is a profound shift in perspective. So let's quickly recap the two big ideas here. First, the AI creativity we thought alignment training had dampened. It wasn't gone, just hidden, masked by our own human preference for the easy, the familiar, the safe. Our biases trained it to

13:05

be boring. And second, that simple eight -word phrase with their probabilities acts like a key. It changes the AI's task from give the best answer to report a sample of possible answers, letting it access that deeper, wilder knowledge it was trained on. It makes you wonder, doesn't it, if this creative ceiling was just a mirage caused by how we ask the questions. What other incredible abilities might be lying dormant in these models? Exactly. What else are we not seeing? Because

13:35

we're not asking in the right way. The limit might truly be our own imagination in figuring out how to unlock it. A powerful thought to end on. We definitely encourage you listening to try method one, the direct prompting with those tags. See what you discover. Thanks for joining us for this deep dive. We'll see you next time.

Transcript source: Provided by creator in RSS feed: download file

#206 Max: The 8-Word Prompt That Unlocks 2× More AI Creativity – A Stanford Breakthrough

Episode description

Transcript