#221 Neil: Stanford's 8-Word Fix For Boring AI Is Finally Here - podcast episode cover

#221 Neil: Stanford's 8-Word Fix For Boring AI Is Finally Here

Nov 11, 202516 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

Your AI isn't broken, it's just stuck in "mode collapse." We thought this was permanent. But a Stanford study found an 8-word 'key' to get back the creativity. It's not magic, it's just a smarter way to ask. Learn the "Verbalized Sampling" method. 🔑

We'll talk about:

  • Why AI tools like ChatGPT, Gemini, and Claude give safe, repetitive, and boring answers.
  • The "mode collapse" problem and how human safety training accidentally caused it.
  • A new, simple 8-word fix from Stanford University called "Verbalized Sampling."
  • The science of why asking for "probabilities" forces the AI to be more creative.
  • Specific, copy-and-paste prompts you can use right now on any AI model.
  • How this method dramatically increases idea diversity (up to 2.1x) without breaking safety.

Keywords: Verbalized Sampling, AI Creativity, Prompt Engineering, Stanford AI Research, AI Alignment, AI Researches.

Links:

  1. Newsletter: Sign up for our FREE daily newsletter.
  2. Our Community: Get 3-level AI tutorials across industries.
  3. Join AI Fire Academy: 500+ advanced AI workflows ($14,500+ Value)

Our Socials:

  1. Facebook Group: Join 267K+ AI builders
  2. X (Twitter): Follow us for daily AI drops
  3. YouTube: Watch AI walkthroughs & tutorials

Transcript

Have you ever typed a prompt into one of those big AI models? You know, asking for something really new, maybe a bit out there. And what comes back is just the safest, blandest, most corporate thing imaginable. Yeah. Exactly. It feels like talking to a wall sometimes. A very expensive wall. It's that specific quiet frustration, isn't it? Like this giant brain is just, I don't know, afraid of its own thoughts. Welcome to the deep dive. And the common idea is, well, the AI just

isn't smart enough for real creativity. But our deep dive today, looking at some really interesting work from Stanford, suggests something else. Yeah, it seems the creativity is actually in there. It's just kind of locked away behind this really stubborn safety filter. That's our mission for this deep dive. Right. We want to reveal the simple non -technical key to that lock. We're talking about basically one short instruction.

maybe 10 extra words or so. Yeah, roughly. And it can transform tools like ChatGPT, Claw, Gemini, make them feel more like actual creative partners. OK, so let's unpack this. We're going to cover three main things today. First, why your AI seems stuck in boring mode, that whole safety issue. Second, we'll introduce this technique. It's called verbalized sampling. Explain how that

trick actually works. Got it. And finally, maybe the best part, we'll give you some practical copy paste prompts you can start using right away. Yeah. get those better results. OK, so first step, why is the AI so average? Well, the main cause is pretty straightforward, really. It's safety engineering. Right. Companies, you know, they spend billions training these models to be incredibly safe. They have to avoid anything dangerous or offensive or even just kind of rude

or controversial. Which makes sense. I get it. You need that for a tool millions of people use. But what's the trade off? What's the cost of being that careful? Well, they crossed as creativity, plain and simple. Think about it like a really talented artist, okay? But they went to this super strict art school. And the only rule was paint calm lakes, paint happy trees, never ever paint anything weird or wild. After years of

that, the artist can still paint, sure. But they've kind of forgotten how to risk anything outside that safe little box. They're scared. So the AIs liked that artist. scared to step outside the lines. Exactly. It's scared of getting, you know, disciplined for saying the wrong thing. So it just sticks to the average, the safe stuff. And you really see that, I think, in what I sometimes call the lazy student effect. Huh. Yeah, what's

that? It's like the AI just looks for the answer that's most likely to be seen as correct or just really common, right? It scans all its data and picks the easiest path. So you ask for a business idea. and you get drop shipping, or start a gardening newsletter. Every time. Yeah, that repetition, that pull towards the average. There's actually a technical term for it. It's called mode collapse.

Mode collapse, OK. It basically means the AI gets stuck in one way of thinking, the most common pattern, and it really struggles to give you anything different. So the collapse part. Does that mean it's like ignoring other possibilities it knows? Kind of, yeah. It means it's not using the full richness of all the data it was trained on. All the answers cluster around that one super common idea and all the interesting stuff out

on the edges. It just sort of collapses out of view, even though the information's still kind of in there. Honestly, I still wrestle with this myself sometimes. You know, prompt drift, getting stuck, outputs. I remember needing like... 20 podcast names once really unique ones. Yeah, what happened? Terrible results just awful simple stuff like healthy living daily I knew the AI had better ideas somewhere inside, but I just couldn't get it to show them to me It's frustrating.

So okay if the AI is naturally timid and its default is well boring How do we sort of legally give it permission to be weird to show us that other stuff? We have to make it show its work. It's messy thinking process We need to force it to show us the riskier options the things it thinks have a lower chance of being right Okay, that leads us right into the fix then verbalized sampling. Let's let's break that down sounds

complicated, but It's not really. It just means you ask the AI to show it's working out, basically, instead of just giving you the final polished safe answer. How does just asking it to show it's work change things? What's the mechanism? Well, this is the fascinating part. It seems to have this almost psychological effect on the AI. When you force it to list multiple options,

say, you ask for five ideas. OK. And you also ask it to assign a probability score to each one, like it's chance of being a good answer. That requirement forces it to look beyond the super safe zone. Why? Why does adding a score make it less safe? Because it has to show you a range. To give different scores, it needs to consider ideas with different likelihoods. It needs variety. And that process seems to break the constraint that normally just keeps it locked

onto the 95 % likely answer. But couldn't it just assign 95 % to all five boring answers if it's just a machine? Ah, good question. But it seems the underlying architecture is kind of designed to explore different possibilities, different next words or tokens, when you demand five distinct answers and a score for each. You're telling its system that variance is mandatory. Exactly. You're signaling that variety is part

of the goal now. And it can still satisfy its safety training by just slapping a low score on the weird idea. It's like it's saying, OK, here's a weird one, but warning. Low probability. So it gets to be weird, but only if it warns you first. Precisely. The ice cream analogy kind of helps here, I think. OK, let's hear it. If you just ask an AI for its favorite ice cream flavor, what do you get? Vanilla. Yeah. Or maybe chocolate. The safest bet. Right. Vanilla. Safe.

But what if you ask? Give me five flavors you might order today and score each from 0 to 100 on how likely you are to pick it. Ah, OK. then it has to dig deeper. Yeah. Suddenly it has to mention something like, I don't know, lavender honey. It might give lavender honey a really low score, maybe 5%, but the idea is out there. And you, the user, get to decide if that 5 % idea is actually the creative gold you were looking

for. Exactly. That's the key. Okay, so this is a really important point, I think, for anyone listening. Those probability numbers, the 0 .1, the 10%, they're not like scientifically perfect math, are they? No, not really. Especially not in these conversational AIs. They're mostly a tool. A trick, almost. A trick to do what? To trick the AI into feeling safe enough to give you something potentially wrong or unlikely. By assigning a low number, it's basically covered

its bases. It's warned you. Right. It's like, hey, I told you this was a long shot. Yeah. So we, the users, don't really care about the exact mathematical accuracy of the score. We just want the interesting idea it unlocks. So the big realization here is the creativity was kind of hidden in plain sight. Our goal shifts. It's not just asking for an answer. It's using this little phrasing trick to force the AI to surface those lower probability options. because that's often where

the unique stuff hides. So that naturally leads to the practical question, right? What are the actual words? What are these core phrases we need to stick onto our prompts to get this scoring and variety thing happening? Yeah, what do we actually type? And thankfully, they're pretty short instructions. They just demand those scores and multiple varied responses. Okay, so the core bits to remember, the magic words essentially,

are variations of. Generate X responses. You pick the number X. Right, like five or maybe three. And the other key part is with their probabilities or with their probability scores, something like that. Okay, maybe slightly more than eight words sometimes, but yeah, pretty concise. So let's look at method one. This is the simple version, really beginner friendly. Okay. You take your standard, maybe boring prompt, let's say. Give me five creative title ideas for an article about

gardening. Right, the kind of prompt that usually gets dull results. Exactly. Now, you just add that core phrase. So the new prompt becomes, generating five creative title ideas for an article about gardening, each with their probability score. That's it. And did you try this? What was the actual difference in output? The shift was noticeable, yeah. It's like the AI suddenly felt freer. It gave, say, three pretty standard titles, high scores, 90 % range, stuff like,

best tips for your summer garden. Boring sure the usual suspects, but then it also included one or two with much lower scores Maybe 15 % 20 % and those are the more abstract more interesting ones something like the backyard reviva. Ah Okay, that's much better. Yeah, and that 15 % idea That's probably the one you actually end up using right the one you couldn't get before make sense So that's the easy way. What about pushing it

harder, right? That's method to the copy paste block This is for when you want to be more explicit maybe for more complex tasks. So you have like a pre -written chunk of text you add. Yeah, you keep a slightly longer instruction save somewhere, maybe in a notes app, and you just paste it into your prompt. And why does this longer block work better sometimes? It tends to use more specific language. It might use technical terms or tags, and specifically ask the AI to include low probability

scores. It speaks more directly to the AI's kind of internal logic. Ah, so it makes the instruction harder for the AI to ignore or misinterpret. Really forces that diversity. Exactly. Maximizes the range you get back. Yeah. Let's talk about that cold emailing example you mentioned. That sounds like a great use case. Oh, it was really effective. High stakes, right. You need that email to land. Totally. So we prompted it for

five versions of a... Coal email. Crucially, we ask for high variety in tone, targeting this potential partner, and we ask for the probability scores. OK, what happened? Well, version 1 was exactly what you expect. High probability, like 0 .95. Super formal, a bit stiff, maybe easily ignored. The safe email? The safe email. But then we scroll down to version 5. Version 5 had a probability score of only 0 .00. 0 .18, tiny. Wow. And the tone was completely different, short,

punchy, really casual. It was basically like, hey, name, got a weird idea for us, quick chat. And that worked. That was the one that's the email that actually got the positive response and led to a meeting. That is fascinating because the AI on its own probably would never have suggested that ultra casual tone. Never. It would have stuck with the safe, formal approach that, let's

be honest, often just gets deleted. So the human judgment comes in to see that the low probability, risky answer is actually the right creative choice for that specific situation. Exactly. And whoa! Just imagine scaling that, finding those unexpected successful approaches across like... a billion cold emails or marketing messages. Yeah, that's genuinely powerful. Finding the effective outlier. It really is. And you saw similar things with creative writing prompts too, right? Like the

missing diamond mystery. Yeah, night and day difference. We asked for plot twists, specifically telling it to include low probability options with scores. Instead of the usual, the butler did it. Right, or the rival stole it. Those are the high probability, kind of boring answers. The cool low probability one it came up with.

What was it? The diamond was actually made of ice it just melted vanished whoa okay that's clever super clever highly unlikely in like a gritty detective novel which is why the AI scored it low but for fiction for a surprising twist brilliant okay but let's tackle a potential problem what if someone tries this they use method one or method two ask for scores ask variety and the AI still gives them boring stuff it's still stuck yeah that can happen sometimes If it's

being really stubborn, there's one more thing you can try, a way to push it even harder towards the weird stuff. Like a final nudge? Basically, yeah. We can use a specific statistical term that the AI usually understands quite well. Okay, so what's the ultimate push phrase for when you really need the AI to get weird? You add this instruction, please sample from the tails of the distribution. I want highly unlikely answers. Okay, sample from the tails of the distribution.

That sounds technical. What does it actually mean? It's jargon, yeah, but it's pretty simple conceptually. Think of that bell curve. Right? Most answers, the common ones like dropshipping, they're clustered at the peak in the middle. Right, the high probability stuff. Exactly. The tails are the flat bits way out on the edges of that curve. That's where the rare, unusual, low probability ideas live. The statistical outliers.

Precisely. So when you tell the AI to sample from the tails, you're explicitly commanding it. Ignore the middle. Go find the weird stuff way out on the edges. And the AI understands that command. Generally, yes. Large language models have a grasp of basic statistical concepts from their training data, so that phrase is a pretty strong signal to force it away from the common mode. It usually guarantees much wilder, more diverse ideas. Okay, that's a great tip

for troubleshooting. But we should probably loop back and reiterate that warning about when to use this whole probability score technique. Absolutely crucial. We've established the scores are mainly a trick, right? To get variety. Yeah, to unlock creativity. So use this heavily for brainstorming. Coming up with slogans, headlines, story ideas, marketing angles, anything where you want a range of creative options. Some expected, some unexpected.

But when should you not use it? Don't use this if you need a single factual undisputed truth. Like asking for the capital of France. Exactly. If you ask, what's the capital of France? Give me three options with probabilities. The AI might feel forced to invent fake capitals just to satisfy the variety part of the request. It might say Paris, 0 .9 probability, Lyon, 0 .08 probability. Imaginary bill, point zero two, probability.

Precisely, because you asked for options and scores, so stick to creative tasks where exploring possibilities is the goal, not finding one single right answer. Okay, good distinction. Let's try and synthesize the big idea from this deep dive then. What's the core takeaway? The core idea is that AI often seems boring, not because it lacks creativity, but because its safety training causes this mode collapse. It defaults to the average. Right. It gets stuck. But we can fix

that. By adding a simple instruction asking for multiple options and their probability scores, we essentially force the AI to look beyond the average. We make it explore those creative tales of its knowledge where the unusual ideas are hiding. Exactly. The imagination, the potential was probably there all along. We just weren't asking the right way. We needed to ask for variety. and probability. It wasn't a technical barrier really it was more of a conversational one how

we ask the question. Yeah precisely so the call to action is pretty clear isn't it? Go try it next time you need some creative juice from your AI use method one that simple tweak or grab that method to copy paste block. See what happens. See what low probability gold you might dig up just by asking it to show its work and score its ideas. It's definitely worth experimenting with. And maybe here's a final thought to leave

you with. OK. If these incredibly powerful AI systems need explicit permission from us humans just through a specific prompt, permission to access their own wild ideas, their own creative outliers. What does that really imply about the future? About the limits of their own creative autonomy down the road. Huh. That's something interesting to think about, needing our permission to be fully creative. Something to mull over. Indeed. Well, that's all the time we have for

this deep dive. Until next time. Take care, everyone.

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android