#218 Max: The AI UGC Factory – How to Generate Unlimited Product Ads Without Code | AI Fire Daily podcast

00:00

Okay, think about the usual hassle of hiring creators. Yeah. You know, all the contracts, the back and forth, weeks of waiting. Right, revisions. And you're spending maybe hundreds, even thousands, for just one 30 -second ad. An ad that might totally bomb, by the way. Exactly, a huge upfront gamble before you even see if the idea connects. Now, compare that whole headache to... Getting an almost unlimited supply of professional looking kind of viral style UGC ads. Generated

00:31

automatically by AI. Yeah. Ready to go for maybe 15 cents each. That difference is just, it's huge. It really flits the script on how marketing creative gets made. So today we're going to dig into the blueprint for building exactly that system. Right. We're unpacking this automated AI thing. Let's call it a UGC ad generator. And it uses no code tools like NAN. Google Sheets. All hooked up to some pretty advanced AI models. So the plan is, first we'll look at what you

00:57

put in and what you get out. Super simple inputs, surprisingly powerful outputs. Then we'll get into the interesting part, comparing the three different AI workflows, like three teams competing to be the best engine for this. And finally, we'll talk about how you go from just building one ad to scaling this whole thing up into like a 247 content factory. Sounds good. Let's jump in. So the old way, getting good UGC. It's tough.

01:22

Right. And slow. Oh, yeah. Costs can be anywhere from like 50 bucks to maybe $500 for single video. And just coordinating everything. Emails, shipping products, waiting for approvals that can easily eat up days, sometimes weeks. Plus, there's always that risk hanging over you. You pour in the time, the money, and crickets, the ad just doesn't perform. It's a real logistical nightmare for something that often has a pretty short shelf

01:47

life anyway. Okay, but the beauty of this AI approach is how little input it actually needs. It's kind of amazing. Yeah, you basically kick off the whole process just by filling out one single row in a spreadsheet, like Google Sheets. Symbol data in, finish content out. We found there are basically five key pieces of info you need. Right. Number one, the URL for the product photo. Then who's your target audience? Yeah. Your ICP. Yep. What product features do you want

02:14

to talk about? Uh -huh. And where should the video look like it's taking place? The setting, a kitchen, a car. And the last one, which is important for testing, is choosing which AI model setup you want to use for that specific ad. And what comes out the other end? It's pretty slick.

02:29

You get a fully automated, ready to use, maybe... eight to ten second ugc style video ad what's professional yeah surprisingly so it includes a realistic looking human presenter ai generated dialogue that sounds pretty natural and critically it's ready to post immediately already formatted for tick tock reels you know vertical video you just get a video link drop it straight into your ad campaign The real power here, though, it seems, is the scale and the speed of testing. Exactly.

02:58

That's the game changer. You can test like 50 different creative angles at the same time, focus on different features, different settings. And see what works almost instantly instead of waiting weeks per test. Right. It slashes that time to content. That's the edge. So boiling it down, what's the main advantage of this scale compared to just hiring one person? It's automatically scale testing tons of different creative ideas all at once. Okay, got it. Let's get into those

03:22

three workflows then. This is where you see the different ways to build this engine. Yeah, and we should probably define a couple of terms first. We keep saying NN. Right, so NNN, think of it like the supervisor on a factory floor. It's a no -code tool that connects all the different steps and APIs together in the right sequence. It tells everything what to do and when. And the other piece is FAL AI. Ah, yes, FAL AI. That's basically a service that bundles up a bunch of

03:48

different cutting -edge AI models. So instead of juggling multiple accounts and APIs, Fal gives us one place to access models like Nanobanana and Vio, which we used here, makes things simpler. Okay, so Workflow 1, this is the one you recommend, the pro version. Yeah, this is the one we landed on as the most reliable. It's a two -step process, Nanobanana plus Vio 3 .1. Step 1 uses Nanobanana.

04:10

What's that? It's an AI image model. Its job is to take your product photo and your prompt and create a new really realistic image of a person actually holding or using your product correctly. Okay, so it generates the person with the product. Then step two. Step two uses VO 3 .1, which is an AI video model. It takes that image Nano Banana just made and animates it, adds the talking, the subtle movements. And the

04:37

big advantage here is? Accuracy, mainly. The product usually looks right, held correctly. And really importantly, this two -step thing avoids that horrible static thumbnail problem. Because the first frame isn't just the product photo. It's the AI -generated person already moving. Exactly. The video starts with action, which is way better for grabbing attention on social feeds. What's the downside? Well, it's

05:00

two steps, so it takes a little longer. And it costs a bit more, came out to around 32 cents per video in our test. Okay, workflow two then, the speed demon. Yeah, the one everyone wants to work, Sora 2 only, just straight. Image to video. And it's fast and cheap. Super fast and the cheapest. We clocked it at about 15 cents for a 10 -second video. But there's always a catch. Big catch here. Sora 2, at least right

05:23

now, has pretty tight content rules. It often flags and blocks realistic AI -generated human faces. Oof. That's a non -starter for believable UGC ads. Pretty much. Plus, you still get that static product photo as the first frame, that bad thumbnail issue again. All right. And workflow three, the middle ground. Kinda. This one uses VO 3 .1 only, so direct image to video like Sora 2. It's reasonably fast, about 30 cents for an 8 second clip. And does it have the face restriction

05:54

problem? Nope, no face restrictions, which is good. So what's wrong with this one? Oh, this one had a major flaw. A deal breaker, honestly. Product alteration. Meaning? It kept changing the product. We were using this example of a glass jar of gummy supplements. VO 3 .1 kept turning the jar into a flexible bag in the video. Seriously, it just swapped the packaging. Yep. Consistently. We wasted a bunch of runs trying

06:19

to fix it. We even nicknamed it the gummy thief internally because it kept stealing the jar. Wow. Why would it do that? Just misinterpret the image. Our best guess is it over indexes on context. Like if the prompt talks about grabbing something quickly on the way out, the AI thinks quick grab must be a flexible bag and ignores the fact that the input image was clearly a rigid jar. That's. Not good for brand consistency.

06:43

Not good at all. It completely undermines the point if the product isn't shown accurately. Okay, so given that risk with VO 3 .1 alone, why is that more complex two -step process and workflow one necessary? To make sure the product looks right. And crucially, to get that dynamic first frame with action. Right. Reliability wins out. Okay. Okay, so workflow one it is. Let's peek behind the curtain now at how this actually

07:06

works in ANN. Sure. So prerequisites, you need an NA done setup, a Google Sheet ready, your file AI account, and an OpenAI API key for the brains. And the workflow starts how? It kicks off with an ANN trigger. It's basically just watching that Google Sheet, looking for any new row you mark as ready. Finds a ready row, then

07:24

what? hits a switch node that's just like a traffic controller it looks at which ai model you chose in the spreadsheet for that row and sends the job down the right path for our winning workflow it sends it to the nano banana path first okay so node one in that path is the image prompt agent what's that doing This uses an AI model like GPT -4 .0 as an agent. We give it a really detailed system prompt. Think of the system prompts like the AI's job description and rulebook. You're

07:53

telling it exactly how to behave. Exactly. We tell it, your job is to write a prompt for an image generation AI. Make the image hyper -realistic. Think lifelike skin, tiny imperfections, maybe a selfie angle. And critically, make sure the product in the image looks exactly like the one in the photo URL we gave you. So it crafts the instructions for now. nanobanana, then nanobanana starts making the image. But that takes time. Right. AI generation is an instance. So that

08:19

brings us to the polling loop. This is super important. Because you can't just wait indefinitely. Nope. The workflow uses a wait node to pause for a bit. Then an alpha node to check the status from fal .ai. Is the image done yet? If not, it loops back, waits again, checks again. Keeps knocking on the door until fal .ai says completed. Precisely. That loop stops the whole system from timing out or breaking while it waits. Okay, image is done. Now, this next part is really

08:45

interesting. Node 7, analyze generated image. You use OpenAI Vision here. Yeah, this is maybe the cleverest bit. You take the image that NanoBanana just created, and you feed it back into another AI, GPT -4O, with vision capabilities. Hold on, you use AI number two to look at what AI number one just made? Why? Seems redundant. It's like quality control. It solves the problem of AI hallucination. Sometimes the first AI might slightly mess up or maybe the image isn't quite what you

09:15

prompted. Ah, so the vision AI describes what's actually in the image. Exactly. It looks at the picture and says, OK, I see a woman with brown hair sitting in a blue car holding a white jar. It confirms the visual reality. And that description is then used for the next step. Yes. That description becomes a key input for Node -8, the video prompt agent. Now, the AI writing the video script knows for sure it needs to write dialogue for a woman in a blue car holding a white jar, not a red

09:45

truck or a green bag. That's smart. It anchors the video script to the actual image that was generated, ensuring consistency. Totally. Prevents weird disconnects between the visuals and the dialogue. I can imagine getting these prompts right, especially chained together like this, must be tricky. You mentioned prompt drift. I still wrestle with prompt drift myself when managing API calls, getting the JSON clean and consistent. Oh yeah, it's a constant thing. Prompt drift.

10:11

It's like playing telephone with the AI. You give it instructions, but by the third or fourth step in a chain, the AI might kind of start interpreting things a bit loosely. Forgets the original strict rules. Yeah, you asked for ultra -realistic, but maybe it starts leaning a bit more stylized down the line if you're not careful with how you pass context. It requires careful prompt engineering and sometimes explicit reminders

10:32

in later prompts. Makes sense. Okay, so Node 8, the video prompt agent, uses the audience info, product features, and that verified image description. To generate the final video prompt for VO 3 .1. This includes writing the eight seconds of dialogue the person should say, making it sound spontaneous and natural, matching the scene. Got it. And the last few steps. Nodes 9 through 12 are basically send the final prompt to VO 3 .1 to generate the video, run another

10:59

polling loop to wait for that to finish. The waiting. Yep, more waiting. And then the final step, update the Google Sheet, mark the status as finished, and paste in the URL of the final video. Boom, ad generated. So to recap that complex part. What's the absolutely essential function of analyzing the generated image mid -workflow? It guarantees the video script matches the visual reality of the generated image. Consistency. Mid -role sponsor, read placeholder. Okay, let's

11:24

talk results. The brass tacks. Cost. You said the winning workflow, Nano Banana plus VO3 .1, landed around $0 .18 an ad. Yeah, about $1 .18. And remember, Sort 2 was cheaper at $0 .10. VO3 .1 only was $0 .15. But the comparison isn't really between $0 .10, $0 .15, and $0 .18, is it? It's between 18 cents and, what was it, $50 to $500. Exactly. That's the money ball moment, right? We're talking orders of magnitude cheaper

11:49

than traditional methods. Whoa. Okay, just thinking about that, testing 50 different creative ideas for less than $10. compared to maybe thousands for just one human creator test. That's the democratization aspect. Small teams, even solo founders, can suddenly test creative at a scale that was previously only possible for huge agencies. That's a massive advantage for anyone who jumps on this early. But cost isn't everything. What about the quality? Do the 18 -cent ads actually look good? That's

12:20

the crucial question. Because, you know, saving 8 cents per ad sounds great. But if the cheaper ads don't convert because they look bad or have issues... Then it's false economy. Especially if you're running thousands of these. Right. And this is where reliability becomes the deciding factor. Workflow One, the Nano Banana Plus VO 3 .1 combo, was the clear winner on quality and reliability. Why specifically? Best natural look, consistent product accuracy, no gummy thief incidents,

12:46

and that vital action -first frame. You pay a few cents more, but you get an ad that's much more likely to actually work on social platforms. So even though Sora 2... Workflow 2 is cheapest. Yeah, the face blocking and the static first frame really hurt its potential for genuine -looking UGC. It's maybe useful for some things, but not ideal. And VO 3 .1 only, Workflow 3. Dead on arrival because of the product alteration risk. Turning a jar into a bag? You just can't have

13:15

that. It makes the cost savings totally irrelevant. So the real benefit of spending that extra, what, 3 to 8 cents on the winning workflow boils down to? Quality and reliability simply outweigh tiny cost savings, especially avoiding critical errors like product changes. Okay, so you've built your generator. It's making great ads one by one using Workflow One. How do you scale this up? Go from a little workshop to a full -blown factory. It actually starts pretty simply, right? With batch

13:40

processing. Yeah, you just tweak that initial Google Sheet trigger node in NEN. By default, it's set to only grab the first row it finds marked ready. You just untick that box. Basically, yeah. Remove that limit. Now, NEN will grab all the rows marked ready. So you could line up, say, 20 different ad ideas in your sheet, different angles, features, audiences. Hit ready on all of them, and the workflow will just chew through them one after another, maybe overnight while

14:05

you sleep. That's the factory mode unlocked. But just making more isn't enough. You want to make better ads too. Right. Optimization. This goes back to those system prompts we talked about. You can create different versions tailored to specific needs. Like if you have a luxury product, you tweak the prompt to ask for a premium aesthetic, maybe soft, elegant lighting. Or for a fitness gadget, you'd write prompts demanding energetic movement, dynamic angles, maybe even visible

14:33

sweat for realism. You can bake the brand tone right into the generation instructions. And you should also... test different messages, not just visuals, right? Absolutely. Use four rows for the same product image and setting, but in row one, focus the script on convenience. Row two, results. Row three, value. Row four, maybe social proof. Then you run them all and see which message actually connects with people. Exactly. Let the real world data tell you what resonates. Which

15:01

brings us to the really advanced move. Closing the loop. This is where it gets really powerful integrating performance data back in. Yeah. You add a webhook node to your workflow. This node listens for data coming back from your ad platforms like Facebook ads or TikTok ads manager. Pulling in actual results, views, clicks, conversions. You configure the ad platform to send that data

15:24

to the webhook. Then you have NAIMN write that performance data back into new columns in your original Google Sheet right next to the ad it belongs to. Okay, so now your spreadsheet shows not just the ad, but how well it did. And here's the final piece. You add another AI agent. It's job. Read the sheet, analyze the performance data, and figure out what's working best. And then it automatically creates new ready rows

15:49

based on the winners. If the convenience angle ads got way better click -through rates, this analysis agent automatically queues up 10 more variations on the convenience theme. Wow. So the system starts teaching itself and improving automatically based on real results. It becomes a self -optimizing content engine. a true factory that not only produces but also iterates and improves based on live market feedback. Okay, so what's the ultimate goal, the big win from

16:14

integrating all that factory mode stuff? Creating a self -improving loop that automatically tests, learns, and iterates using real performance data. Let's just zoom out one last time and grasp the scale here. Ten traditional UGC ads. You're looking at, what, $500 minimum, maybe up to $5 ,000? and weeks of work, coordination, back and forth. Right, versus 10 AI -generated ads using this winning workflow, costing maybe $1 .80 total. Yeah, maybe $1 .80, $2 max, and generated in

16:46

minutes, ready to deploy almost instantly. It's not just cheaper. It's a completely different economic model for creating marketing assets. The advantage clearly goes to whoever adopts this kind of automation and learns to iterate quickly, like we said, testing 50 ideas for the cost of maybe one old -school ad. And what about future -proofing? Are we going to have to redo this whole thing when SOAR 4 or VO5 comes out? That's another beautiful part of using tools

17:10

like NEN and FAL AI. The core logic, the workflow structure stays the same. So when a better, faster, cheaper AI model drops? You literally just go into your NEN workflow, find the node that calls the AI model, and update the model name in the settings. Maybe tweak the prompt slightly if needed. And your entire factory instantly upgrades to the next generation of AI content. Exactly. The system itself is designed to be adaptable. So for you listening, we really encourage you

17:38

to start exploring these ideas. Autonomous workflows, clever prompt engineering. The barriers to creating high quality, scalable content are rapidly disappearing. Really, the main constraint now is just the quality of the ideas you feed into that initial spreadsheet. So go build your engine.

Transcript source: Provided by creator in RSS feed: download file

#218 Max: The AI UGC Factory – How to Generate Unlimited Product Ads Without Code

Episode description

Transcript