#337 Neil: 10 Free Chinese AI Tools Crushing Expensive American Software Now

00:00

You know that feeling, right? That friction. Oh, yeah. The creative buzzkill. You find this amazing tutorial for a new workflow. You get that little spark, and you think, I can actually make this. This is it. You click the link, and boom, paywall. Subscription fatigue. It's $40 a month just to experiment. Exactly. And it just prices out curiosity. It's a huge bottleneck. It kills the whole tinkering phase. You can't just play around if you have to pay up front.

00:25

Precisely. It puts innovation behind this gate. But the whole premise of the sources we've pulled together today is that we might be looking in the wrong place. While the West is so focused on... you know, three or four big companies with their subscriptions, there's this parallel ecosystem rising in the East. A massive one. We're looking at a stack of incredibly high powered AI tools from China, from giants like Alibaba, Tencent, ByteDance. And they are, for the most part, completely

00:56

free. And these aren't elite versions or some three day trial. These are the heavy hitters. But there's this huge disconnect, right? Western users just aren't touching them. because of the language barrier. Right. The user interface is in Mandarin. And for most people, that's an immediate close tab. But the sources we have argue that this barrier is it's thinner than it looks. Way thinner. And that's what we're going to walk

01:18

through. First, how to break that barrier. And then we're going to build a whole creative toolkit, 3D modeling, realistic video, coding agents, even local hosting. All using tools that cost absolutely nothing. Zero. We're building a free toolkit. So let's start with the visual side. I think the jump from 2D to 3D is probably the steepest learning curve in all of digital art. Oh, for sure. If I want a 3D asset, I'm thinking, OK, I have to learn Blender. I have to deal with

01:46

vertices and meshes. I've tried to learn Blender three times. I quit every single time. It's a six -month commitment, easy. Right. So the sources point to a tool called Hunyun 3D. What's the breakthrough here? What's the... Big idea. The breakthrough is image to 3D. That's a whole concept. Honeymoon lets you just bypass that entire modeling process. You skip it. Completely. You upload a single photo, a sneaker, a little toy car, a chair. And the AI just infers the spatial geometry.

02:16

It predicts what the back looks like from the front and spits out a fully rotatable printable 3D mesh. And this is where we hit that first barrier we were talking about. You land on the Hunyuan site and it's all in Chinese characters. And this is the translation trick that the sources really emphasize. It sounds... It's almost too simple, but it's the key to everything. You don't need to learn Mandarin. You just right -click on the white part of the browser page, literally

02:42

anywhere on the background. Just on the white space. Yep. And hit Translate to English. So it's a browser -level thing. Exactly. And suddenly the whole interface is in English. The buttons make sense. But, and this is a really crucial detail for Hanyuan, the default settings are not what you want. The sources point out a specific slider you have to change. The detail slider. Right. It's set to medium by default. Probably to save on their servers, you have to slide that

03:07

thing all the way to high. And what does that do? It might double the generation time, so maybe two minutes instead of one. But the difference in quality in the polygon count is huge. If you want to 3D print it or use it in a game, that slider is everything. So this tool is turning a flat image into a spatial object in just a few minutes. What does that really mean for a creator? It means spatial design is democratized. You don't need to understand geometry to make

03:33

3D art anymore. You just need a picture. That's fascinating. Okay, let's move from objects to people. This is the other big challenge. AI image generators like Mid Journey and Dali, they've been around, but they have this persistent problem. The plastic skin problem. Yeah, everything looks way too smooth, too perfect. The uncanny valley. Our brains are just hardwired to spot fake humans. We're looking for pores, for asymmetry, for the way light hits skin. If that's missing, we reject

03:59

it, even if it's super high -res. So the sources point to a tool called APOB, and specifically their Ultra S4K model as a solution. How is it doing things differently? APOB is optimizing for imperfection. It's kind of ironic, you know, to make it look real, you have to make it look a little worse. Right. It adds texture, those micro wrinkles, uneven lighting. It generates what the sources are calling authentic influencers. There was a great prompt example in the source

04:25

material for this one. It was a university student studying in a cozy library wearing a yellow sweater, soft rain on the window behind her. It's such a mood piece. And what APOP gets right isn't just the person, it's the whole atmosphere. It actually understands how soft light diffuses through rain on a window. But the real power here is you can change one detail without wrecking the whole image. You can swap the yellow sweater for a red one and the face stays perfectly consistent.

04:54

That brings up a good question, then. Why does texture matter more than, say, just raw resolution? Because our brains are wired to spot fake smoothness, texture is how we read reality. A high -res photo of plastic still looks like plastic. APOP just nails the human texture. Okay, so we've got 3D objects. We have realistic people. What if I just want to edit a photo I already took? I'm not starting from scratch, I just want to change the background. The sources mention a tool called

05:22

Wann. Yeah, Wann is really interesting. It calls itself an infinite canvas. So think Photoshop, but instead of using brushes and layers, you just edit reality by typing. So you'd highlight a window and just type, replace with a tropical beach. Exactly. But this is where that free part can get a little tricky. Most of these tools use a credit system, right? You get five free generations and you have to pay. But the sources found a specific hack for one. The free mode

05:48

hack. This is critical. You have to dig into the settings and there's a little toggle switch that says, generate with credits. You need to turn that off. That feels backwards. Why would they let you just turn off their monetization? It just moves you to a slower server queue. So instead of your image generating in like 10 seconds. It might take 40, but it costs you nothing. So the trade -off is just 30 seconds of patience for unlimited creativity. Exactly. Time is the

06:15

only currency here. If you're not in a huge rush, you can try out a hundred different ideas for free. It's perfect for just experimenting. I love that. Okay, let's move into video. This is where AI usually falls apart. We've all seen those videos where a person's face just... morphs three times in five seconds. Yeah, it's like a fever dream. It just breaks the immersion instantly. The technical term is object permanence, isn't

06:39

it? Right. Does the AI remember what the person looked like in frame one when it gets to frame 20? Yeah. And usually the answer is a hard no. The tool that's apparently solving this is one people might actually know, but maybe not for this feature. It's CapCut. Right. Specifically the web version and the instant AI video feature. CapCut is owned by ByteDance, and they have like arguably the best video data set on the planet. So they've cracked consistency. They really have.

07:04

The source material tells this little story about a dog named Max who gets lost in a city and befriends a cat. A classic. Totally. But in a normal AI video, Max would start as a golden retriever and end up as a Labrador. With CapCut, Max stays Max. Same collar, same spots, every single clip. That is massive for storytelling. But what if one of the clips is just... Bad. Do you have to redo the whole thing? Nope. And that's the other key feature. You can regenerate just a

07:32

single clip. So if scene three looks weird, you just fix that one scene without touching the rest of the movie. So how does having that character permanence really change AI storytelling? It turns a bunch of random clips into an actual narrative. You can build a character arc because the audience finally recognizes the character scene to scene. OK, but sometimes you don't need a whole movie. You just want an image that moves a little. A cinemagraph. Yeah, just something

07:56

to stop the scroll on social media. The tool for this is Veer. And the best part about Veer is how accessible it is. You don't even have to make an account. That's so rare now. It's incredible. You can take that image of the student in the library we made with APOB, upload it to Veer, and just tell it. Make the rainfall and make the girl breathe. And it isolates those parts and animates them. It finds motion vectors,

08:16

yeah. It's super subtle, you can even add little camera moves, like a slow zoom in for cinematic feel. The whole thing takes like 20 seconds. Is this really about making a video or is it about something else entirely? It's about capturing attention. It's designed to arrest your eye on a busy social feed, a static image is easy to ignore, something that moves even a little and grabs you. Speaking of grabbing you, let's talk

08:41

about speed. Nothing kills creativity faster than waiting 20 minutes for a render that turns out to be wrong. The latency of creativity. It's a real flow killer. And that's where Quinn comes in. It's from Alibaba, and it is built for pure speed. How fast are we talking? Usually under two minutes. It works like chat GPT, but instead of replying with text, it replies with a video. So you just type drone flying over a cyberpunk city with neon lights and rain, and it just makes

09:07

it. It just spits it out. And because it's so fast, you can iterate. If the video is too dark, you don't feel like you wasted your time. You just type, okay, make the neon lights brighter and maybe more pink, and boom, a new version appears. Does that speed actually improve the quality of the final art, do you think? I really believe it does. It allows for such rapid iteration and learning. You learn how to prompt better because the feedback is instant. You can fail

09:33

50 times in an hour. Let's pivot a bit. We've been very focused on the arts. Let's talk business. The sources make this interesting distinction between a chatbot and an agent. What's the difference? It's simple. A chatbot talks to you. An agent does work for you. And the tool here is called Minimax. It's designed to be like a digital employee. OK, give me an example. How is that different from just asking ChatGPT for business advice? So let's say you're planning a new coffee shop.

09:59

You ask Minimax for help. It doesn't just give you a wall of text. It actually generates downloadable files for you. Wait, it makes the files. It makes the actual files. It'll create a Word document for your menu. It'll create a separate text file with your marketing plan. You just download them and use them. So we're really moving from just conversation to actual execution. Precisely.

10:20

It's the shift from advice to deliverables. It's the difference between hiring a consultant and hiring an intern who actually does the work. And if you're starting that business, you might need a simple app. Normally that means hiring someone or learning to code yourself, but the next tool, GLM 4 .7, claims you can do it with just English. pretty mind -blowing, especially for non -technical people. You use its full stack or code mode and you just describe the app you

10:49

want. Like, make me a tip calculator. Exactly that. Create a tip calculator. I need a box for the bill amount and a slider for the percentage. GLM writes the code, it runs the code, and then a little window just pops up on your screen with the working app. You can actually use it right there. You can drag the slider, you can type in the bill amount, and if you think the numbers are too small, you just say, make the numbers bigger. You're literally debugging with your

11:11

voice. So does this mean the barrier to software engineering, at least for simple tools, is now just English? For these kinds of tools, yeah. Language is the new syntax. If you can clearly articulate what you want, you can basically build it. OK, we've covered visuals, video, a business plan, an app, but we're missing a huge piece, audio. The silent film era is over for sure, but the copyright strike era is very much alive. You use a famous song on YouTube, and you get

11:39

demonetized instantly. So the solution here is a tool called Hailu Audio. Right. And Hailu solves this in two ways. First, it generates music. You ask for an upbeat ukulele song for a travel vlog. and it composes a totally unique track for you. No copyright issues because it didn't exist five minutes ago. Exactly. And the second part is voice cloning. You just need about 30 seconds of your own voice to train it. What's the practical use case for that? Laughs. Well,

12:06

besides ego, it's mostly for efficiency. Say you record a whole video, but you stumble on just one word. The word. Instead of setting up your mic and lights and doing a whole new take, you just type that one sentence into Halu. generate the audio in your own voice, and patch it right in. Is this basically the end of needing the perfect take? Absolutely. It turns voice acting into a text editing process. You can fix audio like you'd fix a typo. It's incredible. Okay,

12:35

now... There's a whole group of creators out there who have great ideas, but are, you know, camera shy. They just don't want their face all over the internet. And that's a huge group of people. The tool for them is TikTok Symphony, which is, of course, from ByteDance. It uses what they call digital avatars. These are the AI -generated people who will read your script for you. Yes. But because it's ByteDance, the tech is just terrifyingly good, especially the

12:58

lip sync. It's built for vertical video on phone screens, so the mouth movements are shockingly natural. You just type the script, pick an app, avatar and it performs the video for you. So are we separating the creator from the performance itself? I think we are. It lets your personality and your ideas shine through without the anxiety of actually being on camera. It removes a huge barrier for a lot of people. We have covered so many cloud -based tools, but that raises a

13:25

question. What happens if the internet goes down? Or what if one of these free tools suddenly decides to become not free? That is the ultimate vulnerability of the cloud, right? You don't own any of it. And that's why the final tool we're looking at is Pinocchio. And Pinocchio is not a website. No, it's an application. It's like a browser and an installer for your own local computer. Think of it as an app store for AI models that run on your hard drive. So you download the actual

13:51

AI model to your machine. You download it, you install it, and it runs completely offline. You can run versions of these image and video generators using your own computer's power. What's the main benefit of doing that? Well, privacy is a big one. No one sees what you're making. But also, zero limits. No more credit systems, no server queues, no corporate censorship. And the best part, no one can ever turn it off. Is this the ultimate safety net for a digital creator? It's

14:17

digital sovereignty. You actually own the means of production again. Sponsor. Okay, let's just take a breath here. We have unpacked a massive amount of information. We went from 3D modeling with Hanyuan to business agents with Minimax, all the way to running your own local servers with Pinocchio. It's an overwhelming list. It really is. And I think that's the real danger here. You look at these 11 tools and you think, I have to learn all of this right now. And please

14:45

do not do that. Do not do that. The advice from the sources is spot on here. Just pick one. Just one. If you're an artist, go play with Juan in that free mode hack. If you're more business minded, give Minimax a try. Spend 20 minutes on it. Make something fun. Don't try to change your entire workflow in one afternoon. Exactly. The goal isn't to become an expert overnight. The goal is just to break the seal, to realize that these tools are just sitting there waiting

15:09

for you to try them. And really, the only thing standing between you and this entire ecosystem is a bit of curiosity. and maybe a right click to hit translate. That's it. That's the whole key to the kingdom. So that's the challenge for this week. Step out of your comfortable, familiar app bubble. Go try one of these tools. See what you can build when there's no paywall there to stop you. The frontier is wide open. Go explore

15:34

it. Thanks for listening to this deep dive. We'll catch you on the next one.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript