🎙️ EP 185: OpenAI Just Hit Google Where It Hurts (And DeepSeek Changed LLMs Forever)

00:00

I want you to picture an extremely sophisticated mind, like a super genius. Okay. They're trying to solve these really complex logic problems, but first they have to run a full detailed calculation just to remember what the capital of France is. Every single time. Every single time. It's an absurd waste of energy. It's totally counterintuitive, isn't it? But that strange inefficiency, that repetitive brute force thinking just to recall simple facts. That's been a foundational limit

00:27

for large language models since day one. They just lack a reliable built -in memory. Okay, so let's unpack this using the sources you shared. For years, the story of AI has been all about scaling up, right? More data, more parameters. More processing power. Exactly. We've been focused on just increasing the raw brain size. But today... We're seeing this fascinating shift. The sources are pointing toward the focus moving from that inefficient, constant processing. To extremely

00:55

efficient architectural remembering. Right. Absolutely. And our mission today is to cut through the noise and give you those vital nuggets from this evolving landscape. This deep dive is all about specialization, personalization, and efficiency. So here's the roadmap for you. First, we're going to look at how OpenAI is finally seriously challenging Google Translate. And they're doing it by injecting emotional and tonal context into it. Then we'll

01:20

do a quick synthesis of the market. We're talking predicted mega IPOs, major corporate shifts, and critical revaluation of data. And finally, we'll dive deep into that breakthrough memory architecture we just mentioned, DeepSeq's Enneagram system, which really changes the game by giving LLMs a true functional memory bank. Let's start with translation. For what, over a decade, Google Translate has just been the default. Oh, yeah. It's functional. But everyone knows the output

01:47

often sounds, well, sterile. Yeah. Mechanical. And now OpenAI has launched ChatGPT Translate, and it is a direct challenge with a major twist. And that twist is? They aren't just trying to provide a basic language conversion. They're targeting the professional need for nuance. Right. The core innovation is that this translator lets you customize the output tone right from the interface. That flexibility is the real aha moment

02:12

here, isn't it? Especially for anyone who needs to bridge cultural gaps or handle sensitive communication? It's tone -aware translation. It completely changes the standard for quality. So you could give it an example. Sure. You can input a complex sentence, say a response to a client complaint, and tell the AI, translate this into German, but make it sound... highly formal like a university professor

02:36

writing a business contract. Wow. Or you can flip that completely and ask it to translate something into Spanish and make it sound, you know, casual like a text message between friends. This just highlights how mechanical the old systems are. Google Translate is basically operating on this massive statistical map. Yeah, it finds the most likely linguistic equivalent. But it was never built... to understand the emotional or professional intent behind the words. And

02:58

that's the competitive barrier. The sources suggest this new service, which is likely powered by GPT -4 Turbo, handles over 50 languages. And it works right on mobile browsers. Right, with text or voice. For big tech, this is all about targeting that next layer of value that goes beyond just basic utility. When you think about the user base, I mean... Business users, students drafting papers, international creators. Yeah. They all need translations that don't sound like

03:29

they came from a machine. A contract that sounds robotic isn't just awkward. No, it could actually hurt a deal. Exactly. It saves you that extra step of translating, then manually rewriting the whole thing to sound natural. It delivers context and quality at the same time. But wait, if the user just typing in a prompt like make it sound casual, isn't that essentially just a packaged prompt engineering trick? Why is this a major step forward for translation models?

03:53

That's a really good question. It's more than just a trick. The model itself is trained to recognize this huge spectrum of linguistic features, things like register, formality markers, cultural norms. Things that are usually invisible to a simple statistical engine. Precisely. It's embedding that contextual awareness into the translation process itself. So the translated language structure reflects the requested tone, not just the word

04:17

choice. So how will this focus on tone? fundamentally change professional communication across different languages. It ensures translated content conveys human intent, not just literal word equivalence. Okay, let's pivot to the broader AI ecosystem. Let's do it. We'll look at shifts in productivity and personalization that affect you daily, and then at the massive capital that's driving this entire sector. On the personalization side, we're seeing users find really clever ways to integrate

04:45

AI into their workflows. The sources highlighted a trending... thread on Reddit where a user curated 15 of what they called golden prompts over six months. And they claim this saved them 10 to 15 hours every single week. Prompt engineering or prompt hacking, as some people are calling it, is definitely becoming a high value skill. It really is. But the providers are also doing the hard work to bake that personalization right into the product itself. And that's where Google's

05:11

personal intelligence in Gemini comes in. Right. This is integrating the model directly with a user's personal ecosystem. your Gmail, your photos, your YouTube data, to offer these hyper -customized answers. So instead of asking a general question, you can ask something like, based on my Gmail, summarize the three key action items from the Smith meeting last week. And then find the flight receipt from my photos folder. Exactly. Their pitch is basically, we're saving you a headache.

05:38

by making the AI an extension of your digital memory. And that focus on seamless integration is also happening so fast inside corporations. Microsoft, for instance, just shut down its massive internal employee library. I saw that, including subscriptions they've used for decades. And they replaced that traditional resource with an AI -powered skilling hub. Just think about the message that sends. Institutional knowledge is no longer

06:04

static documents in a folder somewhere. No. It's a dynamic conversational resource managed by an AI. That's a fundamental change. Now, moving to the money side. The legal drama between Altman and Musk, that's still unfolding. New court documents just dropped. But the market valuation story is what's truly staggering. I mean, a New York Times report suggests 2026 is shaping up to be the year of the... What do they call it? The

06:29

$2 trillion mega IPO. That's wild. We're talking about open AI, Anthropic and SpaceX potentially hitting the public markets almost at the same time. That would just redefine global tech investment. And investors are still betting big on specialized sectors. The AI video startup Hakesfield just hit a $1 .3 billion valuation after a new funding round. Which shows massive confidence in video generation as a real market, you know, despite

06:55

the ongoing debates about quality. But the most fascinating shift, the one that speaks directly to the long term cost of all this. It involves Wikipedia. This is a huge signal that the old scraping era is ending. A hundred percent. Wikipedia has now signed actual payment deals with major tech companies, Microsoft, Meta, even Perplexity. Giving them access to their content for AI training. So big tech, which for years just ingested vast amounts of content for free, is now paying a

07:23

premium for quality verified source data. This changes the entire economics of training models. Whoa. I mean, imagine scaling your company to hundreds of millions, even a billion queries without ever paying for your core source knowledge. That was the Internet model. And it seems that model is finally broken. Yeah, it suggests the market has realized that freely scraped, noisy data is just far less valuable than curated, clean data. Quality is now a scarce and valuable

07:54

commodity. So what does big tech paying Wikipedia tell us about the fundamental long term value of quality training data? Quality training data is transitioning from a free resource into a critical, expensive commodity. OK, so let's get back to that technical challenge we introduced at the top. The inefficiency problem. Right. Why do these massive models waste so much compute power just remembering how to spell basic words

08:16

or recall simple facts? Traditionally, the race has been all about increasing the context window, you know, making the workspace bigger. We focus on how much data the model can see at one time. But you're right. The problem isn't the size of the room. It's the filing system inside the model. Precisely. They are constantly retalculating facts because they lack an indexed built -in memory structure that separates that rote knowledge from complex reasoning. This is where DeepSeq

08:42

comes in. Enter DeepSeq. They introduced the memory lookup module for LLMs, and they named it Ngram. So Ngram is essentially a real memory system. A conditional one, yeah. It lets the model know... when to think deeply and when to just retrieve a fact. How does that actually work? Well, think of it this way. The model's deep neural network is like a master sculptor. It takes time and compute to create a new statue.

09:06

Okay. But if you just want to reproduce the same simple flower over and over, you don't re -sculpt it every time. You use a mold, a stamp. Enneagram uses something called engram embeddings. And for our listeners, how would you define engram embeddings in simple terms? Ngram embeddings are basically compressed pre -packaged digital representations of common phrases or facts. They're like pre -made organized stamps you can grab instantly. So instead of rebuilding the answer

09:33

through complex processing. It just. performs a single, lightning -fast, one -step lookup for that fact. Hang on a minute. If they're just prepackaged, compressed phrases stamps, as you put it, isn't that just a really optimized caching layer? Why is DeepSea calling this revolutionary? Because it fundamentally changes how the model uses its depth. It's not just caching external data. It's offloading the internal memory function that was previously eating up the network's reasoning

10:01

capacity. I see. That instant retrieval frees up the core model's depth for actual complicated tasks. The math, the code generation, the truly novel logic. It stops simple recall from clogging up. the intellectual arteries of the neural network. That's a good way to put it. It separates the dictionary from the logic center. And DeepSeek's research showed something critical here. What

10:22

was that? They found that general model performance actually dips significantly when you push models to a massive scale if memory isn't handled correctly. But by offloading that rote memory work to the Enneagram system, performance rebounds. A lot. And this whole tension. between using deep processing for complex logic versus a quick lookup for simple facts, that balance is known as sparsity allocation, right? how you intelligently distribute computational

10:49

effort. Exactly. It's one of the biggest efficiency and cost problems in AI today. On a human level, you know, when we try to juggle too much information in our heads, we get context collapse. I mean, I still wrestle with prompt drift myself when I overload a context window on a challenging task. That's relatable. If you can offload the simple stuff, you can focus on the deeper thought. So the question now is whether the major players like OpenAI and Google will actually commit to

11:13

this memory -first architecture. Will they adopt Enneagram or something like it? Or will they just keep doubling down on sheer scale, hoping massive compute brute forces the memory problem? If you can remember simple facts efficiently, you just. You dramatically reduce the effort needed for that constant brute force thinking. It changes the entire economic model for future development. So will this new memory architecture fundamentally change how all models are developed

11:41

going forward? If AI can remember efficiently, less effort is needed for continuous expensive brute force processing. So what does this all mean for you, the learner? We've covered an AI landscape that is rapidly moving away from these generalized one size fits all tools. And towards specialized tone aware services that prioritize human nuance. Yeah. And we watched the market finally recognize and actually start paying for

12:06

the true value of quality data. The overarching trend here is clearly specialization, personalization, and a relentless push for efficiency. AI is getting smarter, but not just by getting bigger, like, you know, stacking Lego blocks of data. Right. It's getting smarter by optimizing its architecture and using its resources more intelligently, which is exactly what we saw with DeepSeq's Enneagram.

12:27

If architectural breakthroughs like Enneagram become the standard and they solve this massive cost problem with fact retrieval and memory, the next great limit might not be data access or rock. compute. It might be economic. How so? Well, if efficiency makes the cost of running these LLMs just collapse, does AI stop being a specialized tool for giant corporations and become a true utility accessible to every single person on Earth? That's something for you to

12:54

consider as these systems evolve. And I'd encourage you to look into the concept of sparsity allocation, how the AI allocates its effort, or even just test out that new tone -aware translation feature yourself. You'll really see the difference between mechanical output and contextual understanding. Thanks for diving deep into your sources with us. We'll catch you on the next deep dive.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript