🎙️ EP 63: The Excel Bot That Beat McKinsey Analysts (and It’s Just Getting Started)

00:00

Imagine for a second an AI, one that finishes those really complex Excel tasks, but in seconds. Well, our sources suggest it's not just theory anymore. It's reportedly here. And actually beating human experts at it. That's the kicker. We're digging into AI agents today. Yeah. From those spreadsheet wizards to this tiny AI that seems to kind of think like a brain. Get ready for some big shifts. Welcome back to the Deep Dive,

00:26

everyone. Today, we are really unpacking your latest sources, focusing hard on the cutting edge of AI agents. Okay, let's get into it. Yeah, let's do it. Our mission, like always, is navigating this really fast -moving AI landscape. We try to boil down what matters most for you. Think of it as your shortcut, you know, understanding where AI is heading and how it's changing work, like right now. We'll kick off with an AI agent that's making some serious waves in finance.

00:51

Then we'll kind of zoom out, look at how AI is shaping other areas, video creation, corporate strategy, that sort of thing. And finally, we've got this really surprising new AI architecture. Honestly, it could change everything we thought we knew about building intelligent systems. It's pretty wild. Okay. Let's jump right into segment one. Our sources are calling it an Excel revolution. Big words. Yeah. And it centers on this new spreadsheet native AI agent. It's called Shortcut. Right.

01:20

And this isn't just like a fancy macro. No, not at all. It's way beyond that. This agent, it actually builds entire financial models, fills out data. And gets this one shots tasks, tasks usually given to junior analysts. Exactly. There's this demo where it auto populated a whole discounted cash flow model, a DCF. Which is. Pretty complex stuff, yeah. Totally. And it did it using raw data, just pulled straight from SEC filings, no human input. Wow. So it handles the data gathering,

01:49

the formatting, the formulas, everything. The whole workflow. Data, formatting, formulas, checking, all in one go. It's kind of like watching a whole team work, but super fast, like a bot. Okay, that's impressive. What about performance? How does it stack up? Well, the sources detailed these benchmarks. They put shortcut up against actual junior analysts. From like... Top firms. Yeah. Private equity, banking, consulting, product roles, the big leagues. OK. And here's the kicker.

02:18

Shortcut reportedly won eighty nine point one percent of the time. Eighty nine percent scored by managers from those firms. That's what the sources say. It's pretty dominant. OK, well. Any other comparisons? It also apparently beat the ChatGPT agent in 90 % of head -to -head tests. 90%. Okay. Got to add the usual caveats, right? Early benchmarks, maybe fuzzy criteria for winning. Sure. Absolutely. Need to see more verification. But the buzz, it's definitely there. Oh, yeah.

02:46

Finance influencers calling it a ChatGPT moment for Excel. You can practically hear the jaws dropping in finance departments. It really makes you think about that warning from Dario Amadei, you know, about... potentially half of junior office jobs just vanishing because of AI. Because this isn't just automating one cell, it's full task automation. Right. It's the whole process, getting data, building the model, writing analysis, even exporting it. It's a huge shift, like machines

03:14

and factories. But now it's bots, one -shotting spreadsheets. Makes you really consider the skills we'll need going forward. So with shortcuts showing this kind of power, What does this really mean right now for those detailed, repetitive financial analysis tasks? Is it a total rewrite? Oh, absolutely. It signals a clear shift. Yeah. That routine spreadsheet grunt work. It's automatable now, plain and simple. Okay, so shortcuts shaping up Excel, but that's just one piece, right? Exactly.

03:44

The sources show a much bigger picture. AI's capabilities are expanding everywhere. The industry's shifting fast. So moving beyond spreadsheets, what else is happening? Well... For one, we're seeing more complex systems. Take Claude, right? They've added this sub -agents feature. Sub -agents. What's that? It lets you build basically teams of agents. They can tackle multiple tasks at the same time working in parallel. So you give it a project and it kind of assembles its own

04:07

digital team. Pretty much. Imagine that for complex workflows. And then there's this thing called JSON prompting. Sources are calling it the most underrated AI skill of 2025. Yeah, JSON prompting.

04:20

it sounds technical but it's basically a way to give ai really precise instructions so it knows exactly what you need exactly structured instructions leads to reliable answers and crucially no hallucinations turns models like gpt gemini claude into consistent reliable agents super important for professional use okay makes sense what else we're also seeing ai just blend into everyday tools more google search is ai mode yeah it can now see homework You can upload a

04:48

PDF or just use your camera. And ask AI mode for answers based on it. Yep. Makes research or figuring stuff out way faster. And ChatGPT has that new study mode. Right. Like a personalized tutor in your pocket. Adaptive lessons, the works. Feels like a real study partner. Interesting. What about the creative side? Runway just launched Aleph. It's a new video model. In -context video generation. Multitask visual stuff. So better AI video tools. Letting creators do wilder things,

05:17

basically. More control, more flexibility for imaginative content. And then there was that story about Meta. Oh, yeah. The poaching attempt. Wild story. They reportedly tried to hire away like half the staff from Thinking Machines Lab. A top research group. And Zuckerberg himself was apparently messaging people. Offers over a billion dollars mentioned. Yeah. Not one person left. Wow. Tells you a lot about what top AI talent actually wants, doesn't it? Yeah. It's

05:45

not just the money. Meanwhile, you've got the National Science Foundation, the NSF, putting serious cash into AI research. $100 million. That's significant. And they're partnering with big names, Simons Foundation, NIST, Capital One, Intel. Yeah, it's a big deal. It's not just about apps. It's foundational stuff. Material science, new tech, pushing boundaries in the physical world with AI's help. Okay, let's connect that back. If we look at that meta poaching attempt

06:09

failing. What does that tell us about the AI talent market beyond just the insane money? It really says top AI talent seeks more than just cash. They want meaningful work, innovative places, real frontier stuff. All right, let's shift gears slightly. How about a quick round of some new AI tools and maybe some industry quick hits, stuff that's changing things day to day? Yeah, let's do it. Rapid fire. First up. Kasi AI. Kasi AI. Turns long videos into short viral clips.

06:38

It finds the good bits automatically. Think AI content strategist for creators. Big time saver. Potential reach booster. Okay. Useful. Next. Cache Scene. Removes backgrounds. Enhances photos instantly. Super common task. Now super fast. For designers. Or just, you know, touching up photos. Andy. And Immersity for mobile. Turn your flat 2D images into immersive 3D videos. Kind of cool for making more dynamic content right on your phone. Neat. What about layout

07:05

.dev? This one's for developers, designers maybe. Turn simple ideas like a sketch or just text into working prototypes really fast. So speeds up that early ideation phase. Massively. Could really accelerate getting projects off the ground. Okay, those are some tools. Any quick industry news bites? Yeah, a few. Zed AI released a model cheaper than DeepSeek. Yeah. And it's free to download. Shows things are getting more accessible, maybe more decentralized. Interesting. What else?

07:30

Meta's going to let job candidates use AI during coding tests. Really? Yeah. Kind of acknowledging it's just part of the workflow now. Pretty pragmatic. Makes sense. Google. They revealed an internal site, AI Savvy Google, for their non -tech employees. Just trying to get everyone up to speed on AI. Smart internal education. Good idea. Anthropic. They published research on using automated agents to audit other AI models. AI auditing AI. Yeah.

07:58

Crucial for safety, reliability, making sure these things behave, especially as they get more autonomous. Definitely important. And finally, Spotify. Just hinting at a more conversational voice AI interface down the line. You know, chat with your AI DJ about playlists based on your mood. Kind of like talking to a friend. So all these different tools and updates. Yeah. How do these smaller things impact us, you know, directly? They basically bring powerful AI into

08:24

everyday stuff, often behind the scenes. Makes digital life smoother, more intuitive. Okay. Fascinating stuff. Before we get to maybe the most mind -bending part, this new AI architecture, let's just take a quick moment for our sponsor. Sponsor. And we are back. Okay, that revolutionary architecture we teased. Seriously, this could shake things up. It challenges some basic ideas

08:47

about how we build AI. Right. The sources talk about something called the Hierarchical Reasoning Model, HRM, from a group called Sapient Intelligence. And the key thing, it's tiny, but it uses this brain -like architecture. That's the buzz. Brain -like? How so? It's got two main parts working together. A planner module. That's the slow thinking part. Strategizing, like planning chess moves. Okay, thinking ahead. And then a worker module.

09:13

That's the fast acting part. Executing with like how your brain instantly recognizes a face, you know, quick processing. So they work together in a hierarchy. Exactly. They plan and solve in one go, one single forward pass. It's different from how many big language models work. And you said it's tiny. Unbelievably tiny. Only 27 million parameters. 27 million. How does that compare? GPT -1, the original. 117 million parameters. HRM is less than a quarter of the size. Wow.

09:40

Okay. Tiny size. But what about performance? This is where it gets crazy. On the ARC -AGI benchmark, think of it like an AI IQ test for fluid intelligence. Yeah. HRM scored 40 .3%. Claude, 3 .7. 21 .2%. Another model, 03 Mini High, got 34 .5%. Wait. So this tiny model. It outperformed much bigger ones on this reasoning test. Significantly outperformed them. Okay. Are there specific examples? Yeah. And they really highlight the difference. It solved 55 % of Sudoku

10:12

extreme puzzles. And other models? Claude and OpenAI scored 0 % on those. Zero. Okay. Any others? It found the optimal path to nearly 75 % of these really complex 30 by 30 mazes. And the others? Also 0%. It's not just a little better. It's solving problems the big guys just can't. Yeah, I still wrestle with prompt drift myself sometimes, getting the AI to stay on track. So the idea of a tiny model having this kind of consistent step -by -step reasoning, it's really fascinating.

10:36

It makes you wonder if we've been too focused on just scaling up. Whoa, okay, that's... That's genuinely remarkable. A tiny AI outperforming models vastly larger, solving problems they can't even touch. It really does challenge the whole bigger is better idea in AI scaling. That's kind of mind blowing. Exactly. It's early days, sure, and it's not general purpose like GPT yet, but it suggests advanced reasoning might not require trillions of tokens or massive compute. So this

11:04

raises a huge question then. Could this... brain -like hierarchical approach fundamentally change how we build AI in the future? Are we looking at a different path entirely? It definitely points towards efficiency and maybe more novel ways of thinking emerging rather than just relying on sheer scale. Could unlock AI for new areas, maybe inspire totally different kinds of intelligence. Okay, let's try and pull this all together. What's

11:26

the big picture here? We've seen AI agents evolve incredibly quickly from tools to full task automation, spreadsheets, complex reasoning. Yeah, what strikes me is just the speed and the variety of innovation. It's not just about making AI bigger anymore. It's smarter, more specialized, way more efficient sometimes, tackling really sophisticated problems with amazing precision. So connecting it all

11:51

up, the signal seems clear. AI is rapidly changing how we work, how we solve problems, not just in huge systems, but through these focused, intelligent agents in our everyday tools and tasks. Absolutely. It's becoming embedded. So this whole deep dive leaves us with a final thought, maybe something for you, the listener, to chew on. As these AI agents get better and more integrated, what new roles, what new opportunities actually open up for humans? Right. How does our own creativity,

12:19

our problem solving? How do we adapt and evolve with these powerful new partners? It's not just about replacement. It's about partnership, right? Shifting maybe from doing the task to defining the problem or overseeing the AI, moving from rote work to more creative. of ideation. Could be. We really encourage you to keep exploring this, keep asking questions, keep thinking about how all this tech is shaping our collective future and your own place in it. Thank you for joining

12:43

us on this deep dive. Until next time, keep that curiosity alive. Otiro Music.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript