Apps vs Models: Who Wins AI? - podcast episode cover

Apps vs Models: Who Wins AI?

Nov 14, 202528 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Summary

This episode delves into the critical debate of whether application-layer AI companies can thrive amidst the rapid shifts of the model layer, discussing arguments for and against their long-term viability. It explores the importance of deep vertical integration, unique UX, and proprietary data in creating defensible moats. The discussion is framed by recent AI news, including a $2.3 billion funding round for AI coding startup Cursor and its strategic move toward proprietary models.

Episode description

Today’s episode examines the core debate shaping the AI industry: whether application-layer companies can survive the pace and instability of the model layer. The discussion covers the arguments that apps can’t outrun rapid model shifts, the counter-case for deep vertical products, and what Cursor’s momentum reveals about where durable value might emerge. The episode also includes a fast headlines sweep on agentic cyber-espionage, major infrastructure investments, breakthrough agents, and the latest updates to GPT-5.1.

Brought to you by:

KPMG – Discover how AI is transforming possibility into reality. Tune into the new KPMG 'You Can with AI' podcast and unlock insights that will inform smarter decisions inside your enterprise. Listen now and start shaping your future with every episode. ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.kpmg.us/AIpodcasts⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Rovo - Unleash the potential of your team with AI-powered Search, Chat and Agents - ⁠⁠⁠⁠⁠⁠⁠⁠⁠https://rovo.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠

AssemblyAI - The best way to build Voice AI apps - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.assemblyai.com/brief⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Blitzy.com - Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://blitzy.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ to build enterprise software in days, not months

Robots & Pencils - Cloud-native AI solutions that power results ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://robotsandpencils.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

The Agent Readiness Audit from Superintelligent - Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://besuper.ai/ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠to request your company's agent readiness score.

The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614

Interested in sponsoring the show? sponsors@aidailybrief.ai

Transcript

Intro and AI News Headlines

Today on the AI Daily Brief, a massive fundraising round for Cursor and what it says about app layer companies versus the model layer. Before then, the headlines. Welcome to the agentic hacker age. All right, friends. Quick note before we dive in. First of all, thank you to today's sponsors, KPMG, Blitzy, Robots & Pencils, and Rovo. To get an ad-free version of the show, go to patreon.com slash AI Daily Brief, or you can sign up at Apple Podcasts to learn about sponsoring the show.

or anything else, including job opportunities, speaking gigs, et cetera, visit us at aidailybrief.ai. And of course, while you're there, check out the AI ROI benchmarking study. At this rate, we are going to put together one of the biggest collections of information about actual ROI for actual use cases. If you want to get the full version of the study, come and share which use cases are driving the most value for you. You can get all of that at roisurvey.ai.

Now with that, let's get into some very interesting conversations to close out our week. Welcome back to the AI Daily Brief Headlines Edition, all the daily AI news you need in around five minutes.

Anthropic's AI Cyber Espionage Alert

We kick off today with a story that could very easily be a main episode. Anthropic say they've thwarted the first reported use case of AI-enabled or really agentic cyber espionage. In mid-September, Anthropic detected suspicious activity that was later determined to be a, quote, highly sophisticated espionage campaign. The company said that they have high confidence that the threat actor was a Chinese state-sponsored hacking group.

The unprecedented part was that the group didn't just use AI for planning. Claude's agenda capabilities were used to carry out the attack. The hackers reportedly used Claude code to automate an infiltration of 30 global targets with a small number of successes.

The targets were organizations like large tech companies, financial institutions, chemical manufacturers, and government agencies. Anthropic monitored this activity across 10 days, banned accounts as they were identified, and coordinated with authorities as appropriate.

They said that Claude Code was able to perform 80-90% of the attack, with human intervention only required during a handful of key decision points. This allowed the attack to be carried out at a speed that would have been impossible for human hackers.

Claude's guardrails were circumventing the attack into smaller tasks, which each seemed innocent but added up to a massive system breach. In their postmortem, Anthropic wrote, this campaign has substantial implications for cybersecurity in the age of AI agents.

systems that can be run autonomously for long periods of time, and that complete complex tasks largely independent of human intervention. Agents are valuable for everyday work and productivity, but in the wrong hands, they can substantially increase the viability of large-scale cyber attacks.

Anthropic believes this issue will grow as AI models become more capable, so they're expanding their detection capabilities. They wrote, With the correct setup, threat actors can now use agentic AI systems for extended periods to do the work of entire teams of experienced hackers.

Less experienced and resourced groups can now potentially perform large-scale attacks of this nature. They further noted that this is an escalation of the VIBE hacking findings they reported over the summer, as those incidents still had, quote, humans very much still in the loop directing the operations.

Anthropic's Infrastructure & Funding News

Sure, this is a topic that we will be hearing a lot more about in the months to come. But one other story from Anthropic in a very different dimension of their work. They are joining the infrastructure build-out, announcing a $50 billion commitment for U.S. data centers.

Up until now, Anthropic has been a renter of compute, getting most of their access through partnerships with Google and Amazon. On the financial side, this hasn't been a big problem, allowing Anthropic to functionally spend equity instead of cash on their largest expense during their early growth phase.

But it has come with trade-offs. At certain points, Anthropic has been required to use in-house chips from Amazon and Google when they might have preferred to be using NVIDIA's GPUs. They've also been repeatedly bottlenecked by compute, leading to severe rate limits that hampered customer retention at times.

With this year's rapid growth, Anthropic has stepped up to another echelon and consequently they're looking to own some of their own infrastructure. The announcement discussed several sites to be built across the U.S., including in Texas and New York. UK-based data center developer Fluidstack will partner on the project.

with the expectation that the data centers will start coming online next year. Anthropic spoke about the project in terms of the administration's AI goals, saying it was about, quote, maintaining American AI leadership by strengthening domestic technology infrastructure. CEO Dario Amadei said in a statement, Now, speaking of $50 billion...

That is also the reported valuation from an upcoming fundraising round for Mira Mirati's Thinking Machines Lab. According to Bloomberg reporting sources, the deal terms haven't been finalized, and some sources said the round could close at $55 or even $60 billion.

For those keeping track at home, that would be a very quick forex from TML's $12 billion valuation from their fundraising round in July. The new valuation would catapult TML to become one of the most valuable private companies ever less than a year from launch. For some quick comparisons, Stripe's most recent mark in secondary markets is around $106 billion, Databricks recently raised at $100 billion, and Canva reportedly marked up to $42 billion during a tender offer to employees in August.

Now, it is true that TML is no longer a pre-product company with the release of their reinforcement learning platform Tinker last month, but they are still pre-revenue and haven't really established a clear business model or even a firm product niche.

Sources said that Tinker is being used by several university research groups as well as some paying enterprise customers. But this valuation certainly isn't going to be based on anything like revenue forecasts or anything like that. As with earlier rounds, it's a bet on talent.

with TML boasting a stacked roster of some of the best AI researchers drawn from OpenAI, DeepMind, and other labs. Really, the only comp that truly makes sense is Ilya Sutskiver's Safe Superintelligence, which is also a pre-product bet on talent. SSI established a $32 billion valuation in April.

Google & DeepMind AI Product Updates

Moving over into product land, Google has added deep research to Notebook LM. Now, Notebook LM has already proven to be one of the most interesting and popular tools in AI, but until now, the way to get the best results was pretty manual.

Google says the addition of deep research will allow users to automate the process of putting together source documents, allowing Notebook LM to function more like an AI research assistant. Their example video showed a user simply typing in latest breakthroughs in quantum physics and setting the agent to work.

come back a few minutes later and notebook has an entire dossier ready to read or transform into a podcast or video slide deck speaking of video slide decks in addition notebook lm has introduced the ability to prompt custom styles for video overviews they showed a variety of different styles like

8-bit pixelated art, pop art, turn-of-the-century art nouveau. And these are firmly in that category of app updates, which aren't about some underlying model improvement, but about making a product simply more aligned with what its users need from it.

Still, that wasn't Google's biggest launch of the day. DeepMind has released an agent called SEMA2 as a research preview. SEMA, which stands for Scalable Instructible Multi-World Agent, was described by DeepMind CEO Demis Hassabis as a general agent that can understand and reason. about complex instructions and complete tasks in simulated game worlds, even ones it has never seen before. He continued, incredible to see how it can just learn from self-play, a crucial step towards AGI.

Now, the first version of SEMA was released in March of 2024 and was fairly primitive. It learned to complete some simple tasks like following instructions like turn left, climb the ladder, or open the map across a wide range of video games. It had a total of 600 different instructions it knew how to follow.

The most interesting part about that result was that the agent could take what it learned from training conducted in one game and apply it to a game it had never seen before. Over DeepMind's total eval set, SEMA 1 had just a 31% success rate, and the rate plummeted to just a couple of percentage points on games it hadn't seen before. SEMA 2 has demonstrated a dramatic improvement in task completion. It has a 65% success rate across the eval set.

which is starting to get pretty close to the human level of 76%. On games the agent hadn't seen before, it achieved around a 13% success rate. The ability to generalize across different environments is one of the reasons many researchers are looking to world models as one of the keys to AGI.

DeepMind even tested how SEMA 2 would perform in entirely novel games that were generated on the fly by their Genie 3 world simulation model. SEMA 2 is able to orient itself, understand instructions, and take meaningful actions towards a goal, despite never having seen the environment before.

GPT 5.1 Updates and Features

Super interesting and firmly in this theme of alternative paths to AGI that we'll be increasingly spending time on. Lastly, a couple quick follow-up notes to GPT 5.1. It is now available via the API, and OpenAI has also published a prompting guide to help developers migrate their use cases. The guidance actually reveals a lot about the design decisions made for this model update.

For example, OpenAI suggested 5.1 has a tendency to be too verbose in providing an answer. They suggested it's worthwhile giving specific instructions about how much detail you want to be contained in the outputs. The guide also noted that the model is much more steerable than previous iterations.

so developers can dial in very specific behaviors when it comes to agents. I'm continuing to have great early experiences with GPT 5.1, and I'm excited to see what you guys think of it. For now though, that is going to do it for the headlines. Next up, the main episode.

What if AI wasn't just a buzzword, but a business imperative? On You Can With AI, we take you inside the boardrooms and strategy sessions of the world's most forward-thinking enterprises. Hosted by me, Nathaniel Whittemore, and powered by KPMG. This seven-part series delivers real-world insights from leaders who are scaling AI with purpose. From aligning culture and leadership to building trust, data readiness, and deploying AI agents.

Whether you're a C-suite executive, strategist, or innovator, this podcast is your front row seat to the future of enterprise AI. So go check it out at www.kpmg.us slash AI podcasts. or search you can with AI on Spotify, Apple Podcasts, or wherever you get your podcasts. This episode is brought to you by Blitzy, the enterprise autonomous software development platform with infinite code context.

Blitzy uses thousands of specialized AI agents that think for hours to understand enterprise-scale codebases with millions of lines of code. Enterprise engineering leaders start every development sprint with the Blitzy platform, bringing in their development requirements.

The Blitzy platform provides a plan, then generates and precompiles code for each task. Blitzy delivers 80% plus of the development work autonomously, while providing a guide for the final 20% of human development work required to complete the sprint.

Public companies are achieving a 5x engineering velocity increase when incorporating Blitzy as their pre-IDE development tool, pairing it with their coding pilot of choice to bring an AI-native SDLC into their org. Visit blitzy.com and press get a demo. to learn how Blitzy transforms your SDLC from AI assisted to AI native. AI isn't a one-off project. It's a partnership that has to evolve as the technology does.

Robots and pencils work side-by-side with clients to bring practical AI into every phase. Automation, personalization, decision support, and optimization. They prove what works through applied experimentation and build systems that amplify human potential.

As an AWS-certified partner with Global Delivery Centers, Robots and Pencils combines reach with high-touch service. Where others hand off, they stay engaged. Because partnership isn't a project plan, it's a commitment. As AI advances, so will their solutions. That's long-term value. Progress starts with the right partner. Start with robots and pencils at robotsandpencils.com slash AI Daily Brief. Meet Rovo, your AI-powered teammate.

Robo unleashes the potential of your team with AI-powered search, chat, and agents or build your own agent with Studio. Robo is powered by your organization's knowledge and lives on Atlassian's trusted and secure platform, so it's always working in the context of your work. Connect Robo to your favorite SaaS app so no knowledge gets left behind.

Robo runs on the Teamwork Graph, Atlassian's intelligence layer that unifies data across all of your apps and delivers personalized AI insights from day one. Robo is already built into Jira, Confluence, and Jira Service Management Standard, Premium, and Enterprise subscriptions. Know the feeling when AI turns from tool to teammate? If you Rovo, you know. Discover Rovo, your new AI teammate powered by Atlassian. Get started at ROV, as in victory, O.com.

Apps Versus Models: The Core Debate

Welcome back to the AI Daily Brief. One of the big news items to end out this week was that AI coding startup Cursor just raised a fresh $2.3 billion at a $29.3 billion valuation. Now, that sort of rarefied error is a valuation that so far has been exclusively for the model companies. And so what's interesting to me about it is not just to explore the fundraising in isolation.

But as a representative example of how people are thinking about the battle between the application layer and the model layer, you might have seen this tweet floating around this week. It comes from investor and entrepreneur Yashan and got 20 million views this week. for what is ultimately sort of an insider baseball type of conversation. This is the foundation of our entire conversation, so let's read what he has to say and then break it down a little bit. Hishan writes,

by rapid expansion of the foundational model providers. App functionality will be added to the foundational model's offerings because the big players aren't slow incumbents. It is wrong to apply the analogy of fast startup slow incumbent here. They're just big. There is almost no time to build a company and scale it. Wong continues.

There are two ways AI application startup founders can make money. One, make a flash in the pan app that generates a ton of cash and bank the cash. My estimate is that you have about 12 to 18 months of cash flow generation. or two, make it good enough app that you get acquired by one of the big players for sufficient equity. The situation is highly unstable. We don't know if it's going to crash or go to the moon, but both scenarios make it very unlikely that any AI application startup

will independently become a generational supercompany. The best odds are finding an application niche in a highly specialized field with extremely unique and specific data barriers, ideally ones related to real atoms, hardware, or world-related data, and not software and finance.

So the key elements of the argument here are, one, that foundation model providers will eat the app layer, basically that we have to throw out our old heuristics around slow incumbents versus fast startups, because the incumbents here are driving disruption at extreme speed.

The second point, however, which he gets into in a follow-up post on his own thread, is that the foundation is too unstable to build lasting app businesses. So Yishan continues in a subsequent post, The entire novelty of this thesis is that unlike in the past, specific elements of the AI industry are likely to make it so that application companies cannot outrun the wave of obsolescence, which will rush along far, far more quickly than prior technology waves.

The foundational technology has not stabilized in any way whatsoever, and applications require a sufficiently stable foundation for some extended period of time in order to create value and then a system for monetizing that value.

The wholesale rate of change in the nature of the foundation is the reason why I think almost all application startups will not survive to achieve any significant scale, not because the current large players are special. So this is the nuance that it would be easy to lose in this conversation.

What he's really talking about is a speed of change argument. And he's effectively arguing that app startups will get overtaken by sea changes before they can become real businesses. And that it's not that the big labs are quote unquote better in any specific way. but that only they have enough internal stability and resources to survive the chaos that they themselves are creating. He concludes in his second post,

And by mature, I mean having all the boring stuff like sales relationships and brand recognition. Yes, your engineers can make the change, but human hiring cycles and team solidification and market relations are incompressible. E.g., if you hire 100 people a month, your organization will implode.

Thus, application companies never quite make it to a full business threshold before the sea change happens out from under them. When I say the incumbents will take the application space, I mean that they're the only ones who can provide enough internal stability and resources to survive the sea changes they themselves will be driving.

Not that they're going to provide a superior product. They're just the ones who won't starve. So like I said, this had 20 million views and generated a huge amount of conversation both on the post and even in other channels like LinkedIn.

Arguments for Vertical Apps & Moats

So let's talk first about the people who thought that Yashan was wrong in some fundamental way. Many of these themes can be sort of bundled into the idea that vertical apps, workflows, or UX still matter hugely. David Roberts writes, I think you're underestimating how much unique UX, context engineering, integrations, human in the loop, and embedded workflows need to exist for any vertical business application to actually get from 70% decent to 100% outcomes with AI.

Vertical applications are going to be enormous and they will not be eaten by the foundational model providers. Now, implicit in David's argument is that the stuff that it takes to make a vertical application specifically for business and B2B application work is so immense and complex.

that it's just not in the incentive of the foundation model companies to do that. And certainly this is a point that I resonate with, seeing how much last mile integration work it takes for a very powerful AI tool to be actually useful inside the context of a business. Now, Yashan actually responded to this one saying, your reasoning here supports my thesis rather than undermining it. What I think he means is that there's going to be so much change so fast.

that the app layer companies aren't going to be able to survive long enough to do that sort of complex last mile work that David is talking about, ultimately leaving it only to the foundation model companies, even if they don't prioritize it in the short term.

Aaron Levy from Box, who's one of the most thoughtful thinkers when it comes to enterprise AI, says, the counter dynamic to the AI model doing everything is that, at least in the enterprise, bridging the AI model's capabilities to the customer's environment still requires a tremendous amount of long tail work.

The gap between an AI agent working for 90 or 95% of the solution and 100% is usually about 10x more work than most realize. So here you see Aaron reinforcing many of the themes from David's post.

He continues getting access to the enterprise data, connecting to the enterprise workflows, delivering the change management that employees need to adopt the technology, handling the regulatory and compliance requirements of that industry, and so on, all require some degree of highly dedicated focus in a domain.

Others argue that Yashan might be underestimating the new types of moats that could be formed. Investor Natasha Malpani writes, I'd say the opposite. The real white space is at the application layer. Everyone wants to sell shovels, but the gold is in how people actually use them. The InfraRace is a knife fight between hyperscalers, OpenAI, Google, Anthropic, Meta, Amazon. They'll undercut each other on price, latency, context window, and token cost until margins collapse.

Developer tooling looks safer, but it's crowding fast and every improvement gets absorbed upstream by the foundation models or downstream by open source forks. Meanwhile, applications are where behavioral modes form. Data isn't the only barrier. Habits are. users don't live in apis or eval dashboards they live in experiences context workflow brand and trust compound fast distribution and feedback loops create data advantages that scale locally even when models converge globally

you win if you own feedback surface to capture every edit action and intent build domain depth and embed in daily workflows collect proprietary exhaust behavior and telemetry that the model providers will never see some infra will break through security evals low latency edge compliance

But the broader white space is still at the application layer, where people, agents, and systems actually interact. Go deep enough that a foundation model can't care, and sticky enough that users won't leave even when it can. Now again, I really want to double-click on this foundation model can't care piece.

Proprietary Data and Feedback Loops

... ... ... ... ... ... ... ... but still customized to the customer work, which is why that outside of the foundation model companies, the firms that have done the best from the AI boom are the big systems integrators and consulting firms. The fact that the foundation model companies have to compete on other vectors creates a window of opportunity for a different category of company to swoop in and do the work that it takes to actually bring these solutions to market and practice.

Now, the other point from Natasha that I want to really double-click on is this idea of proprietary exhaust. For those of you who don't live in Silicon Valley jargon, that paragraph might have seemed really dense. Let's read it again. You win if you own feedback, surface to capture every edit, action, and intent, build domain depth, embed in daily workflows, collect proprietary exhaust, i.e. behavior and telemetry that the model providers will never see.

exhaust is the data that comes out of the usage of a product. And many of the folks that are most excited about the application layer when it comes to AI have a thesis that when it comes to improving model performance,

This type of behavioral exhaust is the real gold because it's the only thing that's not commoditized to everyone else. In other words, the foundation model companies all have access to the exact same trading data, more or less, some version of the same trading data. But a company...

that gets enough usage can create a feedback loop where they actually see how people are interacting with the models and that data stream can be used to refine how the model and also the experience that the model lives in works. This is going to be particularly relevant to our example of cursor, which we'll come to in a moment. Still, even with all of these arguments for why Yushan's thesis might be wrong, or at least limited, there's a big overlap in the Venn diagrams between these two camps.

that I think would acknowledge that many AI apps are just flimsy wrappers and that the real winners are likely to be the deep autonomous systems. Jacques Reynolds writes, most new AI apps aren't defensible. They're just UI wrappers on top of someone else's model.

The mode disappears the moment OpenAI or Anthropic ships the same feature natively. The real upside isn't in building another AI app, in my opinion. I think it's in implementing AI inside existing business workflows, where data, context, and customer relationships create real barriers.

Chong Call builds the thesis out even farther. He writes, the issue isn't that foundational models will kill application startups. It's that most AI applications today aren't really applications. They're shallow automations built to impress investors on a six-month time frame. He basically makes a comparison to early SaaS and says today the same story is repeating with AI agents. Duct tape workflows, zero defensibility, no reliability at scale. But the core question hasn't changed.

Who's building a system that delivers real value repeatedly, reliably, and autonomously? So the implication of this is that if you are building an application, you have to build it deep. You have to be hands-on. You have to be in a position to actually capture that behavioral exhaust data. Now Fall writes, I think even if a new application starts on this constantly evolving base, it can endure if it embeds itself in existing workflows, rights to proprietary systems of record.

builds proprietary data, and learns from usage and or captures distribution before incumbents bundle the feature. More importantly, AI wrappers that continue to swiftly ship features that solve users' needs, even as competition arrives, are difficult to compete with even for the foundation models. And so again, I think that you're starting to see the through line here that acknowledges the incredible speed at which things are changing.

and the new challenges that creates for the app layer, as well as the innovation capability of the big foundation model companies, but still sees this core path for some number of extremely high-performant application layer companies. And indeed, a lot of the responses was about what it takes to be one of these actually successful application layer companies. Sarah Catanzaro writes,

that the labs are not currently focused on, thereby accumulating more technical defensibility. At times, their objectives may even diverge. We already see this in creative industries, where post-training alignment impedes the ability of models to produce diverse outputs.

It will be hard to survive since the app companies will also need to define compelling workflows and user experiences, but with the right team and support, some, but not all, will make it. A16Z's Anisha Shara writes about a few approaches that he thinks advantage app player startups.

The first are categories that benefit from being multimodal, basically where the experience for the end customer is better if they can access models from different providers, cornered resources, those locked proprietary datasets, and ecosystems.

that, quote, imply a ton of features surface area. He gives the example of Granola. Sure, you can replicate Granola's recorder, but is OpenAI really going to build the entire ecosystem of productivity apps implied by it? Now, regardless of what we all think about this, The reality is that money is still pouring in. The Information, for example, recently published a piece called Investors Chase Neo Labs to Outflank OpenAI and Anthropic.

They point out that over the last month, those investors have made or discussed $2.5 billion of investments into just five startups. The information writes,

Cursor's Success: App Layer Transition

They say major developers like OpenAI and Anthropic may have overlooked. And that brings us to the cursor part of the story. Now, cursor is, of course, one of the big breakout leaders of the last year. When the story of 2025 is written, AI coding will be at the very top of the narratives, and one need look no further than the valuation jumps of Cursor to see just how big a deal at least investors are treating that whole theme as.

The company has raised $2.3 billion in a new round that values them at $29.3 billion. That is close to triple their $9.9 billion valuation from their Series C in June, and a 12x compared to their valuation from the beginning of the year.

In addition to the funding, Cursor also announced that they've reached a billion dollars in ARR and that they now produce more code than any other coding agent. Yuchen Jin did the research and commented, Cursor is almost certainly the fastest company in history to reach a billion dollars in ARR.

achieving this milestone in a little over two years. He added, and let's see if you can spot the connection to our broader theme today, people said cursor would go to zero because it's just a wrapper. AI products won't be monopolized by model labs, in my opinion.

One, products win by delivering real user value. Model capability alone isn't enough. Two, once they hit product market fit, companies can train their own models, often based on open source models combined with their own unique data and RL environments. Cursor's Composer 1 is an example. Now Composer, which is Cursor's proprietary model, seems central to their business strategy moving forward. They said that they intend to use this fresh capital to invest further in developing Composer.

The Wall Street Journal framed this raise, in fact, as being a test case to see if app layer startups can transition away from relying on the foundation model companies. They noted that both OpenAI and Anthropic are now directly competing with Cursor.

When asked about this, Cursor CEO and co-founder Michael Truel gave a diplomatic response, stating, We're excited to be one of the first examples of a large company built on their platforms. All of the AI labs are important partners to us. But clearly, Composer, their unique model, is top of mind. Truel said,

it does take significant resources, both specialized talent and also GPUs, to do something at Composer's scale. This funding lets us do it in a big way. Cursor also showed just how much the model environment is changing. Back in April, The most popular models on Cursor were Claude 3.7 Sonnet, Gemini 2.5 Pro, Claude 3.5 Sonnet, and then in 4th and 5th place, GPT-4.1 and GPT-4.0. The fastest growing in April were O3, O4 Mini, and DeepSeek version 3.1.

Today, the most popular models are in the first place, Sonnet 4.5, in the second place, Composer 1, and then after that, GPT-5, GPT-5 Codex, and Sonnet 4. The fastest growing, however, is Composer 1. All of which brings us to an interesting point about where this Venn diagram between the app layer and the model layer overlaps, which is at some point, do the handful of app layer companies that can break through and reach the scale to survive just become model companies themselves?

The Unstable Future of AI Apps

That certainly seems to be part of the direction here with Cursor and I think will be an interesting thing to watch. Anyways, it's a fascinating discussion and I think if you take away anything, it just shows that right now things are changing so fast that even the people whose entire job...

It is to watch and understand and allocate against these movements don't really have any idea what's happening. We are all just students with the very fast spinning world, our teacher. For now, that's going to do it for today's AI Daily Brief. Appreciate you listening or watching as always. Until next time, peace.

This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.
For the best experience, listen in Metacast app for iOS or Android