Support for the show comes from Zell. Scammers are nothing new, believe it or not, they even existed before the internet. But as technology keeps advancing, the tools and techniques at the scammers disposal are ever changing. And they're getting much savier at separating people from their money, so it's important to stay vigilant. Always remember to only send money to people you know and trust. And be sure to educate yourself on how to spot a scam so you'll recognize the signs.
Learn more at zelpay.com slash safety. Starting your credit card search with Nerdwallet Smart. Using their tools to finally find the card that's right for you, even smarter. With Nerdwallet you can filter for the features you care about, access the latest deals, and add your top cards to a comparison table to make smarter decisions. All powered by the nerd's expert reviews of over 400 credit cards.
Head over to nerdwallet.com forward slash Learn More to find smarter credit cards, savings accounts, mortgage rates, and more. Nerdwallet. Finance smarter. Nerdwallet compare, incorporate it. NMLS1617539. Hello and welcome to Decoder. I'm Neil Appetel, editor and chief of the virgin. Decoder is my show about big ideas and other problems. Today I'm talking with Mike Krieger, the new chief product officer at Anthropic, one of the hottest AI companies in the entire industry.
Anthropic was started in 2021 by former OpenAI executives and researchers who wanted to build a more safety-minded AI company. Which I have to point out is a real theme among people who leave OpenAI. Something to think about. Anthropic's main product right now is Claude, which is the name of both its industry leading AI model and a chatbot that competes with chatGPT. Like other major AI companies, Anthropic has billions in funding from some of the biggest names in tech, primarily Amazon.
But at the same time, Anthropic does have a distinct and intense safety culture. The company is notable for employing some people who legitimately worry that AI might destroy mankind. And I wanted to ask Mike how that tension plays out in product design. On top of that, Mike has a pretty fascinating history. If you're a longtime tech fan, you likely know him as the co-founder of Instagram.
A company he started with Kevin Sisterum before selling it to Facebook, now Meta, or a billion dollars back in 2012. That was an eye-popping amount of money back then, and the deal turned Mike into founder royalty basically overnight. Mike left Meta in 2018, and a few years later he started to dabble in AI. But not quite the type of AI we talk about all the time on Dakota.
Instead, Mike and Kevin launched Artifact, an AI-powered newsreader that did some very interesting things with recommendation algorithms and aggregation. I was a big fan of Artifact, but ultimately it didn't take off like anyone wanted, and Mike and Kevin shut it down earlier this year. They sold the underlying tech to Yahoo. We talk a lot about decisions here in Dakota, so I wanted to know more about the decision to shut Artifact down, and then the decision to sell it to Yahoo.
And then of course I wanted to know why Mike decided to join Anthropic and work in AI. In industry with a lot of investment, but very few consumer products to justify it. Really, what is all of this for? What products does Mike see in the future that make all of the turmoil around AI worth it? How is he thinking about building them?
I've always enjoyed talking product that Mike and this conversation is no different, even if I'm still not really sure anyone's described what the future is going to look like. Okay, Anthropic Chief Product Officer Mike Krieger. Here we go. Mike Krieger, you are the new Chief Product Officer at Anthropic. Welcome to Dakota. Thank you so much. It's great to be here. Good to see you.
Yeah, I'm excited to talk to you about products. The last time I talked to you I was trying to convince you to come to the code conference. I didn't actually get to interview at COVID. I was trying to convince you to come and I was like, I just want to talk about products for someone as opposed to regulation. And you're like, yes, here's my product. I warned the audience. We're definitely going to talk a little bit about AI regulation.
It's going to happen. It seems as part of the puzzle. But you're building national products, and I have a lot of questions about what those products could be, what the products are now, where they're going. But I want to start sort of at the beginning of your Anthropic story, which is also the end of your artifact story. So people know you you started Instagram, you're at Meta for a while, you left Meta, and then you and Kevin Sister, I'm started artifact, which was a really fun newsreader.
And that's some really interesting ideas about how to surface the web and have comments and all that. And then you decided to shut it down. I think of the show as a show for builders, and we don't often talk about shutting things down. Walk me through that because it's as important as starting things up sometimes. Yeah, it really is. And the feedback we've gotten post shutdown for artifact was, you know, some mixture of sadness, but also kudos for calling it when you saw it.
And I think that there's value to also having a moment where you say, you know, we've seen enough here. For us, it was the product I still love and miss. And in fact, like I will run into people and I'll be like, I expect them to say, I love Instagram. They're always like, I love Anthropic like artifact. I really miss.
So clearly had a resonance with a too small, but very passionate group of folks. But we've been working on it kind of in the kind of full run of it about three years in the product. It would have been out for a year. And we're looking at the metrics, looking at growth, looking at what we had done. And we kind of had a moment where we said, are there ideas or kind of product directions that will feel dumb, not having tried before calling it.
And we had a list of those and that was kind of mid last year. And we basically took the rest of the year to work through those got to the area and said, yeah, those move the needle a little bit, but not enough to convince us that this was really on track to be, you know, something that the team and we were collected. We were collected. We're going to spend a lot of time on everything that coming years. And that was the right moment to say, all right, let's pause the step back.
Is this the right time to show it down? The answer was yes. It's actually, if you haven't seen the Yahoo, Yahoo basically bought it, took all the code and redid Yahoo news as artifact or the other way around. And it's very funny. Like you'll have a little bit of like a bizarre world moment. The first time like this is almost exactly like artifact. A little bit more purple.
Some different sources, but yeah, it was, it was definitely the right decision and like, you know, it's a good decision when you step back and the thing you regret is like, it didn't work out not that you had to make that decision or that you made that exact decision at the time that you did. There are two things about artifact. When I asked about it, I definitely want to ask about what it's like to sell something Yahoo in 2024, which edge unusual. It's not a thing that's been happening a lot.
And the first is that artifact was very much designed to surface web pages. It was predicated on a very rich web. And if there's one thing I'm worried about in the age of AI, is that the web is getting less rich. Right, more and more things are moving to closed platforms, more and more creators. They want to start something new. They end up on a YouTube where a TikTok or I don't know if there's like dedicated threads creators yet, but they're coming.
And it seemed like that product was chasing a dream that might be under pressure from AI specifically, but also just the rise of creator platforms more broadly. Was that a real problem or is that just an answer from the outside? I would agree with the assessment, maybe different root causes. I think what we saw, some sites were able to balance kind of a mix of subscription, you know, tasteful ads, good content.
I would put the verge at the top of that list. I'm not just saying that I'm talking to you. I could legitimately like every time we linked to a verge story from artifact, somebody clicked through it was like, this is a good experience. It feels like things are in balance. At the extremes, though, like local news, a lot of those websites for economic reasons have become sort of like you arrive.
There's a sign in with Google before you even read a single thing, a pop up to sign up to the newsletter. It's like before you've even consumed any content. I think that's like probably a longer run economic question of supporting local news, probably more so than AI, at least like that trend seems like it's been happening for quite a bit.
The creator piece is also really interesting where, you know, if you look at where things that are breaking news or at least like emerging stories are happening, they're often happening. It's an expo that went viral and we would often get an artifact is the summary roundup of the reactions to the thing that happened yesterday, which if you're relying on that, you're a little bit out of the loop already.
And so I think when I look at the where things are happening and where the conversation is happening, at least for the kind of cultural kind of core piece of that conversation, it's often not happening anymore on media properties. It is starting somewhere else and then getting aggregated elsewhere. And I think that just has a implication on a site or a product like artifact and how well you're ever going to feel like this is breaking news.
So for a time we move more to let's be more interest based, which funny enough is where Instagram, you know, it's hard to also very interest based less breaking news, but can you have a product that is just that? I think that was the struggle. You said media properties, I mean, you know, some media properties have apps, some are expressed only as newsletters. But I think what I'm asking about is the web. This is just me doing therapy about the web.
What I'm worried about is the web, right? The creators aren't on the web. We're not making websites and artifact was predicated on there being a rich web. Search products in general are sort of predicated on there being a rich and searchable web that will deliver good answers to some extent. AI products require there to be a new web because that's where we're training all our models. Did you see that? That okay, this is this promise of the web is kind of under pressure.
If all the new stuff is breaking on a closed platform, you can't search like a TikTok or an X or something else, or you can't index or surfacing old tweets is not really a great user experience. Actually building products in the web might be getting more constrained and not a good idea anymore. Yeah, even setting newsletters is a great example where some of the best stuff that I read is, you know, sometimes there's a equivalent substock site that you could go look at in some of the newsletters.
There's a purely an email, we even set up an email account that would just ingested newsletters to try to surface them, at least links from them. And it was, you know, the design experience is not there. I'd say that the thing I noticed on the open web in general and like as a long time fan of the web, somebody that was, somebody that was very online before being online was like a thing that people who are back in Brazil, like as a, as a, you know, preteen.
In a lot of ways, the incentives that have been set up around like, well, you know, a recipe won't rank highly if it's just the recipe. Let's tell the story about the life that happened up to leading to that recipe.
Like those trends, I feel like have been happening for a while and already leading to a place where they and consumer might be a user, but it is being intermediated already through, be it, you know, a search engine and optimized for that findability or optimized for what's going to get shared a bunch or what's going to get, you know, the most attention. Like that flow, I mean, newsletters are in podcasts are two ways that have probably most successfully broken through that.
I think that's been an interesting direction. But in general, I feel like there's been probably a decade long sort of at risk for the open web in general in terms of like, what is the actual intermediation that's happening between like I am trying to tell a story or I'm trying to talk to somebody and somebody is like receiving that story and like all the roadblocks along the way.
We just make that more and more painful and it's no surprise then that, hey, I can actually just open my email and like get the content that feels better in some ways, although also not great in a bunch of other ways. That's how I've watched it and it's I would be call it not a healthy place for it is now. Yeah, the way that we talk about that thesis on Dakota most often is that people build media products for the distribution.
And so podcasts famously have open distribution is Narcess feed. Well, it's like an RSS feed, but there's like Spotify's ad server in the middle. I'm sorry to everybody who gets whatever ads that we put in here. But it's still like it is core. It's an RSS product. Newsletters still at its core and IMAP product right at an open mail protocol product.
The web is like search distribution. So we've optimized to one thing. The reason I'm asking this and I want to come back to this theme a few times that felt like artifact was trying to build a new kind of distribution. But the product it was trying to distribute was web pages which were already overtly optimized for something else.
I think that's a really interesting assessment. It's actually funny watching the Yahoo version of it because they've done the content deals to get the like more slim down pages.
And don't they have fewer content or fewer content sources. The experience of tapping on each individual story I think is a lot better because like those have been now formatted for a distribution that is, you know, I guess linked to some paid acquisition, which is different than what we were doing, which is like here's the open web will give you. You know, words and all and link directly to you, but I think your assessment feels right.
Okay, so that's one I want to come back to that theme and I really wanted to start with artifacts in that way because it feels like you had an experience in one version of the internet that is maybe under pressure. The other thing I want to ask what artifacts you and Kevin your co founder both once told me that you had like big ideas like scale ideas for artifact and they had this like big idea and you wouldn't tell me what it was.
So, over now, what was it for us it was I mean two things that I remains like sad that we didn't get to see through one was the idea of good recommender systems underlying multiple product vertical so news stories being one of them but can believe or I had the belief that maybe somebody will build out that if you understand yourself well through this is going to stand you well through how you're interacting with news stories how you're acting with content.
And is there another vertical that could be interesting is it around like shopping is it around like local discovery is it around people discovery all these different places because for all the promise and I'll separate maybe machine learning and I realize that's a shifting kind of definition throughout the years.
Let's call like for the purpose of our conversation like recommender system machine learning systems for all their promise my day to days actually not filled with too many good instances of that product so the big company idea was can we bring. Instagram type product thinking to recommender systems and combine those two things in a way that like creates new experiences that aren't beholden to your existing friend and follow graph with news being an interesting place to start.
You highlight some good problems about the content but the peeling part was we're not trying to solve the two sided marketplace all the ones turns out half that marketplace was already search pilled and had its own problems but at least there was there was the other side as well.
Other piece like even within uses really thinking about how do you eventually open this up I think sub stack is pursuing this from a very different direction but open this up so creators can actually be writing content and understanding distribution natively on the platform.
I think I've it was like every platform eventually wants to get to this as well when you watch the sort of closest analogs in China like to T.O. they started very much like crawl the web have these you know eventual publisher deals and now it is like I would guess 80 to 90% first party content.
There's economic reasons why that's nice and some people make their living writing articles about local news stories on to T.O. including one of our engineers I think sister close family member but the other side of it is that content can just be so much more optimized for what you're doing actually at code.
I met an entrepreneur who's creating a sort of new like novel media experience that was like very like if stories met news met mobile what would it be for most news stories and I think for something like that succeeded also needs distribution that has that as like the native distribution type so recommendation systems for everything and then a like primarily recommendation based first party content writing platforms are like the two ideas are like one day for somebody.
Last artifact question you shut it down and then there was a wave of interest and then I think publicly one of you said oh there's a wave of interest we might flip it and then it was y'all tell me about that process.
I think there were a few things that we wanted to align like I think we'd worked in that space for long enough that whatever we did we sort of wanted to kind of tie a bow around it and move on to whatever was next and so was one piece and the other piece was like I wanted to see the ideas live on in some way.
So like there was kind of a lot of conversations around like well like what would it become under like different conversations and the Yahoo one was really interesting and I would admit to being like pretty unaware of what they were doing beyond like I was still using I who finance and like my fantasy football league but beyond that it was like not familiar what they were doing they're like no we want to take it and we think in two months we can relaunch it as Yahoo news and I was thinking like that sounds pretty crazy like that's a very short timeline for like and it could be so not familiar with and they had our access to us.
And like we're basically like helping them out almost full time but that's still a lot and they actually basically pulled it off I think it was 10 weeks instead of eight eight weeks but I think there is like a new found energy in there to be like all right like what are the properties we want to build back up again and do it so I fully admit coming in with a bit of advice like I don't know what like what's left to Yahoo like what's going to happen here and then the tech teams actually like like bit into it with the open mouth it's kind of a gross metaphor but they like white and all in.
And then they got it shipped and I'll routinely text adjustin was our Android lead he's that anthropic now he actually came here before I did and I like find little details now he's like oh they kept I spent a lot of time with this like 3d spinning animation when you got to a new reading level it's like way too much time in this like beautiful reflection specular highlighting thing.
Probably misprioritize that week but they kept it but now it goes yeah when you do it was like that's that's pretty pretty on brand it was a really fast and you experience but you know it gets to live on and I think it will probably a very different future than what we were envisioning but I think some of the some of the core ideas are there on like hey like what would it means actually try creative personalize new system that was not that was really decoupled from any kind of existing follow graph or what you do.
So it's like a follow graph or what you were seeing already on something like Facebook. But they the best better was the decision yahoo will deploy this to the most people at scale was it they're offering us the most money how do you choose.
There was this optimization function I would say the three variables were like deal was attractive or attractive enough our personal commitments post you know transition were pretty light which I liked and they had reached like Yahoo news because like 100 million monthly still so it was like reach. So commitment but enough that we felt like it could be successful and then like they were like in the right space at least on the bid size.
This sounds like the dream I used to have this on the walk away it's much money okay yeah make sense I was just wondering if that was it or whether it was like it wasn't as much money but they had the biggest platform and you wanted to because Yahoo is deceptively still huge.
Yeah deceptively still huge I think under new leadership and with a lot of excitement there and no for me it was it really changed that not it was not like a huge exit or like I would not call it a super successful outcome but the fact that I feel like that chapter closed in a nice way and then we could like move on without like wondering if we should have done something different when we closed it like it just I slept much better at night the q one of this year because of it.
We need to take a quick break we're back. Support for decoder comes from Mint mobile. If you're looking to reduce your monthly expenses one of the first things you can save on is probably sitting in your pocket right now maybe even playing this podcast it's your phone.
While I was pay way too much for wireless service which is why meant mobile is doing things differently when you switch to my mobile you can get three months of premium wireless service for just 15 bucks a month all of men's phone plans come with high speed 5G data and unlimited talk and text plus you can keep your same phone and phone number on your new meant plan to get this new customer offer and your new three month premium wireless plan for just 15 bucks a month you can go to meant mobile dot com slash decoder that's meant mobile dot com slash.
You can cut your wireless bill to 15 bucks a month at meant mobile dot com slash decoder 45 dollar upfront payment required equivalent to $15 per month new customers on first three month plan only speed slower about 40 gigabytes on unlimited plan additional taxes fees and restrictions apply cement mobile for details. Support for decoder comes from Vanta customers aren't going to support your business they don't trust you it doesn't matter if you're starting to get a new customer.
It doesn't matter if you're starting or scaling your company security program demonstrating top notch security practices and establishing trust is more important than ever Vanta can help you do both Vanta automates compliance for sock to ISO 27001 and more saving you time and money while helping you build customer trust
plus you can streamline security reviews by automating questionnaires and demonstrating your security posture with a customer facing trust center all powered by Vanta AI over 7000 global companies like Atlassian flow health and Quora use Vanta to manage risk and prove security in real time get $1000 off Vanta when you go to Vanta dot com slash decoder that's Vanta dot com slash decoder for $1000 off.
Support for decoder comes from LinkedIn when you're hiring for your small business you want to find quality professionals that are right for the role that's why you might want to check out LinkedIn jobs LinkedIn jobs is the tools to help find the right professionals for your team faster and for free LinkedIn isn't just a job board LinkedIn helps you hire professionals you can't find anywhere else
even those who aren't actively searching for a new job but might be open to the perfect role and according to the company in a given month over 70% of LinkedIn users don't visit other leading job sites so if you're not looking on LinkedIn you're looking in the wrong place you're wearing so many hats already and it takes time and resources to find the right person to hire but according to LinkedIn 86% of small businesses get a qualified candidate within 24 hours
hire professionals like a professional on LinkedIn post your job for free at LinkedIn dot com slash partner that's LinkedIn dot com slash partner to post your job for free terms and conditions apply we're back with anthropic chief product officer might creeper. All right, so that's that's that chapter the next chapter is you show up as the chief product officer and
anthropic what was that conversation like because in terms of big commitments hairy problems are we going to destroy the way it's all right there maybe it's a lot more work how do you make the decision to go to ontropic the top level decision was what to do next at all I admit to having a bit of like an identity crisis at the beginning of the year
I've only really know how to start companies and actually more specific I probably know how to start companies with Kevin like we make a very good company and I was looking at like what are the aspects of that that I like I like knowing the team from day one I like having a lot of autonomy I like having partners that I really trust I like working on big problems with a lot of like open space
and at the same time I was like I do not want to start another company right now like just went through the ringer on that three years didn't okay outcome wasn't the outcome we wanted knows that they're going like I want to work on interesting problems at scale at a company that I started but I don't want to start a company is like I kind of like swore a bit I was like what do I do next I definitely knew I did not want to just invest not that investing is a just thing but it's just different I'm like a builder heart as you all know and so I was like this is going to be really hard maybe I need to take some time and then start a company and then I got introduced to the company and I'm going to be able to do that.
And then I got introduced to the ontropic folks via the head of design here who somebody I actually built my very very first iPhone app with in college so I'm going for a long time.
I started talking to them and I realized they you know the research team here is incredible but the product efforts were so nascent that you know I'm not going to kid myself that I'm coming as a co founder like the company's around for a couple years they're like already like sort of company values and way things are working and the cold themselves ants maybe I would have advocated for a different like employee nickname but it's fine that ship has sailed
but I felt like there was a lot of product greenfield here and a lot of things to be done and built so there was that comment it was the closest combination I could have imagined to the team I would have wanted to have built had I been starting a company enough to do like so much to do that I'm like I wake up every day both excited and
I was not told by how much there is to do and like already momentum and scale so I could feel like I was going to hit the ground running on something that had a bit of tailwinds where I felt like a lot of artifact for like headwinds somewhat outside of our control and that was the the sort of combination so the first was
like big decision like what do I do next and then the second one was like all right is ontropic the right place for it was a sort of thing where every single conversation I had about them I'd go back to my iPhone be like I think this could be it. I wasn't thinking about joining a company that's already running like crazy, but I wanna be closer to the Corei tech. I wanna be working interesting problems.
I wanna be building, but I wanted to feel like as close as to a co-founder or kind of situation as I could, and I think Daniela is the present here. Maybe he was trying to sell me, but she's like, you feel like the eighth co-founder that we never had that was like our product co-founder, which is amazing that they have seven co-founders, and none of them are like the product co-founder. But whatever it was, it sold me, and it was like, all right, I'm gonna jump back in.
I'm excited for the inevitable Beatles documentaries about how you're the fifth-pital, and then we can argue with that forever. The peak best event I hope not. It's the rainbow that comes in here. 2024 with our audience, as young as it is, that might be a deep cut, but I encourage everybody to go search for a peak best and not much of an argument that is. Let me ask you just two big picture questions about working an AI generally.
You started Instagram, you're deep with creatives, you built a platform creatives, you care about design, obviously, with that community, AI is a moral dilemma. Like people are upset about it. I'm sure they will be upset that I even talk to you. We had the CEO of Adobe on Stockup Firefly, and that is some of the most upset emails we've ever gotten. How did you evaluate that?
And when I go work in this technology that is built on training, and install all this stuff on the internet, and people have really, really hot emotions about that, and there's a lot, we've got to talk about lawsuits. There's lawsuits, there's copyright lawsuits. How are you thinking about that?
I have some of these conversations and my good friends as a musician down in LA, and he comes up to the bear, and I've used on tour, and we'll have one hour deep conversation over Pupu says, over what is AI in music, and how do these things connect, and where do these things go, and I think it was interesting, and so it's like, what parts of the creative process or which pieces of creative output are most affected right now, and then you can kind of play that and see how that's going to change?
I think that question is a big part about why I ended up in Anthropic, if I was going to be an AI. I think a couple of things, like, obviously the written word is really important, and like there's so much that happens in text, like definitely do not make this sound like text is less creative than others, but the fact that they've chosen,
we've chosen to really focus on text and image understanding, and like keep it to text out, and text out that is like, supposed to be something that is like tailored to you rather than reproducing something that's already out there, I think reduces some of that space significantly,
where you're not also trying to produce like Hollywood type videos or high fidelity images or sounds and music, and some of that I think is a research focus, some of that's a product focus, but I think that also the space of thorny questions is still there, but also a bit more limited in those domains or outside of those domains, and more purely on text and code and those kinds of expressions. So that was a strong sort of contributor to me wanting to be here versus other spots.
There's so much controversy about where the training data comes from, where does anthropics training data for a cloud come from? Is it a script in the web? Like everybody else? Script in the web, we respect robots.txt. We have a few other data sources that we license and work with with folks kind of separately for that, let's say the majority of it is web crawl done in a web crawl respectful way. We were respecting robots.txt before everyone realized that you had to start respecting robots.txt.
We were respecting robots.txt beforehand, and then in the cases where for whatever reason, it wasn't getting picked up correctly, we've since corrected that as well. What about YouTube Instagram? Are you on our... Trimming on the slides? Yeah, no, that's, you know, that's what I think about the players in this space, there are times where I'm like, oh, it must be nice to be nice to be in an inside met.
I don't actually know if they true it on Instagram content if they're talking about that, but like there's a lot of good stuff in there and same with YouTube. I mean, that's like close friend of mine's at YouTube. That's like the repository of collective knowledge of how to fix any dishwasher in the world and like, you know, people ask that kind of stuff. So, mostly over time, what those end up looking like. You don't have a spare key to the meta data center? The Instagram servers?
I've did on the way out. When you think about that general dynamic, there's a lot of creatives out there who perceive the IATV at a risk of their jobs or perceive that there's been a big taking. I'll just ask what the lawsuit, that there's a lawsuit against Anthropic. It's a bunch of authors who say that the model that Clawed is legally trained against their books. Do you think there's a product answer to this?
This is kind of, it's gonna lead into my second question, but I'll just ask broadly, do you think you can make a product so good that people overcome these objections? Because that is kind of the vague argument I hear from the industry, right?
Like right now, we're seeing a bunch of chatbots and you can make the chatbot fire off a bunch of copyrighted information, but there's gonna come a turn when that goes away because the product will be so good and so useful that people will think it has been worth it. And I don't see that yet. I think that's a lot of the heart of the copyright lawsuits beyond just the legal pieces of it, is that the tools are not so useful that anyone can see that the trade is worth it.
Do you think there's gonna be a product where it is obvious that the trade is worth it? I think it's very use case dependent and like what the kind of question that we drove our Instagram team insane with is we would always ask them, like, well, what problem are you solving? And like general, like text box interface that can answer any question is like a technology and like the beginnings of a product, it's not like a precise problem that you are solving.
And I think grounding yourself in that maybe helps you get to that answer, which is like if what you are trying to build, for example, like I use cloud all the time for like code assistance, like that is solving a direct problem, which is I'm trying to like ramp up on product management here and like get our products like underway and like also like work on a bunch of different things and to the extent that I have any time to be in like pure build mode, I wanna be really efficient on it.
Like that very directly connected problem and like total game changer just to myself as a builder and like allows me to focus on different pieces as well. So I can do somebody right before this call as well, like they are now using cloud a bunch to soften up or otherwise change their like long missiles on Slack before they send them. And so like this like human editor kind of piece, like that solves their like kind of immediate problem.
Maybe they need to like tone down and like take a, you know, chill out a little bit before sending a Slack. But like I think again grounding it in use, because that's what I'm trying to really focus on our products here.
It's like if you try to boil the ocean, I think you end up in actually really adjacent to these kinds of like most ethical questions that you raise, which is like if you're in anything box, then like everything is like, you know, potentially either under threat or like adjacent or problematic. I think there's real value to saying, all right, like what are the things we want to be known to be good for?
And like I don't know you today that the product actually does serve some of those like well enough that it's like I'm happy it exists and I think folks are in general. And then I think over time, if you look at things like writing assistance more broadly for like novel length writing, I think the jury still out on that. My wife was doing a kind of a prototype version that I've talked to other folks.
And like it's, you know, like the, our bottles are quite good, but they're like not great at keeping track of characters over like book length sort of pieces or you know, reproducing particular things. So yeah, I would ground that in like, what can we be good at now? And then like let's as we move into new use cases, navigate those carefully in terms of like who is actually using it and are we providing value to the right folks in that exchange?
Let me ground that question in a more specific example, both in order to ask you a more specific question and also to calm the people who are already draft to me angry emails. TikTok exists. TikTok is maybe the purest garden of innovative copyright infringement that the world has ever created. Like I've watched entire movies on TikTok and it's just because people have found ways to bypass their content filters.
I do not perceive the same outrage at TikTok for copyright infringement as I do with AI. Maybe there's someone's really mad. I watched entire like 19 A.B.'s episodes of this hold house on TikTok accounts that are literally labeled like best of old this hold house. I don't think Bob Villas getting royalties for that. But it seems to be fine because TikTok as a whole has so much utility and people perceive even the utility of watching like old 19 A.S. episodes of this hold house.
And there's something about that dynamic between this platform is going to be loaded full of other people's work and we're going to get value of it. That seems to be rooted in the fact that most say I'm looking at the actual work. I'm not looking at some 15th derivative of this hold house as expressed by an AI chatbot. I'm actually just looking at a 1980s version of this hold house. Do you think that AI chatbots can ever get to a place where it feels like that?
Where I'm actually just looking at the work or I'm providing my attention or time or money to the actual person who made the underlying work as opposed to we trained it on the open internet and now we're charging 20 bucks and the 15 steps back. That person gets nothing.
Yeah, I think to ground in the TikTok example as well, I think there's also an aspect where, you know, if you imagine the future of TikTok, probably most people say like, well, maybe they'll add more features and I'll use it even more. I don't even know what the average time spent. It's like it definitely equals to be ever-hand-sugger. That's the end of the economy. Yeah, exactly. Like full on TikTok. Like build AGI, create universal, you know, prosperity so we can spend time on TikTok.
That would not be my preferred future outcome. But I guess you could construct that if you wanted to. But I think it's, the future feels I would argue a bit more noble in the TikTok use case. And I think in the AI use case, it's a bit more like, well, where does this accelerate, you know, too? And where does this eventually complement me? Where does it supersede me?
And I think I would posit that a lot of the sort of AI-related anxiety can be tied to that sort of like, the fact that like three or four years ago, this technology was radically different. Three or four years ago, like TikTok existed and it was already, you know, kind of on that trajectory. And even if it wasn't there, you could kind of have imagined it from where YouTube and Instagram were. And if they had like an interesting baby with Vi and like it might have created TikTok.
So I think it is partially because the platform is so entertaining, I think that's peace. I think the like connection to real people is an interesting one and I'd love to like spend more time on that one because I think that's an interesting kind of piece of the AI ecosystem. And then the last piece is just like the no ability of where it goes. Or like probably the three that it grounds it more.
And to have it started, it was probably the original we're all quitting open AI to build a safer AI company. Now there's a lot of them. My friend Casey makes a joke that every week someone quits to start to say yet another safer AI company. Matt Levine has a great like guy. It's a universal sorting function where it's actually just going to distill it on either side. Is that expressed in the company? Like is that I mean, obviously Instagram had big moderation policies.
You thought about it a lot. It is not perfect as a platform or a company but it's certainly at the core of the platform. Is that at the core of Anthropic in the same way that there are things you will not do? Yeah, deeply. And I saw it in a week too. So I'm like a I'm a ship oriented person, you know, like even with Instagram like early days, it was like, let's not get bogged down and like building the 50 features is build the two things well and get it out as soon as possible.
And some of those like decisions to ship a week earlier and you know, not have every feature. I think we're actually existential to the company. So like I feel that in my bones. So week two I was here, our research team put out paper on interpretability of our models and kind of buried in the paper was this idea that they found a feature inside one of the models that if amplified would make Claude believe it was the Golden Gate Bridge.
Not just like kind of believe it like prompt it like, hey, you're the Golden Gate Bridge, but like deeply like in the way that my five year old will make everything about turtles, like it made everything about the Golden Gate Bridge. I car you today like I'm feeling great. I'm feeling international orange and like I'm feeling in the foggy, you know, Claude San Francisco.
And somebody in our sock was like, hey, should we like build and like release Golden Gate Claude is almost like a off hand comment. And a few of us were like, absolutely yes. Like I think it's like let's lighting people. Two reasons one, this is actually quite fun. But two, like getting people actually have some first hand contact with what a model that has had some of his parameters tuned without work was valuable.
So from that IRC message to having Golden Gate Claude out on the website was I think like basically 24 hours. And in that time we had to do some product engineering some model work, but we also ran through like a whole battery of safety evals. And I think that was like just an interesting piece where you can move quickly and not every time can you do only a 24 hour safety of all. There's like lengthier ones for new models. This one was a derivative so it was easier.
But the fact that that wasn't even a question like, wait, should we run safety evals? Like no, absolutely. Like that's what we do before we launch models. And we make sure that it's both safe from the things that we know about. And let's also model out what are some novel harm. So the bridge is unfortunately associated with suicides. Like let's make sure that the model doesn't guide people in that direction. And if it does, let's put in the right safeguards.
So that's kind of a like trivial example. Because it's like an Easter egg we ship for basically two days and then wound down. But it was like very much at its core there. Even as we prepare model launches, again, I have the urgency like, let's get it out. Like I want to see people use it. And then you like actually do the timeline.
We're like, well, from the point where the model is ready to the point where it's released, like there are things that we are going to want to do to like make sure that we're in line with our responsible scaling policy. And it's what I appreciate about what the product and the research team is here is that it's not seen as, oh, yeah, it's like standing in our way. It's like, yeah, that's why this company exists. I don't know if I should share this, but I'll share it anyway.
Like at our second all-hand since I was here, somebody who's very early here stood up and was like, if we succeeded at our mission, but the company failed, I would see this as an outcome, as a good outcome. And like, I don't think you would hear that. You definitely would not hear that at Instagram.
Not because we were bad people, but it was just like not in the like, you know, if we like succeeded in helping people see the world in more beautiful visual way, but like the company failed, I would be super bummed. And like, I think a lot of people here would be very bummed too, but like that ethos is like quite unique. I think this brings me to the decoder questions. Anthropic is what's called the public benefit corporation. You've got a trust underlying it. You are the first head of product.
You've described the product and research teams as being different than there's a safety culture. How does that all work? How's Anthropic structured? Broadly, we have our research teams. We have the team that sits most closely between research and product, which is a team thinking about inference and model delivery and everything that it takes to actually serve these models because that ends up being the most complex part in a lot of cases. And then we have product.
I would say if you like just sliced off the product team, it would look similar to product teams that, you know, most tech companies with a couple of tweaks. One is we have a labs team and the purpose of that team is to basically stick them in as early in the research process possible with designers and engineers to start prototyping at the source rather than wait until research is done. I think that can go into why. I think that's a good idea.
That's a team that got spun up right after I joined. And then the other team we have is our research PM teams because ultimately, we're delivering the models using these different services. And the models have capabilities like what they can see while in terms of multimodal or what type of text they understand. Even understanding what languages are they need to be good at. Having end user feedback tie in all the way back to research ends up being very important.
And it prevents it from ever becoming this like, almost like ivory tower. Like we built this model and it's like, is it actually useful? Like we say we're good at code. Are we really how work startups that are using it for code, giving us feedback on, that's good at like these Python use cases. It's not good at this autonomous thing. Great. That's feedback that's going to like channel right back. And so those are the like two distinct pieces.
But I'd say like within product, you know, and I guess a clickdown because I know you get really interested in under code around like team structures. We have apps, just Cloud AI, Cloud for Work, and we have developers, which is the API. And then we have our Kookie Labs team. And is that all just, that's the product side. The research side is that the side that works on the actual models? Yeah, that's the side on the actual models.
And that's everything from like researching like model architectures, figuring out how these models scale. And then a strong like red teaming safety alignment team as well. And that's like a, you know, another component that is like deeply in research. And I think some of the best researchers end up gravitating towards that as they see like that's like the most important thing they could work on. How big is Anthropical? How many people? We're north of 700 at last count.
And what's the split between that research function and the product function? Product is really, I think probably, let's say double, but almost double. Product is just north of 100. So the rest is, you know, everything between, we have sales as well, but research, like the fine-tuning product research, inference, and then the safety and scaling pieces as well. So we are, I described this like within a month of joining as like, like those crabs that have like one super big claw.
Like we're really big on research. And like product is this like very small claw still, or the other metaphor I've been using is like, you know, your teenager, like some of your limbs have grown faster than others and some are still catching up. The kind of like, crazy or bad is, I would love for us to not have to then like double the product team. I'd love for us instead to find ways of using Quad to make us more effective at everything we do on products so that we don't have to double.
Because, you know, every team struggles with this. So this is not a novel observation, but I look back at Instagram. When I left, we were 500 engineers. Were we more productive than at 250? Almost certainly not. Were we more productive than at like 125 to 250? Marginally, you know, I had the really depressing interview once when I was trying to hire a VP of engine. I was like, how do you think about developer efficiency and like team growth?
And he's like, well, if every single person I hire is at least like net contributing, like something that I'm that succeeding. Even if it's like sub number like a one to one ratio or before, like, that's depressing. And like just like, I think it creates all this other like swirl around like just team culture dilution, et cetera. So that's like something I'm personally passionate about.
Like, how do we take what we know about how these models work and actually make it so the team can stay smaller and more tight in it? Yeah, Tony Fidel, who did the iPod, he's been on Dakota before, but when we were starting the verge, he was basically like, look, you're gonna go from the actual numbers where he sets something like, you're gonna go from like 15 or 20 people that it's a 50 or 100 and then nothing will ever be the same. And I've thought about that every day since.
Because we're always right in the middle of that range. And I'm like, when is the tipping point? Where does moderation live in this structure? You mentioned safety on the model side, but you're out in the market building products. You've got a, what it sounds like a very horny, golden, great bridge people can talk to. Where in your running test there? Sorry, that's just my every, every conversation has one joke about how horny the models are. Where does moderation live, right?
It Instagram, there's the big centralized meta trust and safety function at YouTube. It's in the product org under NeomoHun there. Where does it live for you? I would probably put it in three places. One is in the actual model training and fine tuning where part of what we do on the reinforcement learning side is saying we'd like to find a constitution for like how we think clod should be in the world and that gets baked into the model itself.
Like early, like before it, you know, before you hit the system prompt before people are interacting with it, that's getting encoded into that around like, how should, you know, how should it behave? What should it be, you know, willing to answer and be willing to chime in on where should it not be? And that's, I think, very linked to the responsible scaling piece. The next is in the actual system prompt.
So we actually, like in the spirit of transparency, just started publishing our system prompts. People would always figure out like clever ways to try to reverse them anyway. And we're like, that's going to happen. Why don't we just actually treat it like a change log? So just be transparent. So it's, I think this is last week, you can go online and you actually see what we've changed.
That's another place where there's like additional kind of guidance that we give to the model around how it should act. Of course, like ideally gets baked in earlier, people can always find ways to try to get around it, but we're fairly good at preventing jail breaks. And then the last piece is where our trust and safety team sits. And that's, you know, the trust and safety team is the closest team.
At Instagram, we called it at one point trust and safety and other point while being, but that's same kind of like last mile remediation piece. And I would kind of bucket that work into two pieces. One is, what are people doing with cloud and publishing out to the world? So with artifacts, it was like the first product we had that any amount of social thing at all, which is like you create an article, in fact, it share and actually put that on the web.
And that's like a very common problem and kind of like shared content. Like I lived shared content for almost 10 years at Instagram and here I was like, wait, like do people have user names? Like how do they get reported?
And it was like, we ended up, that we delayed that lunch for like, but we can have to make sure we had the right just TNS, trust and safety pieces around moderation, reporting, cues around taking it down, like limited distribution, figure out how would it means for the people on team's plans
versus individuals, like, some of those things where I got very excited, like let's ship this, like sharing artifacts and like, you know, we later like, okay, now we can ship it, we got it like actually sort of these things out. So that's on the like content moderation side I would say. And then on the response side as well, we also have additional pieces that sit there that are either around preventing the model from reproducing copyrighted content.
It's like that's something that we want to prevent as well from the completions. And then other harms that are, you know, against the way we think the model should behave and, you know, should ideally even caught earlier, but you know, if they aren't, then they can get caught at that last mile. So it's like our head of trust and safety is talking to him last week.
He calls it the Swiss cheese method, which is like, no one layer will catch everything, but ideally enough layer stack will like catch a lot of it before it reaches the end. I think, you know, I'm very worried about AI generated a bakery across the internet. This morning I was looking at like a Denver Post article about a fake news story, about a murder that people are calling the Denver Post to find out why they hadn't reported on it, which is in its own way the correct outcome, right?
Like they heard a fake story, they called it trusted source. At the same time, the Denver Post had to go run down this like fake murder true crime story because an AI had just generated it and put it on YouTube. That's, and it all seems very dangerous to me. There's the death of the photograph we talk about it all the time. Like are we gonna believe what we see anymore? Where do you sit on that?
What is, what is, and Thropic is obviously very safety-minded, but we are still generating content that can go haywire and all kinds of ways. Yeah, and I would like, maybe like split like internal trend, Thropic and just like what I was just seen out in the world.
Like the GROC image generation stuff that came out like two weeks ago was fascinating because it's almost like, you know, because I think there was a, maybe they've introduced some at launch, it felt like there was almost, it was a total free-fraud. It's like, do you wanna see the Kamala with a machine gun? It was like, it was, you know, crazy stuff.
I go between believing that like actually having examples like that in the wild are actually helpful and almost like inoculating like what you take for granted as a photograph or not, you know, or even a video or not, I don't think we're far from that as well. And like getting, you know, maybe it's calling the Denver Post or like a trusted source or maybe it's like creating some hierarchy of trust that we can go after. You know, there's no easy answers there as well.
But like that's, I would say like a industry, almost like, it's not so grandiose. Society, why I think that we're gonna reckon with as well. I'm like the, you know, image, image and video pieces. And then on text, I think like what changes with AI is like the mass production. So one thing that we look at is any type of coordinated effort. We looked at this as well on Instagram.
Like, individual levels, it might be hard to catch the one person that's like commenting on a, you know, Facebook group trying to start some stuff, you know, is that probably indistinguishable from a human? But we're really looked for like networks of coordinated activity and we've started, not started, we've been doing the same as well on the anthropic side, which is looking at this is gonna happen more on thing on the API side if it happens rather than on cloud.ai.
I think there's just more effective official ways of doing things scaled. But when we see spikes in activity, that's when we can go in and say, all right, like, what is this end up looking at? Let's go learn more about this particular API customer, you know, we need to have a conversation with them. What are they actually doing for what is the use case?
I think it's important to like be clear as a company, like what you consider bugs versus features, you know, and like, it would be an awful outcome if anthropic models were being used for any kind of like coordination of fake news and, you know, election interference type things. And so we've got the TNS teams actively working on that and to the extent that like we find anything, like that'll be a combo, additional model parameters, plus trust and say, if you had to shut it down.
We need to take another quick break. We'll be right back. Support for this podcast comes from Huntress. If you're a small business owner, the threat of hackers isn't just a threat, it can affect your livelihood. Small businesses are easy targets for hackers and Huntress wants to give businesses the tools to help. Huntress is where fully managed cybersecurity meets human expertise. They offer a revolutionary approach to manage security that isn't all about tech.
It's about real people providing real defense. When threats arise or issues occur, their team of seasoned cyber experts is ready 24 hours a day, 365 days a year for support. They provide real-time protection for endpoints, identities, and employees, all from a single dashboard. Their cutting-edge solutions are backed by experts who monitor, investigate, and respond to threats with unmatched precision. Now you can bring enterprise-level expertise without needing a massive IT department.
Huntress can empower your business as they have done for over 125,000 other businesses. Let them handle the hackers so you can focus on what you do best. Visit huntress.com slash decoder to start a free trial or learn more. When it comes to business, you know this podcast has you covered. But who do you turn to when you need smart financial decisions? If your answer is our sponsor, Nerdwallet, then you're absolutely right. And if it's not, allow us to change your mind.
Not only have the nerds over at Nerdwallet spent thousands of hours researching and reviewing over 1,300 financial products, but they have the tools you need to make smarter decisions. Looking for a credit card? Go beyond the basic comparisons. At Nerdwallet, you can filter for the features that matter to you and read in-depth reviews. Ready to choose a high yield savings account? Get access to exclusive deals and compare rates, bonuses, and more.
House hunting, view today's top mortgage rates for your home, sweet home. Make the nerds your go-to resource for smart financial decisions. Head to nerdwallet.com forward slash learn more. Nerdwallet, finance smarter. Nerdwallet compare and cooperate. NMLS 1617539. This episode is brought to you by Shopify. Forget the frustration of picking commerce platforms when you switch your business to Shopify. The global commerce platform that supercharges your selling, wherever you sell.
With Shopify, you'll harness the same intuitive features, trusted apps, and powerful analytics used by the world's leading brands. Sign up today for your $1 per month trial period at Shopify.com slash tech. All lowercase. That's Shopify.com slash tech. We're back with an anthropic chief product officer Mike Krieger to discuss where he thinks generative AI is going next and whether it's somewhat dangerous.
With apologies to my friends at HardFord Casey and Kevin, they ask everybody what their P-Dume is. So I'm going to ask you that. But that question is rooted in AGI. Like what is the chances we think that they'll become self-aware and kill us all? Let me ask you a variation at first, which is what if all of this just hastens our own like information apocalypse and we end up just like taking ourselves out? Do we need the AGI to kill us all? Or are we headed towards information apocalypse first?
Yeah, I think the information piece, like living in a society with this amount of internal without AI already. I think there's already just like, just take, you know, textual, primarily textual social media. I think some of that happens on Instagram as well, but it's easier to disseminate when it's just, you know, a piece of text that you can rough, has already been like a journey out in the last 10 years. But I think it comes and goes.
I think we go through waves of like, oh man, this is like how are we ever going to get to truth? And then good truth tellers emerge. And I think people flock to them and I think some of them are traditional like sources of authority and some of it are just people that have become trusted. And then we can get a separate conversation on verification and validation of identity. But I think that's an interesting one as well. But I think I'm an optimistic person as hard, if you can't tell.
And I think that part of it is my belief from an information sort of chaos or proliferation piece of our abilities to both learn, adapt, and then like grow the right mechanisms in place. So I remain optimistic that we'll continue to figure it out on that front. The AI component I think increases the volume and the thing you would have to believe is that it could also increase some of the parsing. I'm gonna say it was a William, oh, it was a Neil Steven Snuffle that came out a few years ago.
He was a William Gibson one. It was one of the two of them had a concept of, in the future perhaps you'll have a social media editor of your own and that gets deployed, you know, as a sort of gating function between all the stuff that's out there and what you end up consuming. I like, there's some appeal to that to me, which is, you know, if there's a massive amount of data to consume, probably not most of it is gonna be useful to you. And I even try to scale back my own information diet.
And to the extent that there are things that are interesting, you know, I'd love the idea of like, go read this thing in depth, like this is worthwhile for you, Mr. Baby. Well, let me bring this all over the background. We started talking about recommendation algorithms and now we're talking about classifiers and having filters on social media to help you see stuff. You're on one side of it now, right? Colleges makes the things and you try not to make bad things.
The other companies, Google and Meta, are on both sides of the equation, right? We're racing forward with Gemini, we're racing forward with Lama, and then we have to make the filtering systems on the other side to keep the bad stuff out. And it feels like those companies are at decided cross purposes with themselves.
Right, I think an interesting question is, then I don't know what the current, you know, do you ask out a mystery what he would say, like what percentage of Instagram content could, would and should be, you know, AI generator at least, AI, I say. Now from your seated ontropic, knowing how the other side works, is there anything you're doing to make the filtering easier?
Is there anything you're doing to make it more semantic, make it more understandable what you're looking at to make it so that the systems that sort the content have an easier job of understanding what's real and what's fake? Yeah, there's, on the research side, and now outside of AI, I actually teased like active work on, like what are the techniques that could make it more detectable? Is it watermarking, is it probability, et cetera?
And I think that open question, but also open, very like very active area of research as well. I think the other pieces, well actually, I would break down to three, there's like what we can do from like, like detection and watermarking, et cetera, aside. On the model piece, also have it be able to express some uncertainty a little bit better. Like I actually don't know about this.
Not willing to speculate, or I'm not actually willing to help you filter these things out because I'm not sure, I can't tell which of these things are true. And also open area of research in a very interesting one as well. And then the last one is like, if you're met up, if you're Google, maybe the bull case is that you're, if primarily you're surfacing content that is generated by models that you yourself are building, there is probably a better closed loop that you can have there?
I don't know if that's going to play out or whether people will always just flock to whatever the most interesting image generation model and create it and go and go publish it and blow that up. I don't know, I'm not sure. Well, I think that jury's still out on that one. But I would believe that the built-in tools against Instagram, 19 plus percent of photos that were filtered were filtered inside the app because it's just the most convenient thing.
And that in that way a closed ecosystem could be one route to at least having some verifiability of generated content. Instagram filters are kind of an interesting comparison here. Instagram started as photo sharing, as Silicon Valley nerds, and it became this, it became Instagram. It is a dominant part of our culture. And the filters had real effects on people's self-image, had real negative effects, particularly teenage girls, and they felt about themselves.
There's some studies that say teenage boys are starting to have self-image issues and body-image issues at higher and higher rates because of what they perceive on Instagram. That's bad, right? And it's bad, way to against the general good of Instagram, which many more people get to express themselves, we build different kinds of communities. How are you thinking about those risks with interopics products? Coach, I was working with.
I would always push them like, well, I want to start another company as much impact as Instagram. And he's like, well, there's no cosmic ledger where you'll know exactly what impact you have, first of all. And second of all, what's the equation by positive or negative? And I think the right way to approach these questions is with humility and then understanding us things as things develop. But to me, I'm excited and overall very optimistic about AI, and the potential for AI.
If I'm going to be actively working on it, I wanted to be somewhere where the risks and the sort of mitigations were as important and as foundational to the founding story, maybe to bring it back to why I joined. That's how I balanced it for myself, which is, you need to have that internal run loop of great. Is this the right thing to launch? Should we launch this? Like, should we change it in some ways? Should we add some constraints? Should we explain its limitations in some ways?
I think it's essential that we grapple with those questions, or else I think you'll end up in the like, well, this is clearly just a force for good. Let's like, blow it up and go all the way out. I'm like, I feel like that misses. Having seen it at Instagram, you can build a commenting system, but you also need to build the bullying filter that we built. I think that's a good thing. This is the second Dakota question. How do you make decisions with the framework?
Actually, maybe I'll go meta for a quick second, which is the culture here on topic is extremely thoughtful and very document writing oriented. So if a decision needs to be made, there's usually a document behind it. There's pros and cons to that. It means that as I joined and I was wondering, why did we choose to do this? People would be like, oh, yeah, there's a doc for that. And there's living a doc for everything. And then which helped my ramp up.
But sometimes I'd be like, why have we still not built this? I'm like, oh, yeah, it's pretty rude to talk about that. Two months ago, I'm like, well, did we do anything about it? And so my whole decision making piece is I want us to get to truth faster. None of us individually know what's right. And getting to truth could be, let's de-risk the technical side by building a technical prototype. If it's on the product side, let's get it into somebody's hands.
Figma mockups are great, but how's it going to move on the screen? And so minimizing time to iteration and time to hypothesis testing is my fundamental decision-making philosophy. I've tried to install more of that here on the product side. Again, it's a thoughtful, very deliberate culture. I don't want to lose most of that. But I do want there to be more of this hypothesis testing and validation components.
And I think people feel it when they're like, oh, yeah, we had been debating this for a while, but we actually built it. And it turns out, neither of us were right. And actually, there's a third direction that's more correct. At Instagram, we ran the gamut of strategy frameworks. The one that's resonated the most with me consistently is playing a win. I go back to that often, and I've instilled some of that here as well as we start thinking about what's the winning aspiration?
Where are we going after? And then more specifically, and we touched upon this in our conversation day, where will we play? Because we're not the biggest team by size. We're not the biggest chat UI by usage. We're not the biggest AI model by usage either. We've got a lot of interesting players in the space. We have to be thoughtful about where we play and where we invest. And then this morning, I had a meeting where the first 30 minutes for people being in pain due to a strategy.
And the cliche is strategy should be painful. And feel, forget the second part of that, is that then you will feel pain on the strategy. It creates some trade-offs, but also just recognizing that in Instagram, we always talked about doing fewer things better. That was like a foundational company of value. And the most important part. Wait, what was the trade-off and what was the pain?
The trade-off was, not getting too much in the technical details, is basically of the next generation of models, what particular optimizations we're making. And you can't sure exactly what, but it will make one thing really good, another thing just like, OK, you're pretty good. And the thing that's really good, I think, is a big bet. And it's going to be really exciting. And everybody's like, yeah. And they're like, but.
Yeah. But so I'm actually having us write a little mini document that we can all sign. I know this sounds kind of cheesy. Or it's like, we are making this trade-off. This is the implication. This is how we'll know we're right or wrong. And here's how we're going to revisit this decision. And I want us all to, like, at least cite it in Google Docs and be like, this is our joint commitment to this. Or else you end up with the next week of, like, but that revisit.
So it's like, it's not even disagreeing commit. It's like, feel the pain, understand it. Don't go blindly into it forever. Like, I'm a big believer when it comes to hard decisions. Even decisions that can feel like two-aid doors. The problem with two-aid doors is it's have to keep walking back and forth between them. So you have to kind of walk through the door and say, the earliest I'd be willing to go back the other way is, you know, two months from now.
Or with this particular piece of information. And hopefully that kind of quiets the, like, even internal credit of, like, it's a two-aid door. I'm always going to want to go back there. I think this brings me into a question that I sort of am in dying to ask the whole time. You're talking about next generation models. You're new to Anthropic. You're building products and top-these models. I am not convinced that LLN's technology can do all the things people are saying they will do.
Like, my personal p-dume is like, I don't know how you get from here to there. I don't know how you get from LLN to HCI. I see it being good at language. I don't see it being good at thinking. Do you think LLNs can do all the things people want them to do? I think current generation, yes, in some areas, no, in others.
I think part of, I think maybe what makes an interesting product person here is that I really believe in our researchers, but I'd like default, like, you know, default belief is everything takes longer in life and in general, and in research and in engineering, then we think it does. I do this mental sort of exercise with the team, which is if our research team got Ben Winkled all fell asleep for like five years, I still think we'd have five years of product roadmap.
And we'd be like, we are bad at our jobs. We're terrible at our jobs. We can't think of all the things that even in our current models could do in terms of improving work, accelerating coding, making things easier, coordinating work, even intermediating disputes between people. Do you think there's a funny LLN use case that we've even seen play out internally?
Like, these two people have this belief, like, help us even ask each other the right questions to get us to that place, such as a good sounding board as well. Like, there's a lot in there that is embedded in the current models. I would agree with you that like the big open questions to me, I think it's basically like four longer horizon tasks. What is the sort of horizon of independence that you can and are willing to give the model?
Like, the metaphor I've been using is right now, LLN chat is very much, you've got to do the back and forth because you have to correct, you know, you've got to iterate. No, that's not quite what I meant. I meant this.
A good litmus test for me is like, when can I email Claude and generally expect that an hour later, it's not going to give me the answer it would have given me in the chat, which would have been a failure, but like, it would have done more interesting things and gone find out things and iterate on them and even like self-critiqued and then responded. And like, that, I don't think we're that far for some domains.
I think we're far from some other ones, especially ones that involve sort of like, either longer range planning or thinking or research. But I use that as sort of my capabilities piece. It's like less like, you know, parameter size or like a particular e-val. And to me, it's like, again, what problem are you solving? And right now, it's like, I joke with our team. It's like, right now, talking to Claude is like, a very intelligent amnesiac.
It's like, every time you start a new conversation, it's like, wait, who are you again? Like, what am I here for? Like, what do we work on before? And it's like, instead, it's like, all right, like, can we carry a continuity? Can we like, have it be able to plan and execute on longer horizons? And can you start trusting it to get some more things in? Because there's things I do every day that I'm like, I spent an hour on, you know, some stuff that I really wish I didn't have to do.
And it's like, particularly, leverage use of my time. But I don't think Claude could quite do it right now without, like, a lot of scaffolding. And right now, that here's maybe like a more succinct way to it about it. Like, right now, the scaffolding needed to get it to execute more complex tasks. Doesn't always feel worth the trade-offs. Because you probably could have done it yourself.
I think there's an X, KCD comic on like, time spent automating something versus time that you actually get to save doing it. That trade-off is at different points on the AI curve. And I think that would be the bet is, can we shorten that time to value so that you can trust it to do more of those things?
Like, you know, probably nobody really gets excited to put a, you know, coalesce all the planning documents that my product can just work on into one document, write the meta-narrative and like, circulate to these three people, like, and I don't want to do that today. I have to do it today, but I don't want to do it today. Well, let me ask you in a more numeric way. I'm looking at some numbers here.
And through our big, $7 billion of the funding, as of last year, and through our big is taken more than $7 billion of funding over the last year. You're one of the few people in the world who's ever built a product that has delivered a return on $7 billion of the funding, right? It's scale. You can probably imagine some products that might return on that investment. Can the LLMs you have today build those products? I think that's an interesting way of asking.
The way I think about it is the LLMs today deliver value, but they also deliver our ability or help our ability to go build the thing that delivers that value. So what actually, let me ask you for a threshold question. What are those products that can deliver that much value? Yeah. To me, it's like, right now, Claude is an assistant and the helpful sidekick is where I heard it internally at some point. She's like, at what point is it a coworker?
Because the joint amount of work that can happen in even in a growing economy with assistance, I think is very, very large. So I think a lot about, we have Claude for work. Claude for work right now is a sort of, almost like tool for thought. You can put in documents, you can sync things, and have conversations and people find value. Somebody built a small, like, fission reactor or something was on Twitter. I was like, not using Claude, but Claude was there. There's their tool for thought.
So the point where I like, it is now an entity that you actually trust to execute autonomous work within the company. Like that delivered product, it sounds like a fanciful idea. I actually think the delivery of that product is way less sexy than people think. It's about permission management. It's about identity, it's about coordination, it's about remediation of issues. It's all the stuff that like, you actually do in training a good person to be good at their jobs.
Like that to me, even within a particular discipline, like some coding tasks, some particular tasks that involve like coalesce of information or research and like each of those, getting to like have the incremental person on your team, even if they're not, you know, in this case, I'm okay with like, not net plus one productive, but net 0.25, but maybe there's a few of them and coordinated. I think that, I get very excited about the economic potential for that.
And that's all what 20 bucks a month, the enterprise subscription product. And this debate with somebody around, I think the price point for that is much higher. If you're delivering that kind of value, but I was debating with somebody around, you know, what Snowflake and Databricks and those have shown like, data dogs, another one, like usage-based billing is like, you know, the new hotness, we have like subscription billing, now you have like usage-based billing.
And like the thing I would like to get us to, it's hard to quantify today, although maybe, maybe we'll get there is like a real value-based billing. Like what did you actually accomplish with this? And you know, there's people that will ping us because like a common complaint I hear is that people hit our rate limits. And like I want more cloud. I saw somebody who was like, well, I have two clouds, I have like two different browser windows. I'm like, God, we gotta do a better job here.
But the reason they're willing to do that, and they write in, they say like, look, I'm like working on a brief for client, they are paying me X amount of money. Like I would happily pay another $100 to get me to finish the thing so I can deliver it on time and move on to the next one. That to me is like an early sign of like where we fit, where we can provide value that is like even beyond, you know, a $20 subscription.
But when I think about like deployed clouds, and this is early kind of product thinking, you know, but it's things I get excited about, being able to think about like what value are you delivering and like really a line over time is the way we're like, I think it just creates a very sort of, you know, full alignment of incentives there in terms of delivering that product. So that's, I think that's an area we can get to over time. So I'm gonna bring this all the way back around.
We started by talking distribution and whether things can get so tailored for the distribution that they don't work in other contexts. I look around and I see Google distributing Gemini and its phones, I look at Apple distributing Apple intelligence on its phones, they've talked about maybe having some model interchangeability in there between right now it's open AI, but maybe Gemini would be there, maybe Cloud will be there. That feels like the big distribution.
They're just gonna take it and these are the experiences people will have unless they pay some other money to someone else. In the history of computing, the free thing that comes with your operating system tends to be very successful. How are you thinking about that problem? Because if you're just like, I don't think open AI is getting any money to be an Apple intelligence, I think Apple just thinks some people will convert for 20 bucks and they're Apple and that's gonna be as good as it gets.
How are you thinking about this problem? How are you thinking about widening that distribution, not optimizing for other people's ideas? Yeah, I love the question. I get asked this all the time, even internally, like what should we be pushing harder into like an on-device experience? And I agree, it's gonna be hard to supersede the built-in model provider there.
Even if our model might be better at particular use cases, like a utility thing, I get more excited about, can we be better at being close to your work? And like, work products have a much better history than the built-in sort of thing. Like, pages comes with, and plenty of people do their work on pages, I hear, I don't know.
But like, you know, there's still a real value for a Google Docs or even a notion and other people that like can go deep on a particular, like sort of take on that sort of productivity piece. So I think it's why I lean us heavier more into how people get things done. And some of that will be mobile, but almost maybe as a companion. And provide and deliver value that is almost like independent of needing to be exactly integrated into the desktop.
I think as an independent company trying to be that like that first call, that Siri, I've heard the pitch from startups, even before I joined here, like, we're gonna do that, we're gonna be so much better. And the new action button means that you can bring it up and then press up, I'm like, no, like the default really, really matters there. Like, Instagram never tried to replace the camera.
We just try to make a really good thing about what you could do, like once you were like decided that you wanted to do something novel with that photo. And then sure, people took photos in there, but like by the end when we left, it was like 85% library, 15% camera, right? Like there's real value to like the thing that just requires the one click. So it was interesting because, you know, every WWE see that would come around. With pre-Instagram, I loved watching those announcements.
I was like, what are they gonna announce? And then like a change is like, what are they gonna announce? And you guys at the point where we realized like, they're gonna be really good at some things. Like Google's gonna be great at some things. Apple's gonna be great at some things.
You have to find the places where you can differentiate either in a cross-platform way, either in a depth of experience way, either in a like novel take on how work gets done way, or be willing to do the kind of work that some companies are less excited to do because maybe at the beginning, they don't seem super scalable. Are there consumers scalable $7 billion worth of consumer products that don't rely on being built into your phone?
I mean, I opened up the app store and chat to you with these regularly. Second, I don't know what their numbers look like in terms of the F-TRAZ app is, but I think it's like pretty healthy right now. But long-term, I think it's, yeah. I actually, I optimistically believe yes, because I think they even on a, like, let's conflate mobile and consumer for a second, which is not a super fair conflation, but I'm gonna go with it for a second, which is so much our lives still happen there
that whether it's within LLM plus REX recommendations, or LLM plus shopping, or LLM plus even dating, like I have to believe that at least a heavy app component can be in a $7 billion plus business, but not one where you are trying to effectively be, like, Siri plus plus. I think that's a hard place to be. Yeah. Opening up your answer to this appears to be search. I feel like I need to disclose like every other media company blocks me and just taking the money. I have nothing to do with steel.
I'm just letting people know that we took the money too. It feels like they're answering search, right? If you can cough some percentage of Google, you've got a pretty good business. Especially, such an adela told me a bing when they launched, you know, a chatchipity powered bing, like any half a percent of Google is a huge boost to bing. Would you build a search product like that? We've talked about recommendations a lot. Yeah, I'm talking recommendations and searches, like right there.
Yeah. It's not on my mind for any kind of near term thing. I'm very curious to see. I haven't gotten access to it probably for good reasons, although I know Kevin Will pretty well, I should just call him, like, yeah, I'm a beta. So I haven't gotten to play with it, but like that space of, you know, the perplexities search chatchipity search, I forgot how they actually branded it.
Search GFT. Search GFT. Yeah, I mean, it ties back to the very beginning of our conversation, which is like search engines in the world of summarization and citations, but, you know, probably fewer clicks. And where does that end up, you know, how does that all tie together and connect? And it's less core, I would say, to what we're trying to do. So it sounds like right now the focus on work, right?
You've described a lot of work products that you're thinking about, maybe not so much on consumer. I would say the danger in the enterprise is, it's bad if you're enterprise software is hallucinating. Just broadly, it seems risky. It seems like those folks might be more inclined to see if you send some business haywire because the software is hallucinating. Is this something you can solve?
I've had a lot of people tell me that LMS are always hallucinating and we're just controlling the hallucinations and I should stop asking people if they can stop hallucinating because the question isn't big any sense. Is that how you're thinking about it? Can you control it so that you can build reliable enterprise products? I think we have a really good shot there. The two places that most recently this came up, one was we are current, like LMS will oftentimes try to do math.
Sometimes they actually are, especially given the architecture, impressively good at math, but not always and especially not when it comes to like higher order things or even things like counting letters and words that you can get there. And so like one week we've made recently just helping Claude, at least on Claude AI, recognize when it is more in that situation and explain its shortcomings. Is it perfect?
No, but it's like it significantly improved that, particularly I think, because from an enterprise, then this came directly from an enterprise customer that said, hey, I was trying to do some CSV parsing. I'd rather you give me the Python to go analyze the CSV than try to do it yourself because I don't trust that you're going to do it right yourself.
So I think on the like data analysis, code interpretation that front, I think it's a combination of having the tools available, like LMS are very smart, so are humans. I still use calculators all the time. In fact, over time, I feel like I get worse at mental math than like people rely on those even more. So I think there's a lot of value to, hey, give it tools, teach it to you, tools, just a lot of what the research team focuses on.
And then really emphasize the time where it like, yeah, I know you think you can do this. The joke I do is like the CSV version is like, yeah, I can eyeball a column of numbers and give you like my average. It's probably not going to be perfectly right. So I'd rather like use the average function. So that's on the on the data front.
On the citations front, the app that has done this most well recently, I have no affiliation with this other than like, we listened to our parenting advice all the time, which is like Dr. Becky, who's like a parenting guru, has a new app out. And I really like playing with chat apps, so I really try to push them. And I push this one so hard around like, you know, trying to like hallucinate or talk about something it wasn't familiar with.
And I have to go talk to the maker, they're actually, they're on Twitter. They do a great job of like, if it does not super confident that that information is in its sort of retrieval window, it will just refuse to answer. And it won't confibrillate it, it won't go there.
And I think that that is an answer as well, which is like the combination of model intelligence plus data, plus the right like prompting and retrieval so that like, you don't want it to answer unless there actually is something grounded in the context window helps tremendously on that hallucination front. Does it cure it? Probably not, but I would say that like all of us, all of us make mistakes.
I hopefully they're like, predictably shaped mistakes that you can be in like, oh, danger zone, like talking outside of our piece there. I even like the idea of like, you know, even having some like almost syntax highlighting for like, this is rounded from my context. This is for my model knowledge. This is out of distribution like Danger World Robinson. I'm not sure if this is exactly where I'm, like, I'm not exactly sure what I'm talking about. Maybe there's something there.
This all just adds up to my feeling that like prompt engineering and then like teaching a model to behave itself feels non-deterministic in a way. Like the future of computing is just like misbehaving toddler and we just have to contain it. And then we'll be able to touch our computers like real people and they'll be able to touch us like real people. That just seems wild to me.
That even if you're going to release the system prompts, I read the system prompts and I'm like, this is how we're going to do it. Like Apple system prompt is do not hallucinate. And it's like, that's where this is how we're doing it. Is that feel right to you? Is that feel like a stable foundation for the future of computing? It's a huge adjustment. Like an engineer at heart, I like determinism in general.
I'm like, we had a insane issue at Instagram that we eventually tracked down to using non-EZC RAM and literal cosmic rays were flipping RAM and call like, what do you get to that? So you're like, I want to rely on my like, on my hardware. Here's the moment, it was actually a moment, maybe like four weeks into this role where I was like, okay, I can see, it's the perils and potentials. We were building a sit system collaboration with the customer and we talk about tool use, right?
Like what the model has access to. And we had made two tools available to this, to the model in this case. And one was a to-do list app that it could write to and one was like a just like reminder, sort of like short term or like timery type thing. And the to-do list system was down. And it's like, oh man, I try to use the to-do, I couldn't do it. You know what I'm going to do? I'm going to set a timer for when you meant to be reminded about this task.
And so instead of an absurd timer, it was like a 48 hour timer. You would never do that on your phone, it would be ridiculous. But it to me showed that like non-determinism also leads to creativity. And that creativity like in the face of uncertainty is ultimately how I think we are going to be able to solve these like higher order more interesting problems. And that was the moment I was like, that's non-deterministic, but I love it.
You know, it's like non-deterministic, but I can put it in these odd situations and it will do its best to recover or like act in the face of uncertainty. Whereas any other sort of like heuristic basis, if I had written that, if I would never have thought of that particular work around. But it did in a pretty creative way. So I can't say it sits totally easily with me because I still like determinism and I like predictability and systems. We seek predictability where we can find it.
But I think I've also seen the value of like within that constraint with the right tools and the right sort of infrastructure around it, how it could be more robust to like the like needed messiness of the real world. You're building up the product infrastructure. You're obviously thinking a lot about the big products and how you might build them. What should people be looking for from Anthropic? Like what's the, what's the, what's the major point of product emphasis we should be looking for?
Yeah, so on the cloud side where, you know, in the between I think the time we talk and at air is we're launching cloud for enterprise. So this is like our push into really going deeper. It's a bunch of on the surface, on exciting acronyms like SSO and SCIM and data not management and audit logs. But it's the importance of that is that you start getting to push into like really deep use cases and rebuilding data integrations that make that useful as well. So there's that whole component.
And then on the API set, if you didn't talk as much about the API set although like I think of that as much as an important product as anything else that we're working on, the big push is how do we get lots of data into the multimodels or ultimately they're smart but they're, I think they're not that useful without good data in there. Like it's like tied to the use case. How do we get a lot of data in there and make that really quick?
So we launched explicit prompt caching last week which basically lets you take a very large data store, put it in the context window and retrieve it 10 times faster than before. Look for those kinds of ways in which the models can be brought closer to people's actual interesting data. Again, it's always ties back to artifact. And like get you personalized useful answers in the moment at speed and like at low cost.
Like that whole push, I think a lot about like good product design pushes extremes in some direction. Like this is the lots of data but also push the latency extreme and like see what happens when you combine those two axes and that's a thing we'll continue pushing for the rest of the year. Yeah, well Mike, this has been great. I could talk to you forever and ever about this stuff. Thank you so much for joining the cutter. It's great to be here.
I'd like to thank Mike Krieger for taking the time to join the cutter and thank you for listening. I hope you enjoyed it. If you'd like to let us know what you thought about the show or anything else you'd like us to cover, please drop us a line. You can email us at decoderthewerge.com. We really do read all the emails where you can hit me up on threads on that reckless 1280. We also have a TikTok. It's at decoder pod. It's a lot of fun.
Don't forget to share it with your friends and subscribe where we hear podcasts. Decoder is a production of the version part of the box media podcast network. The producers are Kate Cox and Nick Statt. Our editor is Kelly Wright. They're supervising producer's lame James. The decoder music is by break master cylinder. We'll see you next time.