¶ Intro
🎵 Music
Hello and welcome to the Attention Mechanism.
Robert Young joined as always by the One Annoy.
Hello there, Justin. Good to see you.
I was going to be doing this in my studio, but uh uh There's a thing there was a uh uh our our fiber wire was cut by our landscapers. ATT came out today, they apparently only screwed with the upload and not the download, and so I hacked into Darren Kitchen's house and now uh here I am doing the show.
Why acted?
Yeah.
Climb through the doggy door.
Yeah.
Uh
I just dove in. We have a lot to talk about though.
Exactly. I know. I know. Uh we have a lot to talk about though on this episode. Uh we remain in an export controlled situation for Anthropics top models, Fable and Mythos. Meanwhile, OpenAI making a big show of their five point five cyber, showing off some very impressive benchmarks and committing to a bunch of uh ways that they hope that they can leverage this kind of stuff at scale for patching holes in the internet, as it does look like these chained agentic
uh uh hacking techniques or something that uh people are taking very seriously. And then on the horizon, possibly even this week. We see the debut of five point six and a brand new voice model. But let's begin.
¶ Anthropic
Again
with anthropic. Mr. Maine, where are we as we talk today? Monday, June twenty second on Anthropic's now over a week export-controlled main model.
A week ago people were thinking that maybe that they would have Fable back available by Monday a week ago. And we're a weekly and that didn't happen. You know, as you recall, it was a week ago Friday when kind of the FID hit the shan, so to speak, and It got pulled and they had the export control letter. And then, you know, we're told that Anthropic team was called the DC Friday. They got their Sunday and and you know uh but a lot of things happen.
You know, we ended up seeing Dario show up at a G six, G seven summit, along with Sam and some other AI people. Uh Trump had said things are directionally positive about Anthropic. The The challenge is this, it's a frontier and I think metaphors got us into trouble. I think using metaphors like this is like nuclear material and that was I think that was a really, really not a good I and Curtis I Dario's built an exceptional company. Cruptial product, great team, but
When you tell the US government, who is in the middle of, by the way, bombing an aspiring nuclear power, that this is a new nuclear power, and then they're like, Hey, we have concerns and you're like, yo, bro, it's got it. Don't worry, that's not a concern.
Particularly when, you know, as Trump had said, hey, one of their investors and competitors had said this, basically alluding to the fact that it was Amazon was the one that went to, you know, the government said, hey, we have an issue and let's let's unpack that a bit. We're not talking about Amazon, the people that deliver you your you know your your latest edition of whatever.
Um
Ver herbalife or whatever, you know, supplement you're into. We're talking about the Amazon. These are
These are not the guys that greenlit the boys, right? Yeah. This is this is a different Amazon.
These are the people that are responsible for keeping, I don't know, most of the internet up and running. And and and there was this, I saw they go, ah, Amazon's. I'm like, yeah, like Amazon has entire teams devoted to cybersecurity. And apparently the story was
They had concerns. They went to anthropic. Anthropic brushed them off and they didn't feel like they were getting a satisfactory answer. So then they elevated that to the government to say, hey, listen, we have concerns. Other people within Not just administration people. I understand this is not a a a partisan thing. This was Stakeholders in security were concerned about this. And then stories came out. You know, you know, there was
Earlier when Mythos came out, it turned out that more people had access to it than we're supposed to, that basically was being unrestricted access was to other people. Maybe even the Chinese had access to it. Then it came out that there was a government whitelist of a hundred companies and Anthropics said we'll have access to it. Then it turned out there were fifty more companies that had access to it.
And the government's position is we want to know because there are things about companies that we know that you don't know because they're top secret. One of the companies that came out was, you know, allegedly was a s you know a Korean, South Korean telecom company.
That may have had connections to may have had that maybe maybe like maybe infiltrated whatever but they thought was concerns'cause of the Chinese government. Again, maybe true, maybe not. I don't know. I am I'm not sitting there in front of the dashboard, the Palantir dashboard. you know, of the NSA getting all the intel and very few people I know do.
Um and no most m most of the people commenting on Twitter definitely don't have that access to it. So we don't know. But we do know that created this issue of you said this was dangerous and scary and could do these things. But then when we brought concerns to you, you didn't act like these were scary things to deal with the concerns. And that created an issue of trust. A way a lot of the stuff works, a lot of this leads before even formal policy gets.
And that means you have people from your company in DC who are talking on a daily basis. I remember early on when I was at OpenAI, you know, I would go to the White House to go talk about stuff, you know, to partners and stuff on a broad sense. Meanwhile, some of the security team people were talking to the security community and doing this back and forth. Anthropic was been and Anthropic is supposed to be
the experts at this because they've been selling contracts to to CIA and the NSA, apparently, allegedly, you know, for years. So I would say there was a st a lot of uh Not the kind of tight communications that we wanted to have happen.
Yeah. It is very interesting that the Trump administration was able to find common cause with the uh Iranian Revolutionary Guard than they were before anthropic. Uh now obviously on different timetables, but in all seriousness I do think that right now what we have seen is
It's it's it's easy to understand the Iranians come from. They have a very simple platform death to America. God
Simple. Pretty simple. The the anthropic thing that that you know really makes me scratch my head is that you know, we just saw over the last forty-eight hours uh Mark Warner, who's a democratic senator, very senior Democratic senator with ties to the intelligence community uh for from from Virginia.
Came out and said that look, the NSA was really worried about this. And there was questions as to exactly how he described their level of worry of what Mythos could do, but put that to the side for a second. That means that in this period of time leading up to the export control, you had Andy Jassy of Amazon.
The NSA, we have heard from behind the scenes that there was multiple different government agencies that were that these are not politicians. These are these are tech-centric experts that know this tech. They are not babes in the wood on this. They they know what this can do and what this can't do. Uh, and then people that are not adversarial had not been at least adversarial to anthropic in uh Scott Bessent and Howard Lutnick that eventually have this all come to a head.
And it all does just seem to be about trust. And that that's what's sort of come to the come to the fore here. And it does seem like the administration is going out of their way to try to explain it a little bit more because they know that this is something that can be painted on on a more polarizing line, but I don't think it was shocking that they went to the Washington Post with some of this because they wanted the message to be look
The Post isn't gonna do us any favors with with their with their coverage. At least they haven't up till now. If the Post is reporting this, then this is something that that, you know, people should generally feel is essentially vetted. And it all just comes down to for all of their talk about security, security, security, the administration, the NSA, Amazon. All felt that Anthropic wasn't taking security seriously.
In in and and and and that's even more context, um Scott Bessant. And Howard Ludnick, head of commerce, want a successful anthropic. They want a very successful anthropic because they know business users use it. It's a wildly successful product. It is a great lab. They put out a big great model.
They want a successful OpenAI, they like a successful Google. They want to have multiple players. They do not want anybody to run away with this. This isn't very clear. JD Vance has even said as much as this. They don't want anybody to run away with it. And as much as it might play into some fantasy, the idea let one lab run away with it with really good relationships to the government.
Couple things happen. One, administrations change. Two is you create a situation of stagnation and it's not good. So they they they do not want to see. You know, a hobbled anthropic. And I don't think that's in the interest of the American consumer. You know, I don't think that's in the interest of American AI. Okay. Um
And
I think that you have to sort of understand that starts from that premise. You can come up with a scenario where you think that like, ah, it's vindictive, it's just like this, whatever. I think there's a world where, you know, they may want more exercise more control over them. They want more say that. And that's a great debate to have. How much do we want government dictating these terms versus whatnot? How much should be you know decided. you know, through other processes, what have you.
But I think that the the every every company right now in AI that every frontier is going through their own sort of learning curves and figuring out how to deal with the changing landscape of stuff. And We're going to, you know, we can we can sit there and say, oh, we'll put this policy in place. And I'm not saying that none of that should be done, but I think things move so fast. And part of the thing that I've been saying for years is some of these things people suggested.
Like I remember years and years ago at OpenAI, it had some some people who are some of the safety people like, Oh, we need to worry about this frontier model. I said, Tell me your greatest fear that you think that once you get above here, that that's a reality. And let me tell you if I can figure out how to do it with a smaller model, limit the capability, and be able to work a way around it. And we're seeing that now.
With people building these harnesses, which is basically building an information system that goes around a model that are able to do really complex stuff. And part of it being people say rightly so that like, hey, if we don't get figured out how to get Fable out there for legitimate security researchers to use.
We're gonna have a Chinese model. We're gonna have something that's not aligned that we don't have eyes into do it. And so I think there's an incentive to figure out as quickly as possible how to make these things widely available. I also think that um Sometimes trying to write the policy can be really hard because it comes at you from so many different angles.
It it appears that the solution, at least the reports around this are that What is going to need to happen is there are more uh uh standard lines of communication. So conflict does not have to be resolved in the way that it was resolved last Friday. Yeah. Uh, not to go back to the Iranian situation, but it does sound a lot like what was announced in some of the Switzerland negotiations where now there are deconfliction lanes that people can
make phone calls to enough people that that can mediate smaller things so smaller things don't turn into bigger things. And it seems like that's that that that might be what's happening now with with the anthropic. Because it it does seem like the further we get away from here is like this was not an exercise in just government bigfooting. an AI lab, even the AI I AI lab that was begging to be Bigfooted, but uh this this really was, you know, just when
You know, there were a persistent line of problems that stakeholders could not get good answers to. The threat was we will export control this, and it was export control. That's the way I understand it right now.
Yeah, and I I I wonder, you know, there's been push like it's like, hey, this is why we need third party groups to come in and monitor. I think we've had those. My my criticism of those has been often it's hard to know what they're incentivizing. And and and and I've seen that some people like, oh, you should use this group or modern. I'm like, well, actually, they're D cells who are really would like to just shut everything down.
You know, not to be not all are, but I've seen that. Or some other ones like, yeah, they have very terrible practices and anytime that anything they know gets leaked to everybody and everybody knows. We've seen an interesting situation here where you had Amazon, who is heavily invested in Anthropic, and also a partner with OpenAI, who has got a economic interest in Economically this was not a good you know, was not a good play.
for Amazon. It just showed you the severity of the security issue. I think they thought uh there was a deal'cause I think they're afraid of like this thing comes out and somebody finds out another error in MongoDB and shuts down, you know, you know you know, some bucket, you know, in the West Coast which is handling all the credentials and they got a problem.
I think that we are we might be fine trying to figure out who the competent players are and who knows. I think within agencies, you know, some a you know, you'll get a lot of times you get fights between agencies where FBI like we've got the cyber team, we want to do this, and you find out like
Yeah, you guys not to say this is true about them, but a hypothetically native it's like, yeah, you guys have been out of market for a decade and did working for government and doing low level stuff and not skilled to do this, you know, or you have, you know, other areas. So I think we're starting to figure out where these strengths need to be.
Um, I think there's, you know, thinking about like how can we make sure that the government is able to hire really good people to work on these tools and work on this stuff. I think that what's interesting, we're at an interesting point now where you have some people coming out of AI labs. Who are kind of quote what you call post money?
And doing rotations and stuff in the government can be more appealing because it can be going there because these aren't people who are all of a sudden want to go work for another AI company, but really have a vested interest in our country. We've had a couple of people come out and work with like the the the Pentagon has a project project.
Where they they basically enlist like certain, you know, career people, A tech people, whatever to do that. Maybe that's what we're gonna need more of to basically make sure that when you go talk to somebody who's got, you know, a White House lanyard or whatever that
they're out of a lab and they know a lot. You know, there was the, you know, commerce department was gonna there was an appointment. They were gonna do an appointment with somebody straight out of anthropic. And I think that caused issues because you don't want to make it look like one lab is all of a sudden going to put in a whole bunch of their people in there. Yeah. You know? But uh anyhow, I think it was a good wake up call.
It certainly says to me that we have crossed the line of demarcation when it comes to cyber. the the you know, if if anybody thought you know uh Dario, the CEO of Anthropic wrote a uh a a blog post with a metaphor. By the way, export control on metaphors. I I'm done with metaphors. Like, can we please not use metaphor? There was
Somebody who is very central to op you know uh AI history wrote a thing about like, oh, just to explain to people how uh anthropic is feeling. Step one, there are monsters in the forest. Enough with the monsters. Dario was talking about the treents from Lord of the Rings. Can let's just talk about this. If this is serious stuff, let's just talk about this in serious terms that people can understand without reverting to
Fantasy and science fiction metaphors. Let's, let's, let's please just, you know, put this in a slightly more serious idea. Anyway, that being said. Everybody is wide awake right now. The tree nths are moving faster than they've ever moved before when it comes to cyber capabilities and cyber infrastructure, with the level of intelligence that these models have and the chained reasoning that they are able.
to operate things on. And I think that that is absolutely the the case with Fable. I'm sorry, with with Mythos and Fable getting to Mythos level is is part of what got us here. And then I think it's probably we're gonna talk about five point five and five point six uh from open AI, but I think it's likely true with that as well. That the these are this is just the level that we're at right now when we need to actively incentivize
retrenching some of our infrastructure, our digital infrastructure here, uh, because the capabilities are real. They are, they are very real now.
Yeah, I I when I read the original Dario analogy and he's using Hobbits and And I'm like, well, you know, some could say that you're Cela Brimbor being manipulated by Sauron too. Let's think about that. Um and and the essay he said that the hobbits have to sort of tell the ints, but the ints move too slow. And then like a week later, turned out the ints moved really fast. To slap an export restriction. So uh we we
Everything's a metaphor, it's why metaphors are terrible. I think we we we call things for what they are and I think the metaphors got us into some serious trouble there because we need to use a metaphor like nuclear powers. Um
You know, their open I announced, you know, showed that, you know, GPT five point five cyber, which is their new model for that, and they have a project called Project Daybreak, which they're working with people to sort of re you know, to find a bunch of vulnerabilities and open source software. They've talked about, hey, yeah, we're not, we think that we have a good relationship going forward to continue to release models.
And people have asked like, well, you know, are you know, will this stop everything? And the government said, like, no, this is unique to anthropic, which yeah probably has to do part with the dynamics of relationship. has Anthropic put in place a much clearer chain of command, so to speak, about these things? Because Anthropic kind of everything runs through Dario. That is yeah. That's it's it's Dario decides that open AI, you know
Is obviously uh a very, you know, complicated company in many ways, but also there is a lot of high trust on other people at OpenAI. And I'd say that's what's sort of unique about OpenAI compared to Let's say meta is that it Sam is very good at empowering certain people and saying that this person says go, no go, then that's what we do. He he doesn't tend to
you know, when it's it it seems to be pretty well reasoned out, f you follows through on that. So I'd say that there's also OpenEye has been focusing on has had a different approach towards how they handle model security. So we'll see what happens with 5.6, whatnot. But I think I'd recommend people check out what they did GPT-5 Cyber, 5.5 cyber, to look at that and to see some of the ways in which they look at how they can kind of mitigate these problems.
And there's some technological things too, which maybe we'll see happen in the next week or so, which we could we don't have direct knowledge of, but we have kind of indirect knowing that there's you connect a few dots to things, you go like, oh yeah, really well suited to be able to handle.
Yeah. Uh will we get Fable by the time that we talk next on Monday? Will it be available?
Um
Yes or no?
🔇 Silence
Uh, here's the other question. When Fable came out, Amazon, or sorry, Anthropic was very clear that it was. gettin' to know you time for subscription That was gonna end on the twenty-fifth. We are as we record this the twenty-second. Let's say that it that Fable is made available on the 25th. It was supposed to go into API only after that.
Do you think Anthropic throws it right to API? Do they throw it to to to subscription people? Is that the right thing to do? Do you think that any of this affects their decision with that?
Here's here's the challenge. Prior to this controversy with Fable. The previous controversy was that Fable would actively give you wrong information if you were doing frontier machine learning research. If you were doing things there. Change things in your code base, do things. One word, I'm not saying it, but others have said was sabotage. Okay? Yeah. And that was that was one of their ways of defense was to say that like
And then they got a protest, like within 24 hours protest, they said, okay, we won't do that. We'll just tell you, give you a warning and revert you to there. What else is that involved in? Is there if I'm trying to use it for front, if I'm using fable, not mythos, for frontier cybersecurity, and that looks like I'm doing pen testing.
Am I going to get reliable answers? If I say, look at this entire code base, tell me what the errors are, if there's any vulnerabilities that saw a hacker could do, would it go, ha ha, you're not fooling me, I'm not going to tell you. There's going to be some question about that. You know, are and and is it there like
I don't know how much that was baked into the product, but that was a pretty had a lot of people kind of like the almost people forgot about that. And that's gonna be the question is that if I'm not if I'm doing anything other than designing a a beautiful front end for a React website,
Can I rely on it? So I do think that uh they'd promise people access for it up until like this week or whatever. And of course they didn't get access to it. I do think they have to do they're gonna still do a reset. They've been they've got more compute, they've been they've been catching up.
you know, best they can with the compute space. I I don't know if it goes into the API right away. I think they're gonna want a lot more data from users using it. I could be mistaken. There might be there might be limited rollout to API for certain customers that are relying on it, but I don't know. It's a very I think they've created a very complicated situation.
I would agree. I don't know many people that would say that they have not.
¶ GPT-5.6
Meanwhile, across the metaphorical street, OpenAI, you mentioned their cyber announcement. I think it was uh a very Very uh purposefully done to talk about exactly how much they are committed to leveraging their state of the art uh cybersecurity model and every project that they are working on and how seriously that they are taking it.
But the big rumor And this is something that I have heard is that 5.6, their most recent uh or their their what would be their flagship model has been in testing uh out in the public for weeks now. It would stand to reason that this Thursday would be when it is launched. It is rumored to have three different model launches, one 5.6 Pro. This would be the state of the art of the state of the art. I'd be very curious to see what the benchmarks are on it.
There will be 5.6, and then there will be their brand new voice model. This will be their first new voice model since 4.0. And uh this would be essentially a 5.4. uh powered voicemail. Yeah. Those are at least are the rumors.
There there was an announcement thing came out last week, which was fascinating. Was it 5.5 instant? They say is on par with like 5.5 for health-related questions. Yeah. Which that's significant because you have to understand that Every lab deals with their own challenges. And OpenAI's challenges, they've got a billion users worldwide, and they have, you know, 900 million plus on the free tier.
And people often ask health questions. And one of the reasons why they came up with the switcher and then they came up with the auto was that, you know, people described if you go to a doctor and you find out the doctor's using the free version of Chat GPT to answer your questions. Not a confidence build.
But by making the five point five instant model, they say as well as they say on and uh by the way, open a models you look at actually health benchmarks are like really category leading, like incredibly capable there. It's one of the things that gets measured less because Uh fortunately the average person has far fewer health questions per day than the coder does coding questions. Yeah. But um I think it's gonna be interesting as we try to make these
models like the the voice models more capable because that's often like we've we you and I have had to explain to friends endlessly like, ah, it's so dumb. What do you mean? I I sh here, let me show you. And they show us a voice transcript. They're like, well, you're using the voice model.
And they don't understand, but nobody it's amazing. People don't stop to think, do you think the model that can tell you something in half a millisecond or half a second might not be as smart as the model that tells you it's gotta take 15 seconds to think about it?
And again, it's just, it's, you know, if you stop, you go, oh yeah. So anyhow, that'll be exciting to see the development there, whatever. I also want to just highlight something from the GPT 5.5 cyber, which was um protecting critical infrastructure and syntax systems. This is open eyes web. We're also collaborating closely with government inst government institutions around the world to uplift their defensive security cybersecurity capabilities and protect critical infrastructure.
We have been working closely with the United States government and relevant federal agencies as we prepare for increasingly cyber capable AI models.
I mean, I'm not sure. I'll put it this way. I would not be shocked if at the G seven summit in France, where all the major AI companies were, after the anthropic export control, if Sam didn't hand Howard Lutnick, Donald Trump, and anybody else in the administration walkie talkies so they could get in touch with him globally whenever they want. Like forget a phone call. You could just go brrrr.
Like uh uh'cause I think they are they are very well incentivized to be as communicative as possible in this moment, not only for their own future, but also competitive.
Well, that's another thing, is that yeah in the the communication styles between the respective CEOs and companies. Sam is incredibly high bandwidth. That was a thing that's amazed me is that is that When you get to Sam level, Sam's response rate is incredibly fast. And I get, you know, I can get responses from Sam and I'm a nobody. And um
He is a person that like literally is always on his phone, always doing this, whatever. And so he is extremely able to get cap and that was a story we heard. Like that was a frustration from the government was like, Hey, we tried to get hold of Dario, it took an hour and a half or an hour, whatever.
And and I think that um people like that hasn't seemed more reasonable like, well, uh I I can tell you that Sam and a couple other CEOs I know, you're if you're coming if the wide if the numbers coming from Washington D C, you're not gonna wait an hour.
Someone's getting back to you with answers that matter to you. Yeah. Uh uh immediately.
And I think that was the I'd say that was one of the things I think people didn't realize like why why is that such a big deal? Because I don't think Andy Jassy and Tim Cook take an hour to get back when they get those kinds of calls.
Yeah. I and and that's yeah, that that is that is an advantage there. Uh Five six. What are your expectations?
Um I have been I have been very impressed with the I will tell you something I think OpenEye has really going for it, says the head host of the OpenEye Podcast, the producer of the OpenEye Podcast. I have noticed between 5.4 and 5.5, 5.5 often feels like a much better model, even though it's the same model as Codex improves. If you're using Codex, yeah.
The memory system, things like this, things I had earlier on where I had to restart threads and kind of start over, I don't have to do anywhere nearly as often.
The work on compaction with codecs is just a a totally silent game changer that even in like the you know AI coding world I don't think is getting as much attention as it should because they solve that and that used to be
Just a absolute killer for the entire industry. And for folks who are not aware, that's like when the the conversation gets too long, uh, and you know, the context window for what you're asking the model, it can only be so big, uh, to summarize essentially what was said before is the compaction. That never used to be good. It used to suck. You used to have to restart and and and do stuff. And now it's it's just seamless. It just goes forever.
Fun fact, um, I'm gonna get out my own horn and play it. Uh in June two thousand twenty. Six years ago when OpenAI showcased my app, AI Channels, this is right before I went to go work for them. One of the things I had was this was a very chat GPT like app. You could have conversations with it, but it used GPT three. I built a compaction system in there because I said, Oh, if I wanna have I really want to have long conversations, I hated chatbots you didn't.
And the the challenge of the time was I had to use like GPT-3 DaVinci for the conversations. Then I was using a model called Curie to sort of create like summaries summaries of the stuff. So you could talk forever.
Terribly slow. Terribly, terribly, terribly slow. And it's one of the reasons why other people played with systems like that. It took a while for people to really implement it to figure out was were we better off with just using a lot of people like, ah, just make context length longer. I'm like, well
You say that, but then you're telling me just use more RAM. And that's not you, and I often I tell people like, I'm gonna tell you what this means from a technical point of view. You're saying use more RAM. And that's the the goal is faster, less RAM, whatever. Anyhow, it's taken years to get to these things are so fast and so good you don't notice it. And I I remember how painful it was with my app, just waiting seconds or 30 seconds for a response.
Now in compaction was really an underexplored area because people just threw it all into the bag of what we'll call it rag. I'm like, well, that's that's like just saying that like library, like data science, like it's just data. I'll just use data. I was like, that's
There's a huge thing there. And I would see companies come out with stuff that like they would think through so poorly, so many different aspects. It's hard. So I think that they really do have a compaction. It's gotten so much better, so good at it. So the question is 5.6. What's going to be miserably different for me? I think speed is really the thing that I'm looking for now. Like like I think that for many of these models that I have to think harder to really stump it and do what?
Obviously, front-end design still like it got better, but it got a certain kind of better. But I would love for better front end, better speed. Speed is a big factor. We know that OpenA has their deal with Cerebrus. which is building has the super fast wafer scale chips that are incredibly capable. And we've seen GPT Spark, which is really fast, not the best model. If we continue if they continue to make the models that are running the Spark models faster, like more capable.
I believe there was there was a rumor floating around that this week since we last talked. there was in the wild a five five spark. Which is five five is the current model that really kind of set Codex apart. Like Codex has a separate app and how much it's taken off. Like five. And you would be going on Cerebus chips at a speed that when when I try to impress for people who have not played around with this tech, this is
faster than I can snap my fingers. This is this is real time. Like you ask for a thing, it spits the thing back out nearly instantly. And uh If that's part of this, I mean that's that that's a huge thing. The big question is
You know.
Is Thursday a model announcement? If it's a model announcement, you'd guess it's five six pro, five six and the new voice model. But looming around, and this is not a secret, people are talking about it. P the employees are talking about it openly. Is the super app, the idea of essentially what will be for retail customers, a brand new ChatGPT that gives you some of the capabilities, it gives you all the capabilities of Codex now.
That these are going to merge into one product. And I wonder whether or not this is the week.
Yeah, I don't know that you do you do a model release the same week as you do your super app? Maybe you do because you To go back in time, ChatGPT came out in November 2022 because we knew the interface for GPT-4 was going to be a chat interface that we were moving to that chat line style interface. But we wanted to test that before we released GPT four. And that was cause trying to do two things a radically new interface, radically new model seemed like that might be
You know, it's like, oh, we'll just throw this interface out there, see how it works. Turned out the interface was all the difference in the world. And I would say that is a lesson, by the way, that I think that OpenI has kind of realized. I don't know if other labs have realized this, because like we we literally saw When we try to tell people, oh, you should switch to codex, yeah, and you're like, well, and it's it it's it's like I it was like me making the argument, like I built a prototype.
uh playground for open I years ago that was let you switch between the just the white canvas and a chat mode, right? Yeah. And then they were and then I saw a show to be like, oh yeah, we're working on this. I'm like, oh cool. So I thought that took them took them like nine months to release it. And I'm like, okay. Well
You know, mine was ready. But anyhow, often it wasn't like wasn't it wasn't apparent to me even though how how big of a difference that would make for a lot of the hardcore users. You know, I could see the benefit of conversational. But I wasn't like, you know, Sam, you've got to do this eyebrow. I'm like, yeah, I think it's better. You know.
But it turns out it was it was the thing. It was it was the surface area that caught.
Even though that was built into my my AI channels app back in two thousand twenty when I first released it, I just assumed that I love the blank canvas approach to a lot of this stuff and I don't I don't feel like I need to talk to an AI. I just feel like I need to get it to do work.
Um, I would say that often we really I think the the benefit of OpenI bringing in a lot of people who are really good at interfaces and studying interaction is that they're now looking at the models in a different way than just saying, you know, more, more, more compute.
I'm fascinated. Very excited. Let me ask you from from just like an inside the company perspective, uh, in terms of launch strategies, beyond how it's going to be affected by or or consumed by the outside world. Is there just like a benefit to releasing a model, being able to catch bugs, having all your top talent on the ready to, you know, fix things in a very quick and expeditious manner when it hits?
the public and then then doing the super app when you can also have that team staged. There are these things just like Two huge possible a million things will go wrong when you when you start launching at scale. So you want to have the teams to do it.
I I think it is easier to do it that way. And and not to say that that's what they'll do. It's just that you run into
But th these are considerations that are being talked about.
Yeah, even even though like they're The number uh you know, when I was there there were 150. When I started there's 150 people, there's seven thousand people. Right. Even though the engineering bench, even all that is increased substantially, there's just there's a few people that have to you know, it comes down to, you know, Jakob and Mark Chen and Sam and a couple other people are like, what do we do here? What do we do here? And and no matter how much you plan for a smooth rollout or launch,
Even if you have AI helping you, having, you know, uh, you know, 900 million people using your app or playing around with it is a different matter. And what happens to the all the weird ways the API stuff gets, the edge cases, um, it would surprise me. Yeah.
Yeah.
Yeah. It de well also depends like how complicated is the super app too. Is it a codex update and they've added I don't know. I I'd love to hear what you would love to have in it, by the way.
🔇 Silence
I think I would just it it really is uh I don't know if like The super app to combine Codex and Chat GPT in any way matters to me at all. The the big question is how they fold in at Because Atlas is still my daily driver for a web browser. And I really like having just a lightweight Chat GPT interface. It's my number one way to interact solely with Chat GPT is through Atlas. Um I wonder about how that
You know, is this just kind of like flipping the chalkboard? You're mostly web browsing, and then, you know, all of a sudden you can do really heavy duty stuff. Cause I to me, I look at codec. Even in a world where it it's wired a hundred percent for uh uh more mainstream users and is is like only getting you to a coding situation if it
You're really, really asking for something that it's suggesting. Hey, don't worry, let me code something for you, which Codex does a lot of that, you know, just sort of in the background. Uh, but even in that situation, I I want that more as like a central brain for my life. I don't need it to look at, you know. my the things I like to pay attention to online. That's that that's my big my big question is exactly how they fold
these these uh those two products together, Atlas and Codex. Ca Codex and Chat GBT, I I already understand it. I get it. It's Chat GBT. It just looks, it can just do more.
Yeah, I like what I like about Atlas is I can open up multiple instances of it, multiple windows of it. And there's a lot of things I'm very I'm gonna be like, Am I gonna be hating it? Because there's certain things like, you know, Atlas users are in the minority. Um and and it is and it is a it's a case again where you get it and you understand what it's really good at. You absolutely love it. If you get it and it fails at a couple things for you, you just don't come back to it.
And and and I would even still I find myself going to Codex more and more because the harness in Codex is better. There I'm often now I'm often now just going into Codex and controlling Atlas from Codex. Which you'd be like, that's whack. It's the same model, well, different harness, different, different ways that it goes about doing stuff. And you know, we know that some of the Atlas team members left.
Um that you can do a browser inside Codex. It's not a complete browser. You know, you can't do things like microphone inputs, and there's a lot of other stuff. So
Yeah, yeah.
It's not the same.
Yeah, yeah, yeah. We'll see. We'll see. Um one last thing before we go.
Well I tell you the feature though that I want though before.
Oh please go ahead. Yeah.
I want, and I've been using codex to handle email. I want a dedicated inbox, a dedicated inbox part of it.
Yeah.
I want dedicated inbox. I want I want a message system. I mean I just want I want a way that Either via an app or via the apps that I have. I just want a dedicated um A dedicated chatbot. I I want a dedicated five five level intelligence in my in in anything that I deem that I want it in. And I want it easy. I want it like within I want it instantly.
Yeah, I would love. I would love I would love an inbox. I would love a like a d a d direct message inbox that I can have friends in chats like they have, like we have. We've seen inklings of this chat GPD. You can share a conversation with people.
But my my concern is, yeah, it's buried in there. I don't use it because it's not like iMessage, I press a button and the thing is there. That's one of my concerns with the super app is like You might have a lot of great functionality in there, but if I'm three, the lesson we learned at ChatGPT was friction, friction, friction.
And there's great capabilities inside Chat GPT, but I'm three clicks away from getting to use it. And if I want to use it twenty times a day, you know, you're making me click sixty times to do a thing. That's not.
Right now my on the phone, the vast majority of what I am using. the app for is conversations that I'm having within their health app, which is mostly like food logging. I'm like trying to stick more to a diet. And then codex controlling projects that I have via codecs because I know I can just have more functionality on there. All right. Last thing.
¶ Google Brain Drain
When the G seven summit happened, Donald Trump It's in the center of uh the the the dais there they're talking about artificial intelligence and at his two hands are two top CEOs. Not Dori O'Amade, he was on the other side of the table. It was Sam Altman of OpenAI and Demis Asabis of Gemini. Unfortunately, somewhere between France and Mountain View, Gemini seems to have lost a few of their key brains. Uh one of them goes to anthropic, one of them goes
To open AI. What do you make of the Gemini brain drain? Do you think that it affects where they're going? And are we, you know, are are we do we look at Gemini different today than we did a week ago?
I you had Uh Noam Shazir, who is one of the original Attention is all you need authors, going to OpenAI, and they are extremely excited about this. And then John Jumper, aptly named, going to Anthropic, who was a Nobel Prize winner along the demis. You know, that that's that's um. Here's what I was I am very excited. Anthropic bringing a guy like him on, like John Jumper is great for life sciences and biology. Like, like that is exciting that Anthropic at the point now is going to be putting
billions, hundreds of billions, who knows how much into this area eventually. That is exciting news because anybody who wants to live a long and healthy life, and I know that that's an area they look it for, where else can they expand to whatever? So I would say that Absolutely net, net positive for people. Like OpenAI has already been doing their initiatives there. It's brought in great people there and then doing that too. To have another player there aside from there is great.
Um Gnome going Gnome was at Google, then he left and he created character AI. And then eventually that got sold back to Google, which became basically Gemini. And so he was like a Gemini lead for running on that. Him going to open AI is interesting because The you know, the rumors are that Demis and Sergei have a different view of the future.
Yes. And we talked about this before. Demis is very much into the idea that you need to, you know, beyond our transformers, beyond these things, you need to be thinking about world models, really, really intelligent systems. From the guy that created the company that made Alpha Go from the guy prior to prior to the excitement around, you know, Chat GPT and GPT three and all that.
It was the only game in town was Deep Mind. And that was the reason OpenEye came to exist was the fear that Deep Mind was going to run away with everything on that. Yeah. So Demis, Demis is a generational thinker. and and about AI, not just sort of where we are right now and very much this. Sergei, another generational thinker. You know, Sergei is a guy that understood the value of information, where they go, very much a quick study. One of the arguments I've heard that why
The reason that Google was able to make a good showing was Sergei was in there making things move, whatever, as much as you know Sundar would like. And he's, you know, I'm sure he's happy to let Sundar take the credit for that, but really kind of came down
The rumor was that there was only one person who can really cut the red tape and move things forward at the kind of speed that some of these other labs are going at, or at least Comparable to the speed that that OpenAI and Anthropic are running at, and it was Sergei. Sergei was the only person that could do it, the only person that could move it forward, but.
He is A believer in the transformer model, which is really what we have seen OpenAI and Anthropic have have moved forward on is in developing that Demis says, look, transformers are nice. Well, I'm not saying that it's bad, but there's an upper limit. There's there's a limitation to it. You need to be thinking bigger with these world models. And so far we have not seen the world model transfer on a, you know, practical layer in the way that we've seen the transformer technology.
Yeah, and there's there's kind of I would run into companies from time to time that are like, hey, we've got a different approach, we've got this approach, which we think can speed things up or whatever. And the thing I kind of argue is, Certain things are going to win, not because they're inherently better, but because they're here now and they're easier to work with.
And and we know a ton about transformer models. We have hardware designed around them. We have a ton of people who know how to work them. And there's some people, you know, you know, the yawn lacoons who are saying like, oh, there's there are gonna be limited. And he gives you an argument that's out of like in two thousand and nineteen.
Because the way he describes them is not like, oh, it won't, it's gonna have trouble these things. Like, no, we're watching it solve these kind of problems all the day, all day long. Is it the most efficient way? Good I bet not. I bet there's probably much more efficient architectures out there. Might be in entirely different versions of that. What's cool about Transformers is
I can build a tiny transformer model like you do with just a GPT, and I can see really good signs of life and scale it to GPT two and see even more signs of life. World models is a very different thing. You know, opening eye uh chief scientist Jakob Pachotsky. You know, as a guy that part of his reputation was he was very good at figuring out capabilities increases based on amount of compute you throw at stuff. So
Between the Sergei Demis battle is Sergei is kind of pragmatic. Like he learned with Google data centers. I scaled up the data center and then I could do things like Gmail, do this sort of stuff. And then we were able to do this stuff. He understands that really, really well.
Damas is like, yeah, but what if there's this other paradigm and that becomes the question of which which gets you there sooner? You know, is it is it just saying, let's turn all this compute into a a new form of a world model is able to do all of these kinds of things? Or is it just keep scaling compute, keep throwing money at Gemini and keep going forward? Um, I'm an opinionated guy. I don't know the answer to that. I don't. I I I I don't I I think Demis Demis is a super smart guy. My
If you ask me where I was where I'd bet, I'd still keep betting on the Transformer because it just every nine, ten months we get these rapid speed ups in it and whatnot. Um
Well, it looks like Transformer Tech might have created three trillion dollar companies. So it's like it seems like there's some magic in that old black hat they found.
Yeah, and and and that they're they're like a thousand times cheaper and more capable today than they were six years ago. That's the other thing, is that if you go back at when GPT three launched
And like, yeah, it's good, but we're gonna need blank blank. If I went back in a time machine and even told people at OpenI, like, you know, a year and a half from now, it's gonna cost, you know, one tenth this and be a hundred times better, be like, no, and be like, yeah, all in the all these incremental things. So when it comes to if you're, you know, if you're gnome or you're John and you're inside
Deep mind, you're at Google. The problem is every lab is dealing. Even open I had the greatest pitch in the world years ago to researchers. This is the amount of compute you're going to get. Look at this fabulous showroom filled with compute. They still can do that to researchers to an extent because there may be like 7,000 people there, but the research team is still not huge. They didn't.
They didn't crazy scale it, but some of the people they've lost were people that left because they weren't going to get the compute they wanted anymore.
Yeah.
And Google has a real problem with that there. I know they've hired a bunch of people and I know some people that are thrilled there, but I Uh nobody'll know who I'm talking about. So they're working on toy problems, they're working on toy amounts of compute. You know, there are people coming for the world of physics and stuff like this and they're excited they're getting paid more money and they get a lot of compute. I'm like, Yeah, you're
It's not a serious project for them. It's the same problem Microsoft had. There's a reason Microsoft gave OpenAI their entire AI budget was that it was spread so thin between so many different researchers and nobody knew who to bet on. Yeah. And right now, between If you have a different novel idea, guess what? You're not gonna win between demos and or Sergei. No. You know, so
You were going to hope that you can sell one of the two of them that your idea is really good.
Yeah. And if they're heavily invested in something different or different priority, it's a challenge. You know, and then then we could even get into like what's going on over a metal.
Well meta it appears that uh You know, I think uh the the The mood is dark. The mood is bleak at Meta. Uh it it appears that they got some of their researchers doing data entry. You cannot refuse recruitment to the AI team. You can either quit or Uh, you can report for duty, so people are calling it the enlistment that you were going out to the to to the fronts, but Whoo boy, man. From from from from where they were a few years ago, where they were doing really, really well regarded work.
on open source models. Yeah, everybody knew that there was going to be a turn where they were going to utilize these kind of things, but uh boy, things have not been great.
Yeah. Um There's an expression in Hollywood that I don't think people in tech have realized. When it comes to talent, there's such a thing as above the line and below the line.
Yeah. Go on.
And below the line is crew. It's your electrician. It's it's your PAs, it's your line, you know, the You know, the craft services, administrative, your line producer, whatever, okay.
Th this would this would be uh above the line and below the line is who shows up on the poster versus who shows up in the credit.
Yeah. Yeah. And above the line Actors, writers, directors, right? And the whole world, like if you get it we just and I just talked about the TV show, the studio, and you kind of saw this where the head of the studio has to sort of eat a lot of crow and and deal with the personalities of above the line talent of actors and and directors because they are the ones that Quality or things tend to follow them. And in AI, it's very similar when it comes to research.
There are people that have spent their entire lives thinking about these things and have a lot of great ideas in their heads and whatnot. And they're researchers. You hear about these crazy salaries for researchers, but that's the differen the differences between, you know,
A group of people put together a paper. It's called Attention Is All You Need and it doesn't do anything at Google. And then a guy like Alec Radford over at Opening Eye, who was at Opening Eye, not anymore, reads it and says, Wow, um This is a very, very, very interesting idea here. Uh what if we if this thing could scale, then you know, that would be really cool. And so that's a thing where you you look for some of these people.
Who are incredibly capable and just sort of figure out, like, you know, what does it mean when they go somewhere? You know, what does it mean? So, anyhow, the point I'm trying to say is that meta. I know in cr there are there are there are super super A plus plus plus plus engineers.
Yeah, bye.
When it comes to A-level engineers, there are thousands and thousands of thousands of them. And Meta was built upon the idea that you could like like, you know, Microsoft the developers. We just throw a bunch of developers at it. We just put a program manager and whatever it we scale it, whatever.
Meta, I think, was still in that mindset of that. They they hired some key people, like, oh, we've got some key people here. We'll throw that there. It's not just that. You have to be able to work with these people. You know, you have to be able to work with above-the-line talent.
You know, you have to be able to nurture that, give them what they need and understand it, not just say, okay, go into a room, produce genius. And I'd say that's Zuckerberg is a super smart guy, and I will not bet against him, but That's a thing that I'm hesitant to say, you know, that'll be challenging.
It seems as this world evolves that Chemistry Talent selection and high level investment in a singular kind of direction, that's kind of the the things that are making the the the the leaders in this space the leaders in this space right now. Like that's what Anthropic has. That's what open I uh open AI has.
They've got big brains that know what other big brains are getting them to their vision. They are betting gigantic sums of money on these things coming to pass and they are being rewarded for it. But that is a structure that's rare. You know, that that that doesn't normally happen. And I think it's kind of it's to Sam's credit that Sam's not the one that's doing it. It's to you know, Dario is is is that brain. He's leading that that that operation. Uh
It's it's hard. I think it it it it's not a it's not the same as as some of these other engineering problems where you where the like talent is talent. You you know that talent matters, but a little bit more interchangeable. I don't think that these Frontier Lab uh uh, you know, these the guys, the main players are really all that interchangeable except with each other.
Yeah, I I think that You know... Dario is is a visionary. You know, Dario's got a very big vision of that. And if you agree with that vision the way he sees it, it's you're it's it's good. I think to an extent, I think that outside of research there, it's I've heard different experiences. But uh
There's there's the meme. Uh I asked my friend at Anthropic how his day went. He looked kinda sad and said, I can't talk about it.
Yeah, I would say OpenAI. Sam had a background. Sam knew how to code. Sam could do all this stuff. I think Sam actually understands a lot of the stuff really, really well. I'd say better than just uh, you know, I think. Sam and Dario, I think as far as CEOs go, understand this stuff better than just about anybody else. Dario has the more formal background in this, but I'd say Sam understands it really well.
But Sam is not I'd say that the the difference between let's say we'll put Dario and Anthropic in the c you know, out uh in a one sort of area. Difference between Sam, Elon Musk And Zuckerberg Is Sam is okay not being the Sam does not want to be the smartest person in the Sam wants to make sure that he found the smartest people in the room and put them in that room and they know that they're the smartest people. I would say the trouble that Elon had with XAI was Elon has incredibly smart.
But to bring a bunch of smart people in the room and to say, no, you're wrong. We need to do this is hard in that environment. And we saw that. We saw he lost. There were something like eleven co founders. They all left. They all
For X AI.
Yeah, for XA. All left. Um, for Elon. And and and I think Elon is like the greatest engineer and businessman of our. Like with that handstand. Like he's whatever personality quirks and everything else like this and all the drama and my frustration with many of the things he's done, have to say. Incredible. But I also say that that that's I knew when he when people said, Oh, he's gonna he's gonna take over you know Twitter, I'm like, well, it's a software company.
He hasn't really ran a software company since PayPal and he was organizing Peter Teal, Max Lechkin, a bunch of others. And then revenue wise, the revenue for X, I like I'm on X all the time, but the revenue went down, all these other things went down, and and there's a lot of features they don't support anymore. And then we heard us to do, yeah, I'm like, it's gonna be a challenge because
It's not the same as hardware. It's a different pace than that. And and you're in a team, you're in an environment where the d you if you want to build cars, you can work at Tesla. If you want to build spaceships, SpaceX still wear it at. There are other great companies out there, but you go there.
You want to work in AI. There's a bunch of different places to go work. And that was going to be the challenge he had to deal with. And same with Zuckerberg. Zuckerberg was like he he thought he would just write checks like he was building an MBA team. you know, uh MLB excuse me, M L B major league baseball team, sorry. You know, he thought he was gonna write checks like that.
Yeah, he was he he wanted to build a super team, but the problem is is like I do think that chemistry matters. Mostly because I think vision matters. The more the more I come to understand this is just like Dario's got a vision for how these things are gonna go. And he's and and it is it is a multi year structured vision that he needs people to help execute where he's going. Uh uh.
Jakob, Sam, Greg, they've got a united vision as to where this thing is going to go. Now, where the applications are from that, there's disagreement. Obviously, they went through a big pivot a couple months ago as to what they were gonna. uh keep going, but they are gonna uh w w what they were gonna shudder. But the scaling up of this tech and the idea that it's that it's going to be as robust as it is, that's a singular vision.
They are uh uh uh betting billions and billions and billions of dollars in compute to execute that. If you don't have a united vision from your your big brain, your CEO, and everybody that's executing below it, like I think it's hard. And and that's and that is not the way that Google was built. That was not the way that Apple was built. That was not the way that
Um that that uh uh meta or Facebook was was built. You know, there's are similar lessons, but it's not it's not the same at the scale at it as as it is here. This is kind of a new way to go about things. And it's it's a little bit hard to teach some old dog new tricks.
Yeah, there's an open AI has a massive new compute. that is is and I understand too is that they've invested a lot of money and now now Anthropic has been catching up. They've been making deals with Micron and other people to make deals, but like OpenAI has is several years into this.
And they have a new compute, supercomputer coming online. And and sometimes you'll hear like people I talk about like, oh, Elon Musk, he did colossus. He threw a hundred thousand throwing a hundred thousand dollars into GPUs into a system and and using, you know, some gigabit Ethernet to connect it all. ain't the same as knowing how to make a hundred thousand systems. Like that was one of the things they found out with Colossus One was it was just
very inefficiently run. It's fine for doing inference and stuff like this and classes to sort of say mistakes. A lot of people were looking at that like, that's not how you do a bro. Um, and it's not just a matter of scale. And OpenAI has a
uh a very, very massive amount of compute coming online for doing new model training. And we're going to see this acceleration. And I think that's one of the things that we're we're we're seeing people trying to meet demand and continue to training. We're going to start see the acceleration even more so.
For anybody thinking that it's going to get simple, it's it's not. You're going to get to a point. There are metrics that I saw internally basically describing the rate at which you would see, like, you know, how fast it would take to do GPT-4-level models. And it would go from months. Too much faster than that. And and extrapolate that to where you want. And we're going to be able to see the ability for new models, new capabilities that come out at an incredible pace. And
Where that would be that would be approaching and passing the line of recursive self improvement.
Yeah, that's the big term, by the way, we've seen that RSI recursive self-improvement's been a new phrase that everybody hears and basically is. When the models get good enough, you just let them get a bunch of data and they continuously figure out how to improve themselves over and over and over again. You let the thing run in a data center and you come back a day later and it's an entirely more capable. And that's that's not a theoretical.
People are talking about that like before New Year's Eve. Like that's where people are talking about RSI. Like we are we're not this is not over the horizon. This is like Here now. This is this is like it might be a midterm issue, like in terms of of of of it being uh it being close. Um
¶ Wrap-up
Yeah. Yeah, we gotta get out of here. Unfortunately we won't have time to talk about some of the political uh uh back and forth uh between open AI and anthropic, uh a big, big fight, uh that we can uh, you know, I'd be very curious to see how New York twelve. uh votes. So if you are in Manhattan and that is your district, if you are represented by Jerry Nadler, I'm very curious to see how you decide to select uh
who is there, we can uh we can take a look at all those results. I I I very much wonder whether or not online is real life. Uh You know, or or whether or not those actual constituents care about some of the issues that have been made front and center there, but we'll find out. Yeah. In the meantime, Mr. Andrew Main, where can people find you?
Uh go to Andrew Main on on X X dot com. Andrew Main, M A Y N E. Uh, check me out there. Um to find out uh completely unrelated, you know, news about stuff. Yeah.
Yeah, yeah, good. Give give the news.
Yeah, I mean it it's it's Hollywood. So Hollywood is it's Hollywood pace and Hollywood mean or Hollywood time. So I mean I I I'm hesitant to even talk about it but I'll say Uh ABC Network announced a development of my novel, The Naturalist, to try to develop into a TV series. Got an incredible team, some CSI veterans, people worked with the TV show The Vikings.
Uh, the movie The Complete Unknown, whole team working on it, trying to bring this to television somewhere near you soon. So that's exciting, but we'll see.
Well, what is the what is the log line for the naturalist?
Naturalist was a book I wrote. Ten years ago, about a computational biology. Jesus. Because I got very excited about where AI was headed. And at the same time as I started to study AI, I decided to write a book about a guy who worked with it. It's crazy if you told me ten years ago, ten years later, where I would be, you know, uh, while the book was, you know, you know, which has turned into a very successful series. It's an Amazon charts bestseller. It was
Crazy sort of
🎵 Music
Here we go. Alright. Justin R. Young, everywhere you find Justin R. Young's. Until next time, friends, we will.
🎵 Music
Diamond Club hopes you have enjoyed this program. Dog and Pony Show Audio
