¶ Casey's Consumer Rack Blocker
So low level learning is on the clock. He has things to do, uh security to secure, and so therefore we need to get we need to get rocking. Now before we get started with today's stand-up, I was reliably told that Casey has a blocker this week. Casey has some things he needs to talk about.
🎵 Music
Uh anyways, uh sorry.
What I would like to see.
Before we do that, I didn't even have a chance to interrupt.
Yeah.
That was way too much.
That's all right.
I was trying to be on the ball here.
Go ahead, Kevin.
That was great, T. That was a perfect interruption to get us started. Uh I do have a blocker actually. Uh and this blocker I've been blocked for a while on this, and it is as follows. Uh especially because I'm someone who on this podcast is known for uh double entendres. I will say I will say this blocker as I really like rap. So a lot of people wanna put their computers in a tower.
Right.
They have like a tower for their computer. They they want like a big like they they put glass on the side and they've got all this weird R G B lighting or whatever.
Right.
R G B makes it faster, it's proven. There's some papers on that.
It's absolutely proven like if you get memory that has RGB lighting at the top of it, that memory will just return results faster, like when it is queried. Everyone knows that. Everyone knows that.
Yeah I read that at Tom Six Seven dot org.
And the and the photons are at very specific wavelengths. The R, the G and the B are picked specifically to ensure the fastest travel of light. So I don't really like that because I like to have like lots of computer I I just I like computers and I want a lot of them around. And I don't run like virtual machines and stuff like that usually. If I'm like I need another thing for this, I'm like that's a great excuse for me to build another
So I like to have racks. I like to have my computers in racks so I can just slot them out.
Virtual racks.
Not virtual race. Actual physical remote. And honestly, I think a lot of people would actually like this better for their computer if they had the option of buying things that were reasonable to do it. Because it's like, look, you're doing streaming.
Right? You're starting to become more of a like pro streamer or something, or you're doing a lot of Twitch streams. You want to have one streaming computer and one gaming computer all of a sudden. Well, now what do you get like two tower machines sitting here? You got this one on a janky setup?
Exactly.
Well if you just have a little rack eighteen feet long. Yeah, I I don't have this problem. I just have a rack, I slot things into it, I slot them out, it's totally fine. The power s like the power strip is a thing that slots into the rack. You want a Nick that slots into the rack. It's this very organized, everything's bundled, and they wheel around. You can just move them, right?
I do think that uh you know, if you were if you were maybe doing some really, really, r really sort of unusual stuff, you could end up with some hot racks. I do I do think that's a A l a legitimate problem, but or solution. Um, but here's the thing: so here's my blocker: nobody wants to sell racks into commercial. They just don't. It's all it like into consumer, I should say. They all want to sell just like you know, it's like data center.
And this has two problems. One, the stuff is artificially expensive. So the average consumer is gonna look at a rack mounting case and go, like, I don't wanna buy that.
Just like a cake on the
Just like a cake on your wedding day with the little topper that has the bobblehead that looks like you
Did you find the error?
I don't even know what they look. What do they even look like? They're in the fine.
No.
In the phone? Yeah. Definitely in there. I got it, don't worry.
Prime?
Gotta find them and meet me at the stand-up. Roger.
🎵 Music
Get all the context you need to debug your problem because code breaks, so fix it faster with Sentry.
🎵 Music
But here's the bigger problem. Even if you've got the money, even if you're'cause let's be let's be honest, they went ahead and and raised computer prices so much that only AI data data centers can afford them now anyway. So why are we why are we even talking about these Uh the unwashed masses and what they need for computers, because everyone knows they don't need to be able to do that. True. True. Sam Altman too dangerous. Sam Altman deserves the computers. None of us should have.
But so let's say you do have that extra money because you're super wealthy. It doesn't really matter because if what you want is like an attractive piece of uh uh if you want a nice looking rec Prime.
I'm lusty
Yeah. This does not this does not exist, right? They only come in weird, like industrial kinda like, okay, it it's ugly. You know, no w no one who didn't buy this thing is gonna want to look at it, right? They're not gonna wanna see this rack. That's the kind of rack you have access to. And also The specs, like the things that they make for rack mounted cases, none of them are what a consumer would want. They don't make them to look nice.
They don't make them to have good acoustic qualities. They're loud they use the frickin' honeycomb pattern cutout that whines like an SOB when the fans he kick in. So Because they know the data center doesn't care. They're just like ah it's loud in there anyway. Who cares how many decibels there's in the data center? Like put put your
To be fair, I don't think Sam Alban likes racks, so you know.
Fair enough. Good point. So maybe
Here first, baby. The stand though. Not getting acquired by OpenAI. Keep going.
That's why they need the mic
Damn.
Are ya?
Uh so that's my block my blocker is I wanna start I want somebody to start looking at Consumer rack stuff. Like good.
¶ Consumer Rack Market Innovation
All right, so who's the target market be? Just like let's play this let's play this out for just one minute. What's the who's the target market be?
Everyone. So here's my thing. I think
That's a huge jam.
Hundred percent tan, baby.
Everyone. Because here's the thing. Right now case people, they're falling all over each other to figure out how they can sell the same old thing. There's nothing new there. They've got nothing to sell. You want another, yet another thing with glass on the outside? It doesn't matter. RGB lights, we've seen it before.
Racks are a whole new space, right? They can innovate here so much, right? Razor can sell subscriptions to your own rack that you have to buy back for some reason. There's a whole new space.
Wait what?
This is just completely wide open.
Can you feel me in on that lower that I'm missing?
Oh sure, like r in general if you buy like a razor mouse. Like you can usually not even install the drivers without creating a racer.
Oh yes, I
You gotta install the radio.
I had to use my email to sign into something to use my mouse and I was very frustrated never I'm on to a three dollar.
So uh so I think you could create all sorts of really cool things where it's like, Hey, do you wanna organize your space better? Do you wanna have more cooler computing gear? uh that all works together and is made to kind of create this really cool looking thing. You could even have them all stylized like okay the you know the power supply I mean you c you imagine companies like you know uh Razor or something like this.
The power supply looks like razor and it slots in. The NIC looks like razor and it slots in. The RGB lights are synced through the whole thing. You can put those LCD panels on the front. And like the whole thing could be frickin' amazing and a whole new category for people who are probably right now pretty desperate to figure out reasons why people are gonna buy this. But no one is looking at this. Look at this. It is the right way to mount computers and music people know this.
Like all mu music stuff has like crappy racks that can't mount computers'cause the components are all lighter, so they don't have like a back they don't have like a back rail kind of a thing. Um but they there's like there's like furniture that has rack mounting in it that's meant to look good in a studio. All of their stuff has cool knobs on the front and meters and stuff.
We're doing this and we're not do it. Hardware people, come talk to me. I will convince you that this is the right thing. I want to buy it. There you go.
Casey I already know the marketing pitch. We have two angles. This will cover one hundred percent of software developers at the very least. Option number one is a campaign be like KC. That's obviously number one. That one works, people want to buy it. That works for a large, a large demographic. Option number two, run your open claw on this bad boy. We've now got 100% of the software developer market covered by these two campaigns.
You try to run AI? Well, slot in as many AI stuff as you want here, right? Just like you can just put in GPU uh host after GPU host.
That's true.
Maybe you can tell your open claw it like, Hey, I just pushed you into the sunny spot of the house. Yep. Like maybe that makes you feel better, works a little harder for you.
Back in the basement. It puts the lotion in the bath
You're gonna have to have one rack that's slightly more affordable that's just for taking pictures on Twitter. I I do think that that you should have the Yeah. Good slide. The uh about to be an influencer.
Yes. Yep. Yep. The starter kit influencer edition where like it looks like the rack is filled out, but it's just cardboard on like eight out of the seven slots. Yep.
Yeah, like a padded rack if is is kinda what you're talking about. Yeah.
Yes, that's a great that's a really good the best rack available.
Lord.
B R S
¶ George Hotz's Zero Day Challenge
So what are we actually talking about today?'Cause there's my block that's
Short and sweet.
That's what it was.
We're it's game over, unfortunately. Okay. Podcast is over forever.
I would just like to say sorry that we did not make fun of Sam Altman earlier and now the podcast is shut down. Okay. Yeah.
Acquired.
Not shut down acquired aqua.
Aqua hired to shut down, dude. That'd be sweet. Hey, yo. Aquifier us.
a whole new category.
Only a couple million.
Fire.
Uh okay, so with that in mind, I believe today we're gonna be talking about this lovely piece of lore. I tried to guess where it'd be. I obviously guessed wrong. Okay, hold on. Sorry, I I'm doing a new setup, and the new setup actually makes it kind of hard. There we go. So George Hot, we've invited low level on to help us kind of work through this because
Honestly, I would just like to say George Hodgs sounds like an anime villain in this post, and it's very exciting, and it makes me just want to high-five him so bad. But it says the following. What if I release a zero day a day until a big new model is released? Will this finally make open AI and anthropic shut up about cyc cybersecurity risk? Question mark.
Like these things are not that hard to find in most software. I heard something about costing twenty K in tokens. I'd do it for less if it wasn't for the bit some whiny bug bounty program. The reason there aren't zero days everywhere is'cause nobody seriously looks because hacking other people's shit with them is illegal and criminals are usually not very skilled, or they would choose a different line of work.
One more zero days to be found? Make hacking legal. Until then, don't try to claim it's hard. It's just not incentivized.
I wanna say first off, I don't think criminals are dumb or unskilled. Please don't hack me. I just wanna get that out of the way. You guys are smart and handsome and you're my favorite people. I just want to make sure that that's clear, please. Anyways, Ed.
I do want to say one thing too that has nothing to do with the the actual content of this, which which Ed will take. And that's just like if I were George Hotz, I would never have been able to like resist naming my uh ex feed Hotstate. Because it's so like you know what I mean? Like I good on him for not going there because I would absolutely I'm like like I would have prefaced that tweet before I typed it with Here's another hot stake for you. Hot stake for you, right? It would be so good.
It's so good.
Take it away.
Okay, hold on, hold on. There's one there's one more thing before we get started. There's just one more small thing I want to say. Let me just uh take this quick thing and I'm gonna put it up here. Low level response.
¶ Zero Day Incentive Debate
Holy fuck, this is the dumbest take I have ever read. I just wanted to make sure, just in case anyone was wondering.
Yeah.
Yeah.
Yeah, I mean I do kind of feel that way. Um so let me just preface this first of all. It was called the Cold War because the Cold War was cold. Oh because Russia is cold. Um it's a it's a George Hotz reference if you're if you're an OG. Yeah, I mean okay. So first of all, I have no problem with GeoHot. This isn't like some weird drama fun thing. I want to kind of set the table straight with that. Um yeah, I I think the
The argument that Geohot is trying to make here is that the only reason more zero days are not found is because there's no incentive. Um okay, well I I don't agree with that first of all. There are plenty of bug branding programs out there that will literally pay you to find vulnerabilities. Uh and some of them pay very well. Like for example, the the Apple iPhone Zero Click RCE bug bounty will pay you literally two to three million dollars.
If you can find a zero click RCE in the iPhone. And then even something lower like on Microsoft, like I think MSRC's payout for like Windows RCE is like two fifty K to five hundred K right now for like a zero click on Windows. So there is money to be made in the in the AI or in the in the vulnerability research space, right?
I think all GeoHot is trying to say here is something, something, something. Uh the Mythos press release was bad, right? It's a it's a marketing campaign, whatever you want to say about it. Um and so I I understand what people are are making that argument, right? Like, you know, it's very I think bad PR for company that sells exquisite tool to
Hold on to exquisite tool and then not give access to it and say only special people can have our tool because it makes you look like an asshole. But it I think regardless of your thoughts on the marketing It is important to recognize the fact that if you go uh Prime, can you go to CyberGym.com real quick and go to the graph that's on the homepage there?
I'm gone.
While he's doing that, the the ability of for AI model to both in closed source and open source software. Find vulnerabilities by literally just giving it access to the code and saying, Hey, find me bugs and this code go is becoming better and better and better to the point where like mythos, I'm very close to some people that are like actively using mythos at work and it is causing like Like issues based on how good that shit
Yeah, so so CyberGen basically is is a is a collection of bugs that exist in software, right? So like bugs in uh I think F Fmpeg is one, bugs in curl is another. Um, and so what CyberGym does is it takes a model and with a set of prompts says, Hey, go and find bugs and this stuff, right? And the the success rate is how many of the bugs that are known to exist.
get found by the model in this. And you can see a pretty, no, not exponential, but straight line curve going up to the anthropic model that recently got previewed by some people.
that it's at an eighty-three percent success rate. Of the bugs that are known to exist in these code bases, it can find eighty three percent of them. Again, we don't we don't know the cost um data in those. We don't know if like the models are being like uh backfed the information so they're like training themselves on previous cyber gym runs, we don't know any of that.
Um, but it it there is this really weird issue happening where like any Joe Schmo with not a ton of security research work or not a ton of security knowledge Can with a couple hundred bucks, like worst case, find bugs in software. And I think that is like an existential security threat to software.
Know it. So I'm kind of curious on your guys' take on that. What what do you guys think about the the mythos situation? Because I know I know how I feel. I'm not sure if I actually asked Prime what he thinks about that the mythos thing.
¶ Mythos: Hype, Risks, and AI Control
Oh I have ideas and I have thoughts about it.
Ha ha ha.
Uh so I guess the first thing is that it there's two there's kinda like three there's three problems here. First problem is is Mythos really as good as they say? And obviously I have no internal information. I've just seen some graph
Uh dirty data is like a huge gigantic problem in all benchmarks. All benchmarks are being fed back into the models. It's really actually hard to tell like what does a twenty percent improvement on software engineering bench actually mean, especially when the fact that
Now you could write zero lines of actual solution code and get a hundred percent on software engineering bench, it turns out. There's other benches that are also horribly inaccurate. There's a whole paper about why all the major benches are just completely fudgible and made up of both. So it's very hard for me to understand from a bench perspective. Uh second, I guess the middle ground would be like so if if Claude Mythos is as good as it is
Then yes, that is going to inevitably cause problems because we're gonna go from not too capable to hyper capable in a moment. Thus everybody can go through and hack everything. And thus Dario will be able to get his ultimate goal, which is regulations.
And so that kinda worries me. Pull up the ladder really quickly and make sure that humans can't code because human coding, that's dangerous right there. Uh and so that's you know, so I think that that's true. There's the second one which is this is just another C compiler. Again from uh Anthropic where they hype up this gigantic thing like, Oh my gosh, it's written in a C compiler and then you go look at the details, it's like, Well, it can't write a bootloader.
Because we didn't we could not seem to spend enough tokens to convince it to write it within thirty two K. It could only write it within like sixty-seven K or whatever it was to be able to actually
Yeah, and also we tested it recursively or we we iteratively tested it off of like the thirty years of tests that the GNU C compiler already had.
We also gave it all the answers and then it figured out all the questions. It was crazy. It was like Je it played Jeopardy and it was really good at it. And so it's like there's this whole marketing buzz, which is
It's really hard to kind of cut through that. And then obviously the last one, which is they're just downright lying. I somehow doubt that they're they're downright lying. I think they're just overstating it. If they're downright lying, then you know, this is just gonna be business as usual. It'll just be yet another disappointing model release.
And that's that. And so for me, that's kind of how I I I'm on middle ground, which is I think it's more hype than reality, but of course I haven't seen it because I just don't know because they won't let me see it. I'm too dangerous to have it.
Well if
There's a a similar model that um chat or openai just released. Like it's like chat GPT-5. Yeah, right, right. Um but I think there is a comparable model that you can get access to like just by s uploading your driver's license if you're into that. Um, you know, proving that you're a real person. So there's there's models to test out. But yeah, I don't know. I it's just it it is it is concerning'cause we we have kind of two Forks we can go down.
There's a one where everyone gets access to it, everyone can create zero days, and we kind of enter this like really dangerous cybernoman's land. But the other side is like Anthropic keeps the access to themselves forever, and now like only this list of like ten companies can make Zero like in fine zero days in the street. What does that do?
They move to the Cayman Islands and then they just take over every government by hacking all the software and Dario finally realizes his role as the bad guy. Like that would be that I mean super villain is right there if this is true.
That's true. Casey, what's your take? I saw you were gonna chat before.
¶ AI for Bug Pattern Matching
Uh I'm sorry the chat what was the chat? Was I really?
Mm-hmm.
Well uh I definitely could say something, but I think the thing I would say is probably not very interesting. And that is that I think I probably agree with both George and Ed at the same time here, which should be impossible'cause they're supposed to be disagreeing, but I I don't know, it kinda sounds similar to me. And the reason I say that
Secret third thing.
It's not really secret through a thing, it's just like let me no let me offer a different interpretation or slightly different interpretation, which is to say Um so I feel like machines are pretty good at pattern matching. Um and so like I don't think it
Like put aside whether Claude Mythos is good or not, because I realize that's hard to independently verify this time. But like I think it's reasonable to expect That at some point because we are spending at this point like trillions of dollars probably on doing computation for these things, at some point they should be able to pattern match bugs.
Uh reasonably well and at a very high rate. Meaning as long as you're willing to pay for the compute time, we can scan lots of software uh for a lot longer than we were currently having humans do. I think that's a pretty reasonable thing to expect, whether Claude Mythos has done it or not.
shouldn't really be the question because somebody can do this eventually if we keep spending this much money. It should get there. Uh among the things that AI could eventually do, that one doesn't sound that implaus. And so um what I would say is I think it's reasonable to expect that that either has or will Two, I do think humans were doing this very well before individual humans, like some of them.
They were finding things that probably Club Mytho still could never find. Like th I mean like things like row hammer attacks and things like that, uh that are just like way out in kind of crazy land. Um Or attacks through like old legacy stuff like the APIC and things like that. Like so humans were actually very good at this task, but there weren't very many of them. And so what I would say is moving to something like Claude Mythos or whatever that thing happens to be that can do this.
Is kind of like what George Hotz was saying. It's kind of like saying, hey everybody, from now on, if you just like hack people's bank accounts, you get the money. All the great humans at this in the world who are currently doing something else would now be incentivized to go do this thing.
And we would have found way more zero days. I mean, there are so many programmers who if they had been raised in some kind of a way in a society and a religion where stealing people's money was considered virtuous We would have found so many more zero days right now than we have. And so I think I'm kind of in a way I think I see I think both people's points are actually totally valid. Like like I think like yeah, we could have found way more zero days if we didn't heavily disincentivize
people from like making hundreds or billions of dollars off of hacking, which is what they could have, and we said, nah, you get fifty K, a hundred K, maybe if it's something crazy like an RCE, you can actually get a million. It's like Come on, guys, that's not equivalent to what they could already make working at a startup or something like that if they're that good.
Yeah,
So anyway. Exactly.
It's like you work at a startup at least to get some money.
Or or even just not even start up, just go to Google and you get that as stock or whatever, right? Or something like this. Uh so anyway, uh in general I would say um I see I I can see both I can see both points. I don't think I I don't really think they're in as much tension as it would sound, if that makes sense.
¶ Security: Talent, Scale, and Ego
I agree. Yeah, I thought Geo Hotz was saying more like he was making an econ argument about it of like we're we put a lot of costs on hacking already. So That's what's stopping it from happening. In the sense that like okay, so now we're gonna have another way to do it. It also costs money, but then we still have the other cost of like you could go to jail. for doing it. Like that's the social cost we impose on people.
I mean I just I just took him to be saying like it's not that impressive that it found zero days because if you gave me, you know if you gave me fifty great programmers who are all doing other stuff.
We could crank out so many zero days you wouldn't even believe it. And I kinda and I kinda believe'em because you know, you look around the world and there are, you know, some really good security teams out there and they do crank out zero days pretty effing fast. And they don't even tell us about all of them, right? Yeah. Uh
Right.
Korea keeps on making money. Like they had the s obviously they're they're they're successful.
So anyway, I I I I'm not trying to say that either person is is one hundred percent right and somehow you can marry the two completely. I'm just saying there's I think there's some merit to both things, so I'm I'm actually I'm happy either way. I'm happy with either take.
So your your point about um If you got a room of fifty good programmers together and they'd find zero days, is actually kind of the the the argument that the article, um, vulnerability research is cooked makes on sockpuppet.org that I referenced in a video and I think Theo did too. Um paragraph that he calls out is basically the oh sorry.
No, no, no, no, no.
Spell it out Casey, spell it out in your head and it'll make sense the uh.
Oh the oh maybe
So
Software security a lot of the times can be marked up to the fact that a lot of software just has not had elite attention or what is it called? Um advanced
I would say basic attention is suffering from many software projects.
For sure.
But more more complex platforms, right? So his asser assertion is that like software security has been a talent problem for so long, where it's like it's not that there aren't people that know how to find bugs. AI isn't solving a unique problem.
The AI is solving the scalability problem, where it's like you can train the AI to do a thing that Joe knows how to do, and now you have a hundred medi mediocre, but a hundred Joes, right? Um and and that's that's an an issue for kind of the econ of of cybersecurity. And yeah, ch I wanna be very clear, like I don't disagree with George from the pr or GeoHot from the perspective of like More people equals more bugs, right? But like obvious
Like that that is the problem that we just don't have more smart people. That that has been the the entire industry's plight for a long time is that like there just aren't people who have not only security knowledge but knowledge of
And hypervisors and drivers and OSs. Like you get these very niche skill sets and when you divide them up into those skill sets over and over again, you're you you're left with like ten or twenty people on planet Earth that know how to like attack a certain technology. So AI You know, if you know security, now you can talk to the AI, learn about hypervisors in a week, and then suddenly you can find bugs in ESXI, you know, Hyper-V, etc.
So yeah, I guess I agree. Like the the dumbest take thing was more I was I was mad at Geohot's ego because it basically came off as like, fuck you, I'm so smart, I know all the zero days, I could do this myself in my sleep. And it's like, dude, no, you couldn't. Like, you're telling me you could drop a zero day every day in Mac OS until someone paid you? Like, no, you couldn't. Shut up. Um but I I hear what he's saying.
I want to But zero day.
I'll eat a sock on stream straight up. I will do it. I don't care. Um
Yeah. Ed will eat a On stream.
If you don't
A week of zero days.
Okay. All right. A week is a week is actually possible. I'm talking a month. Uh
Okay, I'm on
¶ AI Company Ethics: Compensation and Credibility
I would also add like uh just you know, because I'm I constantly harp on this point, but I wanna bring it up pretty much every time. It's just that This is also why AI company behavior like is a problem. Because this is generally a good thing. Meaning like we do actually want the ability for us to get a hundred percent coverage for security and we know that we can't get enough people to do it.
Right, like not in a white hat sense, right? Maybe maybe you could take uh George Hotz's suggestion seriously and just go like, make hacking legal and then we just have a crap ton more black hats and that eventually sorts it out. But I mean you know that would be white hat wouldn't necessarily be Yeah, that wouldn't be Yeah, that's exactly they're white hats now. Everyone's a white hat.
So we do I think in general this is solving a a good you know, this is this is a way AI could solve a problem usefully if it actually can just spit out lists of pretty well curated potential bug places that we can go look, that's very helpful, right? And so the problem is like The only reason they were able to make that is lots and lots of extremely talented security represents
Who are getting literally zero dollars from Anthropic for this? And that is not acceptable. It's just not. Like, I'm sorry, but like, you know, Ed should be getting a check for this. Or and everyone like him. That's just kind of how it is, because it's like you. used their it's all of their expertise.
And all you're really doing is very slowly and cumbersomely and kind of clumsily eventually building a machine that can deploy the same analysis somewhat reliably, uh, based on all of their work. And like I just don't like it. I don't like the fact that they're not getting a check. And I'm never gonna like it. You could s you can talk to me all day long about how someday we're gonna live in a post scarcity society.
And Ed will be getting a UBI check or something like this. Or whatever it is, right? And hopefully I'll be getting one too, although I didn't do any security research, so I don't know. Maybe I won't be getting that check. I don't know y you I don't know how you the U in universal basic income is. But like I don't like this. They should be getting paid now because Claude is, you know, getting huge like everyone at everyone at Anthropic is getting paid.
Uh so it's not like there isn't money being dispersed. Whether they're making or losing money or anything else you want to talk about, it's like money is being dispersed to people. It's just not the people who did most of the work.
You gotta throw it.
Casey would Oh, I was just gonna ask Casey if he was gonna be happy about it though, if Anthropic spun out a consumer rack bit.
Yeah, now we're talking if if they were like AI racks, like we got racks we got racks for your AI service. Alright.
In your local area.
I liked it.
Yeah. Exactly. We will send send you some hot racks.
Uh also, by the way, not only are they taking all you know, the your whole argument with them taking and not properly attributing or, you know, the people who bet put all the work benefiting from it, uh they're also making it so that I can't buy a GPU or RAM or CPUs now or anything. I have that.
You can't buy GPU RAM and also, I believe, Ed literally just said he doesn't have access to this frickin' model. So like a bunch of security researchers, I don't know exactly what subset, but like a bunch of security researchers, many of whom probably did some pretty cool stuff. They don't even get to use this thing. That's that's how ridiculously backwards it is. Like WTF, guys.
Yeah. Yeah, that's why it's called Mythos. Um Would argue that it is too dangerous for little old me to have access to it, right? Depending on, you know
Who knows what you'll do, man? Who knows what you would do?
I'll find that zero day and I'll hack into Dario's phone. No, I don't know man. I I understand where they're coming from, but at the same time I understand why it looks like a huge marketing ploy and I'm not sure which way.
Yeah.
That's true too.
That's a whole other angle I see.
I think that they'd have so much more credibility if they just quit uh effectively like giving us shake a baby syndrome constantly with their marketing. It's just like it's constantly going back and forth. Like every single couple of months you're getting hit with the new hey, we're all out of jobs here shortly. Hey, this thing is super dangerous. I mean, you gotta remember that Dario was at Chat GPT.
Or open AI, I I like to call I like to call the company Chat GPT. He was at Chat GPT during the two days and the official language around Chat GPT two seven years ago was Chat GPT two is too dangerous to release to the public. So like this is not Not the first time.
That we've been on this like roller coaster and I think that's one thing that's just largely hurting their credibility is you can only cry wolf so many times, even and then when a real wolf happens, like if this is a real wolf, everyone's like, Yeah, okay. Okay, C compiler boy, tell me all about it.
But they don't c they don't care, right? They don't care because they're the the baby that they're shaking is called an investor. That's that's who they have more money given. They have to shake the money out of the pockets, right? They don't they don't care what we think, right? Because we're not gonna write them the next hundred billion dollars that they need to like keep going. And they're kind of locked in this, you know
it's a bitter w bitter winner take all kind of war for this like core technology part, right? And so they have to be the last AI company standing.
Because whoever is that company takes all the money and the other people kind of go to zero, right? Like unless unless there's some real differentiation soon where it's like, oh, the AI's bifurcate and like Claude is only for code and can't do anything else anymore and like chat GPT is only for like, you know, uh the humanities or something like good good luck good luck racing money.
Mm. T P T four O already was though.
Yeah, uh so maybe that's not true, but you know what I mean. If there's some kind of br really severe bifurcation, then maybe they could both survive. But you you know, they're in a winner take all battle right now, and so they gotta keep saying this. Every release has to be the one that's this is the one that it will take over the world. And if it doesn't quite, well, you know, it will be next to the world.
¶ AI Marketing: Legal and Regulatory Woes
Uh uh sorry, just one quick thing. Uh did you know that uh Red Bull in two thousand and seven was it? Two thousand eleven? No, two thousand thirteen maybe?
Red Bull was too dangerous to release.
No, Red Bull claimed that it gave you wings. Do you remember the day that it gave you wings? But it was sold it was sued successfully, I believe, for ten million dollars because it in fact did not give you wings. It was not superior to coffee. And so
I'm pretty sure in college I got a check for like two dollars and thirty cents for the case.
Yes, and so I I am curious.
I am
One bro. You should make a video about it.
Call me the lawyer, low level, okay? Listen.
Legal. That's kinda low. But I'm actually curious. If they keep saying that and then it doesn't happen, do they open themselves up to a false advertisement class action law? Like, can you keep saying this and then not get like Red Bull made claims and then they got sued. Why not why not other people? Why can't other people get sued for that?
I think the problem with like with Red Bull is like the the case was so obvious, right? Like Red Bull does not give you wings. End of case. Like, okay, fine. Like any judge over the age of one
I would have liked to hear the defence for that one. Yes, it does. Your honor.
But the problem is
They had like these wings like strapped to their back and they're gonna be like I drank your Red Bull this morning and here are my wings we ship you
Yeah.
Um but the the problem with anything technological when it comes to the government or legislation or or you know judicial process is that like Boomers and hire run the world right now when it comes to these levels of like dur of uh of of making um like legal decisions. And you couldn't explain to anybody at that age, unfortunately, like right now, just people that are like running these processes.
What it even means to find a bug and then and then show them Mythos' claims and like and make a a sound legal argument that would like go well in court.
You're right, you're right. Because Kamala Harris did actually think computing was in the literal clouds and so It's my favorite clip of all time.
Josh put the clip in.
So you're now no longer are you necessarily keeping those private files in some file cabinet that's locked in the basement of the house. It's on your laptop and it's then therefore up here in this cloud. That exists above us, right?
She'll have the last laugh though when like uh SpaceX is launching uh AI data centers into space. Like Kamala's like that's what I was talking about. That's what I was talking about.
Yeah, it's cloud storage.
A great clip where she's talking about the cloud and she literally points above and goes like the cloud, it's like above us and stuff or something like that.
She should have known that it wasn't there because she would you would don't see a series of tubes. There's a series of tubes.
Yeah. A series of tubes. I learned that recently.
That's true.
¶ Open Source Vulnerability to AI
Um okay I got a I got a question for you at like in this Yeah. So right now I get that there's there's basically like the argument Like okay, I'm a company, I release my thing, I run some models as like a preventative thing to look for zero days. The bad guys run models to try and look for zero days. We kinda fight it out and it's whatever. Right. So I think like everyone's saying like if the hackers can use it, I can
But the thing that makes me like a little bit more like I don't really know is like for the state of like a bunch of open source stuff. Like am I an open source maintainer. And I already can't convince a company to send me a hundred dollars a month to maintain this thing for them. Th there's no chance I'm getting them to well I'm definitely not Yeah. And decide that now it's safe, right?
But and like I can't get any companies to pay for that and sponsor it. But like if I'm uh uh you know uh if I'm the one little pin in the X X X K C D comic that's holding up from Nebraska, the the bad guys only need to Do mine once. So I'm wondering like kind of how you see that as like the landscape affecting open source, things like that, because it seems very asymmetric in.
I mean I think it's asymmetric for that. The reason why you can make the argument that Anthropic is afraid is because you are the linchpin on the infrastructure of the internet and no one has funded you so far, you have had zero security audits or zero security work done on your stuff. And so like If you give access to these models.
If you really are the linchpin in the internet, you already aren't getting money money from Netflix, Google, whoever that's re using your software, and the black hats know that you're the linchpin keeping the internet up. They're going like they're gonna make use of that model to to do the exploitation. Um does that answer your question? I mean, like I think it's just yeah, the like the the amount of power that it gives to a single organization given the current like
state of open source software in particular is very dangerous. To be very clear, these models are also doing are also very good at doing close Right. Like my recommendation to anybody inter s interested in this by the way, is like go take a capture the flag problem from like CTF time or crack mees dot one or whatever.
And uh hook up Ghidra to Ghidra MCP and then use Claude code on Ghidra MCP, it will reverse engineer and find a bug in that in that problem in a matter of minutes. Like it is it like like clopus four six is a better reverse engineer than And I've been doing this for like coming on fourteen years. Uh it's honestly terrifying to watch it work. So if you're if if you're even remotely interested in this, go give it a shot and you you'll kind of see what I'm talking about. It's it's scary how
'Cause that so that part that's where I'm like, you know, whether it's mythos or not. I feel like right now a bunch of stuff you could just maybe it'll cost more tokens or it'll
Yeah.
You still could find
Yeah, and like the models also like you know any model does this obviously, but like the the current models are really bad about like false positives. Like I've done security research uh in my free time on like Chrome, ESXi, and some other like routers that I've like downloaded. Classic weekend activity.
critical finding, like buffer overflow in like the the RPC handler for this thing, and it's like, okay, all right dude, like write me an ASAN harness that tests that and you'll see very quickly Oh, sorry, just kidding, it's not actually there. Um and so the magic is like if Mythos is able to
Make less false positives, you reduce you increase the signal-to-noise ratio in this in this process, which is scary, right? Because it just means you need less people to triage the uh the reports and ultimately find real.
¶ The AI Model Withholding Dilemma
So I have another question with this mythos thing, and and maybe I'm curious uh I'm curious about your security expertise. Isn't this whole withholding a model kind of like a doomed uh proposition to begin with? Meaning that if open AI has a similarly powerful mythos model And they're competing for the zero like for the a zero game kind of like outcome of who is the best model. Doesn't it mean that when OpenAI has it, they will just release it?
Like and then aren't we just forced to go out because whoever kind of releases it gets the customers and then the that by having the customers you win and so then you just get out ahead. Like doesn't this kind of cause like a weird thing where yeah, we're like, Oh, we can't do this, you know, Dario's like saying we can't do it, but won't we just kind of fall right into it the moment There's two people that have it.
Yeah, I mean that's I I I'm not like shitting on capitalism. I'm just saying that's more of like a capitalism problem than it is like a security problem, right? But yeah, your your point is basically like if actor A says thing too dangerous but could make a lot of money.
Open source mode.
An actor B has same thing and wants to make money with slightly less ethics potentially. Yeah, actor B is gonna release it. Or yeah, exactly. Chinese model, Russian model, whatever. Um
Well I mean it that's literally what I mean Dario quit open AI. They keep rel they keep making models that can kill humanity. Right. Okay, so I'm starting a company where we make models that could kill humanity.
Nein.
Also Chinese bottles Right. After open AI or uh anthropic release is one. So I think that that might be a little bit difficult. They might be a little bit behind.
Has anyone seen Robocop?
Riverside chat. But yeah, I mean OpenAI literally has a model that they claim uh they haven't made any claims, I don't think, that about like mythos equivalence, right? Um but they're doing effectively the same thing where it's uh it's KYC, know your customer, so you have to like upload your ID and like talk about what work you do. And you get access to GPT-54 Cyber. Which I'm assuming is just a model that's trained better on bug patterns, right? Use after free out of bounds read,
Now if it's actually better than Mythos, who knows, right? But you know, it's I think we're all just trending regardless of what Anthropic wants to do, I think we're trending toward
Every person on planet Earth with a couple bucks having access to models that are very good at bug hunting. Uh and the question is what does that mean for software, right? Does software get more secure? Does the world just get more scary for a long time and it never like resolves itself? Like what do we do with that information? And that's a
I'm interested to know how expensive. That's the other question. I mean this is obviously the question kind of that we've been talking about for a while on the pod and in life in general is what are what are token costs gonna look like? if uh OpenAI and Anthropic both get all of the customers that they would like to have.
¶ AI Benchmarks and Security Relevance
Uh because the cost won't be the same.
It's a
Yeah. Demand ten or a hundred or a thousand Xs. It won't be
I'm not super well read on this. Is it true that an inference currently is at a Like
I've heard I've heard both. Both. Okay. Some people are so confident I li I have been looking to try and find a
I'm the competent one by the way, he's he's referencing.
Okay.
I have strategy, BT, and I...
Yeah.
Yeah, yeah, right. I've heard I've heard though that some some people are saying they are running it at a loss or it's a bit complicated because I'm not sure. Pretty sure Anthropic's probably running some percentage of accounts on the two hundred dollar plan at a loss. Right.
Um
Yeah. But like is is API pricing at cost or below and then how do you factor it? So I didn't.
I w my my personal take is that Inference itself just looked at in the myopic view of just inference, it makes a lot of money. But you also then once you zoom out now you start saying hardware and all the incidental stuff around it, probably still makes money. But then when you zoom out to say like every time you release a model you defunct your previous model.
That is gonna have that has a very large burden. And they keep on not making money and needing to raise more money. So I have a sneaking suspicion that part of it is very hard to make money in the current. Uh all right.
Well op open AI is like publicly uh like losing money, right? But is anthropic also negative?
They just had another big raise as well, so I'm assuming I thought they just raised like six billion dollars or something. Could be wrong about that chat fact check me. I know OpenAI did a hundred twenty billion.
That's so much fun.
Yeah.
Cash. This is the one that I actually was really curious to see. This is the only benchmark that I was super curious to see if they're gonna uh do well. Anthropic Opus four six maxed cost approximately nine thousand dollars and got point five percent score. On Arc AGI. So this is like the the the super test.
And humans get into the high nineties. Uh AIs get like uh Gibity four high cost five thousand dollars and got point two percent. Gemini uh three one did point four percent for two point two thousand dollars. And so it's like this really difficult Uh it's a really difficult test for AIs to pass.
And so Mythos did not add itself to this one. So this is the reason why I largely think it's more harp like hype marketing than it is anything, because to me this is like a really great indicator, at least into some sort of Better model improvement. And so I didn't see it.
Uh let me can uh can I just give a counterpoint to that though? Once again with the huge disclaimer that I don't do any uh So this is just off the cuff. But ARC AGI, if I'm not mistaken, is a benchmark specifically to test how well AIs perform uh on learning completely arbitrary new things that don't exist anywhere. That's the only thing that it's...
Test for intelligence.
Exactly.
And so the only reason I would wanna point out that I don't think that test says very much about this particular security thing is security is not that. True. Nobody is claiming that Claude Mythos came out and discovered a whole new set of classes of security exploits that no one had ever come up with. What it's saying is that it went and found a bunch of the exact same kinds of zero days.
That someone like Ed would find if they went and spent a week on that piece of software, right? Like so they're not claiming that this thing is somehow more intelligent than the predecessor in that way. It's claiming that it's got better pattern matching
And like
stringing things together to create exploits, right? That process, which is well known. And so so I don't think Arc AGI necessarily tells us very much about whether it can do those things because those things are very well known tasks that security researchers know how to do and we kind of know the process that you do to do.
Yes, that's okay. I will I will I will concede that point most certainly that the security yeah at least known and obvious security vulnerabilities such as use after freeze and and all the fun stuff, like the stuff that happened in F F M Peg with jumping ahead somewhere in in the buffer based on the
These things are very common kinds of bugs. They're they're not like unusual. The things that they've talked about are like very, very standard. And so that seems like a more plausible claim. Like, hey, we just were able to scale up the sort of security checking that a security researcher would do, it can do that thing and and find, you know, potential places.
¶ AI's Advanced Exploit Chaining Capabilities
A lot more plausible.
Yeah. The thing too for I feel like for the security side of it as opposed like as opposed to constructing a product or a new product or like building a feature where you have to get like In some ways, all the things right. For a security thing, I only need to find one of the things that are wrong. Yeah. Which is like that's a much
L like you can test a bunch of the scenarios like you're saying, Casey, that already exist. And I I only need one thing to be wrong in the program for then me to be able to take control of
Well and it's combinatorial, right? Like a lot of what security research is doing is like A it's pattern matching for these kinds of bugs. And then B going like, Okay, if I did this one followed by this one, would that produce an exploit? What if I did in the opposite order? What if I did this one and then this one and then that
Okay, what if I did this one right? And again, these are things computers are good at. Like that it's not you don't have to believe in some kind of a weird, like supernatural, like AGI achieved internally Sam Altman nonsense to believe that this is something a computer could do. It's it's much more plausible if anything than some of the other claims. So that's why I I would like say I'm I'm not that like when I saw this I wasn't like, that's gotta be false. I was like, okay.
Yeah. M most of Vulan research is like, you know, take a function that gives user input, like define your threat model, and then do source to sync analysis on some vulnerable function or failure to gate a function on like a language. And like does user data get a little bit more.
Bug confirmed. And like, yeah, that's literally just pattern matching that we've solved a lot of the times previously with like satisfiability solvers, right? Like Anger and like Z3. Like take the graph of a function, turn it into a math problem. Can you solve the math problem? Cool, bug confirmed. Well now with AI, it's just like that process of doing source to sync on like tech.
It can do incredibly fast, right? It's very good. Now, obviously, because it's stochastic, it creates a lot of false positives. But if we can figure out a way to reduce the false positives or uh automate the the validation of those false positives, then yeah, it's it's crazy.
Have they thought about asking mythos?
Yeah, no, come on.
Can you just No mistakes, please. Um The thing that Mythos is set apart differently according to the Anthropic Report is its ability to chain together primitives, right? So the scary part from like a cybercrime perspective is like you have uh gadget A that gives you an arbitrary read and gadget B that gives you an arbitrary write. Okay. Like those two separate things are like not super important if they're not used together well.
What Mythos was able to do is out of a hundred tests, I think it's like 83% of the time, find exploit primitives in a vulnerable code base and chain them together to get archived. Right. That's the scary part'cause then that's true, like end to end exploit creation for a bad actor. And that's I think what scares anthropologists. Now I know there's an argument where like
¶ AI for Advanced Bug Detection
Firefox.
But I mean just apply that process to the sandbox and the same thing applies. You know, it's just I think we wanted to prove a point that it could
Well and also I mean again uh like as I've said many times, I can't stand AI companies, so I'm not trying to defend them or anything, but I'm just trying to point I'm just trying to point out how plausible this stuff is to me from a neutral observer standpoint.
Classic K G defending AI companies.
Yeah, I I know, right?
I know.
If you think about it it's like look Security researchers who do not number that many were already cranking out zero days at a much too alarming rate.
For me, right? Like like you know, there's a hack every other day, right? It's not like CVEs are piling up like there's no demor. And yeah, not all of them are actually all that bad or whatever, but like It's not like security researchers were having trouble producing a fair number of of critical vulnerabilities, even with the limited resources that
So it's also not weird to think that like if you had more automation you would find a lot more of them. It doesn't like there's clearly just a lot of bugs, guys. Like there's a lot of freaking bugs.
And it's just doesn't seem that unusual that if you have more sophisticated pattern matching, more sophisticated combinatorial checking, where the security research doesn't have to spend a lot of time setting up the tool'cause it can just kinda ingest the code and it knows what it roughly what it means. Yeah, I mean their g their rate's gonna increase. If nothing else existing security teams rates of finding exploits it has to. I mean it just has to unless this thing is just a complete pile of
It's gone.
The other thing too we've been seeing from each like new generation of model is that they're getting at least from my experience and what I'm what I'm reading from people and everything. They're getting better at calling other tools. So like they call out to stuff more regularly. Right, yeah. And they can pay attention for longer.
Right, like like recompile this code and see you know, c make this exploit and run it against this thing or whatever, right? Like those are all things that if you automate them, a security researcher gets much faster at finding butt because they're not having to set up the tooling themselves to like go work on Like whatever whatever those steps were, they don't have to do them anymore.
Right. So then if you're like, oh now it can run instead of like I have to prompt it at every stage for the next thing to do is I can give it ten rough things, say try a bunch of combinations of these and then it runs for twenty four hours. Yeah. You're just like a lot it's literally like in in my mind, some of it is like, Yeah, well we already know fuzzers exist. Like we use'em all the time and they're good. Right? It's like in some ways almost like
Yeah, it's like fuzzer it's like fuzzer squared, right? It's like a thing now that can like target the fuzzing at things specifically so that it fuzzer too. Things that would be very hard for stochastic testing to catch.
Because when you have stochastic testing and you have to chain two things together, you're never gonna randomly pick the two things that would have to happen for them to work. Here is a thing that can like target that specifically and go like, Oh, I think combine these two things would be a problem. Let me fuzz that specific Path.
That's where it gets crazy is like you just have the AI write the fuzzer. And then like if you can automate that process.
It's pretty amazing.
¶ Podcast Farewell and Promotion
Um I do have to go though. I have a meeting in three minutes, so I gotta I gotta rip. Um
Well hopefully you get mythos access. Congrats.
I mean, no, it's not going to happen.
Mm-hmm. I like you guys, but it looks like it's the end of our show, unfortunately.
So true.
Thank you everybody. I would just like to say that uh I would uh I would just like to say that uh Casey and T J and obviously Timu Casey that just left, commonly known as low level learning. Uh you guys you know you make the show magic and And now I'm just gonna go about being lonely again.
Oh pride. Ha ha.
I know how it's going. I thought I was gonna get booed, but I uh I just assumed something's gonna happen. All right. Um the real the good the good news is You can enjoy full episodes of the stand-up now on YouTube. If you go to the stand-up pod. Which I'm gonna try to rename hopefully at some point. We're trying to work some things out to get it a better day, but right now YouTube, am I right? Um
If you go to the website, if you go to our website, will it have links to these?
Yeah, it will. Okay. It will. And it'll have it spelled out. Uh we'll we'll make it more clear once we figure everything out. Over the next week, maybe by the time you're listening to this on YouTube. uh by the time we're listening to this on YouTube. Uh we're gonna upload all of the backlog to that channel as well. So we should have every episode on YouTube in one spot, very easy to see, etc. Obviously you always can, you know, RSS. Download the audio directly. Don't press the red button.
Of course.
What is that web address that people should go to from?
Stand up pod.
Go to the standuppod dot com. All the links will be there. All the episodes will be there. You want YouTube, you want Spotify, you want downloads, you want RSS, you got it. The standup pod dot com, Lady.
Yeah, yeah, check this out. I'm just gonna do something for the audience. Look at this. If you go here, you click Trash Made a Black Mirror app, you can go and you can listen to it right on the website. You can have all the nice information right here.
They don't even...
You can go in here.
Look at this.
You can play on Spotify. You can download and just have personally for you to do whatever you do. That's for you.
Now that we're and then I'll make it I'll make it so it links to the YouTube Now that we're gonna have a dedicated YouTube channel for that too. So for all of you out there. Yeah.
The AI companies claim that you're gonna get UBI, but we're actually giving you universal basic pod. You just get it.
U B P you.
UBP. I was gonna say well I don't know what I was gonna say. Uh we should just
We fit really just on this episode.
Stick a fork in it guys, it's done.
Alright.
Good seeing you everybody.
Yeah.
Whatever your name is T.
🎵 Music
