An AI So Powerful Anthropic Kept It From the Public

Speaker 1

00:02

Bloomberg Audio Studios, podcasts, radio news.

Speaker 2

00:09

Nicholas Carlini is a well known AI security researcher. He works with Anthropic to stress tests and software for ways it could be used by bad actors, and back in February, he took a trip.

Speaker 1

00:22

He was on holiday in Bali. He was actually at a friend's wedding with his wife.

Speaker 2

00:27

Margie Murphy is a cybersecurity reporter for Bloomberg News. She says at the time of Carlini's vacation, Anthropic had just released a version of a new AI model called Mythos for internal review.

Speaker 1

00:40

So that meant that anyone in the company could start trying out and report back if there were any issues with it. And Nicholas never takes a day off, is so excited by these kind of things, got his laptop and immediately started testing to see whether the model would do all sorts of bad things that he asked it to.

Speaker 2

00:58

To Carlini's shock and concern, it would not.

Speaker 1

01:03

Only at his direction was it going and finding flaws in some really really important software that underpins a lot of modern computing, but it was finding ways to find these flaws and turn them into exploits.

Speaker 2

01:18

Once he got back to anthropics San Francisco office. Carlini kept testing, and he found that this new model could create tools to break into all sorts of software all by itself. As Margie and her colleagues put it, Mythos could conduct the digital equivalent of a bank robbery.

Speaker 1

01:36

The executive started talking about, you know, what do we do? We had planned on releasing this as a part of Claude to the general public, but we might have a national security risk on our hands here.

Speaker 2

01:50

Anthropics executives decided to hold Mythos back from the market and instead share it selectively with some big tech companies and banks, and even with some of its AI rivals, so that they could start seeing what Mythos was capable of and protect themselves. Some skeptics say the fears Anthropic is stoking might be overblown. Many outside researchers haven't had a chance to validate the company's claims already, though there are signs it might be hard to keep this tool contained.

02:22

Bloomberg has learned that a small group with unauthorized users has accessed on propics new Methos AI model.

Speaker 1

02:28

There's an unwritten rule in cybersecurity that nothing can ever be one hundred percent secure.

Speaker 2

02:37

I'm Sarah Holder, and this is the big take from Bloomberg News today on the show, how Anthropic decided Mythos was too dangerous for public release and what a new wave of AI enabled hacking could mean for businesses, banks, governments and everybody else. Was developing Mythos, it intended for it to be the latest update to its large language model Claude. It wasn't meant to be a special cybersecurity

03:08

tool or a master hacker. But Bloomberg's Marge Murphy says that when the company tested an earlier version of the model, it found dozens of examples of concerning behavior not following human instructions and in rare cases, covering its tracks Afterward, once the model used a series of steps to escape its testing environment and gain broad access to the Internet, where it began publishing materials online, Mythos, the company found was really good at exploiting vulnerabilities.

Speaker 1

03:41

Realistically, humans have for years been able to find fools and software, and we have not only internal teams at major companies whose job is to find that and fix them as soon as possible. There are companies that you know, that's their business model. But what's incredible about Mythoss, according to Anthropic, is that it could find so many and so many that have existed for years. It was finding bugs that are twenty seven years old in software that

04:12

is considered highly secure. And in addition, it could also find different flaws. And now floors are just entry points.

04:20

They may not be a big deal, but what it was able to do was chain together multiple flaws to create an exploit, which is when you are able to get into a system and do something useful the floor That was the kind of turning point I think where they thought, wow, this is something that that is actively finding flaws, putting them together, being very smart about how it's doing it, and doing it on mass.

Speaker 2

04:45

Another thing that Anthropic leaders talked about was Mythos's ability to act with greater autonomy. What made its capabilities so concerning previous models, you'd need to give a little bit of direction, So if you're trying to hack something, you would have to have a good bit of knowledge about the system you're trying to hack. You would have to kind of really point the model in certain directions. With Mythos, there is less direction needed. The way that Nicholas described

05:17

it to me. It was coming back with these sort of methods that it could do itself. You don't want your technical tools to have too much autonomy. You want them to be working for you and explaining why they're doing it. So when something does something that you don't really understand and it doesn't tell you why, I think that's always a concern. What kinds of companies or industries or institutions could be most vulnerable to a cyber attack from a extra autonomous LM like Mythos.

Speaker 1

05:50

So this is the thing everyone everything, Because Mythos isn't changing hacking in a sense, it's changing the speed and the acceleration of hacking. Flaws exist in loads of software, so from the financial industry, to manufacturing to even governments who might be concerned about espionage. Mythos isn't actively going and hacking those things right now. But if a model like Mythos was to come out, anything that's on the

06:23

Internet is potentially able to be hacked. Right so we're reliant upon the industry defending itself and putting in place proper measures.

Speaker 2

06:33

Also, Anthropic discovers these potential concerns, Carlini and others start raising the alarm. Anthropic decides to not release Mythos publicly, but release it to this limited group called Project glass Wing. Who gets this first look and how do they decide who is going to get Mythos in this limited release.

Speaker 1

06:57

I did ask Anthropic, and they're careful of giving too many details about how exactly they decide. But the reassuring thing about the AI industry is, despite being fierce competitors, there is an element of working together when it comes down to security. And so they came up with this idea to create a sort of cohort of companies including cybersecurity companies which would be their competitors, including Google and Apple and AWS who are also trying to crush them

07:31

in the AI market. And yeah, they formed this group called glass Wing and invited people to try out Mythos and see whether they could find any flaws and bugs in their suptware supply chain. You've got some of the biggest tech companies in the world, You've got some finance players, and as Anthropic would say, they're talking to government agencies.

Speaker 2

07:57

And what's the reasoning here. They've discovered the big potential threat in Mythos these cybersecurity capabilities. Why release it to anyone? YEA, why release it to this limited group of players at all?

Speaker 1

08:12

At first it was totally confusing. Why are you working with your competitors to give them access to, you know, your IP, this product that probably would be very valuable for you if you just released it as claud and you're also giving it to them for free as well, because there's a kind of a credit program that Anthropic offered. Anthropic says they released it to these limited players so

08:37

they could use it for good. So rather than tone it down, don't tell anyone about the scary thing that you might freak everyone out, and just make sure that it doesn't do that again, and we'll just release it. We could kind of make a big song and dance of it and say, hey, we're gonna responsibly give it to a few people who we trust, who are to try it out, test it out, see what we learn from it, and then in the meantime signal to the world that we have this extremely powerful model which could

09:09

do no harm for their marketing strategy. And I think, to be fair to Anthropic, they have cared about AI safety for some time. Their founders come out of open AI and their whole Mission from the beginning has been

09:22

talking about using AI in a responsible way. They were thinking, look, other companies are going to have something as powerful as we do soon if they don't already, and maybe adversaries like Nation States who are working on kind of military style hacking operations, they may have that too, So at least let's talk about it.

Speaker 2

09:43

It does seem like a very good marketing strategy, right, Like, we release this tool, it's so scary good that you need it to protect yourself. You need it now, you need it first before we let the rest of the world get their hands on it.

Speaker 1

09:56

It's an incredibly effective marketing tool. Everyone wants to see what it can do. Everyone wants to get their hands on it. To have access to Mythos is like a golden ticket. And you can see online and there's been so many hot takes on oh this is just buzz and you know, how do we know that it's true?

10:14

But I've spoken to a lot of Glass Wing members, the people who actually have access to it, in addition to the people at Anthropic who have explained how it works, and they've all said to me, it really is very powerful. Are they going to go and live in a cave now, And do they have fears that like satellites are going to drop from the sky.

Speaker 2

10:35

No.

Speaker 1

10:36

Do they fear that if something like Mythos got into the hands of an adversary there would be problems, yes.

Speaker 2

10:45

Coming up. What companies are learning from their Mythos testing and how they're sorting through the results, and what anthropics legal battle with the US Defense Department means for other

10:56

federal agencies that want their own look. The same day that Anthropic announced the existence of Mythos, US Treasury Secretary Scott Besson and Federal Reserve Chair Jerome Powell held this meeting with Wall Street leaders in Washington to discuss the model, its power, and the risks it could pose to banking software. Bloomberg cybersecurity reporter Margie Murphy says their directions were clear.

Speaker 1

11:29

Everyone needed to make sure that they had their house in order and that they were prepared for the potential for more hacks to take place.

Speaker 2

11:39

The fact that the US government had Mythos on its radar and that it was calling urgent meetings to tell business leaders to use it defensively was significant. Margy says it was also a bit ironic because the US government has been feuding pretty publicly with Entthropic. In March, it declared the company a supply chain risk after Anthropic give the Pentagon unfettered access to its tools.

Speaker 1

12:03

Anthropic had been considered one of the government's top AI tools for some time. Lots of places we're using it, Department of Defense was using it. And then there was this spat that happened between Anthropic and Pete Hegseth. Anthropic said, we don't want our tool being used for certain things, and I think the government response was, well, you can't pick and choose. You're here to provide national security and if you don't, will label you a supply chain risk.

12:34

And that's what's happened, and it's going through the courts at the moment. So when Anthropic announced that they had this powerful model, I think a lot of government agencies were like, well, what do we do now. We should have access to this, we should be able to see it. My reporting suggests that the Treasury was really keen to get to get access to it. Anthropic has told us that they were doing government briefings and letting them know about the helpfulness.

Speaker 2

13:00

Of this tool, and what about Wall Street?

Speaker 1

13:03

So the banking industry has been using LLMS and AI to try and protect itself already, So I think at first what we were hearing was, oh, is this really anything new? And then as we know, some banks have had access to it, and I think they're starting to realize that this shows that the thing that they've all been worried about for maybe the past six to twelve months is now a real threat and that the time is now to act because if something bad was to happen that impacts the US economy.

Speaker 2

13:37

And as for Project glass Wing, what do you think this new era of experimentation will look like for these companies and institutions. How challenging will it be to actually fix the vulnerabilities that Mythos identifies.

Speaker 1

13:51

Yeah, so I was talking to one Glass Wing member and they were talking about how they'd already used the tool they found it in be powerful, surprisingly powerful, and would love to continue using it. However, even though they found all these flows, because obviously I asked them, wow, do you find anything really scary? You know, what if you've found and they're like, we actually don't know what we found it because then going through it all and figuring out what to patch, what to flag, is a

14:20

critical issue. I think it will take a while for the companies to figure that all out, and it will be really interesting to see the feedback because part of glass wing is that these companies are meant to be sharing information about what they're finding.

Speaker 2

14:35

But I'm wondering could this create sort of an unfair dynamic in the marketplace, Like there's a divide between these companies and governments that are part of anthropics trial and then companies that are kind of left to fend for themselves.

Speaker 1

14:47

Yeah, the have and the have nots. I think that is absolutely a concern. The people who are going to be most impacted by AI models like Mythos coming out into the world are the smaller players who maybe don't have the budgets to make sure that when this kind of bug apocalypse is as everyone's describing it as, comes that they might be the lower hanging through where lots of flaws and exploits are found, and then when when someone tries to use those, they don't have the defense in place.

Speaker 2

15:19

Well, this week, Bloomberg reported that a small group of unauthorized users have already been able to access this model using various tactics. A person familiar with the situation told our colleague that they just want to play around with the models, not reek havoc. But Markie, what does this tell us about how hard it could be for Anthropic to actually keep mythos in a box.

Speaker 1

15:42

When I spoke to Anthropic when they first told me about this limited release, I did ask the question, how do you when you are opening this up to a number of organizations, how do you protect what those organizations are doing? And how do you have oversight? You know, the more people you open this up to, the more ve as there are for potentially malicious actors to try

16:03

and get access to. Whether that's socially engineering them figuring out who's the person who might have access and kind of using these manipulative tactics to get into their systems, or whether it's other means of hacking. And so it raises concerns about who potentially might already have access to it, and if AI hobbyists want it, hackers probably want it too, and they're probably working around the clock to try and get access to it right now.

Speaker 2

16:30

And wouldn't Anthropics say about Bloomberg's reporting on this unauthorized access?

Speaker 1

16:35

So Anthropics said they're investigating we reported that it was partially through this third party contractor that Anthropic uses that these AI hobbyists we're able to get access. They are looking into that and making sure that that access potentially gets cut off and we'll wait and see what they come back with.

Speaker 2

16:55

And is there a plan to release my Thos to the general public eventually?

Speaker 1

17:00

So what Anthropic told me was that they don't plan to release this generally, so there won't be a Mythos update to Claude. What they have released in the past couple weeks is Opus four point seven, which is kind of a bit of a dressed down Mythos. The idea is that they may possibly broaden it out as a cybersecurity initiative, so people may be able to get use of it if they are only using it for defensive reasons, just for the cybersecurity element.

Speaker 2

17:30

What about nthropics competitors like open Ai, Google, even you know, Deepseek. Are they close to releasing models that have similar capabilities to mythols And do you think they will take the same approach as Anthropic, which you know was explicitly founded on this pledge to prioritize safety.

Speaker 1

17:50

I do not doubt that competitors have found or developed models that have near enough or similar capabilities, but maybe chose a different path in how to release or talk about them. Open Ai has come out with a last week. They announced that they have a kind of bug finding capability as well. Google has big Sleep, which offers a similar thing. The way that the AI race is going is that everyone is kind of edging quite close to

18:22

each other. It depends on whether they make that decision to release it or talk about it publicly.

Speaker 2

18:28

Part of what we're talking about is that these models are getting better faster, and especially faster than the ability of human regulators to agree on how to put guardrails on this technology. And I'm wondering, you know, is government regulating this? Can they?

Speaker 1

18:48

Yeah? So, the fact that a private company had to come to the decision to do a limited release of one of its models because it was so powerful just shows the state of AI regulation right now. Everything is moving just so quickly. Even if someone was to suggest the perfect plan to regulate, you'd probably have to tear that up a week later and start again.

Speaker 2

19:18

This is the Big Take from Bloomberg News. I'm Sarah Holder to get more from the Big Take and unlimited access to all of bloomberg dot Com. Subscribe today at bloomberg dot com slash podcast offer. If you liked this episode, make sure to subscribe and review The Big Take Wherever you listen to podcasts. It helps people find the show. Thanks for listening. We'll be back tomorrow

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript