In the mid-2000s, a small TV show called Veronica Mars quietly exploded and became something of a cult hit. Ever since, its fans have begged for a revival. Now, I don't know if a reboot is in the cards, but what I do know is that on Facebook, you could ask Kristen Bell, the actor who plays that character, what she thinks of the odds. I also know you could hop on Instagram and ask AquaFina to workshop some comedy routines about obsessive fan bases and pop culture.
And on WhatsApp, I know that pro-wrestler turned actor John Cena to try to explain why an old TV show about a young female private investigator remains such a cultural phenomenon for so many. And if you're just catching on, I'm talking about using Meta AI. The embedded AI assistant baked into these platforms to goof around for a little bit. Now, it wouldn't be the most world-changing use case of AI, but sometimes you just need a little AI-flavored entertainment.
And Meta is more than happy to entertain you on its social platforms. In late September, the company announced that now you can voice chat with its AI assistant instead of just typing. And Meta AI can borrow the voice of one of these celebrities, as well as Keegan Michael Key and Dame Judy Drench. Meta also said that their chat box can now speak multiple languages, and that it's also multi-modal, in that it has the ability to understand and transform images that you might show it.
Now, it's all fun and games to be sure. But these type of features are table stakes now. And in the quest to offer AI tools to its users as rapidly as possible, Meta is catching up to the competition quickly. But these features are just the type of the iceberg. There's a lot more to this tech than just another large language model owned by another big tech company.
And if Veronica Mars were tasked with investigating what's really going on here, one peak below the surface would reveal a surprising accomplice, an open-source Lama. I'm Belavol Sadoo, and this is the TED AI show, where we figure out how to live and thrive in a world where AI is changing everything.
Hi, I'm Belavol Sadoo, host of TED's newest podcast, The TED AI Show, where I talk with the world's leading experts, artists, journalists to help you live and thrive in a world where AI is changing everything. I'm stoked to be working with IBM, our official sponsor for this episode. In a recent report published by the IBM Institute of Business Value, among those surveyed one in three companies, pause an AI use case after the pilot phase. And we've all been there, right?
You get hyped about the possibilities of AI, spin up a bunch of these pilot projects, and then...crickets. Those pilots are trapped in silos. Your resources are exhausted, and scaling feels daunting. What if instead of hundreds of pilots, you had a holistic strategy that's built to scale? That's what IBM can help with. They have 65,000 consultants with gender to the AI expertise who can help you design, integrate, and optimize AI solutions.
Learn more at IBM.com slash consulting, because using AI is cool, but scaling AI across your business, that's the next level. Are your digital operations a well-oiled machine or a tangled mess? Is your customer experience breaking through or breaking down? It's time for an operations intervention.
If you need to consolidate software and reduce costs, if you need to mitigate risk and build resilience, and if you need to speed up your pace of innovation, the PagerDuty Operations Cloud is the essential platform for operating as a modern digital business. Get started at pagerduty.com. Teams with big ideas start in JIRA, the only project management tool you need to plan and track work across any team.
JIRA even helps our team here at TED, keeping us in sync to deliver the big ideas our listeners love. And there's a lot more that teams will love about JIRA. It keeps cross-functional tasks organized with a project's timeline that's always really key so that we make our deadlines. And cross-functional teams like TED, working in one tool gives leaders the important visibility they need to make better business decisions. Let's start it on your next big idea today in JIRA.
Lama is the name that Meta gave its own large language model. It's an impressive LLM, trained on over 15 trillion tokens of publicly available information. In September of 2024, four sizes that the model were made available, from 1 billion to 90 billion parameters. Small enough to run locally on a smartphone or big enough to run the most complex projects, which is kind of wild. Wilder still, depending on how you look at it, is Meta's unconventional approach to releasing Lama.
It's essentially an open source license. You can download it right now and tinker with it as you'd like for free. For the most part, big AI companies like Google and OpenAI tend to prefer a closed approach to distributing most of their own LLM-based products. And of course, they prefer having people pay for access too. But imagine Lama baked into Facebook, Instagram, WhatsApp, and more. This could revolutionize how we interact online.
On the flip side, this raises questions about security and misuse. We can't always trust that everyone who's using open source tech will do so with the public's well-being in mind. So, will the benefits of openness outweigh the risks? And what's in it for Meta? A good person to ask is Rogovan Sreenivasan. He's the vice president of Product at Meta. He's the team responsible for developing and releasing the company's herd of Lama's. Rogovan, welcome to the show. Thank you, beloved.
Nice to be here. So this fall, Meta announced a new update to Lama, as well as updated features to its own AI assistant called Meta AI. And I want to touch on Meta AI with you first for a moment, because I expect that's the type of the iceberg where a lot of people's first experience with an AI assistant is going to be Meta AI. Probably because it's embedded in Meta as ubiquitous social platforms, right?
We're talking Facebook, Instagram, WhatsApp, Hekket's even on the Rayman glasses and the Meta Quest headset. So Rogovan, talk to me like a user first. What are the kinds of things you find yourself doing with Meta AI? So when you think about Meta AI, it is this universal assistant that is available to you right at your fingertips. And one of the most exciting features that we have announced with the Connect Clunch was Meta AI used to primarily be about text.
And now you can talk to Meta AI and Meta AI can see. So we announced voice and vision capabilities that are being rolled out into this product. And this becomes really interesting for you because you can now start talking to it. You can ask it questions. You'll respond back to you in the languages that you know how to speak. And you can also now start sharing with it images. And all of a sudden, now this assistant that's available to you is able to understand this across all of the modalities.
Are there certain things that you find yourself using on a daily basis? I'm kind of curious knowing that you're in the weeds with the model themselves. What's the stuff that you find yourself going back to? Yeah, so the first one on an almost daily basis. I have a lot of my friends and family living all over the world. And WhatsApp is our lifeblood for us. And with our friends in particular, we are always joking around, passing some memes along.
And especially, you know, I have a lot of friends from India. And now I'm able to pull in, Meta AI into one of these threads and have it riff along with us, almost like it's one of our friends. And the amount of entertainment that it provides is just incredible. And then I have three kids. Each one of them needs some help or not that are with homework. And sometimes, you know, calculus, I've forgotten my calculus.
And so if I need to answer them and they ping me, then I do end up cheating a little bit and asking Meta AI to help me. That's very cool. I mean, it's the walking use case that I find myself using is when I'm on a long walk. And I want to like just get a quick distillation of a very complicated topic. It's such a magical way to get that while you're out and about in the world and ask follow-up questions. Exactly. In my case, oftentimes I end up taking our dog out for a walk.
And so, you know, on one hand, there's already a leash. So you have to like, type and interact with it. It was always hard. And now I can just talk to it. And it'll just like answer back, which is great. I have to imagine this gives a huge competitive advantage to Meta because in this AI race, as everyone's framing it up to be, you know, obviously Meta can now make your AI assistant available to hundreds of millions of users for free.
And so I have to ask like, does that footprint first party distribution literally help your teams learn and iterate? Like, is there a data flywheel in here where Meta is like a unique position to harvest fresh data and insights for future AI models? The way we think about this is actually twofold.
When you think about the core foundation models that we train, the Lama models that we train, they often operate on what we call sort of like general knowledge because you want these models to be as general purposes possible. And then over time, as you build these models, we then start working really closely with our product teams for them to be able to take the next generation of these general purpose models and then start to specialize them and fine tune them for their needs.
And so they end up deploying customized versions of Lama through Meta AI. And that's what our end user see. So what then ends up happening is when you're out and about walking, you're able to quickly get feedback signals. And I'm like, was this translation that I just gave you helpful? Was it not? These kinds of small but very effective at scale sort of feedback loops can help us ship a very compelling product experience. Do we use that to learn back into the models?
Not as much as you might imagine, primarily because the core model itself is really focused on this notion of general knowledge. However, the product experience start to become a lot more improved because we now have tighter feedback loops to be able to say, OK, this set of conversations in this market probably are not jiving with people. So we need to improve them in a certain way. And we have a different sort of like feedback loop for those models, if that makes sense.
So it's almost like you've got these general purpose models. But when you get into these task specific use cases, you are able to mind some very interesting signals that help you refine those models for those specific applications. Exactly. And this is something that we've done even before, Genai. So if you think about Facebook translations as an example, right? Translations are an incredibly powerful feature even today.
When a lot of people come to our apps, they don't necessarily speak the language, but they're really interested in understanding what's happening around the world. And we have amazing AI systems that are providing these translations. But there's always going to be a translation that may not be exactly colloquially right. It may not pick up the latest, you know, again, like bringing up my kids. They use language that I'm like trying to figure out like, what did you actually be doing?
Any of the words, right? Are you asking meta AI to help you translate the latest slang? I totally am. And I try to get meta AI to make lame dad jokes. Maybe that's the other use case I should have told you about. But that's the kind of like cultural zeitgeist that you really want to be able to tap into. And that happens by us getting feedback from our users in real time and then improving their products.
You know, so I have to ask sort of related to that because meta AI is so front and center for everyone to use in the meta ecosystem. How does that change the expectations your team is under to make sure the system can support these like billion user first party use cases versus like this, you know, thousand flowers blooming of third party use cases and interest out there like to put it more crisply. How do you think about prioritizing these first party and third party use cases for Lama?
Yeah. So that is a great question and as someone that is responsible for sort of the roadmap for Lama, it is something that is very top of mind because, you know, with Lama, we have at the highest level three goals as a team that we try to pursue. First and foremost, it's meta as a company being at the forefront and trying to be the leader in getting making progress towards the AGI. And so Lama is our vehicle towards making progress towards AGI. Now that's a long term goal.
It is a long term goal. How does meta think about AGI? Yeah. Good question. And I kind of walked myself into that. So AGI, you know, as people might know, is artificial general intelligence. And that isn't really like a set definition for what AGI is across the industry for us at meta. What we think about are systems and artificial intelligence systems that are able to perform at super human level capabilities in helping humans stay more connected and providing more utility and value for them.
So we imagine this to be a state where humans and these artificial intelligence systems and agents are working closely together to further entertainment, to further social utility, to further economic prosperity. Right. So that's sort of like the vision that we have from meta's perspective. And so a big focus for us in the long run as a company and also as a Lama team is how do we keep steadily moving and making progress towards AGI?
One goal that we think about is how do we make sure that on an ongoing basis, we deliver the best possible capabilities of the Lama model to the vast user base that is served by meta. And so this is where your question around, okay, how do you make sure that you prioritize what your product team needs really comes into play? Because when we charted our path towards AGI, that can be more of a research short map. But when you then say, okay, what is the milestone that you want to be able to hit?
Not only do you have a research milestone in your head, you're also starting to think about what are the products that we want to be able to ship across software and hardware? And then the third goal that we talked about, as you know, Lama is open source.
And so a big responsibility for us is also then to understand from the community's perspective, what is it that they're going to need so that we can pack all of these across a set of prioritization criteria that we as a team, as a Lama team, have to think about. And then you try and come up with the roadmap. So that's sort of the three vectors, if you will, that we think about when we set goals for Lama. It's interesting.
I mean, there are companies that have raised the enemy, a billion dollar seed around to go headfirst towards AGI and some sort of super intelligence. Does that dual focus of having this research agenda, but also regularly shipping product? Does that create pressure? Is it exciting? Imagine it's kind of like there are two these opposing constraints that you need to balance. What does that, does that feel like in practice?
So actually, we don't look at this as opposing constraints in a lot of ways. In many ways, it's almost proof points and really important milestones to say, okay, the progress that you're making from a pure research perspective is also delivering value to humanity as you make this progress.
And so in a lot of ways, we look at this as exactly the tension we need to be able to have because if you think about how technology has historically evolved and from a product perspective, you start by saying, what is a consumer problem that exists in the world today? And can I invent technology to be able to go solve that problem? That's sort of the traditional way to do product. That's right.
The other way to do product, which is, you know, anytime you have one of these platform ships, it's like, here comes an amazing new technical capability. Is that allowed you to solve a problem that you previously were not able to solve? Or does it actually open up completely new opportunities for you to go and serve people that you were not able to do before? And so with Lama, we actually have the opportunity to be able to do both. And so that is the balance that we need to be able to strike.
So when we think about the next versions of Lama, we think about, okay, Lama used to be only test. Now Lama needs to become multi-model because as you think about progress towards AGI, these models need to be understand and communicate in all of the modalities that we as humans are able to. We're in this use text, you know, we use images, we use videos, you know, talking now, there's audio, right? So then you start to say, cool, what do our product teams think they need?
Do they think like image understanding is going to be more important than video understanding for some reason? So if that's the case, then let's actually figure out how you construct this milestone that balances both what we think we can get in the hands of consumers through our products and make progress towards AGI. So that's the thought of the balance that you have to do with every release. So let's take a step back for a second here.
Meta AI is built on Meta's own large language model called Lama. Why is it important for Meta to build its own foundational models rather than say partnering with another tech company? Like for example, what Apple is doing with OpenAI and Google. They seem to be building their own smaller models, but they still seem to be leaning very heavily on third party providers. But Meta is building the full stack. Why? Really good question.
First and foremost, if you think about this notion of artificial general intelligence and large language models being the vehicle through which you're going to experience artificial general intelligence, you get to make a lot of choices in terms of what goes into these capabilities, what goes into these models, the kinds of capabilities that you actually want to build into them.
Do you want them to understand multiple languages or do you want them to only understand, say, a handful of languages, just as an example? Do you want them to be able to speak to you? Do you want them to be able to speak to you using your own dialects?
So as you start to think about the kinds of utility that this technology needs to provide, there are a lot of choices that you end up making, whereas as an organization, as a company like Meta, where the surface area of our product base and there are user bases so vast, you have to have the ability to really shape how this technology moves and also the rate at which the technology moves, as well as the prioritization of a lot of these capabilities at which this technology moves.
So that's sort of the first reason. The second reason is when you think about how technology ecosystems generally have evolved, they start with maybe one or two closed proprietary, vertically integrated ecosystems and then they're usually emerges an open alternative which over time ends up becoming by the community for the community and then the default ecosystem that a lot of people end up adopting. So we saw this even with the web.
So you had closed versions of browsers and then you had open source browsers that came up and everyone started using the open source browsers.
And this technology is really powerful and it's something that is going to be very valuable to our consumers and we need to be able to shape how it evolves and we have the strong belief that open ecosystems eventually end up winning and you have an opportunity now to seed a nurture an open ecosystem, then well, we have to invest in this and we have to do this as an open ecosystem.
So that's why the choice of us investing in Lama and building Lama and then open sourcing this basically became an old way for us. Can we talk a little bit about the open sourcing of Lama there, right? Because as I understand it, the first rendition of Lama is like the code in classic internet fashion was leaked online via BitTorrent, no less. And then the next release for meta formally was released as open source. I'm kind of curious like how much of a debate did that spark within meta?
Like clearly there was intention to make the initial version of Lama available to researchers. But were you all already contemplating this broader open release or was there a moment where the rubber met the road and you saw how people were responding to it that you went all in on this open source approach? Yeah, so I have to preface this by saying I wasn't here when some of these decisions happened, but I obviously spent a lot of time talking to the people that were involved in this.
And so let me at least walk you through the philosophy and the thinking behind this. Metas had a long history of doing AI research. And so what we wanted to do was to say, okay, in traditional fashion, let's make sure that the research community has access to these models and here's the paper. Let's see what they do with it.
We did not anticipate that not only was there like an amazing one of interest from the research community, but the developer community was like, oh my gosh, this thing is amazing. Like why is this only available to researchers? We want to be able to build on top of that.
And so that feedback, I think is not something that you could have predicted, but as soon as we saw that, what we were able to do then to say is like, okay, there is a clear opportunity here and a really important role for meta to be able to play here, which is why the second version of Lama Lama 2, we ended up offering it under an open source license and the rest of the history. There is also what I call a form factor side.
If you think about how these large language models have evolved, they have primarily been driven by companies that have large cloud-based infrastructures and cloud-based businesses. As a result of that, a lot of these models are available behind APIs, which works for vast number of use cases, obviously, Lama is also available behind the cloud, so you can use that. But what makes meta really different is most of our users are using mobile phones.
A lot of our developers want to be able to build mobile apps. And mobile apps means sometimes you're going to be in spot-e-internet connection. So you need a solution that works even if you're not able to talk to the cloud. Sometimes you're working on really private data that is only on your phone and you don't really want to leave your phone. You don't want to send that to the cloud. Even if your API progress says, I'm not going to look at anything, you don't want that data to go to the cloud.
And so a big piece of our strategy this year was also to say, how do we meet developers where they are? So that's why if you think about what we did with Lama this year, we built the 405B class model, which is the largest, most capable model that you can use behind cloud APIs. And then we also released the 8B and the 70B initially and then the 11B and 90B models, which are sort of the daily workhorse type of models that most developers would want to use for their production applications.
And then we also released the 1 billion and 3 billion parameter models. So these models are super lightweight. Still pack a massive punch and can answer a lot of use cases that you as a developer might want to do for offline users, for private use on your phone or on your laptop. And so that's sort of been our philosophy for how we thought about Lama. I think this like plurality of model size is bit is very interesting, especially how it's evolving in the ecosystem. You're totally right.
Most of the chat apps that people might be using, obviously it's like taxing some beefy and video GPU and the cloud giving your answer back. But the smaller models that you alluded to, the 1 billion and 3 billion parameter models, like you can run it on any phone on inner fricking browser for crying out loud. Like, that's just kind of mind blowing. Like, are there interesting use cases you're seeing there around summarization and things like that that it's like really making a dent in?
It's so interesting you mentioned this because it's exactly the use case that we talked about with the team when we were talking about building these 1 billion 3 billion models. So we've actually seen developers running the 3 billion 100% locally on a browser. We've also seen a developer who's connected his iMessages on their laptop to Lama 3 and was basically prompting the model to answer questions about anything in their text.
Perfect use case because it's private, it's secure, nothing has to ever leave your laptop. And you can actually build this when you're on an airplane in airplane mode, right? So these kinds of use cases are exactly what we imagined developers are going to build in this. This is just the first two days. So I'm really excited to see what people end up doing with these things. What's the benefit in that situation of going with something like Lama?
Is it that they can fine tune Lama models on their private code repository? Why use Lama in this scenario versus some other third party offering? Yeah, really good question. And it's exactly what you said. So oftentimes software and your code repository is also one of your biggest pieces of intellectual property. And so you want to have a lot of control. You want to have a lot of security and privacy processes around this.
And so being able to bring into your enterprise and to your on-prem sort of deployments a model like Lama is something that you can't do when you're just dealing with like an external API provider. And so for a lot of industry verticals where code is very, very important and they may also be regulated, Lama is perhaps the best choice for them to be able to deploy this on their code base. So that's one of the biggest reasons. Awesome. So that was a coding example.
The second example is something that we call retrieval augmented generation or a rag. And so this one is really around this notion of a lot of enterprises have what they call enterprise knowledge basis. So this is we're talking about SharePoint. We can't talk much SharePoint. You're talking about like internal portals, right?
Like all of these deployments where people write their docs and then there's probably some version of a wiki internally, which is where you have to go and look up your like benefits information. When you can take like, you know, vacation, what's a vacation policy. So pretty much like any enterprise of like large, you know, number of employees has one of these, has multiple ones of these.
And then if you're a new employee who's coming in, you're onboarding process typically involves at least a week's worth of training for you to know like where to go and ask for information. And so now with these large language models, you can basically just use them to understand the knowledge that is dispersed across all of these different, you know, installations, SharePoint, wikis and whatnot.
And then you have essentially a chatbot that is available for you that is trained on just your enterprise's knowledge and is able to help answer any questions that your employees might have. So for an employee productivity perspective, this is a huge point.
And we use, you know, we have a version internally that we call metamate that we have deployed again on top of flama that is essentially a daily driver for pretty much like all of our employees, whether you're writing code, although you're trying to find out like when I can take my next vacation. Many enterprises have data and data strategies and over the years have essentially accumulated a bunch of really proprietary high-value data assets that they store.
They're not able to take advantage of this by just tapping into an existing large language model because these large language models have been trained on what we call the consumer internet. Right. And so this is the general knowledge of the world that you said. Exactly. So then give you answers that are like sort of based on, again, like I said, wikipedia and the consumer internet, but not really like on your proprietary information. So let's assume you're doing drug discovery research. Right.
And so you have a lot of like really high value proprietary data set. You now have to somehow train this model to also understand concepts within drug discovery. What does it mean to understand a protein sequence? What does it mean to understand like what isotopes are? Right. These are concepts that the model may understand at a very high level, but it's probably not going to be as localized.
And so then what you need to do is to go through this process of what we call fine tuning, where you take a general purpose model and then you give it data that is your proprietary data to teach it these concepts. And so that it can start to perform the tasks that you need for your application. You still want the benefit of this large general purpose model, but you also want to understand these concepts. That's something that is now going to require you to tweak the weights of the model.
The weights are basically how the model makes decisions and like you know, which answers to give you for lack of a better way to describe it. And so how do you then like tweak the weights of this model? For that, you now need access to sort of the internal guts and the engines that you don't typically get if you don't have access to the weights of the the teacher model.
And so that's a really, really important use case where we think about, okay, now you have the power of a 405B model and that can now teach and distal a very special purpose customized model for you. So that's one example. Another example is what we call flexibility, right? So as developers, especially if you're, you know, scaling out your applications, use soon get to a stage where there are some classes of prompts that you just want to quick response, what's the weather today, right?
Like this stuff like that. Yeah. It doesn't require a lot of thinking. Let's put it down. Yeah. Exactly. And then there's probably a class of prompts where the user is asking a really hard question where you want to tap in to the full capacity of this really powerful model. And having the ability to then be able to pick and choose how you wrote your queries, which model you're going to target. And then maybe even have this large model distal a version of a model that is just for you.
So that becomes like your workhorse, you know, daily driver for your use cases. That kind of flexibility requires you again to have access to the model weights, to be able to have access to an entire ecosystem that is providing the tooling and the infrastructure layer. And you just can't get to that level of control without actually having an open source ecosystem to build on top of. This show is sponsored by BetterHelp. We know we're in spooky season.
October is kind of the season for wearing costumes. But what if we're feeling like we're wearing a mask and hiding our true selves more often than we want to? Well, therapy can help. It certainly helped me accept all my idiosyncrasies and the various parts of myself. But of course, I am not finished. I'm a work in progress. So if you are thinking of starting therapy, give BetterHelp a try. It is entirely online and it's designed to be convenient, flexible and suited to your schedule.
Take off the mask with BetterHelp. Visit BetterHelp.com slash Ted AI today to get 10% off your first month. It's BetterHelpHELP.com slash Ted AI. Hi, I'm Belovel Sedu, host of Ted's newest podcast, The Ted AI Show, where I talk with the world's leading experts, artists, journalists to help you live and thrive in a world where AI is changing everything. I'm stoked to be working with IBM, our official sponsor for this episode.
In a recent report published by the IBM Institute of Business Value, among those surveyed one in three companies, pause an AI use case after the pilot phase. And we've all been there, right? You get hyped about the possibilities of AI, spin up a bunch of these pilot projects, and then...crickets. Those pilots are trapped in silos. Your resources are exhausted and scaling feels daunting. What if instead of hundreds of pilots, you had a holistic strategy that's built to scale?
That's what IBM can help with. They have 65,000 consultants with gender to the AI expertise who can help you design, integrate, and optimize AI solutions. Learn more at IBM.com slash consulting. Because using AI is cool, but scaling AI across your business, that's the next level.
You know, this idea of the really large, like, four or five billion parameter models, one trillion parameter models, like, you know, essentially teaching and distilling their wisdom into smaller models for the tasks you're going to hammer it with is very, very exciting. And it perhaps brings up the question, which has been a common criticism of, like, Meta's open source efforts, which is like, is it really open source debate, right?
Where perhaps there's a spectrum of open source, and it seems what Meta's offering right now are open weights models. You're all opened up a bunch of restrictions to recently, enabling you to use, you know, these bigger models as teacher models. What does Meta mean by open source in this case? Look, this is obviously an active discussion, active conversation in the broader community.
I think this notion and this idea of open source also has gone through its own, sort of, like, evolutions and changes, right? And so I think we are at a moment right now where there's a completely new type of technology. And in many ways was defined with the notion of software in mind. Models are, and these, like, large language models on the systems around them are a composition of software data and this, like, other entity or artifact that's basically just a bunch of weights.
And so I think one of the things that we as a community have to now figure out is, like, what is open source in this new world actually mean? And I think over time, we're going to figure these things out. We also have different flavors of this that have happened in the past when you think about content as an example. We tried to apply this notion of, like, open source to content, but it didn't quite fit. And so then we came up with creative comments, right?
So creative comments, well, now if you think about this, it's equivalent to open source, but it is very in tune with the notion that content is just inherently different from software. And so you probably need a different way to say, okay, this is for the community, this is by the community, here's what you can use it for, here's what, if you use it, here's how you attribute it.
So my expectation is that as a community, we're going to come together again, as we always do and say, okay, this thing is different. So for this different thing, what are the values? What are the principles that we want to be able to protect? And what are the definitions that allow us to do that? For right now, at least, the ability for you to have access to the model weights and then our definition of things like the Lama stack that give you a very open API that allow you to expand.
And as these conversations happen, you can believe that we're going to be at the middle of this and we're going to try and shape and evolve as it goes forward. It's a tremendous, like I would say it's like this public commons that Meta is giving to the world. But of course, there's a cost associated with that.
And while Meta hasn't disclosed any costs, it's been reported that the amount of GPUs that you all have used for AI development and training, like in post training and fine tuning, all that adds up to certainly hundreds of millions of dollars of investment. What is the amount of computation power here even look like? Can you paint a picture for us of what the back end of a project this size physically looks like? I'm imagining a board cube in some metadata center.
I think we've talked about some of the numbers in the past in terms of the compute capacity that it takes to train these kinds of models. The thing I want to maybe back up and talk about is when you talk about training these large language models, the pre-training stage, which is really the stage at which you pack a lot of knowledge in and produce the first version of this model, which is more generalized, but not tuned for that. That's the stage that's the most expensive part of it.
After that, yes, you still require GPUs, but it's nowhere in comparison to the scale of compute infrastructure that you need for the pre-training stage. And so that's why you would see only very few organizations in the world who end up doing pre-training. And meta is the only organization that pre-trains and open sources.
And then the reason I want to hit this is because I think it's really important to say if very few companies have the ability to do this and all but one are going to keep it closed, but there's one that is actually trying to do this and open source that. I think that's a pretty big deal.
So when we think about the investment that we put behind and how large these compute infrastructures are, that calculus also goes into sort of why we think this is the right thing to do, because not only are we building this for meta, but also building this in many ways for open sourcing and having the commute access. It's now to your question of like, what are these data to just look like? They're massive. I don't know how to explain them because you're now looking at tens of thousands of GPUs.
And then Lama 3, I believe we trained on the order of tens of thousands of GPUs. 24,000 H100s is the stat that I have in front of me. Lama 4 likely going to be in order of magnitude more. So that's pretty large. So if you think of 24,000, I know you have like a 100,000-ish or maybe even more than a 100,000 GPUs, do they even fit into one data center? Because you really do want them to be as close as possible to help with the training efficiency. That's really fast in Ergon X, right? Exactly.
And now this is not just a hardware problem. It's a physical infrastructure problem because you now have to construct data centers. You now have to find a way to power them and then you have to find a way to glue them. Exactly. But I have to ask you with this level of investment, right? And it is admirable. It's also your totally right. You're the only lab that's really open sourcing these very large expensive training runs.
How does a company measure the ROI, the return on investment for these open sourcing AI initiatives? So we fundamentally just believe that open sourcing is good for developers. Obviously, we've covered that in spades. Open sourcing is also good for Lama and for Meta because we get a lot of contributions back.
The amount of just from a hardware efficiency perspective, being able to not just train these models, now you actually have to run inference on them, which is how you deploy these models, the optimizations that you have to do for various types of hardware. There's a huge amount of community contribution that comes in as part of that.
And then there's an entire tool chain that builds on top of this, which means if you have to now find a way to connect Lama to some obscure database or even for something that Meta is going to need, there's probably someone out there in the community that also has a similar need. And because Lama is extensible and Lama is open, they already built an implementation, which means we can just bring that in house, right? So this isn't only altruistic.
We know that once the open sourcing is kind of technology, the benefits will accrue for the community and for Meta. There's an interesting memo that leaked last year from a Google employee that basically suggested Lama was eating their lunch. I said, like, Google has no moat, neither does open AI. And I'm kind of curious in this world, some of your competitors are obviously seeing the gains you're making with open source.
And they're starting to selectively release smaller open source models, like Google with Gemma, for example, what does the future of this AI race look like to you? Is it simply a race to the bottom? I think the way I think about this is you can either fully commit to open source or not. And if you fully commit to open source, then that becomes a set of choices that you end up making that inform everything from your data strategy to your infrastructure strategy to your release strategy.
But if you're like, I kind of want to dab my toes a little bit in the water, but I don't really want to get into the water, then you're neither here nor there. And you end up not actually playing a game that you're playing. You're just trying to somehow say, well, look, I also have a token open source offering here.
The important thing is a lot of the community sees through that because eventually with open source software and with open source like systems, you want the confidence that this is going to be an enduring commitment. Maybe you're spending your nights and weekends as a hobbyist building an open source tool that builds on top of flama, right? You want to know that lama is going to be around for a while. So I think that type of like commitment is hard when it's not court your strategy.
So I do think that from a cloud-based perspective, you've already seen this, the impact of flama has meant that the cost of inference just keeps getting slashed because it's really hard here to compete. When you have a high quality model like lama that is open source and it's like really available, then what do you do? You have to cut your costs, right? And so our goal with lama, as I said right at the beginning, is we want to be able to get to AJA.
We want lama to be the best and lama is going to stay open. And so that's like our long-term strategy. What other competitors do with that, I think, is a question that's best asked to them. Now embracing open source creates a situation where you have a distribution model with no centralized authority. How much do you think embracing open sources about sort of being a step ahead of any potential, regulatory oversight that might come down the road?
Obviously, this is something like governments across the world are grappling with. The UN just released their report. Doesn't open source make this murkier to oversee and administer? In some ways, it's actually the opposite because that wasn't really, again, anyone else who's doing open sourcing at our scale, which then means if you're a government and you're now thinking about, OK, how should I think about maybe I have my own national, like LLM, sovereign LLM needs? Who do I go talk to?
I can go talk to all of these people who are building proprietary models, but maybe I want to control this. And historically, I've deployed Linux inside so I understand what it means to work in an open source community, who's the only player that is committed long-term to building this in open source manner. So I end up actually end up having really interesting and important conversations with meta. Because we are trying to be responsible towards off LLM as an open source project.
So in a lot of ways, I actually feel like us open source in LLM puts us in a position where we're able to educate policy makers. We are able to engage with them and help them understand the value of why technology like this should continue to stay open and not just be proprietary. Can you just distill down sort of the various measures that are in place to prevent the misuse of models like LLM that are probably three dimensions in which we think about safety.
The first dimension is really around the choice and the control that we want to be able to give developers to do the right thing. The second dimension is safety is a system problem and it's not just a model problem. And so we approach it from that lens. And then the third one for us is safety is an end to end and an ongoing process. So we start all the way from when you're starting to think about what is LLM4 going to look like.
It's at that stage of planning all the way through the development process of the model to its integration into our products when it finally ships in our consumer products. So this is one of those things that you don't just bolt on at the end. You kind of have to be very holistically thinking about this. Maybe I'll zone in on the second one because I think that's probably the most critical choice that we have made here. Is this notion of saying these large language models are powerful.
And so trying to pack all of that safety into just the model is this going to be incredibly difficult and it's going to make the model very hard to be flexible and steerable for the types of use cases that you want developers to be able to do. So instead we bake a fair amount of security and safety into the core model itself. But then we also release what we call LAMAGARD systems. There are a bunch of them. A pick one. There's an input and an output LAMAGARD.
Let's say you're a developer and you're building an app that's aimed at college tutoring. You want the model to be able to provide a certain type of responses because you're dealing with young people and you want the model to be able to address that. Let's say you're another developer who's building a college dating app. Then you want the tone and the response of the models to be able to serve that need as well.
You can't bake all of this into just the model and say, okay, you're going to do this. So instead what we do is to actually give you these input and output filters. So based on your use case, you can then say, okay, what are the bars and the guardrails that I want to be able to set within my context. And so every release of LAMAGARD comes with not only this, we also have cybersecurity evils.
So all of the core set of use cases where you want the model to be useful, but at the same time you want to be able to protect it, we give you the tools and the systems. Because LAMAGARD also has a very rich ecosystem, we work with a lot of cloud service providers, which is typically how a lot of developers experience in build on top of LAMAGARD. They also have access to this. So they also deploy them.
And so that approach, I think again, makes it very unique because it's only something that I think you can do in this kind of an open source model where it's not just an API that is making a choice. You actually get to make the right set of choices for your use case and for your user base, right? So that's sort of the approach that we take. Yeah, I think makes a ton of sense, right?
You've got this like fungible intelligence, but then you've got these other primitives around it that filter the inputs and the outputs coming from it where you can exert control and lay those guardrails as you outlined. I have to ask you maybe a follow up question to that, which is in the background of this entire conversation, right? There's the fact that in the past, Meta's been challenged for its business practices, right?
Like it's been accused of everything from fricking influencing elections, inappropriately using personal data, spreading misinformation, does sinful. All while pushing engagement to draw an ad dollars, I think it's fair to say that some folks have expressed suspicions about Meta's AI ambitions. What's your answer to anyone who might ask this question, which is like, why should we trust Meta with our AI future?
Yeah, look, this is an important question and it's something that we obviously take very seriously because ultimately, I think all of us who are at Meta believe that when you build compelling product experiences and you build compelling product experiences that give people the choice and control you earn their trust. Without their trust, you're not going to be able to build something that is going to be enduring and it's not going to be something that people keep coming back to.
Obviously, we've made our fair share of mistakes, but we also work through them. And so when we think about the investments that we make in AI, the investments that we make in Gen AI in particular, we pay a lot of attention to making sure that we follow best practices and with the ability for Lama itself to be open source, we also now have the community having access to how we're building Lama, how we're deploying Lama.
And then that then should ideally build back more trust in Meta itself as a player in the community that is committed to making sure that there's transparency. There's a lot more choice. There's a lot more control over the experience because ultimately, trust is going to be first and foremost when you use these AI applications because they're going to be powerful and you want them to be trustworthy.
As you think about the future iterations of models, what comes to your mind as this sort of North Star use case for where you're taking Lama? Yeah, look, this is a hard question because my North Star use case, if you'd asked me two weeks ago, would be, hey, can my kids talk to my mom in a language that each of them understand? And now we have that. And the rate at which this technology is evolving, I was feeling like my North Star use cases have to be more like Ursa minor use cases or something.
Exactly. Exactly. Because I actually do think there are a lot of things that we showed even in some of our Orion demos where the seamless integration of augmented reality and these glasses with AI is really going to create things that we've only seen in movies. So to be able to have this holographic conversation with someone, it's still only stuff that's made of movies. And so I'm like, okay, that would be really cool.
So those are kinds of maybe that the more not just two week like North Star's, but the longer term not stars. I think that it's just going to be such a huge advantage for you all having these like billion user surfaces where people are going to be engaging with this stuff every day. And then this thriving open source community with enterprise partners, but also indie hackers. People listening to this may not realize how easy it is to just go download LM studio and start running Lama locally.
And it's kind of wild like things that felt so out of reach just a year ago are like incredibly accessible and running on my freaking MacBook Pro. I want to ask you like given me as really unique position at this intersection of social media, open source AI now and really global communications. What are you excited for over the next decade and what keeps you up at night? Yeah. What am I excited about and what keeps me up at night?
There may be just two sides of the same coin as these things tend to go. What I'm excited about is when you think about Lama and Lama's own journey, we think about this in terms of how do you build a system that is capable of speaking or understanding all of the modalities in which humans understand. So sort of like this universal set of modalities start with like text, images, videos and who knows like what else, you know, even being able to understand a real as a native format.
Like humans come up with like new media of communication. So you want these models to be able to understand at that level and to be able to generate content at that level. So that's sort of like one dimension. The second dimension is this technology is going to have the capacity to think about and reason about and plan about things the same way humans do. And we're still at the very early stages of what these models can do. And so over time, they're going to have that capacity.
And then the third piece is a lot of the models today are really focused on what I would call static generation. So you give it a prompt, it gives you back some content, you have to go do something with that. But over time, they should be able to act on your behalf. And this action can not only be on the digital domain where you're able to entrust them to go take care of, hey, you know, my daughter really wants to go to the state of the concert.
I have to keep hitting refresh when the tickets go online. Can you just like take care of that for me? Right. Going all the way to then being able to take actions in the real world and the physical world. So you combine all of these, being able to understand and communicate in any medium, being able to have the full capacity of the human brain in terms of, you know, being able to do long term planning and reasoning. And then to be able to act, that is incredibly powerful technology.
The flip side of this is you want to make sure that this technology is developed with the right set of connectivity to what humanities needs are. And as you build it, you're thinking about things like responsibility and safety. And you make sure that you bring the right set of Godrails around it so that you evolve the technology in lockstep with what we as humans would want to experience on our day databases.
And for me, especially given that Lama is at the forefront of all of this and our team is responsible to bring this technology to the world, dude, that's what keeps me up at night. Alright, so it's encouraging to hear that one of the key people involved in developing and releasing Lama to the world spends a lot of time thinking about how to build a system that is responsible and safe. Because I find it exciting that a company like Mehta is giving away a largely open model. It's a crowded space.
And with close-source models from Google, OpenAI, Microsoft, and Anthropic vying for the top spot, Mehta's openness is refreshing. It gives an enterprise that doesn't want to give their proprietary data away a path to a custom built-in-owned solution. And it also gives scrappy indie hackers building a small product on a mobile device, a path that won't bury them in massive amounts of GPU costs.
This openness combined with our vast user base across WhatsApp, Messenger, Instagram, and Facebook, positions Mehta uniquely to build this seamless bridge between the physical and digital worlds. In fact, Mehta is very much at work on this augmented reality future. The company revealed a pretty advanced, if still pretty clunky-looking pair of AR glasses that could very well be where all of this is headed next.
But we're still years away from walking around with Tony Stark Ironman style heads-up displays. And in the meantime, while we should always hold Mehta accountable, their move suggests a commitment to the public good. This balance of caution and optimism is crucial. As we question Mehta's intentions, let's also acknowledge the potential benefits of their open approach. It's a very nuanced perspective, but one worth considering.
Mehta's openness might actually be the key to unlocking a more inclusive, community-driven future, one where AI enhances our lives without sacrificing our agency. The TED AI show is a part of the TED audio collective, and is produced by TED with Cosmic Standard. Our producers are Dominic Gerard and Alex Higgins. Our editor is Ben Ben Cheng. Our showrunner is Ivana Tucker, and our engineer is Asia Polar Simpson. Our technical director is Jacob Winnick, and our executive producer is Eliza Smith.
Our researcher and fact-checker is Christian Aparta, and I'm your host, the Levels to Do. See y'all in the next one.