GitHub CEO Thomas Dohmke says the AI industry needs competition to thrive

00:00

Support for Decoder comes from Indeed. Hiring has never been particularly easy. It used to be, you'd have to post to help one dad, sip through resumes, and hope for the best. These days, you can skip the busy work and still find the perfect candidate. How? With Indeed. Indeed is the hiring platform that lets you skip the search because it matches employers with high quality professionals. In fact, according to a recent survey, 93% of employers agree Indeed delivers the highest

00:28

quality matches compared to other job sites. That's because Indeed's matching engine is constantly learning from your preferences, so the more you use it, the better it gets. It even helps you with scheduling, screening, and messaging potential employees, all in one place. Listeners of this show will get a $75 sponsored job credit to get your jobs more visibility at Indeed.com slash decoder. Just go to Indeed.com slash decoder right now and support our show by saying you heard about Indeed

00:57

on this podcast. Indeed.com slash decoder. Terms and conditions apply. Need to hire? You need Indeed. Indeed is a hiring platform that connects businesses with job seekers with over 350 million global monthly visitors, according to Indeed data, and a matching engine that helps you find quality candidates right away. And it's not just fast. According to Indeed, 93% of employers found the highest quality candidates on Indeed compared to other job sites. Listeners of this show will

01:29

get a $75 sponsored job credit. Get your jobs more visibility at Indeed.com slash vox business. Just go to Indeed.com slash vox business right now and support our show by saying you heard about Indeed on this podcast. Indeed.com slash vox business. Need to hire? You need Indeed. Hello and welcome to Decoder. I'm Neil Napitella, Editor and Chief of the Verge, and Decoder is my show about big ideas and other problems. Today I'm talking with Thomas Dumpco, the CEO of GitHub.

02:02

GitHub is the platform for managing code. Everyone from Solo, open source hobbyists to the biggest companies in the world rely on GitHub to maintain their code and manage changes. GitHub started as an independent company, but it's been owned by Microsoft since 2018, which makes this a perfect Dakota episode since I have a lot of questions about that structure. Thomas and I talked a lot about

02:24

how independent GitHub really is inside Microsoft. Especially now that Microsoft is all in on AI, and GitHub co-pilot, which helps people write code, is one of the biggest AI product success stories that exists right now. How much of GitHub's AI roadmap is tied to Microsoft's AI roadmap? How do

02:41

those resources get moved around? And since GitHub is used by all sorts of companies for all sorts of things, how does Thomas keep all those customers feeling secure that Microsoft isn't just trying to pull them towards the services it prefers, like Azure or OpenAI? Thomas had some surprising answers for all

02:56

of this, like any good Microsoft executive in the Settinadella era. He told me that the company's strength is working well with partners, but he also insisted that tech isn't a zero-sum game, and that one company winning doesn't mean another has to lose. He'll hear him tell me that he enjoys competition, and if there was just one option, just OpenAI or just Meta's Lama, for example, that would be like a sport with just one team in the league. Of course, I also asked Thomas about AI,

03:21

and whether our current AI systems can live up to all this hype. After all, he's got a front-row seat. GitHub co-pilot is a successful AI product and he can see what people are using it for, but he can also see what people are building across GitHub. I think his perspective here is pretty refreshing. It's clear, there's still a long way to go. Okay, GitHub CEO Thomas Stumpka, here we go. Thomas Stumpka, you are the CEO of GitHub. Welcome to Decoder.

03:57

Thank you so much for having me, big fan of the show. I appreciate that. There is a lot to talk about. There's a lot of decoder questions to answer about how GitHub works in Microsoft, on Microsoft works in the industry. GitHub is everywhere in the industry. Let's start at the very beginning. I feel like for some of the decoder audience, they're intimately familiar with GitHub. They're probably living it every day. But another part of

04:16

the audience is a bit of anigma. Just explain quickly what GitHub is and what it's for. GitHub is where most of the world's developers are building the software of today and tomorrow. It's started as a place to store your source code in a version control system called Git, that's where the name comes from, which was actually invented by the Linux kernel team.

04:36

About two years ago, 2005, before GitHub was founded in 2007. In the meantime, it has not only become that place where people store their open source code, but it's also used in 90% of the Fortune 100, every big and small company in the world, storing their private source code and collaborating together. That's really what I think GitHub is all about, is that we're using code as a primitive to collaborate across the world. When you say code is a primitive to

05:01

collaborate across the world, do people actually code in GitHub? Is it just version control or repository? There's some blurriness there, especially with some of the news you have today. It used to be just repositories. That's how it all started. It's actually fun to go to the VBIG machine and look at the first GitHub homepage and how Chris and Tom and PJ have found us, basically, described all the changes. The front page was like a change log effectively.

05:26

In the meantime, we also have issues where you can describe your work, you know, buckleys, or feature requests planning and tracking is how we call that area. We have something called GitHub actions, which lets you automate a lot of the workflows. We have GitHub code spaces, which is a whole dev environment in the cloud. You don't even need a laptop anymore. You can just open that in the browser on your smartphone or in your iPad. Have you have VS Code in the browser,

05:49

which is a popular IDE or editor, I guess. You can start coding right there without ever having to install dependencies and all the libraries and all the toolchains. Just an environment that you can leverage and then just submit code back to GitHub. I feel like I have to ask you how many people

06:05

are coding in a browser on their iPhones and GitHub. You know, more than you would think, obviously, it's not the main way of finding software, but you can kind of imagine, you can kind of imagine a scenario where somebody pings you and say, hey, can you quickly review my pull request, which is a way how developers collaborate. I make a change to the code base and I send you a pull request and then you review it and say, yeah, this looks good. I approve it and then you

06:28

deploy it to the system. That definitely happens that people use the browser and the GitHub mobile app in the bus, you know, on their commute to work or back from work to just quickly review what I've done and, you know, a small typo or maybe a bug fix or an update or something like that and then

06:43

just click a proof and it goes on from there. In fact, you know, at GitHub, we use GitHub to build GitHub and so, for example, then one of my employees wants access to Salesforce, they have to send a pull request against an entitlements file and then depending on where they sit in the organization, I might be the prover and so I often do that on my phone. So it's code, not in the sense of I'm writing a lot of code, but it's definitely code in the spirit of I have a file with a diff and

07:12

I compare the two sides against each other and say, okay, this looks good. Let me approve this and then it goes through the process. But you manage enterprise approvals in code in GitHub is supposed to send a horrible enterprise software. Honestly, I feel like that might be better compared to the horrible enterprise software that those people have to use, but that is astonishing. I think we have

07:32

a blog post on this. We can we can Google it later and see if you find it. It's called entitlements and it's basically a repo that has a file, you know, with all our user names that GitHub, almost everybody identifies with a GitHub handle. So I'm ashtom and often, you know, we speak about each other as with our handles and not with our real names. And then those files have the user handles in it. And then, you know, once you do that, you have all the benefits of software processes, right?

07:58

Like can run test cases and see if the file is properly formatted. You can see, you know, where does that person sit in the org chart? And so he who needs to be the approver, you can check, you know, automatically how to give that access and then ultimately give me that access. So in many ways, it's much easier to do that, especially if you're develop already, you know how to modify a file and send a poll request. But yeah, we have all sales team do that. We have our legal and HR team do

08:22

that. In fact, you know, our legal team for the longest time has managed all our terms of service and privacy policy and all that in the public GitHub repository. And so everybody can see what for changes we're making. So it's a utterly transparent. Sometimes, you know, in a good way and sometimes in lots of good ways, people are debating or five are making these changes to cookies and

08:42

and other things. That's a good way. If you think about that for legal texts, to have a dip in the same way that you want to have that for code, I ask a lot of enterprise software CEOs to come on the show and they often say no because they know I will ask them if they use their own software. And it seems like you've passed that test with line collars. When people think about GitHub, it is expanding over time, right? It started as version control. This software is a service to do

09:04

version control with Git. You've expanded out into now you can actually code in GitHub. You can apparently run an entire large enterprise inside of GitHub. Where do you want that scope to end? Do you want it to actually become something like VS code that developers log into and they do all of their work in? Is there boundaries or stuff you don't want to do? All of the work I think itself that development never actually works. You know, if you think about the plethora of tools that

09:29

developers use, right? Like from an operating system and a container like solution like Docker and Kubernetes programming language, you know, all the tools that come with the programming language like compiler and debugger and the profiler and all that, the frameworks. And of course, a lot of the open source is coming from GitHub but it's not provided by GitHub. It's stored as open source

09:48

in GitHub and you find the readme near consumer that project. And then as you go through, you know, the what we call the developer lifecycle on the tail end is, you know, monitoring data collection, telemetry, exception tracking, policies, you know, making sure that all data is stored within a data governance framework all the way, you know, to security scanning. There's never a world where one vendor will offer that all. And so we see ourselves as, you know, one planet, a big planet

10:15

in a large universe of software development tools. And it's, it has been always been important for GitHub to have APIs and web hooks and integration points for these partners to actually build that end-to-end workflow that developers want and give them the choice. And you know, whether whether you're in the Python ecosystem and you want to use PyPy and VS code or whether you in the Java ecosystem, you want to use JetPranes and Mavin and other tools like that. GitHub is

10:40

there for you to help you to collaborate as a team. We see ourselves as the center of collaboration. You may say, you know, the creator network of the social network of coding. I think, you know, for some time, our tagline on the homepage was social coding. And it's a very special creator network because, you know, most creator networks are financing themselves by advertisements and things like that. And you create communities around the creator with comments and things that,

11:06

you know, help you to engage with the community. In GitHub, it's still code. And I don't think anyone would want us to put banner ads on GitHub, even if that will provide a revenue cut back to the owner of the open-source project. So we are constantly also evolving or thinking around that. This is going to bring us inevitably to AI. GitHub has a lot of AI in it now. GitHub Copilot is a massively successful product. You have some news. You've announced something called GitHub models,

11:31

which I want to talk about. But I just want to stay on that vision of GitHub as a social platform or a creator network. Most other creator networks don't launch tools that let you make the thing that the creators are making as a first-class citizen to the creator themselves. Like Instagram is not making an AI photo tool that lets you publish photos that build explicitly on the photos that Instagram influencers have published. And then presenting those AI photos in a first-class way

12:00

against. That would be a weird thing for Instagram to you. That is more or less exactly what GitHub is allowing people to do. Copilot lets you generate code using AI and you can present that code right back to the GitHub audience. Do you see that as being an odd dynamic, a new dynamic? Is that going the way you want it to? It's a good question. I think back to the orange of GitHub, while we allowed you to store source code, in some ways that always spurred creation.

12:26

Once you have a file, especially in a public repository, an open source repository, we allowed somebody else to fork this and modify it. There was some type of creation there in a way of you're taking something existing and you're allowed to copy it into your space, into your name space, and then modify it. Nobody forces you to say, when you fork my repo that you sent me back to your changes, you can just keep them for yourself. We had an editing view,

12:54

obviously, within the UI, a very basic HTML text box for the longest time. I think at the same time is when we started working on Copilot four years ago, this was GPT-3 back then, and the no chat GPT was on the horizon. Generally, if AI was a very inside topic in the tech industry, but certainly not a top news topic that is reported on every single day. In fact, in all my customer conversations, AI was like five minutes and then 55 minutes was something called

13:26

DevOps, the developer lifecycle, agile development and those kinds of things. But I think the original motivation was the same one as GitHub, which is how it can be made, developers more productive. How can we make them more collaborative and ultimately how can we increase their happiness? In some ways, I always like to think about this, but I'll be very internally motivated of just making our own developers faster, as we are always running out of time to implement all the

13:51

ideas we have. If I look at my backlog, we have a huge repository of issues that somebody has filed over the last 15 years. There's some from 2015-2016, great ideas that we just didn't get to yet. I'm running out of lifetime faster than GitHub is running out of ideas of all the things that we could also do to make the platform better. The idea here was how can we make developers more productive? How can we make our own developers more productive so they can implement things a little

14:20

bit faster so we get sooner to the future that we envision? We think about that lifecycle of the developer so much of what we traditionally thought of as software engineering involves talking other people, asking questions, searching for answers. I have a lot of engineer friends who say, I spent half of my time just looking for the code that I need to implement, and then the other half

14:40

trying to implement it. That's gone away in some capacity with AI, right? Platforms like Stack Overflow, which were a huge social community for developers, are seeing drops in the rates of people are using them. You see that in other places as well. Do you see that as the natural outcome of AI, or do you see a way to bring that social innovation back to the forefront? I think the first thing that comes to mind there is that there's totally a democratizing effect of

15:06

having your co-pilot within your editor and you can just get started. It's easy to see that when you look over the shoulders of kids trying to build a little game, which many cats nowadays do at age six or seven as they grow up with mobile phones when their parents go to dinner and they

15:23

want to quite minute and you see your observant, any restaurant around the world that scenario, family with a three-year-old holding an iPhone or an Android phone and watching a video, why it's soon enough they are into Minecraft and other games and they're after they want to create

15:37

because that's what we do as humans and then how do we get started? Stack Overflow is great and I don't think Stack Overflow will go away, but you have to know that AI even exists and who tells you that as a six-year-old when you live in a household where the parents are not computer scientists

15:53

themselves. I think the co-pilot will become ubiquitous enough and now I use co-pilot as the category term whether it's chat GPT or other products doesn't really matter where you can just say, hey, I want to build a game, a porn game or a snake game or something easy to start with and it gives you an answer and it already does link you back to where some of that answer comes from. As that, the social network gets a new feeder where you can learn more about the answer if it doesn't

16:24

solve the problem already. But I think we are going to see more in those chat interfaces. Just a couple of minutes ago I was in a call where somebody bought the example, if your mom goes to Photoshop today and she wants to replace the gray sky with the blue sky, that's probably hard because figuring out how the user interface or Photoshop works if you're not a poll is incredibly

16:45

complicated. If you can just say, hey, replace the gray sky with the blue sky whether it's a porn that you're typing or actually literally speaking to a computer like Scotty and Star Trek, it's going to open up a whole new world of creation and then you know, typically you create something to share with others, right? That's how human interacts. I think it's actually changing how

17:04

the greater economy works. But it will open this up to so many more people and if I bring that back to coding this morning, I woke up with an idea and then I realized well, I have this podcast today and I have customer meetings and I have all the other things in my whole SEOs. I don't have time to start a new project. What if I could just go to co-part it and start saying, hey, I want this and I want to build this app to track the weather. Here's an API I can use and I iterate over this

17:28

in an hour and a half, build something as quickly as building a Lego set. I think that's the tool, you know, change that we're going to see. If you pull that thread out all the way, maybe you don't need to know how to code at all, right? You're just instructing the computer to do some task or produce some application that can do some task and you just evaluate the end result. Is that the endpoint for you that people use GitHub who don't know how to code at all?

17:52

That endpoint already exists, right? There's low code, no code tools, like retool or Microsoft Power Platform. They don't have a natural language interface where you're like, make me an app that changes the color of the sky. You're not quite there yet, but we could be very soon. Well, the Power Platform does. I haven't checked retool recently, but I would be surprised if they're

18:09

not working on that, at least as an assistant to get started. But I think the basis will work is that you have a spectrum of knowledge that you can probably build a web page without knowing anything about HTML and CSS as you can do in scratch, and many other tools and could do for the last, I don't know, 20 years or so. But code still exists as the underlying deterministic language. Right? Human language is incredibly non-deterministic. I can say something and you say the same thing,

18:38

it means two different things. Code is deterministic and code effectively is just an abstraction layer on top of the processor and the operating system that runs your machine, right? And that processor in itself today, the CPU or the GPU both run the machine language or an instruction set. And code is just the next layer and then now we're moving higher, but that doesn't mean those layers have gone away when we invented programming languages and replaced assembly and before that

19:03

punch cards with code. Those exist and I think it depends on what you're working on, whether you're going down the abstraction stack or whether you're staying at the higher level. And I think both the professional developer will know both layers, I think. The professional developer will have to know code that will have to understand the laws of scaling and the intricacies of program languages,

19:24

security, vulnerabilities, all those kind of things. And they're going to leverage natural language to get job done faster, to write boilerplate, to write test cases, all these kind of things. So I think it's going to be a mix of these things and we're going to sit on that spectrum and

19:37

and moving back and forth. And that makes the technology so powerful because if you're a learner and today you're maybe you're in an IT role and you're only working with such a no-code, local tool, you now have the same user interface, natural language to move up that stack and ultimately become a pro code developer. So I think that brings me right to the news you announced recently, which is GitHub models, which allows people to play with various AI models right inside

20:04

of GitHub. Explain what that is exactly because it feels like you're describing something that leads you right to you're going to play with AI models directly in GitHub. What has changed over the last couple of years is that now models themselves have become a building block for software. It used to be code, both in the front end and in the back end and before that we didn't even have a back end, we just build an app that runs on a PC or even before that on a Commodore 64 and Atari.

20:30

That didn't have a back end because there wasn't really internet at that time. We move from building all this by yourself to using open source libraries as building blocks in your application. And in the last few years we have increasingly talked about the full stack developer that is able to build back end code and front end code and all the things in the middle, deployed to the cloud, managed the operations of that cloud service being on call of the time.

20:54

Now what has changed is now we add models to that picture and most modern applications that are being worked on right now have some form of AI integration, whether it's a simple chatbot or it's using a model to predict animal lease and whatnot. And so for a while now we have been thinking, okay, so GitHub offers the code and offers the open source projects, but we're missing the model as a building block. And so we're adding these with GitHub models in partnership with

21:21

Azure AI. We're starting you know with a bunch of models including those from opening our Microsoft of course, but also from Meta and from Australia and Cori and a couple of other partners. A nice mix of open weights or open models and some of them also open source, but you know that is a debate in itself. How do you call these models where the weights are open and the source code is not. And then of course commercial models like like GPT4 or Mini that just recently was

21:48

was released. So it allows you on GitHub with your GitHub account to just play with these models. You can send prompts and you get a response and you can ask you know about Shakespeare and about coding of course and then you can change the parameters of the model. Those that are sent during inference like you know how long is your context window or how high do you want the temperature

22:07

like how non-deterministic do you want the answer to be. And as such you can start experimenting with these different models you know find one and bring them into your editor into your code space and prototype an application and you don't have to sign up for another account and you don't have to worry about you know paying inference costs while you're doing that. You can keep that all within your GitHub workflow. Is GitHub paying for the inference costs as part of the subscription

22:31

you paid to GitHub? So we offer you know the playground for free with certain entitlements as we call that so certain number of tokens that you can send per day and then beyond that you can sign up for an Azure subscription and pay for the overrages and of course when you want to move to production you definitely want to remove your token your GitHub token from the source code that

22:51

IS tied to your personal account. So in a larger organization obviously you don't want that because the employee might leave the team or might even leave the company and you want to move to you know a more productionized version of having a key or token within a key vault system that where that is stored and then inference is done against that key and not against your personal token.

23:13

So when you think about what models you can make available to people there are some open source models or open-ish models like the ones from meta which have open weights maybe that open source code then there's obviously Microsoft's own models and there's models from Microsoft's partners like OpenAI. These are a limit does Microsoft have a point of view on what models GitHub can offer and what models GitHub points people to? I'm just imagining Microsoft would like everyone to use

23:39

their models and run everything on Azure but that's not the reality of GitHub today. I think you know Microsoft wants everybody to use the best model to build applications that ultimately hopefully deployed another cloud and then start on GitHub and I think we want to offer a choice as platform company that is almost 50 years old I think next spring the 50th birthday is coming up.

24:03

We have always offered that choice every time you report on a surface launch often there's also a number of partners there that announce they are laptops under their brand with the similar features at and I think in the model space we think about that similarly we want to offer the best models

24:19

and we're starting with 20 also top models with this launch and then we'll see what the reaction is and what the feedback is and if people want to you know add their own models to the list if they want to find you in these models and what they actually use it is I think that's a very

24:32

interesting question where you know we at GitHub love to just move fast and bring things out there and then work with the community to figure out what's the next best thing that we can build that actually solves the use case we have to take a quick break we'll be back in a minute.

24:56

Sport for this show comes from ARM have you ever thought about the technology that makes this podcast possible to listen to on your phone in your car or on your laptop and then there's the data centers that make it all work one company is at the heart of it all it's the same company that powered the smartphone revolution and is helping define the AI revolution that company is called ARM. ARM designs compute platforms for the biggest companies in the world

25:19

so they can create silicon and solutions to power global technology. ARM is proudly NASDAQ listed and became a NASDAQ 100 company within a year of its IPO ARM touches nearly 100% of the globally connected population 99% of smartphones are built on ARM major clouds run on ARM as well

25:36

as all major mobile and PC apps now. ARM's engineers are tackling the insatiable demand for compute and power efficiency that AI is creating AI enabled ARM CPUs are able to provide the compute platform for the global AI revolution in the years to come but for now relax and enjoy this

25:53

podcast it's very likely running on your own ARM powered device visit arm.com slash discover to learn more Fox creative this is advertiser content from zell the recruiter said all I needed to do was send $500 to cover mandatory safety training and the job was mine in a world where financial crimes

26:20

are more and more sophisticated there's a team that's got your back come in safe squad we got a 10-3 copy that dispatch we're on it hop in skip we got a bony recruiter safe squad the crime drama everyone is talking about I know it's only my first day but that

26:44

sounds like a pretty cut and dry job scam strap in rookie these days criminals can even make it look like it's your bank calm but that's where we come in my what my savings account compromised no I won't hold no I didn't authorize a 12,000 dollar withdrawal that's my life savings

27:05

why don't you come with me I'll show you how to report to the FTC what payment platform did you use let's contact them too don't miss the TV event of the season safe squad hey ace yeah kid you're right that was one hell of a first day learn how you can spot the signs of a scam so you don't have

27:25

to call the safe squad by visiting www.vox.com slash safe squad HQ remember never send money online to people you don't already know and trust support for this podcast comes from Honduras if you're a small business owner the threat of hackers isn't just a threat it can affect your livelihood

27:47

small businesses or easy targets for hackers and huntress wants to give businesses the tools to help huntress is where fully managed cyber security meets human expertise they offer a revolutionary approach to manage security that isn't all about tech it's about real people providing real defense

28:03

when threats arise or issues occur their team of seasoned cyber experts is ready 24 hours a day 365 days a year for support they provide real-time protection for endpoints identities and employees all from a single dashboard they're cutting edge solutions are backed by experts who

28:19

monitor investigate and respond to threats with unmatched precision now you can bring enterprise level expertise without needing a massive IT department huntress can empower your business as they have done for over 125 thousand other businesses let them handle the hackers so you can focus on

28:36

what you do best visit huntress.com slash decoder to start a free trial or learn more welcome back I'm talking with GitHub CEO Thomas Tomka about the big debate over open source AI there's a big debate right now in the AI world about open versus closed I think it's right next

29:00

to debate about we have to actually start building some applications to make money there's another debate about running it in the cloud versus running locally there's a lot going on where do you see that shaking as you build GitHub you probably have to make some longer term decisions that

29:16

predict how development will go right to architect GitHub correctly you have to say okay in two years a lot of applications will be built this way maybe using open source models maybe everyone's just going to use open AI's API whatever how do you see the trends going right now because the

29:30

debate is raging you know one interesting statistic I can share with you then the last year 100,000 over 100,000 AI projects have been started on GitHub open source I can't track this close source one because obviously we could not looking into private oppositors it's 100,000 open source AI

29:45

repositories have been started in the last year alone and that's you know up by magnitude from what we saw seen before to GPT such I'd say the quantity absolutely will be you know in the open source space as it has been you know in software for the last two decades right like open source

30:02

as one there's no question anymore that the most successful software companies all use open source in their stack you know they're running mostly Linux on the server and in containers they're running you know whether than the Python ecosystem or the JavaScript TypeScript ecosystem the Ruby

30:17

ecosystem all these ecosystem have large ranges of open source libraries that whether you start a new project in a large company or whether you start up you're pulling in all these things like react you know it's a thousand or so dependencies just by by starting and you happen and so I

30:32

such I think you know if you just look at the where open source has gone I would predict you know the open source models or the open weights model will play a very important role in democratizing access to software development like it's just so easy to get started and not worry about

30:47

inference costs or license costs and those kind of things and then the other poll of this right is the commercial models that you know try to be the best models on on the planet at any given point in time and so they you know offer a different value which is you can get the best model but

31:02

you have to pay a vendor or a cloud provider to run inference on these models and you don't get access to the rates and you don't get to see what happens in this models so I think those two you know polarities will continue to exist and nothing really in tech is a zero sum game everything really

31:18

is like in our heads we like to think about this right like we like to think about everything like a sports competition where our favorite team should win our favorite phone or a favorite operating system or favorite cloud provider but then a new season starts you know with mobile phones is

31:34

often and fall that you know Apple launches a new iPhone and then everybody else is pulling on and you know there's the tech conferences that kind of like determine the rhythm of model launches right now you know it's new season starts and the competition starts are new and I think that's actually fun right because you wouldn't want to want to watch your favorite sport with just one team in the league or just one team in the championship you want different teams competing against each other

31:58

and you want to see how they can play the infinite game right like in the season they play the finite game they want to win the season but in the long run they play the infinite game they want to have a legacy you know they want to they want to play Minecraft as much as they play Super Mario it is

32:13

interesting to think of open eyes Minecraft and like llama is Mario I'm not sure where that metaphor goes but I'll leave it for the audience it would be the other way around I think llama would be Minecraft right because it's more open world but inside of that right llama right now you know

32:29

met his claim is that it is as functional as the closed source frontier models right is match the performance is match the capabilities if that keeps happening just from what you're seeing now do you think the open models will can because you have to be much better to be closed and paid

32:45

versus open right and free like you you have to deliver some massive amount of additional value just basically you're seeing in the developer ecosystem do you think that's going to play out the llama model isn't free in the sense of you still have to deploy it to GPUs and run inference and

33:00

that's most of the cost that you get for opening eyes models today as well like if you look at GPT4 or many the inference costs are no so small compared to you know just a few years ago on on GPT4 or even before that on 3.5 and 3 that it you really have to look at inference costs as

33:17

the differentiator and not license cost in the sense that you have to pay open eye and additional license on top of that so I think you know the model will be commoditized in the sense of that the chips in our laptops are commoditized and it doesn't mean that you know Nvidia isn't the

33:34

great business it clearly is you know especially in the last year but it doesn't matter as much for consumer what chip is running in your laptop I mean you know I buy a new iPhone every year and so there certainly people in the tech industry that do want the latest chip and the latest feature but the majority of consumers then enterprise users do not actually care about that compute layer at the bottom as much as they don't care whether you're running a SaaS product on a certain CPU type

34:01

a certain VM type or whether you're using a Kubernetes class right that's like a tech question and maybe a you know operating margin question for the provider more so then I think the question for the user of the product and then I think you know the models while you know the benchmarks are getting

34:18

close between those two I from our perspective the GPT line still has an advantage that's why using it in in co-pilot you know I have the freedom to move to a different model like my management at Microsoft is definitely encouraging me to look into all the opportunities to provide the

34:33

best product to my customers and in the same way you know that to keep going on my matter for one more time in the same way that you know we have laptops with Intel chips and with AMD chips and now with ARM chips and the customer decides which laptop they want based probably on different

34:47

things like battery life so I think you know there will be commoditization but there's also differentiation between the different models and it will be like you know typical question of how good is it how much does inference cost you know how many GPUs do I need and how fast is it how

35:01

long is the token window do I have actually a mature response to the AI pipeline around that model and does it fit my scenario you mentioned that you have the freedom to choose models in addition to letting people build on these models you obviously have deployed a significant AI application

35:17

and get up co-pilot when you evaluate its performance its cost versus its value versus the switching cost to another model how often do you sit and think that through are you set with it now on GPT or is this something you're evaluating constantly doing it constantly in fact you know we are just

35:34

doing it on for Omini which you know at the time of the recording had just launched and we're looking you know how does it compare to 3.5 turbo which is the model that we're using behind auto completion right so if you actually look at co-pilot today as it is deployed to over

35:48

seventy seven seven thousand organizations and you know more than 1.8 million paid users it's multiple models that run for multiple scenarios we have 3.5 turbo for auto completion because we need low latency and a fast response time with a decent amount of accuracy because as you're typing

36:04

in your editor and you're seeing the proposal if that's coming for whatever you type like a minute ago and if you actually look at you know the original GPT 4 how long it actually took to write the whole response like the streaming was a genius user interface design because it it obscured how

36:19

long it actually takes to get the full response but auto completion you can't have that you know it needs to show you the whole thing relatively quickly because otherwise you're just faster and keep typing the code that you wanted to type and so we're using a fast small model a small-ish model in

36:32

our in auto completion in chat you know we have a mix of four turbo and actually you know 4.0 is has rolled out in the meantime then for newer scenarios like copilot workspace we have been on 404 while and then we have compared you know for old to other models to see where do we get the best

36:50

returns in terms of you know code rendered and changes made to the code base to solve the problem that copilot workspace tries to solve so we're comparing you know within the same model generation you are releases that we're getting from open AI and we're also comparing these models against you

37:04

know other open baits open source and private models that are accessible to us through Azure do you have a lot of decisions to make there are a lot of things swirling obviously there's microsoft to manage as well how do you make decisions what's your framework for making decisions I have two

37:20

frameworks that you know we closely follow at GitHub one is what we call the DRI directly responsible individual and then often we ask the first question is who's the DRI and if you don't have one we should have one and so we we have one person in the company that runs the project and then if

37:34

you need to have a decision to be made ideally the AI can make the decision by consulting all the stakeholders or they can bring the decision you know to the leadership team and me to discuss Volvo you know the ideas and where we want to go the other framework are like as view voice

37:51

vote veto basically deciding who in the group has actually you know what rights in the discussion right do they can they have a view can they have a voice do they have a vote or do they have a veto and and different you know decisions to be made you have different of these roles and

38:07

obviously within the large framework of Microsoft you know I often have a voice while in the framework of GitHub I often have a veto well I hope at least I do have one definitely I have a vote but you know honestly I also often don't want to have a voice I'd like to view things

38:25

because I'm interested you know to just browse through GitHub issues and GitHub discussions where the companies discussing things but when engineers are talking about you know the ups and downs of using React as an example like I'd love to read all that stuff because it helps me to you know

38:39

to understand what's happening and also turn it out a certain to a certain degree but I don't need to raise my voice or even have a vote on that you know I have this strong engineering leadership team and a strong set of you know distinguished engineers and principle engineers that can make

38:53

those decisions and will be accountable for them within the DI framework you know what I'd like to tell my leadership team is like give me options and give me like a set of choices I can make and tell them tell me about other posts and cons but also then you know and that maybe a

39:09

bit of my German DNA I often ask the questions what about the options that are not here like what are you not telling me what are we missing what what am I I'm not seeing on these options and I think it's it's actually more important to think about what's not presented here and why are we not even

39:24

discussing this where those just picking between A and B and you know last year it's a you know let's be real many CEOs and many leaders leverage experience or intuition to make decisions many small decisions can be just made without a document without you know a data and like I'd

39:42

love to be data driven and I love to look at data especially when it comes to things like determining pricing or determining you know model updates as we talked earlier and and whether five percent is enough or not but you know many other decisions are just a question of intuition

39:57

like the tagline for our conference that's certainly a discussion but then we decide on that when you know taste and intuition and do you not AB testing for your shades of blue now you know the reality is most decisions you don't get it to do an AB test right your life doesn't have

40:13

AB tests you know the price point that we said for co-pilot we are kind of like stuck with that until we make it a decision to change it but you don't really want to sell at $19 to some set of customers and a different price point to other customers that that minus discounting obviously

40:26

but like that doesn't really work and and so it's the same as true you know when we made the decision to launch co-pilot and then you know put considerable resources within the company onto co-pilot it also meant we you know remove funding from other projects that we coach also have done right

40:42

like the reality is in any company or resource constrained even the largest companies in fact you know I think the largest companies biggest weakness is that they're so big they think they can do everything and the truth is they're still resource constrained and so they still have to say

40:55

no way more often than they can say yes and I think that's the thing that I remind myself you know almost every day that saying no is much more important than saying yes and especially in this age of AI it meant that while we invested in all these AI topics like co-pilot and co-pilot

41:12

Brooks based models we also made the consciousness of not leaving things behind that we won't invest into anytime soon you mentioned do you think our models as commodities like chips like Andy chips versus arm chips or whatever have you architected your various systems so that if you wanted to make

41:26

a big model switch to miss role or something you could would that be very costly to do would it be easy to do the costly part is the you know evaluation test suite and the meta-prom to the system prompt that and you can kind of imagine in co-pilot as it sits in the editor there's a lot of these

41:44

system promise for different scenarios like there's a different system prompts for you know summarizing a polaric rest where those you know one that auto completes a text or one that helps you with debugging an arrow which you know co-pilot doesn't that e so of these you know suites of prompts

42:00

that very specific today for different models and actually think you know as we move you know into the next year or two that's going to become a competitive differentiator for companies to be able to plug and play different models while keeping the the prompts we relatively stable today

42:18

we're not in that place and there is a lot of work that goes into adjusting these prompts you know running the offline evaluation again think almost any co-pilot or co-pilot like system runs some form of a b-testing where they you know once they have a new model and they have done their offline

42:32

evil and their response they are rat teaming and all these kind of things they actually roll out to 1% you know 5% 10% of the population and they look at metrics like I mentioned a curiosity before you know they look at acceptance rate they see a we do see you know whether this this new

42:49

population is actually getting better results of rest results than with the old model and only if we have that confidence level we go to 100% and I think you know that will enable us to hope for the in the in the near-term future move to new model generations faster than than we can today

43:03

if one of your engineers came to you with an argument to switch to another model what would the winning argument be would it be 5% more efficient 10% less cost where would the metric be where you'd say okay it's time to switch 5% sounds pretty good 10% also sounds pretty good so but it's on

43:19

that order right like for a lot of things like it's a lot of cost for a 5% game but you're saying 5% would be a winning argument I think the nuance there is that you know we are checking in offline evil for C and C++ and C sharp and JavaScript and TypeScript and Python and Ruby and Go and rest

43:37

so far I haven't seen a model update you know even within the GPT line where all the languages across the board are better off the start and you know some are better and some are worse we're looking at different types of metrics you know obviously successful build is one of them like

43:51

is actually the code building in the test suite but also like how many lines of code did you get compared to the previous model or the competing model and you know if that number of lines goes down the question then becomes well is that better and is it using in our smarter way of writing

44:06

that same code or an open source library or did it get worse and it kind of like well it's the builds it doesn't actually create the right output anymore and so I'd say you know if somebody comes to me you know one of my engineers or data scientists is like this model has everything better across

44:20

the board and we're saving half the GPUs and that seems like a pretty good deal that I would certainly you know go into like a deeper evaluation process and try to figure out is it worth you know to now go into the you know two handful of regions where we have deployed the model because you

44:37

know we are running you know in different Azure regions with clusters of GPUs to just a half low latency so you know European copilot user is connecting to you know GPU cluster and for instance Switzerland UK and Sweden I think if they're in Asia they're having in all GPU cluster in Japan but

44:54

then India is probably closer to the European clusters they're going that way around the world and then we have different ones in the US and we're expanding you know almost you know every months to a new region to get more scale and so switching to model has switching costs across all

45:08

these clusters and then we go coming back to the weakest testing question of how do you do that so you actually have enough confidence that the offline evaluation is matched in the online evaluation where people work with real code and other synthetic scenarios the way I like to think about these

45:23

you know in web services ever since you know the cloud became a thing 99.9 or more in terms of uptime percentage is the gold standard right like everything less than that and you're going to be on hack and user on the verge all the time saying your system that start up xyz or big company xyz

45:41

down again and this preventing everybody from getting to work right and we have seen that both with GitHub and with other collaboration tools like Slack or Teams and others right like if Slack is down on the Monday morning everybody is like both I guess I'm off work today and in the model world

45:57

you know that's the place of all but because your model has to have 99.99 whatever uptime but also the model quality the response quality of that dips you also have to monitor that and you almost have to run through exact same process with your site engineering team site reliability engineering team

46:13

to say okay something is going wrong what is it and when the stack maybe did an operating system update you know patch two stay or something like that maybe a network router changes you know oftentimes when we now deploy GitHub in a new data center the big question is can the network

46:27

bandwidth actually support our load given the scale of of GitHub as as a social network again so all these things play a role now not only in model uptime but also in model output right and that's I think we are you know all these questions come into play before we make the decision of

46:41

saying well you okay we are ready to move you know to the latest GPT model or the competing model I was just want to point out you started with 5% sounds pretty good and you ended with 50% less GPUs so it feels like the numbers would maybe a little bit higher than 5% you mentioned Microsoft

46:56

earlier I feel like I weigh late to the Dakota questions on this episode we've immediately gone into the weeds but GitHub is part of Microsoft the acquisition was made several years ago you're a new CEO of GitHub within Microsoft you were at Microsoft before how is that structured now

47:13

how does GitHub work inside of Microsoft you know I'm coming up on 10 years at Microsoft in December which I wouldn't have believed when I started at Microsoft I given that I came through a small acquisition myself at a small startup called hockey app that I got acquired in late 2014 I joined

47:28

GitHub six years ago and then became the CEO three years ago and today GitHub is very much structured within Microsoft as it was and when we acquired it in 2018 I was actually on the deal team working with Nandfriedman and others of getting the deal done and enjoying GitHub that way

47:42

we are you know a limited integration company as Microsoft calls it so we have adopted some of the Microsoft processes our employees you know get stock grants for Microsoft and the best that stock very similar to Microsoft employees my manager is the president of the developer division

47:58

Julia Liusen who also has all the Microsoft developer tools like we just to decode and we'll studio net some of the Azure services that that are dear near to developer workflows like like reddison API management and whatnot and she reports into script gallery who runs the cloud

48:13

in the i division and of course that way we are very much aligned with what the cloud is doing and also what the Azure AI platform team is doing where we partner with on this GitHub models launch that we talked about earlier but you know as a CEO of GitHub I you know have a leadership

48:29

team across the whole range of function you know have an engineering leader and a product leader SEO and a chief people officer and chief finance officer like I'm pulling a missing a chief of staff and we are working together as a company not as a functional Microsoft organization and so as such I'm operating much closer to a CEO than a typical Microsoft engineering leader and I think that's a lot of fun that that gives me a lot of energy and it gives me a lot of motivation as we can fully

48:59

focus on GitHub and making GitHub bigger you know our goal our winning aspiration is to get to 1 billion developers on this plan and hopefully they also all have a GitHub account but most of the goal is to enable about 10% or so you know by the time we achieve their goals probably 10% of the population to start coding as they can start learning to draw an image or start playing the guitar or you know any you know let it see is 100% hopefully our aspiration as humans and I think coding should go

49:28

in the same direction everybody should be able to code and explore their creativity and so coming back to your Microsoft question and we obviously benefit a lot from the you know mothership including you know the partnership with OpenAI and the the power of the cloud and having GPUs available in

49:43

different region and the response of the iStack and whatnot the same time we get to focus on what makes GitHub unique in the industry you've said co-pilot accounts for more than 40% of GitHub's revenue growth this year is co-pilot revenue positive is it still a cost for you is it just helping you

50:01

acquire customers the earnings call script chat that in the last year 40% of the revenue growth came from co-pilot and the run weight is now 2 billion and run rate obviously is forward looking right so those have a bit of a different metrics we're really happy about the co-pilot growth and

50:19

rather this is going and you know Sajja keeps sharing the number of organizations that have adopted co-pilot and I think you know what has been remarkable is that you know it's not only the cloud native companies the startups you know the Silicon Valley core that have adopted co-pilers really

50:35

you know the largest companies are on the world the traditional industry is you know right but this running co-pilot for you is that a cost center or is that actual actually profitable because that's really the conversation across all of AI right now are we we actually using this to

50:48

make products to make money they're very excited they're co-pilot is today and where this is helping the get up business I did my best right now you've been running co-pilot you have a lot of feedback for your users what are the biggest weaknesses in co-pilot that you want to address I think the

51:03

biggest weakness you know for a pilot like co-pilot that was early you know in the in this generator for your eye journey you know launch co-pilot announced the first version of co-pilot the preview on June 2021 that was like a year and a half before Chatchy PD came and then we did

51:18

GA and June 2022 still like you know half almost half a year before Chatchy PD and then Chatchy PD came and changed everything right and until that point we thought Chatch is not a scenario that works good enough for coding clearly very wrong on that and clearly you know then quickly we move to

51:35

add Chatch you know to the co-pilot portfolio and make that great for developer scenarios within the IDE within the editor because it allows people to have all the context that's available at the power of co-pilot always has been that it knows what's in your file and so when it suggests code

51:50

it actually has the variable names and it knows what open source frameworks are using it actually it looks at adjacent tabs and so when you ask questions to explain code it not only looks at the lines of code you highlighted but it also looks at the context of and while if you copy and paste

52:02

stuff into like a generic chat agent you have to collect that context yourself or give it to you give it to the tool in the prompt so I think you know that it shows one of the weaknesses which is like the world is moving fast and so you have to be really really agile and we don't know what's

52:18

the next big thing in AI in the same way that you know you would have had a hard time predicting in 1994 that you know Amazon would become the big tech company the member of the magnificent 7 that it is today it took them you know a decade or so to actually you know turn the first profit right

52:35

and and then so you know it's hard you know in the tech sector to predict what's coming next and especially in this AI you know race I think our biggest weakness is that we have already a large product in market with a large installed base where then moving fast is you know a challenge in

52:51

itself we have the benefit of you know that installs based you know helping us you know to grow market share and and a tight feedback loop but at the same time you know every time we want to experiment we have to you know balance between that experimentation and breaking things and

53:06

keeping the current customer said happy both you know actually on the technical side but also on the how do we invest you know the engineers and the product managers the designers that we have into one or the other not not both or both actually or both right we have to take another quick

53:21

break we'll be right back indeed is a hiring platform that connects businesses with job seekers with over 350 million global monthly visitors according to indeed data and a matching engine that helps you find quality candidates right away and it's not just fast according to indeed 93

53:46

percent of employers found the highest quality candidates on indeed compared to other job sites listeners of this show will get a $75 sponsored job credit get your jobs more visibility at nd.com slash vox business just go to nd.com slash vox business right now and support our show by

54:05

saying you heard about indeed on this podcast nd.com slash vox business need to hire you need indeed hey this is Scott Galloway host of the proffetee podcast one of my favorite things I get to do on this show is here from our listeners and answer their burning questions about all sorts of things

54:25

including work life school you name it and this summer we're bringing back the hits and covering some of our favorite questions and takes we're talking business crew advice and even hearing a brand new never been aired interview about parenthood so tune into the best of office hours

54:40

the special series from the proffetee podcast sponsored by mint mobile you can find it on the property feed wherever you get your podcasts hey suburb here um megan repino women sports are reaching new heights these days and there's so much to talk about so megan and i are launching a

54:56

podcast where we're going to deep dive into all things sports and then some we're calling it a touch more because women sports is everything pop culture economic politics you name it and there's no better folks than us to talk about what happens on the court or on the field and everywhere else too

55:14

and we'll have a whole bunch of friends on the show to help us break things down we're talking athletes actors comedians maybe even our moms that'll be a fun episode whether it's breaking down the biggest games or discussing the latest headlines we'll be bringing a touch more insight into the world of sports and beyond follow a touch more wherever you get your podcasts new episodes

55:36

drop every Wednesday. Welcome back i'm talking to Thomas Tompka about how he balances github priorities with microsoft's microsoft has a lot of CEOs under sushine edella who is the CEO microsoft when they hire someone like Mustafa Suleiman and making the CEO of AI do you have to

56:00

take a meeting what was that like hey i've i already have one of the biggest AI applications in the world and get help copilot can you help describe that first meeting that conversation the first time I actually met him was at the tech conference in Vancouver because he had a talk and I had a talk

56:14

and we literally ended to each other backstage that was I think about a month after him joining was announced and obviously the first couple of weeks in a large company like microsoft are always stressful and many people want to meet so I left him alone and we just ran into each other and

56:28

insurance and you know it changed a couple of control sentences and then in the meantime we met both in you know what is the senior leadership meeting under satchya at the sL team meeting every friday i'm talking about mostly AI topics and i've also met with him and his team to talk about

56:44

actually a similar question that he asked earlier about you know how do we get more agile on models and how do we you know move faster on being flexible on you know the next model generation what can we learn from from the microsoft copilot now as you know the github copilot was the

56:58

first one that we ever built and as such you know there has been a continuous learning loop across all of microsoft you know since the very early days of get a copilot there has been a monthly copilot meeting with plus 100 plus people across azure across the Bing team across caverns god's ct organization that have been in a loop of what we were doing in terms of building the copilot deploying the copilot you know commercializing the product but also like the what they are

57:27

doing and how we can leverage you know the stack and i think the fast most fascinating thing is that you know i think all the copilot is the first time at least in my time at microsoft where everybody from the early days started on a joint on a common stack you know the azure platform or azure

57:44

AI services that is sold to third parties and so it's not like we built our own stack and Bing built our own stack and then somebody came and said well we should really you know standardize that on a new stack and then everybody else in the future starts with that new stack but all the

57:59

old timers are like well that's way too much effort to move to that new stack so we have used to striving windows right now i just want to be very cool you said that not i but you know very early early on we identified that we need an azure AI platform and so that

58:14

that team underscore gothry started building that in parallel to copilot and before we went and made copilot generally available in june 2032 we were already on that stack we were already benefiting from responsible AI my team is doing you know red teaming and collaborating

58:31

closely with Sarah births team that runs the responsible i team in the platform but we are mostly relying on their technology and we collaborate very closely and i think that's the the new way of working of microsoft that we have benefited greatly even though we are independent and

58:47

limited to the mid to the integrated limit limit limit limitally integrated i don't know is there a set of things you would want to do that run counter to microsoft's priorities that you would not be able to do i don't know i mean like if i well just do an example there's no way you're

59:06

going to go use one at google's models right to show run copilot like that seems totally out of bounds unless it isn't in which case i have a huge breaking news well i'd say if you haven't you know had that discussion because so far we haven't seen the business case for that right but the end of the

59:21

day you know we're running github as a as a business that contributes to microsoft's earnings supports and the overall success of the business as i mentioned earlier i know we're turning 50 next year and and playing the infinite game like the the cool for microsoft but the the reason i'm asking

59:37

is a limited integration company inside of microsoft github did start as an independent it has a different relationship to the developer ecosystem then even azure does right azure is a big important part of the developer ecosystem but azure exists in a much more competitive environment then github

59:54

which people think of as almost utility right it's there you can use it everyone uses it for everything particularly open source community it it is a focal point of a lot of things it doesn't seem to have the commercial aspect that's on the azure might but it's still a business

01:00:08

in sometimes it's priorities and the needs of its users might run against microsoft's desires when you're trying to suss out where that is and how you manage that tension if i can make the business case at the end of the day if i can make a successful business case where i can show you

01:00:21

know that we can generate revenue we have you know healthy gross margins and ultimately profit margins on the long run i think you know anything is possible like i would say never say never um whether it's google or eWS or you know any of the chip providers i don't think there's a

01:00:36

mantra that i couldn't do that i think it's it's a much bigger question can i do it in such a way that we're still achieving our business goals as github and as microsoft right and as such you know what i'm the CEO of github obviously i'm an executive at microsoft and we need to have that you know

01:00:53

one microsoft thinking and the grand scheme of things um to grow the overall business right we're all tight to the mothership whether it's you know rein at linkedin and the games studio's Mustafa in AI or Thomason and github for part of microsoft and we're working with such and slt very closely to

01:01:09

to make microsoft successful but i don't think you know it is against microsoft's dna to partner i mean i think you know classic example as apple uh where you know there have been on and off you know no tension in that relationship at all on and on and off or you know what is the there

01:01:26

has been winters in summers i guess uh in that relationship but you know these days my iPhone is full of microsoft apps and and you know i'm having this podcast on a mac and i use it mac day and day out in fact you know when i joined microsoft in december 2014 uh microsoft bought me a new mac my

01:01:43

startup had macs and it was at the time already undersatia very natural to say well if you want to work on a mac and that makes you more productive if you're totally down on that then i'm forcing you to use a windows pc and i think that's now going coming back to your question i think that

01:01:57

anything is possible as long as it aligns with both our strategy uh where do we want to go with get have but the products we want to build and you know the model's launch is actually a perfect example right like we do have met us model in there which you know it could be it's easy to argue

01:02:11

that lama is a competitor to five three and and gpt4 we have mistrial in there with actually the latest mistrial large model as well and so i think you know we are open to being the platform provider that is both competing and partnering with you know sometimes the same company i want to end

01:02:27

by talking about not just ai broadly but the communities on github and how they they feel about it so let me ask you a question i've been asking sort of every ai leader lately there's a lot of burden being placed on lm technology right it came out it had the moment it just tons and tons of hype

01:02:44

we everyone has bought as many h100s as they can jensen wang's doing great in a video it's not yet clear to me that elems can do all of the things that people say they can do obviously they can run copilot right you have built one successful application at scale that people

01:03:00

really like we also have a view of whatever analysis built in because you run github do you think lm's can actually do the things that people want them to do they can do a limited set of tasks and i think as you as you define those tasks in a very clear box of what that is what you want

01:03:18

to lm to achieve like you know all the completion in copilot as much as scenario there can be very successful right like the reason that we started with the other completion was not that we didn't have the idea of chat and we didn't have the idea of expanding code or you know building an agent

01:03:32

that does it all it was that the model didn't do any of those scenarios at a sufficient you know success success rate and you know developers are very have very high expectations and so if you deliver a product that serves 60% scenarios you're not going to be successful because your reputation is

01:03:51

going to dive down really fast whether it's on social media or you know on our own community forums and whatnot and i think you know those scenarios have expanded over the last four years from auto completion to chat to test generation to helping you to plan out an idea and create a spark and

01:04:10

then implement that code what we're doing in workspace which takes you from an idea you know to implementation without ever leaving github and the AI helps you at every step of the way but you know what's important there's like points in that flow where the human needs to come in and look at

01:04:23

the plan and say yeah that's actually what i wanted and you know often i like to think about that in the same you know way as i think about the relationships that we have with our co-workers and you know how often do you at the verge give a task to somebody and then ask yourself how do

01:04:40

specific do i have to get and how long do i want to go until i need to check in with them and see if they're on the path that i had my head in my head right but i would just i hear that comparison a lot but i have to ask you i never give a task to one of my colleagues at the verge and assume that

01:04:56

they will just make a bullshit at scale right like that's not that's not how that goes right and with L.M. the thing that they do is hallucinate and sometimes they hallucinate in a correct direction and sometimes they don't and it's unclear to me whether they are actually reasoning or just

01:05:14

appearing to like there's a lot of things we want these systems to do and i'm just curious if you think the technology can actually get to the endpoint because it requires them to be different than they are today in some meaningful way we believe that you know i get up that the human will

01:05:30

be in the center that's why we call the thing co-pilot and we believe there has to be a pilot now that doesn't mean that the co-pilot you know doesn't fly the plane at times they do right in the real life and and there's going to be scenarios where a large language model is scoped enough in

01:05:45

the task that it needs to do to fix for example security vulnerability we we have that you know already in public preview what we call autofix which you know takes a vulnerability and actually writes the fix for it but then there's still that moment where now the pilot has to come back

01:05:58

and say yeah that's actually the fix that i want to merge into my repository and i don't think we have anywhere close of the pilot being being replaced by AI too and from a security perspective in itself that that is also risk that companies probably are not willing to manage anytime soon

01:06:15

that AI and AI works together and merges code and pushes it into the cloud with no human involved because purely from a you know nation state actor perspective or bad actor perspective that's a risk vector that nobody wants to take that needs to be the human in the loop to to make sure

01:06:30

what is deployed is is actually a secure code and not introducing vulnerabilities or viruses and whatnot and you know i think it's a question really of how big is the task where you can trust the LMA enough that the results in a productivity improvement because you can easily in our use

01:06:47

an AI agent to change the background color of a web page and it takes you know three hours of work and you could have done it in three minutes yourself that's not dishwasher right that's that's like the that's just a waste of of compute resources and ultimately you know energy and and so i

01:07:03

think you know we're going to you know see progress and i think we're going to see better you know agents and better co-pilots in the near and long term future but i don't think we're anywhere near where we can replace the human with an AI even at the more complex task and we're not even talking

01:07:18

about you know giving the AI a task that is built the next GitHub that like i don't think that's you know in the next decade even but you know predicts us all this great because yeah well i will have you back a decade for now and see if there's a GitHub a GI the reason i asked

01:07:34

can LLMs do it if the answer is they can they can take all of the weight that we're putting on them then maybe some of the costs along the way are worth it if they can't maybe those costs aren't worth it and i specifically mean costs like how people feel about AI there's a community of

01:07:49

coders out there who are very unhappy that GitHub has trained on their work in the in various GitHub repositories and built co-pilot if we think LLMs are going to get to the finish line maybe it's worth it maybe that pain is worth it if it's not going to get there we've just

01:08:06

pissed off a bunch of customers how do you evaluate that because i see creatives across every field whether it's coding whether it's art whether it's movies whatever it is being really upset that these AI systems are being trained on their work maybe they're legally upset maybe they're more

01:08:21

elapsed whatever it is and then the outputs might not be worth it yet how do you think about those customers specifically and then the bigger problem of training and how that makes people feel generally first of all i think the outputs are definitely worth it already and you know we've seen

01:08:38

significant productivity gains for developers we have seen 55% you know as one such statistic from a case study that really brought out and developers 50 with them 50 without co-pilot and co-pilot 55% fast that we see you know some statistics from competitors and customers confirming that both

01:08:54

in the short term and in long term developers are seeing significant productivity gains even you know in the later part of the developer lifecycle in like successful builds and more deployments of the cloud from the team using co-pilot with the team without using co-pilot i think they're more

01:09:08

important is you know we see very clear feedback in surveys our own surveys and you know customer surveys that developers are saying you know they're more happy more satisfied more fulfilled they no longer have to do all the repetitive tasks i think that's where the dishwasher analogy works

01:09:22

really well and it's easier for them you know to onboard to a new project if you think about you know one of the biggest challenges for developer today whether that's an open source or whether it's in a in a company is onboarding to a new project whether you are joining a team or they're just

01:09:36

picking up somebody else's work to to make a bug fix for that navigating the code base is incredibly hard because you don't know what the person thought when they've wrote all this while the AI can somewhat you know reliably figure that out and help you navigate the code base

01:09:50

and and you reason but it together and you ask questions and it gives you you know a wrong answer that's that's okay too because the human power program does that as well so i think the the value is proven but that's sad you know and i think that's the second piece is that

01:10:03

we do need to work as an industry with you know these people raising the concerns and working through those and and figuring out what is the right model that the open source foundations the open source maintainers those that you know have been spending you know most of their private life

01:10:19

on maintaining that small library you know that supports half the internet how do we put them into a place where they also see the benefits of AI where they understand you know about both our legal position but also our i'd say you know human position of i we believe training the models

01:10:33

on that code is is the right thing for society that's a complicated question i i'm not saying i have all the answers but i can tell you that you know at github we have always been committed to work with the open source community work with regulators you know fighting for the for the rights

01:10:47

of open source maintainers with the European Commission and ultimately you know giving github away for free for every open source project and we're not asking the question you know is it is it really open source or is it open base or or like it's it's public it's public but it's not

01:11:01

an open source license right we're giving you free repo free issues free actions free models now with github models free code spaces and you know we've been engaging with the community with things like github sponsors where we you know an integration with patreon and other things where

01:11:16

we enable maintainers to build you know a create economy around that create a community i've noticed that you've changed into language already right you're evolving so even with the launch of github models your blog post it's very clear you have a sentence it stands all by itself no

01:11:31

prompts or outputs and github models will be shared with model providers nor used to train or improve the models that feels important to say now right you it's it's right there you can read it is that something you had to learn that you needed to say that this was a concern that people would

01:11:45

have because it the rush to AI what you might call the chat you can moment if you're like no one knew they needed to say that and that has caused all the problems and now it's very clear that people care a lot about where their data goes yeah it's important to get out of the tech bubble right and

01:12:01

what is obvious to the people working on the product is often not obvious to the customers and as the customer base is growing more people ask those questions so I think it is incredibly important in fact you know it's equally important as it was with the cloud or as it was with you

01:12:14

know systems like exchange and Gmail to say hey you know if you're deploying your application on on our cloud we're obviously not looking at your source code and they're using that source code to make other products better or sharing that source code with with other people deploying on the

01:12:30

cloud and the same is true for models you know people see these models as a compute layer and and you know as such you know they want to use that send something computed and get it back and not implicitly giving anyone access to that data to make the model of the compute layer if you will

01:12:46

better and so I think you know that continues to be a cornerstone of Microsoft's strategy you know we have this line that every employee learns Microsoft's trust and we believe that if we lose that trust turning it back as incredibly hard and we have fun you know through moments in our in

01:13:04

my career at Microsoft and certainly in Microsoft's 50 years where a lot of that trust was lost and it took a while to get to get back and so as such you know I think you know the model providers themselves have enough data and you know we'll be finding ways to get access to the data without

01:13:20

us sharing it with the model providers or it's certainly not without the approval of the of the customer now there's one you know caveat to this which is somewhat orthogonal but you know is easily easily intermingled with that question which is there's an increasing demand of

01:13:36

customers wanting to find you in a model based on their data right and what that means is that taking you know their source code in the GitHub scenario or you know other data in other scenarios and change the parameters of the model change the weights you know through a tuning process

01:13:51

and as such now they have a customized version of that model that is combination of you know the public model the one that you know open AI or meta has released but also their own data where the parameters have changed now obviously that model needs to be within the private tenant of that

01:14:04

customer until unless the customer decides you know to make that model public through their own API you know common scenario that you can imagine is companies having their own programming language just like SAP has an ABAP and so they want a model that speaks ABAP so that everybody that wants to

01:14:19

use you know an SAP co-pilot to build ABAP can do so with a fine-tuned model that SAP has provided right and so that those scenarios obviously exist and there you know it is fine to tune on the customer data because the customer wants to do that I feel like I learned a lot about SAP and

01:14:34

how it's software is just now they're not too far from here Thomas if you're giving us so much time what's next for GitHub and co-pilot what should people be looking for if you look at where we've gone for the last year so it's like we have extended co-pilot into different parts of the developer lifecycle we originally announced that as co-pilot X you know co-pilot coming to other parts of the workflow not just auto completion not just chat but actually bringing it into everything

01:15:00

that the developers do because we believe there's a lot of value there you know a very simple feature that we launched last year is summarizing the pull requests so when you you know have done all your changes to the code and you submit that you know for review you know you have to write

01:15:14

the description yourself you can use co-pilot to write that description for you now you're saying well that's trivial right like you can do that yourself you're not saving that much time but the truth is you know A you're coming out of you know let's say a free hour coding session and

01:15:26

you have to write all the things up that you did during that time A you will have incredible confirmation bias of what you believe you did with us what you actually did right like because you're only remembering the changes that you thought are important and not the ones that you maybe accidentally

01:15:40

made or you made them because you're trying out how things work and whatnot while co-pilot when it looks at the changes it just plainly writes down what it sees and as such you know you get a very detailed write-up and you can obviously customize that to be shorter or longer but it also describes

01:15:56

to you stuff that you may inadvertently have changed and and you know so you're saving a lot of time by just avoiding you know the iteration later in the cycle so we're bringing co-pilot into all parts of the developer workflow we are looking into building what we call co-pilot workspace the

01:16:11

AI native development workflow which is really cool because it allows you to take an idea and bring that into into code where they help of a co-pilot so it's not adding co-pilot to a editor it's inventing the whole developer workflow from scratch and you've written in our idea and it

01:16:27

looks at that idea in the existing code base and it writes your plan and you can look at that plan says well that isn't actually what I wanted and if you think about you know the dynamic today between engineering and product management you're often half like either overspecified or

01:16:42

underspecified issues and then so the product manager then has to go back with the engineering team and say well that isn't actually what I wanted or the engineer to go back with the product manager and say well this isn't specific enough it's for having AI in that planning piece is

01:16:56

already a win for both sides in fact we have seen product managers evening as and saying well now I can implement the thing myself at least you know can try what that does to the code base and see how long it will take I feel like you really ratchet up the temperature on the PM engineer dynamic

01:17:11

right there and I mean I have you know I have a chief product officer friends with you I literally say I found the fun back in coding with the help of co pilot because it's not this you know whether you see yours chief product officer or whatnot most of your day is spent an email

01:17:29

and meetings and Castilla Halls and podcasts and then when you have an hour on Sunday spending that in a productive ways incredibly hard because you have to get back to your environment right like whether that's you know you're building model train houses or whether that's code is

01:17:46

equally similar because you have to prepare your workspace again with you know something like co pilot it actually is much easier because you can just you know open your project where you left it and you ask co pilot you know how do I do this and you don't have to start navigating all that

01:18:00

complex world of open source libraries and models and whatnot and so we're building the AI native developer workflow and we actually think this going to be incredibly empowering for both you know developers working on their private project but also for open source maintainers right

01:18:15

if you if you look at an open source project today and you want to make a change you know your biggest challenge is going to be to figure out what are the places where I have to make those changes and how do I not piss off the maintainers by creating a pull requests that is incomplete

01:18:30

or that doesn't follow their coding standards or that doesn't follow you know the way they want to collaborate with each other because at the end of the day you know the open source comments are defining of how they want to collaborate and that's totally cool right like every company defines

01:18:43

their culture in every open source project defines their culture and the the contributors that come in especially those that are early in career often have formal in or not formal anxieties in their heads of while I find my first pull requests and the reaction will be not oh this is so great

01:19:00

Thomas that you sent that to us but this is a go back and learn how to code and then if once you have learned that and again this is often not actually happening but I think most people have that anxiety in their heads and they're waiting forever until they feel ready to to contribute

01:19:14

back and I think co-pilot will will lower that that barrier of entry and you know one last thing and you know is that I'm from Germany I grew up with Germany as my first language I learned Russian and then English and we'll probably always have an accent when speaking English but most

01:19:29

you know most most kids on this planet do not speak English at age six like you know there's a large population that does speak English but a lot of them do not speak English while open source and technology is predominantly in English for them the entry of the barrier of entry is growing

01:19:45

way down and it will allow them to explore their creativity before learning a second language before learning the fluency in that second language before having the confidence of now I can type a feature request against the Linux kernel and say I want this I want this and he is actually the

01:20:02

code I've already implemented what do you think right and I think that's going to be completely changing the dynamic on this planet well it feels like we're going to have to have you back very soon to see how all these projects are going Thomas thank you so much for being on decoder thank you so

01:20:14

much for super fun I'd like to thank Thomas for taking the time to join the coder and thank you for listening I hope you enjoyed it if you'd like to let us know what you thought about the sub-sauce it really anything else drop this a line you can email us at decoderethroberge.com

01:20:28

we really do read all the emails where you can hit me up directly on threads I'm a reckless 1280 and we have a TikTok check it out it's at decoder pod and it's been kind of blown up lately if you'd like to go to a place share with your friends and subscribe wherever you're podcast

01:20:39

decoder is a production of the version part of the box in your podcast network or producers are Kate Cox next at our editor is Cali Wright our supervisor producer is Lane James the decoder music is where break master cylinder we'll see you next time

✨ This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.

Episode description

Transcript

GitHub CEO Thomas Dohmke says the AI industry needs competition to thrive

Episode description

Transcript ✨

Transcript