How AI reshapes the craft of software engineering, with Yoav Tzfati - podcast episode cover

How AI reshapes the craft of software engineering, with Yoav Tzfati

Jul 24, 202546 minEp. 52
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Summary

Patrick McKenzie and AI researcher Yoav Tzfati delve into "vibe coding," a new approach where large language models increasingly handle software engineering tasks. Yoav shares insights from his bootcamp, where programming novices successfully built full-stack web applications without directly coding. The discussion examines the profound implications for the software industry, suggesting a future where humans act more as product managers or auditors, guiding AI "junior engineers," and debating the long-term impact on programming careers and the overall demand for software.

Episode description

Patrick McKenzie is joined by AI researcher Yoav Tzfati to discuss “vibe coding” - using LLMs to delegate software engineering work to AI models. Yoav runs a bootcamp teaching programming novices to build full-stack web applications using AI, without them ever looking at code. Patrick and Yoav discuss the fundamental shift in software engineering, where humans increasingly act as product managers directing AI "junior engineers," and explore the implications for the future of programming careers and the democratization of software development.

Read full transcript here: www.complexsystemspodcast.com/how-ai-reshapes-software-engineering/

[Patrick notes: Complex Systems now produces occasional video episodes like this one!You can access them directly on YouTube: https://www.youtube.com/@patio11podcast. My kids inform me that I’m supposed to tell you to like and subscribe.]

Links:

Timestamps:
(00:00) Intro

(00:32) Defining vibe coding

(01:35) The evolution of software engineering with LLMs

(04:07) Practical applications of vibe coding

(09:37) Teaching vibe coding to novices

(18:30) Future of AI in software development

(21:42) Discussing timelines and model capabilities

(22:12) Flappy Bird and the evolution of game development

(23:27) The impact of LLMs on software engineering

(24:46) Future of coding and human roles

(29:47) Monitoring and error handling in software

(31:20) The role of LLMs in code review and maintenance

(35:12) Wireframing and project management with LLMs

(36:40) The future of software engineering careers

(43:07) Practical tips for software engineers

(44:38) Wrap


Transcript

Intro

Welcome to Complex Systems where we discuss the technical, organizational, and human factors underpinning why the world works the way it does. Hi, everybody. My name is Patrick McKenzie, better known as patio 11 on the internet. I'm here with Yoav, who is an AI researcher who recently has been teaching people at a variety of skill levels this craft called, quote, vibe coding, end quote. So, Yoav, thanks for being on the program today.

Thanks for having me. So we'll start off with first further question for people who aren't terminally online on Twitter. What does vibe coding mean? Vibe coding is a term that was coined recently by Andrej Karpathy. And the way I've been thinking about it is

Defining vibe coding

basically delegating your software engineering work to an AI model. I find it really funny that this is the term that we've kind of converged on for this. It's basically replacing a human software engineer with an AI model. But it has this cute term now. I'm not the greatest fan of this term, candidly. And you mentioned before the program that you might not be as well. My reason is...

This is not the first time in the tech industry and assorted environs where we picked an auto-minimizing term for something which is much bigger than that term implies. When discussing with other friends who have arbitrarily high levels of skill, they've told me,

that they think LLMs fundamentally reshaped the craft of software engineering. And given that very credible people are telling me very credible things about productivity boosts and similar, and the likely shape of things to come, I think, like...

The evolution of software engineering with LLMs

My first pass would be using LLMs in the craft of software engineering or similar. We see this back to blogging, where blogging was for many years an excuse for people to devalue written output that was, morally speaking, an essay, but written by someone who is not serious about writing essays. And many people who are...

Morally speaking, bloggers are more careful about what they call that these days. And because I think early career engineering professionals might be sort of anxious about their skill level and similar, do not do themselves favors by saying, my skill level is actually crap.

I am the worst engineer in the history of a 19-year-old to have ever done the craft of engineering. I would encourage people, particularly people early in the careers, to not call it vibe coding with respect to themselves. But that is me off my soapbox. Yeah, that is a... slightly different angle than I'm coming from. I don't think people are calling themselves vibe coders as a career title or something like that. I think it's more of an activity. People see it as an activity.

And I think in some ways it sets in the right mindset for using LLMs to produce software because often you'll get way better results. if you constrain the model less and let it be creative. There's an exploratory, almost play-like feel to this, which sometimes there is and sometimes there is less of in the general craft of software engineering, right? Yeah.

report poor results are the people that have this big existing codebase and they need to make a very specific change to it. To them, the vibes that don't help at all, they just need to. they need to accomplish a specific thing. And the models are not as good at accomplishing a very specific thing that is still complicated compared to making a whole new thing.

from scratch that is less constrained. Yeah, I think the acceptance bars on that sort of quote-unquote greenfield development versus brownfield development are a little bit wider, particularly if you're doing it more to learn the domain and to test out directions a program could go in and sort of a prototype fashion versus that, no, we have a system, we need to have a bulletproof bug-free implementation of one.

submodule for the system or everything comes crashing down. For people who are having a bit of trouble visualizing this, and I know that the answer is changing. almost a week-by-week basis. If I'm an engineer and I sit down at computer and I am, quote, vibe coding, what does that activity actually look like to me? What programs am I using? What does the information flow look like? Yeah, so this has several different possible forms.

Practical applications of vibe coding

there is the closest to normal software engineering version of this, which is you have your choice of IDE open, maybe VS Code, maybe Cursor. and you are looking at all the code and you have a particular change you want to make in mind, but instead of typing it out yourself, you maybe prompt a model to add a function or you even start typing it out and then let it complete the rest or type out the function signature or whatever. Arguably, if you're looking at the code that is not Vibe coding,

There's a step above that which is still inside one of your IDs of choice, most likely Cursor or Windsurf over VS Code. People have not had very good success with the agent mode in VS Code yet, as far as I know. But let's say in Cursor, you have the agent mode sidebar. You type in, oh, I want you to implement blah, blah, blah. And you...

mention the feature set you want rather than the implementation you want, and then the agent gets to decide how to implement it. That's a little bit more along the curve. Even more along the curve is you may be still in these tools or maybe in Cloud Code or something like that, you maybe even set up a completely new project, and you plan on not looking at the code at all whatsoever, and you say, this is the application I want. And the model then goes and makes it.

and you look at the result in your browser or whatever interface it is that you're making, and then you ask for feature changes. And then the most vibey version is you use one of these web-based tools. such as v0.dev or bold.new or lovable. Inside your browser, you just tell the LLM what you want and the LLM makes it, you get to see it and request changes. You probably have a one-click deploy button where you can put it on the Internet.

Does that answer the question? Yep, it does indeed. We're starting at something which is almost cliched in the discourse about LLMs now. It's just slightly more sophisticated autocomplete. Although if... If folks are thinking of standard ID autocomplete features, this is much more sophisticated than that. They are not simply guessing from the characters that you've typed recently or your

most 10 used library calls in a library, but they're actually able to see the context above and below the cursor. And so they are quite good at guessing like, okay, which... libraries are going to call next, what are the probable parameters is going to send to it, and even if there are patterns either in the LLM's training set or in the context window of.

Well, okay, if you're doing a for loop... that the invocation line of the for loop is very predictable, but also that if you've been a programmer for a while, sometimes you know that if you've just assigned a variable X and a variable Y, everything about the for loop is preordained and it's just a matter of banging the keyboard. and they are quite good at that in my experience. Then I have not tried the agent-based mode, but Thomas Tacek, my erstwhile co-founder.

We were at Starfighter together, who's currently at Fly.io, recently wrote an essay about Agent Mode and how Agent Mode is, I think he would paraphrase to say it. fundamentally changes the craft of software engineering. Agent mode is not dissimilar, say people I've talked to, to having a staff of junior programmer working for you, where you might meet them before lunch.

give them a number of like, I would like you to bang out X, Y, and Z today and have 5, 6, 10 people that you fan things out to. Then you literally have to go away because the LLM does need a perceptible amount of time to think. So maybe you take a meeting, maybe you take lunch, whatever. You come back and there are five, six, ten pull requests from your staff of junior programmers. And some of them will be very like, this person got exactly what I wanted from them.

accept this pull request. Some of them will require a little bit of editing from you. You might not like the style, etc., etc. Then some of them are, wow, you comprehensively misunderstood the task I was attempting to get you to do. I'll give you feedback on that misunderstanding, and we'll meet back for your next pull request review in 45 minutes. Let's kind of match your...

experience of this? Somewhat. I think that for most tasks and most people, the iteration loops are more like three minutes to seven minutes than 45 minutes. Now with Claude Opus 4, for some tasks, the iterations loop can be a lot longer. They talked about how Opus went and worked for seven hours on the feature. I think the way most people use this is they...

usually have one agent running. Some power users have multiple, but they will look at it work and see the changes it's proposing and over the course of like two or three minutes, let it work and then give it feedback and so on. maybe go back in time to a previous checkpoint. That said, the thing that I was optimizing for in my bootcamp, have we mentioned the bootcamp yet? We haven't mentioned the bootcamp. Oh, we should probably mention

Teaching vibe coding to novices

that context for people before we go too much on. You have been recently teaching people who are earlier in their engineering careers about how to use this as an accelerant to producing commercially relevant code, right? Almost. Rather than early in their engineering career, I would say complete novice, like have never coded in their life, was my target audience. I had a couple people that have coded in the past, but for the purposes of the three-day bootcamp.

they were not looking at the code at all. And rather than thinking of it as sort of, I was thinking of it as providing people with kind of like the bare minimum skill that they need in order to produce working software without needing to know software. And some of the theory here was I actually started planning this before Cloud4 came out and I was anticipating Cloud4 is going to come out and

just hit the threshold of making this possible. It was perfect timing. Basically, all of my students produced working full-stack web applications, which I think would not have been possible with any earlier model. technical sophistication among our audience here. Finger to the wind, a working full-stack web application is something which many people would find difficult to pull off without substantial guidance.

immediately after a four-year undergrad degree in computer sciences. My general finger to the wind, and there are some people who go their entire careers without them. They might specialize in the backend or specialize in the frontend. The backend being the thing on the server that talks to the database and runs business logic. Typically, the frontend being the thing that runs in the browser and does fun animations.

connects to remote APIs, maybe connects to the backend and similar. There are people who have worked in Silicon Valley and earned a nice paycheck and would say, I'm not more than minimally competent at either the back end or the front end, depending on which I've been working on. And so shipped a full stack web application within one day of encountering the notion of there is code in the world is kind of wild. It's incredible.

Yeah, I was surprised to see how well it worked. My students were certainly surprised. I think that some of what is enabling this, there's the trajectory of AI that is going exponential. There's also the trajectory of developer tools. That has been like quietly increasing over the past, you know, since software started. Yep. The younger listeners to this podcast.

might not know this, but there was a life before Git and GitHub. Let me tell you, it was terrible. You mentioned rolling back history to a previous checkpoint. I assume that's a Git-based affordance. In Cursor, no. Oh, okay. Sorry. In Cursor, they have sort of like IDE checkpoints. We were not using Cursor. We were using Cloud Code as the main interface to Cloud.

And Cloud doesn't have that functionality, so we did use Git for that. Got it. Git is also one of those forbiddenly high sort of gates you need to pass to get into. productively functioning as a software engineer in industry, for example, and presumably having attempted to teach people this before.

If you can make someone minimally competent in Git in a day without them having previous version control experience, you're doing pretty well if that is your only pedagogical goal. If you have people successfully being able to, I don't know. commit, undo commits, merge, and rebase and get at the end of 24 hours on top of learning programming the same day. That is, again, kind of wild to me. So this is a nice segue into one of the main underlying philosophies that went into this boot camp.

I saw one of my main roles here as preparing resources for Claude that my students are just sort of like a channel for. such that my students will be able to just talk to Claude very naturally about what they want to happen, and Claude will then execute. None of my students... typed into their command line git commit or git add or whatever. They all had the experience of telling Claude, oh, that's great. I want to deploy it to the internet. Or I like the previous version more. Can we go back?

And Claude had instructions from me on committing frequently and what to do when the user requests to go back to a checkpoint and stuff like that. What to do to squash before pushing. some technical jargon but yep and this is a thing that we you know do in the industry and have for a very long time there are you know low-level technologies and git is

Not quite the lowest of the low-level technologies, but oh my, Linus Torvalds, when he goes off, he goes off. So there are things that one builds on top of, gets like... GitHub, for example, to make things easier for early-career engineers on reasoning around what it's doing.

having social norms within a company or social norms within an open-source project to make people's use of Git cross-compatible with each other. Because if you have vastly different norms for how much work there is in a commit or how much work there is in a pull review, there will be. some friction on the engineering team. Instead of having a orientation where we teach people for two weeks, this is the way we do engineering at this company.

given you only have three weeks and there's no company involved other than your own, you can pass that pre-orientation in a very natural fashion over to Claude, and then Claude transforms people's expressed intent into, okay, if I was working on... this hypothetical engineering team that the way that this team would prefer expressing that intent is taking the following actions in a terminal. Yeah, yeah, I think that's completely correct. And also goes for any other kind of

code-based specific stylistic choices or best practices in your team. That said, my problem here and my students' problems were a lot more narrow because The thing I'm going for here and the thing I think that I sort of uniquely provided is more in the direction of personal software, empowering people to just build small bespoke applications.

for whatever use case they might have. I had a video producer that has never coded in his life and had a little bit of computer phobia. And he built an application that takes in a screenplay and analyzes it. splits it into shots and generates an AI-generated storyboard for it, which I thought was incredible. He said he might actually use it for his films.

That is a wonderful example of computers making people's life better in a way which I happen to know there are a couple of screenplay writing software programs out there that are used in industry and similar. But if you either don't have that budget or you... oh, where's my AI storyboard feature? I want that now rather than like three years from now when one of those companies catches up to it. That's the sort of itch that is much easier to scratch if you're a programmer, but finger to the wind.

I have an engineering degree. I've shipped a SaaS company or two in my day. Okay. Yeah, that's a two-week project for me, I think. Finger to the wind. There's some little projects that are over that line and some level which are under it. If you can give people who might be in an allied trade or something entirely different walk of life than programming, hey, the computer can do this for you if you spend 45 minutes subscribing what you want. I think that...

increases the amount of spoke software in the world. Although I would There are some people in less temperate parts of Twitter and similar who think that software as a service is dead. Everyone will write their own software every day, which is, of course, exactly what we found after we released word processors.

everyone wrote all their own things rather than getting other people to write things for them. I'm joking a little bit, but I do think that is an important point. Speaking of important points, I think while this is extremely early for people on literally day one, as you've mentioned, likely the rest of their engineering journey. It is difficult for me to envision a world in which professional engineers are not using these tools.

constantly going forward. Vibe check me on that statement. Does that sound accurate to you? Or do you think actually, no, I could imagine engineers who just get no benefit out of this whatsoever. I think there will be a wide range of usefulness for different tasks and different contexts. But, you know, I believe that AI will become generally intelligent or super intelligent in the next...

Future of AI in software development

less than a decade, maybe less than five years. And that just straightforwardly means that eventually humans will not write software unless they are doing it for fun. Where that is, like what the trajectory looks like.

I sort of expect that a very large percentage of application development will be like completely done by AI within a year and a half, something like that. And then I think... other more complicated things like maybe systems software that is very low level and complicated, maybe like Linux kernel development or underlying C++ libraries for AI. inference or stuff like that, that sort of thing. I think we'll follow relatively shortly after, but we'll take a little bit more time.

broadly speaking, have longer time frames than you, but you have probably thought about that specific question a lot more than I do. And so I don't think it's very productive for me to say reasons why I think it'll take longer. thing I think there would be wide agreement on is that given that these tools are getting so much better on a every three to six month-ish cadence, where like you said, it would have been

borderline to even attempt to do this for students at the skill level even a few months ago, and then Claude 4 gets you there. If people have previously tried using LLMs to assist them in engineering, or if you're unsure, like, am I doing the kind of engineering that would really benefit from putting points into learning how to do this for a couple of weeks. You might as well pre-deploy those points now because if you are, you know.

expecting a productivity speed up from your use of LLMs. If it isn't available in June 2025, just check back in October and whatever your expenditure of points and resources and investment was in learning how to do this. The LLMs have, in aggregate, expended a lot more training runs on getting better how to do it between now and October, and suddenly, boom, your investment that was not quite where you want it to be.

now gets you some rewards that you can take the advantage of. Yes, I agree. That said, I think that learning this will become easier over time as well. Yes. It's not going to be too late to get into it and you'll be behind or something. With the models getting smarter, they will also become better at teaching you how to do this.

and there will be better UI affordances built on top of them and better educational products like your own built around it, etc., etc. models themselves and the people that are building the systems that are firing prompts off to the models will have more experience of seeing engineers over a range of the skill population on a wide variety of problems and they'll presumably tight loop that into.

either training runs or fine tuning runs to make them function better out of the box as it were. Yeah, absolutely. And regarding timelines, I actually would be happy to talk about that if that's interesting because

Discussing timelines and model capabilities

one of the reasons that I decided to do this is that I believe that the models are much smarter than people seem to think. And I wanted to show that. I think I've actually done a good job. My students were able to build really impressive things in a very short timeline. I remember being wrong about this, and so I will cop to being wrong about it a few years ago when Flappy Bird.

Flappy Bird and the evolution of game development

came out. And Flappy Bird, for those who don't remember, was an extremely minimalistic game, which was nonetheless done well. And for a while, it was a bit of an internet sensation, partly because there was a bit of a gap between how minimalistic it was and how just fun it was to play, and that gap in particular caused people to want to play it, and then internet sensation gets being an internet sensation. Something that someone said, and I can't remember who.

This would have been back in... Oh, I didn't look this up. Beforehand, it would have been impressionistically in the 2016 era or so, before the modern era of LLMs, they said, eventually, you'll be able to go to a computer and just say, make me a game about birds, and this is going to pop out of it. And I was like, no. Now, just from an information theory perspective, there's not enough in the sentence making a game about birds for this to pop up out in it.

Today, I don't think it's even worth doing the experiments. If you say, make me a flappy bird clone, except it's got to involve sharks and water, that you will get a functioning flappy bird clone with sharks and water basically on your first try, right? basically did this? Yeah. The natural progression of programs over the years, in my day, hello world, these days, please make me a flappy bird cologne. Yeah. If you take away nothing other...

The impact of LLMs on software engineering

nothing from this episode except this understand that the LLMs, if you do not currently think LLMs are intelligent, if you think they're like a fun bit of magic, but won't have any major impact on the world, if you think that, okay, cool thing that you can use to generate slop for spam purposes.

but will never transform a white collar occupation. This is two relatively experienced professionals trying to shake you by the neck and say, no, no, like we are not talking about two years from now. We're talking about today in terms of the things they can do and have. a variety of confidence intervals for where we'll be in two years, where I won't ask you to endorse this, but I've tried to tell people, I am the guy who thinks that the

bare case for LLMs is that they're only as impactful on human society as the internet was. Simultaneously, I'm the guy who says that the internet is the most important things that humanity has ever accomplished. And the bold case for LLM is much more impactful than the internet. And I think paraphrasing what you said earlier, and please tell me if this is unfair paraphrase, you're expecting the literal end of work.

at least for productivity purposes by humans, that we can offload entirely all or almost all of that jail alone. Yeah, like basically within a few years, I expect humans.

Future of coding and human roles

with very few exceptions, will not look at code. Maybe only audit very specific code bases that actually need human auditing. because they are, I don't know, relevant to AI, like safe usage of AI. And then you have to have a human look at it because you don't trust the AI or something like that.

This prediction sounds a bit more radical than it is, I think, because when we do code, we typically do it in expressive high-level languages. Ruby, Python, what's your poison of choice? It depends on the use case, but... For most applications, TypeScript these days. TypeScript. TypeScript is a great example. TypeScript is built on top of JavaScript with quite a lot of similarities, but files some of the rough edges off. JavaScript is built on a virtual machine, the virtual machine.

is built on top of lower-level, high-level languages, probably C. C is built on top of Assembly. Assembly compiles down into binary code, and the CPU, the chip that runs in your computer that does all this stuff, interacts essentially with binary code. for that entire tool chain. There are currently people who are specialized in working in each of those elements of the tool chain, but you can go your entire career after undergrad in

Silicon Valley and working in front-end engineering, back-end engineering, FinTech, whatever it is that you do, and never see a single line of assembly code. No one around you will say, wow. This person is not an engineer. They've never seen assembly code. People will say that. They're idiots. Ignore them. Not a real engineer. You've never seen assembly code. You can go an entire career without seeing C. You can go...

There are people who, for considered choices of their own, choose to go an entire career without ever seeing JavaScript. And partly this is like an argument from specialization because humans want to accomplish a lot of things in the world, can't be expert at all the things. We offload some of the things to programs that were previously written or other humans or society generally. Partly, it's an argument for we have these increasing layers of abstractions built on that.

technology which we had figured out a long time ago. That technology continues to improve due to the results of specialists, but you have to think about it much less than you do back when. There was a point where essentially all programmers working in the world knew assembly because that was your only option. I gave a very strong version of the claim earlier, the claim that engineers will not be looking at

at code anymore. I could quibble with it a little bit, but the notion that many engineers are looking at code a lot less than they do seems straightforwardly obvious to me. Given the fullness of time, I think exactly what year that'll come by is that. up for grabs. And I'm personally looking at less and less code as I'm going. For the most complex things that I still do, I will look at all of the code, but increasingly for a lot of stuff.

I'll just check if it works, and if it works, I won't look at the code. So this is already happening. Yep. I will say, as I get older, I look at less and less code is something that many engineers would have reported prior to LLMS existed. Partly, it's the industrial. organization of software where the more senior employees, their time is more valuable. They end up doing things that are very leveraged. A lot of them are interfacing.

within the engineering organization and with other stakeholders at their organization, perhaps the clients, perhaps customers, on what should this do anyway, and then transforming that into a bunch of prompts, if you will, for more junior engineers.

here are things we need to get banged out in the next two weeks. And then reviewing the output of the engineers that are at a variety of levels. And one of those levels could be, you know... line-by-line level code reviews, but as the person who is doing the work for you gets more senior, it is less valuable for you and for them to do line-by-line code reviews and more valuable to talk about, okay,

on the level of design document, on the level of microservice, on the level of et cetera, et cetera. Let's have a meeting of the mind on our goals. And then success or failure will probably not be read out of an IDE. It'll be read out of things like, okay, well... The alerting system isn't going crazy, so that's good news. We had a successful green-blue deploy, that's good news. Continued on a well-designed process for green-blue deploys.

no errors after the thing is 100% in production, then very probably you never need to go back and say, okay, but was that code that they gave me, did it work really? And so... in a certain way, the future might look like a turbocharged version of the past, where as one gets more advanced in one's career, slash as the technical substrate one works on gets more advanced, you need to poke into the underlying layers less and less.

Yeah, I think that's a great take. I do think that investing in very good monitoring has always been important, but it might become even more important now, knowing when something is going wrong.

Monitoring and error handling in software

I think a thing that has been reported to me privately, but I'm interested if you've used it. So monitoring broadly, computer programs have logs to them. There's also metrics you can... collect from a computer system or computer program or similar, monitoring is just making these visible to typically human operators to allow the human operators to make decisions based on them. A thing you might do if, I don't know, a large portion of

of financial industries, computers are down overnight is to wake people up because them being down is typically bad. And you won't know they're down unless either someone's yelling at you or you have someone who is. in some way tied to a computer system that can realize, okay, if the number of requests that we get per minute goes down from 100,000 to 40, that's not good news. Tell an engineer that and have them make a decision on whether to wake people up.

That was just the context on what monitoring is. So typically, monitoring is not reporting errors on a one-by-one basis and filing them all to an intermediate software engineer for, okay, fine, immediately, what caused this? error number 3732 that we experienced today. Typically, anomalies are interesting and individual errors less so because there is a finite amount of attention that intermediate engineers have.

but it seems like attention attached to near human intelligence might be very abundant in the future. One of my friends who has previously done a lot of work has said,

The role of LLMs in code review and maintenance

One of the things that we do these days is just pass every error and every error message pass an LLM and see if they can identify what caused it. If yes, add a test in the code so it doesn't cause again and do a code change. that will fix that error in perpetuity. And then once a day or so, we have engineers review the proposed code changes and batch accept them.

I don't know if his company has yet moved past the batch except to the, okay, computer, just figure it out and we'll tell you if we don't like it later. But I suspect that is not very far away. Are you passing logs over to the... to the LLMs yet for somewhat autonomous mode? I have not done this yet, but also I've been focused on dogfooding a lot of my content for creating these greenfield small applications.

That does sound very useful to me. And I do think people will remain nervous about auto accepting code into their codebases run by LLMs for at least a few more months. I don't know. Yep. I expect that.

That is going to be a Rubicon for a number of organizations, partly for one of the reasons, and I apologize, I don't want to drag you all into it, but one of the reasons why I somewhat fade the curves that people in the AI safety in some of which communities have for impact on human life is that in some places they're not rate limited by

things that the AI or any advocate for the AI can dream up. They're rate limited by acceptance into existing organizations, institutions, and similar. And so just like it took an awful long time. decades in some circumstances to get people comfortable with the notion of your business's valuable data should totally sit in somebody else's building. You won't even be allowed into the room with it anymore. But that's better for you than your current practice.

which was the story of cloud computing on which Amazon at all spent decades, again, getting companies comfortable with. I think it will take a long time to get conservative engineers at conservative institutions over the hump of... No really, just like fire unaudited changes in your codebase, what's the worst thing that can happen? Similarly, more aggressive end of the spectrum where people are, the sardonic phrase sometimes used in Silicon Valley is cat photos.

If you're only shifting cat photos around, the relative robustness of the engineering demands are somewhat less. If you look at the financial industry and look at the company that is most famous for sending cat photos around, the relative robustness at the cat photo company is actually much higher than the mean robustness in the financial industry, that's neither here nor there.

I will say again from private conversations, it seems like people are already doing the LLM to patch to batch adoption cycle in production or very close to production at quote-unquote, serious companies doing serious things. Yeah, yeah. And there's a bunch of topics we can branch into, but I told my students, wow, this is great. You can make these...

web apps, put them on the internet, and you didn't look at the code once. That's awesome. A lot of them got nervous around, oh, will I suddenly be charged a lot of money? Or will I leak user data? Or something like that. And what I told them is, Once your application looks the way you want it, does the thing you want it to, you should probably get a human software engineer to audit the code base, which I think is an interesting flow. You can generate

hundreds of small bespoke apps and then a small fraction of them will actually become quote-unquote serious. For those, you do the auditing. One way to think about this is turbocharged wireframing where in the

Wireframing and project management with LLMs

Classical Craft, if you're at a company and you're trying to decide on what the new feature for the product looks like, you might have a team go off for a week or two with a sketchboard and do quote-unquote wireframes. just let's get the design working here and understand the flow, the screens that the user interacts with to get through the task in front of them. Then you might go as far as having a sub-team of that team.

build something which is clickable, but which can't actually do the thing yet. It's a communication aid with your engineers that are doing the actual implementation. Here's the thing we want you to build. Here is broadly what we expect. output to be, but we haven't fully specified the output because of if a wireframe was fully specified, it would just be the program. Here we have wireframes, which are in some sense communications aid, in some sense a exploratory tool, which are just

much higher bandwidth than traditional wireframes and happen to be executable. Partly, it'll be a matter of tempering expectations and, you know, maybe as of June 2025, don't five code a bank, your regular might not be. Super happy with that decision. Maybe if you are thinking of coding a bank and don't yet know that banks have regulators, ask an LLM close to you about that topic and you'll learn some things.

I feel less threatened about it than some people do. And for whatever reason, a combination of things I've written and my position in the community, I meet many people who are early in their careers who have said things like, I think this is going deep.

The future of software engineering careers

broad increase in skill levels than their increasing adoption in the craft of software engineering. I think this is going to greatly decrease demand for engineers in the future. I think it might decrease salary for engineers. I think the door might have closed. People who are in the industry already are fine, but who are not quite in the industry yet in a very bad way. I'm broadly optimistic about all those questions. In most worlds that I envision, the total...

software employment does not decrease. My basic model and intuition for that is if you make engineers 100 times more effective, then we don't need 100 times less engineers. We just increased radically the number of value-adding projects we can go after.

And I think people might not understand this about the largest engineering firms in the world, but they have a list every quarter that they go through and the exact nomenclature differs in different places. But here are all the things we could do if we had infinite time.

Then we draw a red line somewhere down that list where the things above the red line are projects we think we can actually staff this quarter, given our current size of the engineering team, other constraints, desire for maintenance programming, and similar. there's always things below the line and almost everything is below the line. Given humanity's ability to discover new things at once every time we alleviate.

some pockets of scarcity, I think there will always be something below the line. And so essentially always the demand for engineers. I do like that intuition and framing, but I find it interesting to talk about what the humans are doing. And you mentioned earlier about the role of the senior engineer compared to the junior engineer. And there's the product manager. And the frame that I've been shifting into and that I used for this bootcamp is...

Senior software engineers often do things that kind of leverage the work of the junior engineers to be more effective and help them make less mistakes if you set up your code base. such that when a junior engineer makes a mistake, there is a linter that catches that mistake, or a type checker, or basically a program that you run your code through and it checks it, then all of a sudden you're...

you are more likely to get working software out of your junior engineer. And the framing that I took on for this bootcamp was, okay, I'm sort of this senior engineer, and I'm going to set up a project. template that has all of the necessary pieces to let Claude as the sort of junior engineer be able to work very productively and with minimal errors.

while my students or the people with the idea for what to build, with the idea for what they want, acting like the product manager. This was extremely effective. And I think when people see the level of, or try to estimate the level of capability from these models, they are looking at, okay, what happens if I ask the model to, from scratch, build this sort of like...

whatever application. And the model often gets it somewhat right, but makes some simple mistakes and it's not very polished and you would have to invest senior human engineering time. into doing the last mile, sort of. Also associated with engineers just reading code de novo and understanding what is this thing doing. Yes, yes, true.

My approach here was, okay, what if I did the last mile at the start and set up a project with all of the necessary dependencies. For each dependency, I asked Lod, hey, what is everything you know about this dependency? Okay, great. Now go look at the docs on the internet and tell me everything you missed. Now summarize everything you missed so that I can put it in the instructions for this code base. All sorts of those types of things.

What back-end technologies can I choose such that building out new features is extremely easy? Which is a topic that I've been excited about before the lens. And so when you say something like, our appetite for more software will just increase and we will build a lot more software, but we will still need humans. Humans will be X times more effective. Then I'm going, okay, but they're not going to be doing the same task. There's going to be...

humans that are acting sort of like product managers. And maybe there are humans that act as sort of like auditors for untrusted LLM-generated code where you don't trust LLMs enough to... comply with the bank regulations or whatever. And at some point, you know, I think that once the trust barrier is passed, like we have this aligned AGI or whatever, at that point, you no longer need the...

that trust senior software engineers, like humans, because you trust the LLM to do that. And I don't know if we ever reached that point, because it's a more, it's, you know, it's a technical alignment question combined with a... psychological and social question. And there are indeed a lot of things that computers can do these days, trivially, which society, for whatever reason, does not trust computers to do. An example I can think of off the top of my head. Ah, adding two numbers.

And then comparing that number to a third number and saying, is it above this or not? Again, trivial for computers to do this for almost all values of numbers. There are, if you are attempting to get... accreditation as an accredited investor in the United States to invest in a startup, you really want the

agent in the world that is doing the I'm going to add two numbers together and compare against the target value to be a lawyer or CPA because they have the magical ability under the law to give you the stamp that you need and a computer even though it is capable of doing the math does not. And given some more time, I could think of other ideas there. But you mentioned linters. And one thing, pro tip for anybody who is quite experienced with doing software engineering but hasn't used these yet.

Practical tips for software engineers

One of the best uses of them I found in my limited exploration is given that we already have a linter in the code base that is flagging code patterns that have caused us problems in the past or which are aesthetically unpleasing to the team and which we simply don't want. If you have an error, ask the LLM, could you write a linter rule that would have caught this? And then you can visually inspect that linter rule and say, okay, do I like this or not? Like, is it making up something or...

will this have sufficient coverage, et cetera, et cetera. You can ask the LLM to run it against the code base and tell you what it gets from the coverage report and how many instances it flags. And then you can go one step further and say, okay.

For all the flagged instances, write me a pull request with a patch for it. Then your choice as the person who determines allocation of resources is whether you want to actually review 67 pull requests as a matter of priority today or just like, okay, these are the sort of pull requests I would expect given this rule.

and nothing's broken right now. Maybe I won't spend the time to look through all 67 of them, but it will take that Linter rule in the future just to flag it for the benefit of other junior software engineers and junior LLMs that if your code hits this rule. try again so one bit a tiny bit of alpha there we've been chatting for a little while i think we could go in many many directions with this but where can people follow you on the internet probably best right now is twitter

Wrap

is my handle at y-o-a-v-t-z-f-a-t-i. And if you're interested in the resources that I made for this bootcamp and plan on maintaining it, they'll probably look different in a month or whenever this is released. You can go to code-bloom.app. C-O-D-E hyphen B-L-O-O-M dot app. Thanks very much for coming on the program, Joab. And for the rest of you, thanks very much. And we'll see you next week on Complex Systems. Thanks for having me.

Thanks for tuning in to this week's episode of Complex Systems. If you have comments, drop me an email or hit me up at patty11 on Twitter. Ratings and reviews are the lifeblood of new podcasts for SEO reasons, and also because they let me know what you like. Complex Systems is produced by Turpentine, the podcast network behind Econ 102, Riff with Vern Hobart, Turpentine BC, and more shows for experts by experts in tech.

This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.
For the best experience, listen in Metacast app for iOS or Android