Making Reliable Software in 2026 with Damien Brady - podcast episode cover

Making Reliable Software in 2026 with Damien Brady

Feb 18, 20261 hr 1 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

It's always been challenging to make reliable software - is AI making it worse or better? Carl and Richard talk to Damien Brady about his experiences building software with AI tools and trying to bring that software up to an acceptable standard. Damien talks about leveraging LLMs' tendency toward detailed analysis to catch problems in code. It takes practice and experience to get good at using these tools, but they become more powerful over time!

Transcript

Speaker 1

How'd you like to listen to dot NetRocks with no ads? Easy? Become a patron for just five dollars a month. You get access to a private RSS feed where all the shows have no ads. Twenty dollars a month, we'll get you that and a special dot NetRocks patron mug. Sign up now at Patreon dot dot NetRocks dot com. Hi, this is Carl Franklin.

Speaker 2

And this is Richard Campbell.

Speaker 1

We've got two special shows coming up soon, episode nineteen ninety nine and two thousand.

Speaker 2

For episode nineteen ninety nine, we're collecting people's y two k stories what did you do to help the Y two k event not actually happen?

Speaker 1

And for episode two thousand, we're going to be sharing stories about how dot net shaped your career.

Speaker 2

We have a special page at dot netroocks dot com slash voxpop where you can record messages for us that we can play on these special episodes.

Speaker 1

So tell us what you did for Y two k and what dot net means to you, and of course how long you've been listening to dot net rocks.

Speaker 2

So go to dot netroocks dot com slash vox pop now and leave us a message before the thought of operates like whiskey left in a glass overnight.

Speaker 3

Do it?

Speaker 1

Hey, and welcome back to dot net Rocks. I'm Carl Franklin and I'm Richard Campbell. Damian Brady is here with us. You'll probably hear him before I introduce him formally. But this being show nineteen ninety, we need to talk about what happened that year.

Speaker 4

Don't wait, you're big years, aren't they?

Speaker 3

They know? Oh yeah, big years.

Speaker 1

Yeah. So I remember nineteen ninety because that's the year I got married the first time, and my wife and I were watching television and you know, we heard about the Berlin Wall coming down and the reunification of Germany and that was just like it was a crazy, crazy cool.

Speaker 4

You think about how many pieces were involved in doing that, because, sure me it being broken up after World War Two, so you had to have the agreement of all of the states that were involved in that, which includes Russia, right or the Soviet Union at that time, and then unification of the currency, the complete dismantling of the eastern the Eastern Germany political system essentially to go over to the western one. I'm like it was huge, huge task.

Speaker 1

I went back to college that year, and so I remember my history teacher was a Cthiopian guy, and he was just like, just can you believe this is the garbageoff, just like you know, who saw that coming?

Speaker 3

And my.

Speaker 1

Economics teacher, a macroeconomics class that I had, she was like a protest hippie from the sixties, you know, and she said, we used to go around saying free Nelson Mandela, and now he's free. So that was yeah, February eleventh.

Speaker 4

That happened in nineteen ninety as well.

Speaker 1

Yeah, February eleventh.

Speaker 4

So and then she also saw what started in eighty nine with the dissolution of the Eastern Bloc. Now you start having elections in Poland and in Czechoslovakia and the former Yugoslavia.

Speaker 1

But it wasn't so good for Yugoslavia though, that was collapsing there, the communist regime.

Speaker 4

Yeah, and they separates nations that form from that and they are quite hostile to each other. So things are going to get ugly in that part of the world.

Speaker 1

And they would go to war in the next few years. Yeah. The beginning of the human genome projects. Not to steal your science, but it's good that's a good one.

Speaker 4

Yeah, and it take years and years, but look where we are today. I forgot. They spent what two and a half billion dollars to do that over a decade, and today you can get it for three hundred bucks. Yeah, so pretty amazing, you know, talk about transformation. Hey, before we go off of the whole Eastern Bloc thing. Most importantly, in nineteen ninety, Moscow gets its first McDonald's wow, and its first pizza Hut.

Speaker 1

That was the real reason they wanted to of the Soviet Union, right, so he could bring McDonald's and pizza Hut in there.

Speaker 4

I don't know the answer to that. By the way, they're also opening in China at the same time, so that was going on as well.

Speaker 3

Yeah, any others.

Speaker 1

Go ahead, No, that's enough. I mean I would go over the movies, but the best movies are in the eighties.

Speaker 4

We know that.

Speaker 3

So yeah, yeah, I think we've already had that wave.

Speaker 4

I will go one more geopolitical mention, which is the Iraqi invision of could wait that's right? Yeah, yeah, obviously we'll have more dire consequences, although it takes time for folks to really have that sink in.

Speaker 1

Yeah, absolutely. I know you're going to do science and tech, but I just want to apologize because last week I said nineteen eighty nine was the beginning of the world Wide Web. Tim berners lead didn't actually invent that until nineteen ninety. So he developed the first web server, laying the foundation.

Speaker 4

Yeah, he put the proposal in in nineteen eighty nine to Thurn in December. Yes, yeah, you know, and this is in March of eighty nine. He put the proposal in Oh okay, March and turned him down.

Speaker 3

We don't need that. That's not a good idea time on.

Speaker 1

What do you think the people that turned him down to sleep at night? Do you think they're just like, ah, that was done.

Speaker 4

Fortunately, he kind of just kept working on it, and his direct report, his boss was fine with him still working on it. And yeah, by nineteen ninety on his next computer. And you can go see this machine. It's at certain I've seen. It's got a big label on it. Yeah, it says server do not turn off. And that's really cool. First web server, the first browser, all of it.

Speaker 1

I gotta go see it.

Speaker 4

It's really a thing.

Speaker 1

So what else happened in science and tech and space?

Speaker 4

Let's do Internet first. I'll do space last. Although I think the computing stuff is huge, because the Internet obviously has a very big year. Was not only just tim getting the WWW working, but this is when the first search engine comes out. It's called Archie, written by team of three Allen Mtaje, Bill heal And and Mike Parker out of McGill, University of Montreal.

Speaker 1

There's a whole bunch of named things that were about Archie and Veronica and all that, that whole comic strip.

Speaker 4

Yeah, and it all happened now right like this expansion, Like really, what's happened is that we don't have the Internet yet. It's arpendet, but this is the year that arpaet goes away and it becomes nsf net or National Science Foundation Network. So they're understanding it isn't the Cold War tool anymore. It's not that that. It's it's something else. So besides Archie, you also get in Gopher this year out of Minnesota University of Minnesota, which was a search and retrieval tool.

Speaker 3

Yeah.

Speaker 4

Remember we're only at one hundred thousand or so machine, so it's still reasonable for your computer to search everything.

Speaker 1

Gopher was like file folders on the Internet. It was like a hierarchical file folders system and you can.

Speaker 4

Go fetch the files as you want, all right, so beyond. This is also the year that the Electronic Frontier.

Speaker 3

Foundation is formed.

Speaker 4

Ah, Mitch kpar Yeah, don't forget that.

Speaker 1

And John Perry Barlow who went to write songs with Bob We're for the Grateful Dead, Cassidy being one of my favorites.

Speaker 3

Nice. Yeah.

Speaker 4

This is the year that the first company in history ever makes more than a billion dollars in sales. And that company is Microsoft. And what are they selling in nineteen ninety Windows three, MS DOS and Windows Yeah. So this is when version three comes out. It's the first multitasking version of Windows, so if you have a three D six or better, you can have multiple apps running simultaneously. It's also the first version to include Solitaire, which is

clearly worth a billion dollars. It's also the first edition of Microsoft Office. It was called the Microsoft Office for Windows, included Word one point one, Excel two point one D, and PowerPoint two point zero.

Speaker 1

Wow.

Speaker 4

It's also the year that Microsoft and IBM and their relationship on operating systems. That's the MMOs two And this is the beginning of NT and all those sorts of things. You have two other important pieces of software. This is the first version of Photoshop is in nineteen ninety only for the mac's the beginning of it. And one you know and love snagg It is in nineteen ninety.

Speaker 3

Yeah. All right.

Speaker 4

On the space side, I'll go down two stacks. There are six Shuttle flights this year. A lot of them are just satellite missions, but there are three I will talk about specifically.

Speaker 3

In April.

Speaker 4

Spatial Discovery deploys the Hubble space telescope.

Speaker 1

That's right.

Speaker 4

Originally supposed to launch in nineteen eighty three, delayed by the Challenger disaster and also budget. There's some evidence that it was actually based on the KH eleven spy satellite, which is still classified, so we don't really know for sure, but every image of it looks an awful lot like novel.

Probably was a prototype that wasn't quite right for their application, so they gave it to NASA and they built the saddle the Hubble around it two point four meter mirror that was, of course, in ninety ninety discovered to a big ground incorrectly, and it'll be a few years before they figure out how to fix that. So it didn't work all like it.

Speaker 1

So I just got to say this, this is my favorite David Letterman top ten top ten Hubble space tell Us scope excuses, and the number one was bum with squeegee, smeared lens at stopped light.

Speaker 3

All right.

Speaker 4

Two more Shuttle missions. One is Discovery in October launching the Ulysses Solar probe. This there's been solar pros for this, but what Elysses did it was with the ESA that they did. This one is they actually did a slingshot maneuver off of Jupiter so that they could get imaging of the poles of the Moon that had never been done before. It meant that its orbit ran several years long.

It was a very large elliptical orbit, but it means they've got the first images of the North and South poles of the Sun. And finally Colombia and December with the Space Labs with Astro one. So space Lab was this laboratory before the space station came along.

Speaker 3

It was in the.

Speaker 4

Cargo compartment of the space station, gave more pressurized space to work in, and in behind that was this non pressurized palate. There were a few different options for spacelab doping what're doing. This is the first time they flew with Astro one, which was four telescopes mounted on a pointing rig built by Darnier out the back of the shuttle and the four satellite. The four telescopes were the Hopkins ultra violet telescope, the Wisconsin Ultraviolet Photo Parimeter, the

experiment the Ultravallet Imaging telescope, and the Broadband X ray telescope. Wow, so lots of science. And I promised that I'd be talking about Mirror going forward, And in nineteen ninety the Crystal Module is lifted for mir This was a experimental module that had materials processing furnaces on it, so they were attempting to make ultra puer versions silicon and a

bunch of other experiments. They also had one of the very first androgynist docking ports, initially for the Brand Shuttle, which of course would only have flown once a few years ago, will never fly again, but that port would be used by Spatial Atlantis at nineteen ninety five at the first dockings of space to the Shuttle to Mirror.

Speaker 1

Okay, and that's what I got, all right, that's very good stuff. Yeah, and I think Damien's just kind of quiet because it's like seven o'clock in the morning where you are, right.

Speaker 3

It's it's not too bad. It's I but I mean I was listening to all this. It's fascinating. I was ten, so yeah. Yeah, although that's the first computer I ever got was that year nineteen ninety. Yeah, and I say I got dad hadn't have had some stay over it. Yeah. Yeah, he slowly lost control over it over the next few years.

Speaker 4

I suspect ten years old is one I got grabbed by computing too, dude, just a few years earlier than years.

Speaker 1

Well yeah, yeah, well let's keep going with better no framework roll the music got so I thought when I found you know, I look for trending GitHub repos right and I found this one. And when I saw it, I thought, I swore that we had done this before, but apparently not. Maybe it was I just don't know. Anyway. This is hardened Windows Security from hot Cake X. And it's not code really, it's documentation. It's hardened Window security, a new threat to malware, hardened Windows safely securely with

only official Microsoft methods. So it's basically you know, documentation and all the things that you need to know to shore up your Windows security, which is really becoming more and more and more important for everyday people to do, especially you know, those who don't understand what the difference is between just plugging into your cable modem and using a router and using net addresses and private networks and allowing your friends to come over and access your Wi Fi.

Can they access your machine? It's like, there's so many things people need to know about Windows security. And you may not think that, oh, well, I trust my friends, but you know what, somebody could be sitting outside your driveway and hacking your stuff. So all sorts of things to know here.

Speaker 4

I think it's important to remember that Windows is configured for maximum compatibility, not maximum security, right, and so often there's a whole lot of features that turned on that are very much edge cases. But you know, normal people that those things weren't turned on would return the machines that didn't work for them, and that's more expensive to Microsoft than leaving it somewhat unsecure. So it's well worth running on so many tools.

Speaker 1

As our friend Patrick Hines likes to say, convenience is the enemy of security and vice versa totally. So that's what I got. Richard, who's talking to us today?

Speaker 4

We grab a Colmin Hot Show fourteen twenty five.

Speaker 3

That's right, baby, going back to the way.

Speaker 4

Back we should good nine years ago. Wow, the show is called brown Field DevOps with one Damien Brady. Maybe you've heard of them. So it's just we actually shot this at NDC Lendon. You and I in a booth with Damien. Back in the day we were talking. You know, the demops movement was more mature than it is twenty seventeen. Asould I say, how do we take these practices and put them on to existing applications, sort of get them geared up to the way they ought to be and

at least comments a few years old. Robert Mattingley says, thanks for the great show, guys. Legacy shows are always stock provoking. But the common trend seems to be defining the term quote legacy code. It's kind of subject discussed. It needs to be redefined for our current conversation. I think the big hairy problem is that we use the

term legacy to define three separate categories of code. The first and best category code, which is battle tested, reliable makes the best use of the language features at the time, may be hard to improve with modern features. A lot of Cobol could fall into this category. It may see our cane, but it's reliable and it uses the platform well. The second category, code is functional and reliable uses conventions which are now considered outdated or harmful. I'm looking at

you go to command. The code doesn't need a change, but could be improved by taking advantage of new features or conventions. And the final category is code the just plane sucks. It was bad when it was written, and it's bad now. We've all written it, so there's no use of pointing fingers. But it is an ugly baby.

Ugly things like using a cascade of seventeen try catch blocks to handle a value assignment, or these SQL injectable storage procedures because hey, we need dynamic behavior, or checking if your user is authenticated or not checking if your user is authenticated anywhere but the log in page letting anyone who could play yes the URL to have full of mint access to your web app. If we call all these things legacy code, as if they all had the same value, it needed to be treated.

Speaker 3

The same way.

Speaker 4

They really don't, and it's important to recognize they need to be treated differently. Be nice to have a term for each of these categories so that we could differentiate them. They really don't, and it's important to recognize that they need to be treated differently.

Speaker 3

It would be.

Speaker 4

Nice to have a term for each of these categories so we could differentiate what should be preserved and what genuinely needs a rewrite.

Speaker 1

I consider legacy code any code that nobody on staff can read and understand. That's fair.

Speaker 4

We're no longer compilable. That's a good one.

Speaker 1

That's a good one. Yeah, that's pretty good.

Speaker 4

I don't know that I have much to add to Roberts's points here, because I think he's completely correct. That's one of the reasons I read it. There really is these different kinds of software, and yeah, it would be nice if we treated them with especially that old good code, with more respect. Right, And certainly, you know, we tend to package up those old reliable apps, like those Cobald apps in wrappers so that we can call to them

with more modern things. But often that coach does not need to be replaced.

Speaker 3

It works.

Speaker 4

But we've all made some ugly Yeah, we'd all love to go back and clean up the ugly. It's usually, you know, no applications finished, it's just taken away from you and deployed. M HM, which you could do about that, Robert, thank you so much for your comment. And a copy of music co Buy is on its way to you.

And if you'd like a copy of music go buy, write a comment on the website at dot net rocks dot com or on the Facebook and a couple whichever show there and you comment there and I reading the show, we'll send you copy bees to go buy.

Speaker 1

Yeah, music to code by dot net where you can get that if you don't win it yourself and you just want to buy it. We have the entire twenty two track collection in MP, three wave and FLACK versions. All right, So I guess it's time to introduce Damien. His bio has changed a little bit since the last time he's been on the show. Damian Brady is a developer, speaker and author specializing in DevOps, mL OPS, developer process and software architecture. He's a senior manager and the Developer

advocac team at GitHub based in Brisbane, Australia. Formerly a cloud advocate, at Microsoft for four years and prior to that a dev at Octopus Deploy and a Microsoft MVP. Damien has over twenty years in software development and consulting in a broad range of industries. In Australia. He co organized the Brisbane dot Net User Group and launched the annual DDD Brisbane Conference. Welcome back, Damien.

Speaker 3

Thanks really good to be back.

Speaker 4

I think we've interviewed you in every one of your jobs. I'm sure we did a show when you were with Octavis Deploy.

Speaker 1

Definitely, yeah, yeah.

Speaker 4

And definitely when at Microsoft and our GitHub, although as I understand it now, the line between Microsoft and GitHub is very almost not existent these days, isn't it.

Speaker 3

Look it depends which part of GitHub and Microsoft you're in. But yes, yeah, exactly right. Yeah, there's some very interesting reporting chains going on at Microsoft and get ub at the moment. But the short version is that we're much closer than we used to be. So there's people on my team, like right next to me on my team who work for Microsoft and have the same manager as me, even though I'm employed by GitHub. So there's a lot

of that kind of thing happening. Means that you know the stuff we're developing, particularly you know, co Pilot and all the dev tools. You know, we're working on them together.

Speaker 4

I guess sounds like our friend Martin Woodward has got a big, fat promotion.

Speaker 3

He is a VP of Devrell and very recently that includes a huge number of developer advocates or like your old job. Yeah, yeah, exactly, so half of the Microsoft Deverell team kind of reports up through him.

Speaker 4

Now, I've always been a big fan of Martin wood where I'm glad he's doing well.

Speaker 1

Me too, he is.

Speaker 3

I saw him just before NDC London that were we were both at Richard the other day.

Speaker 4

Yeah, you were really keyn order for that show.

Speaker 3

As I recall, I was, yeah, that was fun.

Speaker 1

I like the way Martin says bogs and to fix the bugs.

Speaker 3

I think I think the faith is naughty. I think we get to say naughty a lot, those naughty bogs right, love.

Speaker 4

Me and Irish Ac.

Speaker 1

So what are you thinking about these days in terms of software and the big picture? It seems to have changed a bit since you were last with us.

Speaker 3

Yeah, yes, well, yes, definitely, but in some respects No, it was interesting listening to that. You know the title of that that last last podcast I recorded with you guys, the you know, the brown Fields DevOps of stuff, and I remember that, remember that distinctly, and that was really all about taking code that did not have, you know, that, the safety nets around it and adding those safety nets in. And in that respect, I'm kind of talking about the

same stuff these days. Sure, it's just instead of you know, legacy code that nobody understands, it has no tests around.

Speaker 4

It, written by a person that no longer at the company. Now it's just not written by a person at all.

Speaker 3

That's right. Yeah, We've we've evolved. We're not we're not using the people anymore. Yeah, So it's a similar kind of thing like where I mean that the way we write code largely has changed, where Like depending on how much of the code you're you're physically typing out yourself.

And there's a huge area everything from you know, just using copilot or clawed or cursor or any of those things, to you know, to just auto complete the line you're on all the way through to hey, you give me an app that does this and then ship it to PROD. But either way, you know that code is not coming out of your fingers directly one could argue the things that run on the machine have not come out of

your fingers directly for quite some time. But it's a little bit less deterministic now, you know.

Speaker 4

Yeah, so you said none of us has written a garbage collect or any time recently, or any cryptographic library. You know, we depend on code all the time.

Speaker 1

Yeah, exactly, And that's really it and software development encoding is really about expression. You want to express your intent to the computer to do something and have it do it, and you know, you express through code the instructions. But now we're at such a high level with you know, the agents and all of that that we're expressing to an agent in the English what to do, and it's just another higher level of programming if you think about it.

Speaker 3

There is I think there was, there's always been this level at which you don't quite understand what's happeningneth. I mean, you write your code, and if you understand what that code is supposed to do, then you can largely trust that the OS and the hardware and the platform and the framework and all the things you're using are going to do what you expect them to do. There was

an interesting story told in this ynote as well. Actually about basically showed people some code and it was a you know, setting some variables and then a trycatch statement and then they catch you you know, would write some stuff out to the console, and you know, this is a dot Net you know, some c sharp. It took people, you know a few seconds, but everybody kind of said, okay, well, great,

it writes this value out. And that's true. It absolutely does, unless you compiled it in September of nineteen ninety five or something like that, because there was a version of the dot net framework, not ninety ninety five, twenty fifteen, I think there was a version of the dot net four point six framework where they changed the just in time compiler. So if you compiled that in release mode with compiler optimizations on, sometimes it wiped out. It optimized out that catch statement.

Speaker 4

When that thing's more optimal than not running code. That's very optimal.

Speaker 3

That's true.

Speaker 1

Well, you know it's still deterministic though you just didn't know the rule.

Speaker 3

Yeah, right, right, And it's very very difficult to debug that if it only runs in an optimized release mode. But like to credit, they fixed it very quickly. However, you know, it did highlight that at some point you don't really know what's going to happen. I think the huge difference that we've got now is that all of that stuff is largely trustworthy. You know, if if that doesn't work for you, it's also not working for the

banker around the corner that has all your money. You know, your inventory system is probably the least of your worries right now. But with AI driven stuff, where you are giving it the intent, you actually don't know what code is running underneath, and so there's a huge gap, much more, much more of a significant gap between your intent and what actually comes out of the other side. And that's that's something that we probably and then need to think.

Speaker 1

And the nondeterministic nature of these agents means that you know, you could express your intent once on a Monday and express that same intent on Tuesday, and you're two completely different results and one of them could be absolutely wrong.

Speaker 3

That's yeah, that's exactly right, all right.

Speaker 4

I've run two prompts simultaneously and gotten two different results back from the same prom Like, you don't have to wait milk, It'll give you something ever time.

Speaker 3

So if you show somebody a programmer a line of code and say what does this do? They will be able to tell you if you show them a prompt, a single line prompt and say what does this do? There's no to know. You can guess. You know, the tools are getting better and a little bit more deterministic, but at some point, you know, especially when you're asking for large pieces of work, at some point you really just don't know what's going to come out.

Speaker 4

Right, So is this all about testing?

Speaker 3

Then?

Speaker 4

Like, is that really the thing we've got to get better at.

Speaker 3

I mean, that's that's definitely a large part of it. But there's only so much you can really do with with tests, I mean, especially especially asking for large pieces of code, so that there's kind of a sliding scale of you know how much you're asking for, asking an LLM for, or asking a tool for, and the know how close to your desired outcome it's going to get. If you ask for a single line of code that does this, it's very very likely that that's going to

be correct. If you ask for a function, depends on the function, but pretty likely that you're going to get a good result as long as you've defined it correctly and you've given it a good prompt and it's got some good context. But if you ask for an entire application, you know, even if it's a relatively small one, you're probably going to have to iterate on that a handful of time.

Speaker 1

So I'm much more comfortable iterating than I am just you know, giving. And I didn't used to think that, especially before I even started with this stuff, sitting down with Scott got three at a conference somewhere and we were both in agreement that just you know, use your imagination. It's so big, just think big and give it this great, big prompt, and you know, if it doesn't work, it doesn't work, you just try again. And I don't think

that anymore. I think small iterations is much better, especially because you can kind of track the progress and test things, you know, and if something doesn't work, you can say no, no, no, no, that was wrong. Yeah that's not what I meant. Erase that and try again.

Speaker 3

Yeah. That's how I'm comfortable doing it. And that means that all of your existing testing strategies, you know, TDD and stuff like that, you can still kind of do that if you if you can give at an entry point and an expected result. So here is a function header, Go implement that function for me, and I can write some tests around that to proof that that works. I can even get, you know, get the tools to help

me write those tests. I can get the tools to help you look for edge cases I hadn't thought of. But yeah, if you're asking for a web app to do it to do X thing, then you don't know what shape that code is going to take, and so it's a little hard to write tests around that. At least, you know, very specific tests. You could probably write some broad functional tests, I guess, but even then they've got to match the shape of the code that comes out.

Speaker 4

The token economy actually encourages practice. I mean the problem with ready the mother of all prompts. It thinks the mother of all tokens, and then it's wrong and you've got to do it again.

Speaker 3

Yeah a little bit. Yeah, I mean behind the scenes, and this is not how you're charged necessarily, but behind the scenes there's usually a lot of iteration going on anyway, because it's very difficult to give it the mother of all prompts and then say, okay, go implement those thirty five files in one in one cycle. I mean, the context window even if it's luge enough, it's you know, you lose fidelity. I guess once it.

Speaker 4

Gets reminds me of the guy who writes a giant stored procedure that I had to swat with a newspaper. Bad, don't it's not good? Like I feel like you're taking the architect's responsibility here to do a decomposition of the problem spaced, break it down a series of steps, and then feed it to the machine to execute each piece in turn.

Speaker 1

And there's there's a there's a danger here that you don't pay. You know, you're so eager to finish the feature and get it working and check it in that you don't check it out. You don't understand what it did. You didn't read the three through the code, and then so you're you're trying to figure it out because there's a bug or something like that that you have to fix, and you think, oh, I'll just look through the code and see what it is. What's that. Oh I didn't

write that, Well, no, you didn't. Your agent wrote that. Oh, now I got to figure that out at the wrong time. You should. You should know what it does when it's.

Speaker 3

Written and if it if it appears correct, like it seems to be doing the right thing, then that's great, but how how correct is the code underneath? Anyway, you are probably not looking at all the educases when you're just clicking through. It's kind of the old school QA where somebody had a spreadsheet of you know, here are the things that it should do, and they just click on the buttons and say, yep, that's great, and then yeah.

Speaker 1

And I like to be engaged in the process, Like I like thinking about the architecture and what I need to do to implement a feature. Oh, this is going to have to go there, and then this guy's got to reference that, and then you're going to throw it over the wall to that guy and blah blah blah. You know, I kind of really like that because it keeps me engaged in the in the software itself. It'd be boring for me if I just handed all that

over to an agent. So I like, there's a nice balance there that I've gotten myself into the works.

Speaker 3

Well, yeah, I agree. I work very similarly. So I have a right in front of me on the screen. Actually I have a little time zone help. I think I'm in Australia obviously, but a lot of the team, especially the team now that it's grown a bit scattered kind of close by time zone wise but not exactly right. And yes, there are some really good time zone tools on the internet, but I just wanted one where I could glance at it and say, for these time zones,

this is what time is right now. That was largely a one shot using the Copilot CLI, but I still iterated on it four or five times to say, oh, this is great, but can you let me color code these things? And can you change it so these things line up instead? And so that's a relatively large piece to start off with. But then once I have that in front of me, I'm like, all right, here's the here are the new things that I want, And what

did you actually do to implement that? Yeah, but this is also this is like a little toy that I have right in front of me. This is not a super personal piece of software, is what it is? You literally built a thing for you. Yeah.

Speaker 1

I have so many of those little tools that and I mentioned them from time to time, and I almost don't want to mention them because I don't want people to think I just sit around writing dumb tools all the time, Like I have a professionalist artware development trives me.

Speaker 3

Yeah, I have a friend who I'm not going to name them, but I have a friend who their daughter was going to look after their kid for a few days, and they rade it. They wrote a daughter as an adult.

Speaker 1

I got it.

Speaker 3

But they wrote a several page document about all the stuff in the house and what the kid does, and what the kid needs, and all the things that they need for school and all that, all that kind of stuff. But of course, you know this person is not going to read that document. So they vibe coded up a little like chat pot, and you can ask anything you want and it pulls the information just using a pretty basic rag and then we'll answer that questions. And it

took half an hour. Right. It's a fun, little, very specific tool for that person to say, hey, here's here's a solution to the problem I'm inevitably going to have. That's great. Yeah, those little tools a fantastic to be able to do them even if you don't code. I mean that's a you know, if you if you can't look at that code and see exactly what it's doing.

Speaker 1

You know, we used to talk about expert systems back in the eighties, right, that was you know, a more linearly coded thing that you could ask questions and it had the knowledge of the business or the whatever, and you could ask it questions and would tell you. I mean that now this has sort of come to reality, and I like the fact that you could take all

the business knowledge that you have. Or let's say let's say somebody starts a small business and it grows, right, and this person has all the knowledge and everybody's always coming to them for the advice, and then how does this work? And all that stuff, let's be perfect for a little RAG application.

Speaker 3

Yeah.

Speaker 1

Yeah, it's like a brain dump into a chat.

Speaker 3

But so the thing is these these little, you know, super personal little tools that you have. They're one thing and writing tests and having you know, CI around that it's not you know, super important. I mean, this is a little tool that you're running. But I've been around long enough to know that sometimes those little tools, you know, if you do them at work, get seen by a manager and then suddenly your machine is getting shipped to problem. So then like, if this has been vibe coded, you

haven't seen any of the code. How do you get from that to something that other people can work on and that you can trust and that you know, you know, you've touched the edge cases. You know that you know it's it's secure, it's reliable, and that is It's not a new problem, but it is a problem. We can get to a lot faster than we used.

Speaker 4

To a much more of it. This used to be the access guy who got out of hand, right, and suddenly they want to deploy it to one thousand users.

Speaker 3

I mean one of the first jobs I had there was a four DL I don't know if you ever touched that. It was basically yeah, yeah, so a four DL system that was running in production.

Speaker 1

Very high level language.

Speaker 3

Yeah. One person knew how to use it to back them back to your on the legacy software. And when that person left, it's a black box, like you don't you don't know what's in this thing. Yeah. So it's not a new problem, but we can do it so much faster. Now we can get a running application, you know, in a minute or two. Yeah, really we can.

Speaker 4

And Gami and I we did RUP for one moment for these very important messages. Eric back, it's done a Rocks and Richard Campbell. That's Carl Franklin. Hey, Hey, Hey, hanging with our z buddy. Damn me and Brady talking a little bit about we'll trying to make good software

out of all this. Like this sounds like the same thing that I did for access for apps dB STAPs are four that this sounds like a job for a consultant Like this is when you call in an expert to go through what you've got here and kind of make the assessments around security and reliability, all of those sorts of things.

Speaker 3

Yeah, I think so. I don't think there's a need to anything necessarily wrong with the code that comes out. It's just once it gets to a point where people trust it and need to run it in production and more people need to add to it, then suddenly you've got to start putting some rigor around that code again. So having you know, the pipelines that you used to have when you hand codeed everything that's super important. And then yeah, the testing around it, code coverage, things like that.

The good news, I guess is that there's all these generative AI systems give us opportunities to help with that process as well. I was probably for a little while one of the few people who actually didn't mind hand coding GitHub Actions. You're it. You're the only one. Everybody else got a piece of Yeah, but I don't do that anymore. Really, I mean it's very very easy to ask any of these tools, hey, can you write a

a action to do this and and check it? And you know, you don't have to write Yamel anymore if you don't really want to do you.

Speaker 1

Have to read it though what he wants to. That's my problem. I don't want to read it. I don't want to write it. I don't even want to see it.

Speaker 3

It's so readable though, Yeah, of all the of all the formats, of all the formats, Yeah, we're not reading at least so, but that means that, you know, if you do want to create a hey, I want to make sure I can build this and run the tests, that's a pretty quick prompt to say, get me a gi hub action workflow to do this.

Speaker 4

But if it's all about the prompt, I mean, don't you just have to say take this application, make it secure and maintainable.

Speaker 3

Does I mean you'd get something out? I'm sure I think what happen, what that actually would be is a bit of an issue. But I mean, even even within GitHub, if you don't don't actively do that work. We've been using gen Ai inside get Up to do a bunch of things like dependent but so that's been around for a while. It just looks at your dependencies and tells you whether there's any that have security vulnerabilities and it will alert you to right. But now we'll also give

you a pr to change that. It's kind of got a little bit better in that now because because sometimes upgrading the dependency doesn't isn't just a one for one replacement. You've got to actually code. Sure, go and find the code that you need to change and make suggestions to say, here is your here's a suggested code change that goes with that, you know, dependency update.

Speaker 1

Sometimes those dependencies are interdependent, so changing one requires that you change others.

Speaker 3

And even I mean there's other things in the whole get Up Advanced Security Suite where it all look for potential issues in code and give you code suggestions to update them. And that's using Jennet in the back. So that's using copilot and that kind of gives you a little bit more rigor around that code that you're putting in to product code reviews is another one, so you know, assign a PR to Copilot and it will do a code review for you. And there's there's plenty of other

tools that do that as well. We'll look at get up prs and yeah, and do that work as well, like third third party tools.

Speaker 4

I have to wonder how far away we are from having a really good code review bot like that is more thorough than any human.

Speaker 3

Yeah, it's interesting. There's there's I mean that's also dependent on your company's standards and what you care about with a code review as well, so it is a little bit personal. I know with Copilot code review, you can set you can give it custom instructions to say, hey, when you're doing a code review, here is some additional

context I want to give you. And pretty recently I think we change that so you can give custom instructions that are different for when you're writing code and when you're reviewing code, and when you're you know, doing the hands off coding agent. So you can be a little bit more specific about what stuff you care about. So you could have some custom instructions in there to say, every time you do a code review, make sure that all of the code has testing around it in the code coveration.

Speaker 4

And all criticism wrapped in a compliment sandwich.

Speaker 3

I mean you're absolutely good? Yeah, I mean you really?

Speaker 4

You talk about the challenge for me is someone who's done a lot of code review bit around a cloud. Code reviewers, like the best reviewers, often are kind of a difficult personality because they're very detail oriented, right, and so actually having to present the results in a form that is palatable to the reviewee is important too. Like somewhere's the LM with its own wavering obsequiousness like this might be meticulous and report like yeah, like what more could you ask for a code review?

Speaker 1

Yeah, you're talking about like the code review people. I imagine Olton Brown watching you cook, you know, and giving commentary. Why didn't you do that?

Speaker 4

I would I would wet myself that now was standing on my shoulder.

Speaker 3

I was cooking, like, there's no way, no way.

Speaker 1

Why didn't you temper those eggs?

Speaker 3

I mean, it's the matter with you, I asked. I asked GIPT the other day. This was a comedy thing. I asked PT the other day to abuse me when I gave it an error or something and it refused. It wouldn't it wouldn't do it. Yeah, we did a kind of a middle ground where it told me the code was terrible, but it wouldn't it wouldn't abuse me. So I don't know, you might get.

Speaker 4

Billy Hollis in there because people abuse the software, all of them.

Speaker 1

Yeah, it's like I have Billy Hollis program that model.

Speaker 3

You must suck as a coder. Yeah, I guess he could do that. You could even like have a look at who committed that and then look them up in a database to see how they prefer their feedback.

Speaker 4

Yeah, it could be tailored.

Speaker 1

I have a tangent here speaking of non secuiturs. Do you ever see a possible day when GitHub co pilot cli, for example, could work against a local inference engine like or Lama.

Speaker 3

So I think you can do that with other tools in VS code at the moment, I don't. I mean, yes, it's the short answer. I can see that possibility for sure. One of the things that I know GitHub has been doing lately is trying to give you a bit more choice, even if it's not inside the gitub ecosystem. So your Copilot subscription can now include you know, Claude and Codex, And I think that's another one.

Speaker 1

Great And so if you can reach out to CLAW, it could reach out to a LAMA just as easily. I mean, the real secret sauce there is knowing about your local repo and you know, I can do my own check ins and all that stuff.

Speaker 3

But yeah, I mean you can. You can definitely call out to your own lms that are hosted in Azure, so you can say, I am I want to you know, host a five three you know, language model in foundry and then use that as my cop.

Speaker 1

It's got to be it's got to be a Microsoft URL, not a local r L. Yeah, because obviously it's a Microsoft thing.

Speaker 3

And yeah, I also I'm not one hundred percent sure when the show goes out, and I don't know everything about what's going on in engineering.

Speaker 4

So I guess just like moving. So it's coming out next week, we're pretty close to It's basically, if you're invoking into a foundry, you have access to everything there, deep seeks there, like you name it right.

Speaker 3

Yeah, absolutely, Well.

Speaker 1

The issue I'm getting at is that we see a future where companies want my customers won't allow me to use a tool that shares its data across the internet in any way because they're so concerned about privacy and digital rights and all that stuff that they don't want their database schemas and their code, all that stuff going out. So we're going to have to use I'm going to have to use In one case, I have a customer coming up here that's going to demand it. It's already

in the contract. So unless I have access to a local LLM, I can't use an agent.

Speaker 3

I can. Yeah, I can absolutely understand that that as a requirement for organizations I think, and I'm biased because I speak to GitHub customers, but I often get the question about I need to run my LM locally because I don't want to stand to copilot. And I asked them where their code is stored. It's on GitHub, Like, well, then we have your code already, like we're I mean, if you're worried about that that boundary, it's already there.

But I can understand, Yeah, definitely, if there's some parts of your code that you you do not want to expose outside at all, then running local models definitely works. I think that part of the issue is that the models that we tend to use by default, so the claudes and the code exits and the you know, the rest of those. They are huge things that run on a ones you know, in the cloud, and you know this show is very successful, but you don't have the money to run one of those in your basement.

Speaker 1

I don't think, oh yeah, well, you know, the machine that I have running Olama is very fast and it produces results instantly, in fact, even quicker than I can get out of cloud. Yes, because I'm the only one using it.

Speaker 3

It definitely is no longer the case that you have to use the cloud ones to get a good result. I mean, for a while, you know, you could run these medium language models I guess, or small language models on your machine and get results. But they went very good. These days, they're pretty good, Like you get to.

Speaker 1

Around a thirty billion model and it's fast. Yeah, they do it because they got.

Speaker 3

Nasty, stinky fast the boring part, without a doubt.

Speaker 4

But yeah, it does speak to a future of a programming team, you know, ten fifteen people working together with a rack of their own compute just from the token expense perspective, like if you're really going to go grind it on this stuff. This whole excuse of I didn't get my future finished because I ran out of tokens.

Speaker 3

That's got to end. There's got to be our solution to that one, right, Yeah, and that that is actually one of the selling points of of these tools that GitHub as well as Microsoft as well as a bunch of other companies have been trying to push. Is this idea of abundance where you can you can fire off a problem to ten different agents and then see what

the best result is. But you're right, that does cost tokens. Yeah, and yeah, I mentioned before that you don't necessarily get built by the number of you know, cycles that you go through, and that's true, but you're still using those those tokens behind the Somebody is still using those those cycles behind the scenes.

Speaker 4

You remind me, and I think I said it's the last show too. Like I come from the ol app world where we used to build these huge cubes and

do an ounce on them. And back then there were several different data mining strategies, but all of them are competationly bit complex enough that you would literally spend weeks figure out which one to run, and then at some point in that when the cloud came along, it's like, you know what an easy answer is run them all and then have another another test to figure out which one

was best. Yeah, you know, there was only like I think at one point we test there was like twenty two different strategies of data mining on this data sets. So just and you can get there with the software developments that I too, where it's like, yeah, there's a bunch of different ways to try and solve this particular problem.

Speaker 1

Just try them all, try them all, especially getting an agent to help you with that.

Speaker 3

Yeah, the fan out model of kind of like get five agents to try and solve it and then come back and see which one is the best. Yeah.

Speaker 4

Yeah, that's crazy, man, And we're I mean right now where there's all this conversation about GPU constraints and COVID clasts and so forth, like that's clearly not what we're going to do at the moment, but you can see a world like that.

Speaker 3

The irony is though, that to get the best results sometimes you have to break things down in just the smaller part. So I think we've mentioned this before, but breaking it down into the smaller tasks and then firing those tasks off. So the idea of you know, how you implement this grand feature that just use one you know, agent and one like back and forth cycle that's not feasible.

You're not going to get a good result. So there's probably a happy medium here where you can break it down at a level that's going to give you a good result and then fire that work off and maybe use smaller model to do those initial pieces of work local. So I know that there's definitely work behind the scenes to do things that give you a bit more control over what models are used for different pieces of work. So you know, if you want to create a specification document,

maybe you don't need the gigantic, huge thinking model. Maybe you just need a smaller one that you can cost a lot less money, that's quicker and iterate a bit more.

Speaker 1

That seems like something that could be a preprocessor, right that could look at your prompt and determine which model to send it to.

Speaker 3

Yeah, well, honestly, that's what happens these days. When you send something to Gemini or Claude or chat GPT. The first iteration really is what does this person want to do and where do I send that information? So when you're like MCP for example, which has become really really popular, that kind of first initial what does this person asked for? It's a Hey, here are the tools that you've got

available to you, and here's what they do. And here's the question that the person asked, what tools do I need to solve that problem? And the skills Yeah, the skills or the whatever whatever else you want to call it. I guess, yeah, winning around. But yeah, that that first iteration can be done with quite a small model. And I think there's some of that that happens locally in VS code. So if you have a look to the folder structure. I think I did this a little while ago.

There is a like a small model that sits there that does some of that work up front, so you don't necessarily have to send send that information and you can literally.

Speaker 4

Do this with confidence. You know expressions too, right, It's like I run it through a small model, but it's such a known it may be a complicated request, but it's such a known function. It's back back a chunk of code with a ninety something percentile. Confiden, It's like, do not continue, but if you come back in sixty percent, all right, kick it up into a higher model and see if we can get to a higher a percentage.

So you're only confusing as much computis absolutely need recognizing all those small models are far cheaper than by the time you get to the big one. And if you needed that one time, you know the time is worth it.

Speaker 3

And they don't necessarily need to be language models. They can be a bit more specific, you know, the old school what we used to do with machine learning before we just threw everything at an l M, where you know it's a little bit more specific to the task. Yeah, this person ask this question, here are the tools? What should you know? What do you need to do? You don't need to absorb to have absorbed the entire Internet text to in that question. Yeah, I keep waiting.

Speaker 1

We were just talking about this last week.

Speaker 4

Yeah, sure, I keep waiting for a completely software development centric model to be built. It's it's not good in anything else except software, and even then you'd probably want to tailor down. Hey, I only need one that's good at C share of javascripted SQL.

Speaker 3

Yeah. I think the sticking point at the moment of those is that if you're asking for things in natural language, you get a huge shortcut just by absorbing the world's natural language. And then the pattern matching means that it kind of can interpret that really well, and then there's the coding pit. But yeah, I mean you could break that up into two pieces. One that's really good at interpreting what's required, another one that's really good at writing code.

I'm not deep enough into the engineering side of co Pilot to, you know, to know what they're doing along those lines, but that sounds total.

Speaker 4

I know from some of the papers that I've read, there's lots of conversation about specialist models and an orchestrator is in front of it that says that makes that assessment and then sends to various orchestrator the specific models and in order. It's almost like a map reduced, break it up, fire it after multiple enities, roll it back together.

Speaker 3

Yeah. I suspect that's the future for sure. Yeah.

Speaker 4

Yes, it's back to that thing you said we couldn't do, which was sending other of all prompts. Like the fact is, if this architecture works what it is, you'll send them other role prompts out and it'll carve it up for you.

Speaker 3

Yep. So there's some really good projects that kind of that help you do that. I mean, the problem with sending them other orble prompts is we have never been wonderful at defining exactly what we want to people. Even so by the time you get the requirements document and then the specification document and then the you know, the functional spec list. To a programmer, there's something has been lost.

Speaker 4

And the English language especially is you know, almost intentionally vague. Yes, so often it's tough to be clear of what you intend once you get past the first paragraph.

Speaker 3

Yeah. Like we say, the lms are non deterministic, but I mean really human language is nondeterministic. So even if you could send that mother of all prompts and get you know, exactly what you've asked for back, there's going to be some assumptions made along the way, and then you've probably got some things wrong. So there's projects like spec kit or just that technique I guess which which kind of walk you through that process of Okay, so

what are the requirements that you have. Here's a big markdown document, look through that make some changes however you want, either directly yourself or just asking the agent to do that work for you. That's also in plan mode in

viscode as well. And the idea is, you know, you are defining it as accurately as you can, and you're working kind of with the agents to do that, and then you move on to the specification of this is what technical implementation we need, and it's still using large language models and coding tools to do that work or to help you do that work. But ultimately what you end up with is these quite well defined individual tasks that then you can get good results from when you

send them to a coding agent. They're small pieces of work. You can put tests around them, you can examine what comes back from each of those pieces, and then you know that gives you a little bit more rigor around what code ends up coming out. Then even if you send a mother of all prompts to something like the like the Copilot coding agent, or if you send an issue to that agent, that's kind of what it's doing

in the background as well. It says, right, i've got this issue, I've got this description that the person's given me. Let's break that down into requirements, let's break it down into its specification. Let's define a bunch of tasks to do, and then let's do those tasks. And I think we've found that you get a much better result doing it that way than you would otherwise.

Speaker 4

I certainly seen a lot of folks now starting to use the tools to write the prompts before running the prompts, and I do you know, I wonder if we shouldn't be making more of this confidence equation visible as you go along. It's like, Hey, I don't want to actually attempt to execute this prompt until it's in the ninetieth percent pile of confidence. So let's keep iterating on it until we can clear up the vagaries that will probably you know, send the tools down the wrong path until

you're really competent. This is an awesome prompt as a very good chance of producing code that's useful. It's interesting, right, like it is. That's the whole can of worms. We go back and forth. But it's something I think I'd done as a PM anyway, where you know, the customer gam theset requirements is like, wow, these look like requirements, except that you didn't actually clarify anything here. Let's go work through the yeah, And.

Speaker 3

That kind of gets to the core of the problem is if if the people doing the work don't actually care a lot about the quality of what's coming out, then you can't make them care. And that's again not a new problem. It's just you can get a lot further without caring these days, I think a lot faster. And so yeah, and again that's kind of why, that's

kind of where you can put some rigorin. So if you have these CI builds and these quality checks and gates, you can even do gates like with AI in them to say, hey, can you just make sure that this has been looked at? And this has been looked at and it's vague enough that the large language model will be how to interpret you know, yeah, it doesn't look like they've looked at accessibility on this page at all.

Those things can be used as well. So having that that pipeline that we've always had, even if it's kind of AI enhanced, can can mean that if you do have people who don't really care about the quality of what comes out, you still have those those checks and balances in the way.

Speaker 4

I appreciate that. And you're again leveraging the strength of these tools to be meticulous and good at going down checklists and catch all those details so that you can be sure of things when the tool gets to avolved. The checklists and they're all red you know checks or green checks you've done something good.

Speaker 3

Yeah, Yeah, that's exactly right. Yeah. I mean they're powerful tools. Using them for the right reasons in the right way is you know, the way we should be doing it. Just like you know, one sentence prompt it looks okay, you'll lower it into PROD. That's not really the right right way to use them.

Speaker 4

You're going to find out in a hurry you were wrong.

Speaker 1

That's right, damnian. Is there anything we missed that we're we should bring up before we sign?

Speaker 4

That?

Speaker 3

Nothing huge? I just I just wanted to jump on that that comment about the confidence intervals that Richard had. I feel like that's something you could absolutely add into your custom instructions or your agent's MD file or whatever it is that you use to say, each time I ask you to help me define something, can you give me a confidence interval at the end to tell me

how how accurate you think you are? And that being a you know, standing order would mean just imprompt Yeah, that kind of sits in there.

Speaker 4

It's instead of looking at the developer and seeing if they look confused or not as your prompt isn't that good?

Speaker 3

You know?

Speaker 4

Now you have the software giving you a number. Although if you're smart, you just make it into a set of faces too, right, He goes, oh, well that's the sixty percent. Look confused. You could yeah, think about how much we already do this when you're trying to pass instructions along and just like do you get this? Do you are you on board? Like, don't what we need to do this to the software too.

Speaker 1

I still wish these things would say I don't know the answer when they don't instead of making stuff up.

Speaker 3

Yeah, you can be a little explicit about that. And again, custom instructions agents MD that kind of thing to say. Look, if if you know you're not absolutely sure about something I've asked, then please prompt me to define that a bit more clarifying or something to clarify. Yeah. But then I mean you sometimes end up in a loop where it tries to be far too helpful and you're constantly field questions yeah, and you almost have to interrupt and say no, I think I think you've got it. Just just go.

Speaker 4

You've had enough detail, Now go work. But you know what, you've also had that employee that you know, rather do the work, would like to just keep asking questions.

Speaker 1

I've had to press control C a couple of times, you know, when you know Claude in GitHub co pilots Cli says, ah, I know the I see the problem and it's completely wrong, you know, and then it goes off trying to fix these control see break ye, no you're wrong about that. And then it comes back with of course you're exactly right. Yeah, yeah, you're right.

Speaker 4

Turn the shining is down.

Speaker 1

It's been great talking to you, Damien, and come back again soon.

Speaker 3

Absolutely, I won't wait until the next job this time.

Speaker 1

I think definitely. We have a lot to talk about, all right, and we'll talk to you.

Speaker 5

Next time on dot net rocks.

Speaker 1

Dot net Rocks is brought to you by Franklin's Net and produced by Pop Studios, a full service audio, video and post production facility located physically in New London, Connecticut, and of course in the cloud line at pwop dot com. Visit our website at d O T N E, t R O c k S dot com for RSS feeds, downloads, mobile apps, comments, and access to the full archives going back to show number one, recorded in September two thousand and two. And make sure you check out our sponsors.

They keep us in business. Now go write some code. See you next time you got jam bands

Speaker 4

And

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android