AI tools for software engineers, but without the hype – with Simon Willison (co-creator of Django) - podcast episode cover

AI tools for software engineers, but without the hype – with Simon Willison (co-creator of Django)

Sep 25, 20241 hr 13 min
--:--
--:--
Listen in podcast apps:

Episode description

The first episode of The Pragmatic Engineer Podcast is out. Expect similar episodes every other Wednesday. You can add the podcast in your favorite podcast player, and have future episodes downloaded automatically.

Listen now on Apple, Spotify, and YouTube.

Brought to you by:

Codeium: ​​Join the 700K+ developers using the IT-approved AI-powered code assistant.

TLDR: Keep up with tech in 5 minutes

On the first episode of the Pragmatic Engineer Podcast, I am joined by Simon Willison.

Simon is one of the best-known software engineers experimenting with LLMs to boost his own productivity: he’s been doing this for more than three years, blogging about it in the open.

Simon is the creator of Datasette, an open-source tool for exploring and publishing data. He works full-time developing open-source tools for data journalism, centered on Datasette and SQLite. Previously, he was an engineering director at Eventbrite, joining through the acquisition of Lanyrd, a Y Combinator startup he co-founded in 2010. Simon is also a co-creator of the Django Web Framework. He has been blogging about web development since the early 2000s.

In today’s conversation, we dive deep into the realm of Gen AI and talk about the following: 

• Simon’s initial experiments with LLMs and coding tools

• Why fine-tuning is generally a waste of time—and when it’s not

• RAG: an overview

• Interacting with GPTs voice mode

• Simon’s day-to-day LLM stack

• Common misconceptions about LLMs and ethical gray areas 

• How Simon’s productivity has increased and his generally optimistic view on these tools

• Tips, tricks, and hacks for interacting with GenAI tools

• And more!

I hope you enjoy this episode.

In this episode, we cover:

(02:15) Welcome

(05:28) Simon’s ‘scary’ experience with ChatGPT

(10:58) Simon’s initial experiments with LLMs and coding tools

(12:21) The languages that LLMs excel at

(14:50) To start LLMs by understanding the theory, or by playing around?

(16:35) Fine-tuning: what it is, and why it’s mostly a waste of time

(18:03) Where fine-tuning works

(18:31) RAG: an explanation

(21:34) The expense of running testing on AI

(23:15) Simon’s current AI stack 

(29:55) Common misconceptions about using LLM tools

(30:09) Simon’s stack – continued 

(32:51) Learnings from running local models

(33:56) The impact of Firebug and the introduction of open-source 

(39:42) How Simon’s productivity has increased using LLM tools

(41:55) Why most people should limit themselves to 3-4 programming languages

(45:18) Addressing ethical issues and resistance to using generative AI

(49:11) Are LLMs are plateauing? Is AGI overhyped?

(55:45) Coding vs. professional coding, looking ahead

(57:27) The importance of systems thinking for software engineers 

(1:01:00) Simon’s advice for experienced engineers

(1:06:29) Rapid-fire questions

Where to find Simon Willison:

• X: https://x.com/simonw

• LinkedIn: https://www.linkedin.com/in/simonwillison/

• Website: https://simonwillison.net/

• Mastodon: https://fedi.simonwillison.net/@simon

Referenced:

• Simon’s LLM project: https://github.com/simonw/llm

• Jeremy Howard’s Fast Ai: https://www.fast.ai/

• jq programming language: https://en.wikipedia.org/wiki/Jq_(programming_language)

• Datasette: https://datasette.io/

• GPT Code Interpreter: https://platform.openai.com/docs/assistants/tools/code-interpreter

• Open Ai Playground: https://platform.openai.com/playground/chat

• Advent of Code: https://adventofcode.com/

• Rust programming language: https://www.rust-lang.org/

• Applied AI Software Engineering: RAG: https://newsletter.pragmaticengineer.com/p/rag

• Claude: https://claude.ai/

• Claude 3.5 sonnet: https://www.anthropic.com/news/claude-3-5-sonnet

• ChatGPT can now see, hear, and speak: https://openai.com/index/chatgpt-can-now-see-hear-and-speak/

• GitHub Copilot: https://github.com/features/copilot

• What are Artifacts and how do I use them?: https://support.anthropic.com/en/articles/9487310-what-are-artifacts-and-how-do-i-use-them

• Large Language Models on the command line: https://simonwillison.net/2024/Jun/17/cli-language-models/

• Llama: https://www.llama.com/

• MLC chat on the app store: https://apps.apple.com/us/app/mlc-chat/id6448482937

• Firebug: https://en.wikipedia.org/wiki/Firebug_(software)#

• NPM: https://www.npmjs.com/

• Django: https://www.djangoproject.com/

• Sourceforge: https://sourceforge.net/

• CPAN: https://www.cpan.org/

• OOP: https://en.wikipedia.org/wiki/Object-oriented_programming

• Prolog: https://en.wikipedia.org/wiki/Prolog

• SML: https://en.wikipedia.org/wiki/Standard_ML

• Stabile Diffusion: https://stability.ai/

• Chain of thought prompting: https://www.promptingguide.ai/techniques/cot

• Cognition AI: https://www.cognition.ai/

• In the Race to Artificial General Intelligence, Where’s the Finish Line?: https://www.scientificamerican.com/article/what-does-artificial-general-intelligence-actually-mean/

• Black swan theory: https://en.wikipedia.org/wiki/Black_swan_theory

• Copilot workspace: https://githubnext.com/projects/copilot-workspace

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems: https://www.amazon.com/Designing-Data-Intensive-Applications-Reliable-Maintainable/dp/1449373321

• Bluesky Global: https://www.blueskyglobal.org/

• The Atrocity Archives (Laundry Files #1): https://www.amazon.com/Atrocity-Archives-Laundry-Files/dp/0441013651

Rivers of London: https://www.amazon.com/Rivers-London-Ben-Aaronovitch/dp/1625676158/

• Vanilla JavaScript: http://vanilla-js.com/

• jQuery: https://jquery.com/

Fly.io: https://fly.io/

Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].



Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe

Transcript

Every programmer who works with these models the first time it spits out like 20 lines of actually good code that solves your problem, does it faster than you would? There's that moment when you're like, hang on a second, what am I even for? And then I tried this new feature of Chatchy PT that they launched last year called Code Interpreter Mode. And I asked a question and it flawlessly answered it by composing the right SQL query, running that using the Python SQLite library and spitting out the answer. What am I even for? Like, I thought my life's purpose was to solve this problem.

But, I think, it's a potential thread. It is scary when you think, okay, I earn a very good salary because I have worked through the trivia of understanding Python and JavaScript. And I am better with that trivia than most other people. And now you've got this machine that comes along and it's better with the trivia than I am. I feel like there's a pessimistic and an optimistic way. The optimistic version, I can use these tools better than anyone else for programming. I can take my existing program knowledge and when I combine it with these tools, I will run circles around somebody who's never written a code line of code in their life.

I can just do this stuff back to. Welcome to the Primatic Engine and Podcast. In this show, we cover software engineering at big tech and startups from the inside. You'll get deep ties with experience engineers and tech professionals who share their harder and less interesting stories and practical advice that they have on building software. After each episode, you'll walk away with pragmatic approaches you can use to build stuff, whether you're a software engineer or a manager of engineers.

In this first episode, we go into a really timely topic using Gen AI for coding. Now there's no shortage of AI companies hyping up their capabilities, but we'll slice stuff all of that. I turned to longtime software engineering Simon Williston, who has saved to refer to as an independent investigator of a large language models because he's been using them so much to improve his personal productivity for the last four years.

Simon, we have a refreshingly honest conversation on how these tools actually work for us developers as of now. We talk about common LLM use cases like fine tuning and rag, Simon's day to day large language models stack and misconceptions about large language models. This is the first episode of many such deep types to come. Subscribe to get notified of when you never filter out.

So Simon, welcome to the podcast. Hey, it's really great to be here. So it's great to have you here. You're an experienced software engineer and you've definitely been around the block. So some people will know you from your prolific open source contributions co creating the Django frame work, which is a rapid web development tool,

written in Python, you're also the creator of a data set tool for exploring and publishing data. And then you're also a startup founder, right. So I remember you were the founder of Lanyard conference direction site, which was funded by Y combinator acquired by event right. And then you were there for six years as an engineer has a manager. So you've really done all the all the things open source founder working at a large company.

I got to, I got to do the startup to large company thing is is particularly interesting, you know, like moving from moving at their speed of a startup to moving the speed of a much larger company where bugs matter and people lose money if your software breaks.

And when I started to notice you more is when around the time when chat GPT came out and you were very hands on in trying out what this works for your development workflow. You shared a lot of things on your blog. And really this is what we're going to talk about today. First hand learnings about how this AI enhance development helps your specific workflow where it doesn't help and what you've learned through this. How many years has it been two, three years of.

Well, so I was on GPT three before chat GPT came out. So I'm at about I'm verging on three years of using this stuff frequently. It got exciting when chat GPT came out GPT three was interesting, but chat GPT stats when the whole world started paying attention to it. To kick off. I'm interested in how you got started with with these large language model tools. What was the first time you came across them and you're like, all right, let me go to the go.

So I've been paying attention to the field of machine learning on a sort of as a sort of like side side interest for five or six years. I did the fast AI course Jeremy how it's caused back in I think 2018.

And then when and then GPT two came out in was that 2019 to that. Yeah, it's 2019 GPT two was happening, which was the first of these models that you could see there was something interesting there, but it was not very good like it could you could give it text to sort of complete a sentence and sometimes it would be useful.

So I went back then where I tried to generate New York times headlines for different decades by feeding in say all the New York times headlines in 1950s then the 1960s and 1970s and then giving its stories to complete now. And I poked around for a bunch. The results were not exactly super exciting.

And then they've lost interest at that point, to be honest, and then GPT three, which came out in 2020. But sort of began to be more available in 2021. That's when things started getting super interesting because GPT was the first of these models that was large enough that it could actually do useful things.

One of the earliest code things I was using it for was I think I was using it for JQ the little Jason programming language, which I've always found really difficult. It just doesn't quite fit in my head. And I was finding the GPT three if I prompted it in the right way. And this was a model where you had to do the completion props. You don't ask the question get an answer. You say the JQ needed to turn this into this is.

And then you stop and you run out in the model and it finishes the sentence, which I think is the reason most people weren't playing with it. It's a weird way of interacting with something. Like in many ways, the big innovation of chat GPT was they added talk. They added a chat interface on top of this model. And so now you couldn't you didn't have to think in terms of completions.

And then you could ask a question and get an answer back. But yeah, so it was very clear back then sort of. And that was running it for about 12 months before chat GPT came along. There was something really interesting about this model and what it could do. And that was also the point where it came clear that code was actually something was surprisingly good at.

And I talked to somebody I play I asked them, it's like, were you expecting it to be good at code? And they said, you know, we thought maybe, but it wasn't one of our original goals. Like the original goals of these models were much more things like translation from one human language to another, which they do incredibly well. But when you think about it, the fact that they can write code well isn't that surprising because code is so much simpler that like English or Chinese or German.

Yeah, yeah, we'll put it together. Well, we know I think it's pretty obvious. And I think we'll talk about implications. But let's just jump a little bit ahead. So I think like I personally had a while. This is amazing moment with LLM's. And then I've also had a bit of a like scared moment of like, is this could this actually replace part of what I do or not. And you had a really interesting story with that a proper like this is scary moment. Can you talk about that?

I mean, I've definitely I've had a few of those. I think every every programmer who works with these models. The first time it spits out like 20 lines of actually good code that solves your problem does it faster than you would. There's that moment when you're like hang on a second. What am I even for?

But I had a bigger version of that with actually with my main open source project. So I build this tool called data set, which is a interface for querying databases and I analyze and data creating JSON APIs on top of data, more of that kind of stuff.

And the thing I've always been trying to solve with that is I feel like every human being should be able to ask questions of databases like it's absurd that everyone's got all of this data about them. But we don't give them tools that let them actually you know dig in and explore it and filter it and try and answer questions that way.

And then I tried this new feature of chat GPT that they launched last year called code interpreter mode. This is the thing where chat GPT you can ask the question it could write some Python code and then it can execute that Python code for you and use the result to continue answering your question.

And code interpreter mode has a feature we can upload files to it. So I uploaded a SQLite database file to it, like just the same database files that I use in my own software. And I asked the question and it flawlessly answered it by composing the right SQL query, running that using the Python SQLite library and spitting out the answer.

And I sat there looking this thinking on the one hand, this is the most incredible example of like being able to ask questions of your data that I've ever seen. But I didn't have what am I even for like I thought my life's purpose was to solve this problem. And this thing this new tool is solving my problem without even really thinking about it like they didn't mention oh it can do SQLite SQL queries as part of what it does just like Python.

And that was fun and that was a little bit existential dread. The way I've been coping with that is thinking okay well my software needs to be better than chat GPT code interpreter this particular problem if I mix AI features into it.

So I've started exploring what to plug in my software look like that ad large language model based like built on a SQL query against this schema all of that kind of stuff. But it's interesting like it did very much change my mental model of the problem that I was trying to solve because it took such a big bite out of that problem.

This episode was brought to by codeon the AI tool of choice or professional software developers that is CODE I U M. Code you moves TDM premier development through a suite of state of the art AI capabilities available the extensions and all of your favorite IDs such as VS code jet brains visual studio eclipse X code near them Jupyter notebooks and more.

Uniquely code is fully enterprise ready as proof it had multiple regulated fortune 500 companies counted within his thousands of enterprise customers. Join the 700 thousand developers using codeoms individual free plan and ask your company to consider a free trial of the enterprise plan. To learn more about codeon visit codeon calm slash pragmatic that is CODE I U M.com slash pragmatic.

What I noticed is you have been experimenting a lot with trying out how different LLMS will work you've been running models locally you be but obviously trying a lot of like you know there's a usual suspect tools but but even beyond that.

Can you share a little bit on how your initial impressions were because you were already on the early versions of the tool from from chat GP to co pilot to some other things and how your stack has changed or refined to actually make you more productive because it sounds like you are more productive.

Yes very much so I mean yeah so I've been calling myself an independent researcher when when it comes to this kind of stuff because I've got the time to to I can dig into these things I write a lot like I've been blogging about this since when I've since when I first started investigating it and yeah I mean I'm like a GPT three I was basically using it through their playground interface which still exists today it's the the the API debugging tool for this stuff.

And it was fine like and I was using it to solve I experimented with having it like right documentation but I've always felt a bit funny about publishing words that I didn't write because I because I do so much writing myself and a little bits and pieces of code but I didn't really get into the coding side until after chat GPT came out and I did the advent of code that December and the sort of month long programming.

Did this was 2020 to the summer right yes November to a chat November 30th is when chat GPT came out and so I spent December trying to learn rust with its assistant which didn't it was interesting I got a reasonably long way rust is actually I still don't know rust rust the memory management in rust is just difficult enough that language models still have trouble with it like one of my test of a new language was OK can it explain the rust rust borrowing to me and

they're getting to a point where I'm almost understanding it but it's it's an interesting sort of stress test for this whereas if you use these models for JavaScript and Python they're phenomenally good there's so much more training data about JavaScript and Python out there than there is for for a language like rust that honestly they they they just completely saying and that's great for me because the code the languages I use every day are Python and JavaScript and secret and those are the three languages the language models are best so I'm

perfectly positioned to have these things be be useful and helpful for me and they've also got an inch I I tend to pick like a set boring technology like Django which language was no already you know if you're sticking if you stick with Django they're going to be able to do pretty much anything that you ask of them but So I tried learning Rust and that was a really good exercise for just every day

trying these things out and seeing what could happen. One of the key things I've learned that I think people don't necessarily acknowledge these things are really difficult to use. And there's a lot of, it's not just skill, there's a lot of intuition you have to build up in order to use them effectively. Like if you just sit down and ask a question like you'd ask on Stack Overflow, you'll probably not get a great response. And a lot of people do that and then

they write the whole thing off. They're like, okay, it didn't give me what I wanted. This is all hype. There's no value here. The trick is firstly, you have to learn how to prompt them. You have to more importantly, you have to learn what kind of things they're good at and what kind of things they're bad at. Like I know, because I've spent so much time with them, the Python JavaScript, they're great at Rust, they're not quite as good at yet. I know that you shouldn't ask them about

current events because they've got a training cutoff in terms of what they understand. I know that they're terrible at like mathematics and mathematics, math and logic puzzles. Don't ask them to count anything, which is bizarre because computers are really good at maths and counting and looking things up. And language models, those are the three things they're not good at. And they're almost, supposedly our most advanced computers. But so you have to build this quite intricate

mental model of what these things can do and how to get them to do those things. And if you build that mental model, if you put the work in, you can scream with them. There is, you can work so quickly at solving specific problems when you say, oh, this is the kind of thing that language model can do. And then you just outsource it to your, I call it my weird intern sometimes.

Well, there's other things you're like, okay, well, it's not even worth trying that on language model because I know from past experience that it won't do a good job with it. So like as a software engineer, I mean, we do have a bit of an enduring mindset, but you know, there's, when you see a new technology, I mean, you know, clearly this is, this is here, it's not going on a way. But there's two ways you can look at it. One is, I think you kind of

explain. You start playing with it. You try stress testing it. You see where it works, where it doesn't. And the other one is you start from a theory. You understand how it's built, how it works, what's behind the scenes. And then you start probing and then you have it. You know,

I think this is a little bit with the way computer science is taught. Like if you go to university, like when I went to computer science, we started with algebra and and and some like formal methods and and languages and kind of coding was a little bit we got there by the end. And they're like, well, yeah, I guess I I now know what happens under the compiler. But

obviously there's the other route as well. In your like, you know, view, like was there, sounds like you kind of like jumps straight into like, let me see how this actually works. And let me not overthink the theory, which right now, if you start with the theory, it will hold you back. Like this specific specific technology, it's weirdly, it's weirdly harmful to spend too much time trying to understand how they like how they actually work before you start playing with

them, which is very unintuitive. Like I have friends who say that if you're a machine learning researcher, if you've been training models and stuff for years, you're actually disadvantaged to start using these tools. And if you come in completely fresh, because because they don't, they're very weird. You know, they don't react like you expect regular like other machine learning models. And machine learning people always jump straight to fine tuning fine tuning on these things

is mostly a waste of time. Like people, it takes a lot of people a long time to get to the point where like, you know what, there's no point in fine tuning at my own custom version of this. And just next month, and just just a break out of fine tuning, because I think like we hear this word a lot, but by fine tuning, like, you mean that you take, you know, the model and then you add more training to, and you run, run training cycles. And it's a very confusing term, because

yeah, so the idea with fine tuning is you take an existing model. It might be one of the openly licensed models or actually like, I think Lord has this now GP and open air have APIs where you can upload like a CSV file of a million examples and they will and spend a lot of money with them. And they will give you a model trying to do that. And it sounds so tempting. Everyone's like, wow, I could have a model that that's perfectly attuned to my specific needs. It's really difficult

to do. It's really expensive. And for most of the things that people want to do, it turns out it doesn't actually solve the problem. Lots of people think, I want the model to know about my documentation. My company's internal documentation. I wanted to ask questions about that. Surely I find tuner model to solve that. That it turns out just plain, just plain, doesn't work because the weight of all of the existing knowledge the model has completely overwhelms anything that you try and

add into it with fine tuning. The models, they hallucinate more if you on questions about things if you've done that extra fine tuning step to add knowledge, which is a surprising thing. Where fine tuning does work is for sort of tasks. Like you can, if you want a model that's just really good at SQL, you can give it 10,000 examples of here's a human question of the SQL schema and here's the SQL query and that will give your model the stronger that kind of activity. But for adding new fact

into the model, it just doesn't work, which confuses people. And so then you have to look at the other techniques for solving that problem. There's a thing called rag, which is a very fancy acronym for a very simple trick. It stands for retrieval augmented generation. All it means is the user ask question, you search your documentation for things that might be relevant to that question, you copy and paste the whole lot into the model. And these models can take quite a lot of input now. And then

you use this question at the end. That's it. Right. Super, super simple. Don't get that. It's so simple. I actually wrote an article about it. And I had a one of the people who guessed wrote it, built an open source tool to do your own rack training and you could plug in chat GPT and I did it. I understand the code itself was very simple. And I was like, is this all there is to it? Like you just break it up into chunks, you get some embedding so you can

figure out where your search will end you. And then you just add in that extra thing. And the only thing obviously you can go down the rabbit hole. But for simple rag is you decide on the context window size for the most part. And I was like, and I was amazed at how well, as you said, like it seems so simple. So I looked at the code and I said, well, this, I mean, I'm not expecting much. And when I tried it out, it worked really well. So one of those counter, I,

I feel there's some counter intuitive things. Yeah. So rag, it's the hello world of building software on top of LLMs. Like you don't get it to pre hello world, you get it to answer questions about your documentation. I've been like 30, 30 lines of Python. I've got one version that's like two dozen lines of batch, I think. It's very easy to get the basic version working. But getting good rag working is incredibly difficult because the problem is that if you built the system and you

know how it works, you're naturally going to ask questions of it in the right kind of format. The moment you expose it to real human beings, they will come up with an infinite quantity of weird ways that they might ask questions. And so the art of building good rag systems, the reason that it could take six months to actually get it production ready is figuring out, okay, they were all of these different ways that it can go wrong. And the key trick and rag is always how do we fill that context?

How do we pick the information that's most relevant to what the user is asking, which is really hard. That's actually like it's an information retrieval problem. It's what search engineers have been trying to figure out for 30 years. And there's a lot of depth to that field. So rag, just like everything else in language models, it's fructally interesting and complicated. It's simple at the top and then

each little aspect of it gets more and more involved the further you look. One of my favorite difficult problems in this is what's called in the industry evals, right, automated evaluations. Because when you're writing software, we write automated tests. We write unit tests and they tell us of our software works and that's great. You can't do that with language models because they're

non-deterministic. Like they all make very rarely return exactly the same answer. So we don't even have unit testing, but with things like rag, we need to have automated tests that can tell us, okay, we tweaked our algorithm for picking content. Is it better? Like does that do a better job of answering questions? It's really difficult. I'm still trying to figure out the right

path to this myself. And I saw what someone was working on it. Hey, I company, the weird thing that I would just it just feels it breaks all that we know is they have this evil test suite, which runs against their model. Whenever they make a change, they run it and you told me like, okay, it's it costs us $50 to run this every single time. And yeah, this is just something I don't think we've been used to. Like, you know, like I run my test, like as a software and you're on my unit

test integrate, I know how much time it costs me. But suddenly, obviously they're using different APIs whichever vendor. This is just it feels like there's a bit of a this clearly used to be the thing before my time, at least like back when they were, you know, servers or mainframes or computing time was expensive, but suddenly like, this is just yet another interesting variable. So yeah, yes, so you don't want to run those on every commit to your repository that will bankrupt you

pretty quickly. Yeah, yeah, funny that with evals, one of the most common techniques is what's called LLM as a judge. So, you know, if you're trying to say, okay, I'm building a summarizer. Here's an article I want to summarize. Here's the summary. How can you write tests against a summary to check that it's actually good? And what a lot of people do is they outsource that to another model. So they produce two summaries and then they say, hey, GPT-4, which if these two

summaries is best? And I find that so uncomfortable. Like, this stuff is all so weird and difficult to evaluate already. And now we're throwing in another layer of weird language models to try and give us a score from our previous language models. But that's kind of these are the kind of options that we're exploring at the moment. Yeah, that's it's interesting. I was speaking about our option. So you've experimented a lot with trying out different tools, including build building

your own and obviously copilot and other models. I saw you mentioned Claude, for example, is what when you're playing with. What is your current LLM stack and like day-to-day, how do you use it for actually coding on data set or on your projects? So my default stack right now is my default model is Claude 3.5 Sonnet, which is brand new. It came out maybe three weeks ago. I heard it's amazing for coding. It's amazing for everything. It is the first time somebody

who's not open AI has had the clearly best model. Like, it's just better than open AI's best available models at the moment. The team behind it, the company behind it, anthropic are actually a splinter group from open AI. They split a couple of years ago and apparently it's because they tried to get Sam Ulton fired, which you can't do. Right, we saw this happen probably six months ago. But they were like, they were early adopters two to half years ago. They tried to get Sam out

and fired. It didn't work. They quit and spelt up their own company. They were some of the people who built the built GPT-4. So it's actually the sort of GPT-4 original team. But anyway, Claude 3.5 Sonnet is unbelievably good. It's my default for most of the work that I'm doing. I still use GPT-4O, which is open AI's probably their best available model for mainly because mainly for two features. It's got code interpreter mode. This thing where it can write Python code and then execute that

Python code. So it's sometimes I'll throw a fit-ly problem at it and I'll watch it try five or six times until it works. And I just sit there and watch it going through the motions. So I use that a

lot and then ChatGPT has the voice mode, which I use when I'm walking my dog. Because you can stick in a pair of AirPods and you can go for an hour-long walk with the dog and you could talk to this weird AI assistant and have it write you code because it can do code interpreter and it can look things up on the internet and such like so you can have a very productive hour-long conversation while you're walking the dog on the beach. I was not expecting all the other things. Yeah. That's

that is the most dystopian sci-fi future thing as well. Like the voice mode and this isn't the fancy new voice mode they demoed a few weeks ago. This is one they found for like six months. It's so good. The the the the internet. Wow. The voices. It's like having a conversation with an intern who can go and look things up for you. And then so you see mentioned the the stack but like if I imagine your data or you know you've got your terminal or your code. There's more to my stack.

So it's um those are the ones I'm using in my browser and on my phone. I use get I do I use get hub copilot. I've always got that turned on. I use my I've built been built in so I've been told called LLM which is command language. Just a question. A cool pile. What features do you use because it's now has a competing feature. It does have a chat window if you want to use that. Yeah. It has a lot of complete. Which ones you find most useful for your use case? Mostly

auto complete like old school copilot. I've recently started using the thing where you can select some lines of code. There's a little sparkly icon. You can click that and then give it a prompt to work run against those lines of code and it'll do that. I don't use the chat window at all. I use Claude free. I use Claude Claude in the browser for what I would use that for. And it's great.

You know it's it's copilot's another interesting one where you get from people who like I turned it on and it just gave me a bunch of junk and I turned it off again because it's clearly not useful. And again copilot you have to learn how to use it. Like there's no manual for any of this stuff especially not for copilot and that you have to learn things like if you type out the start of a function name and give it named clarfully named parameters with their types or type annotations

it will complete the function for you. And if you add a comment it will like you can you can you learn you prompt it through the comments that you write essentially. Yeah I've actually started

to use that. It's actually again no one tells you that but once once you figure it out it's it can be really useful because that's how you can generate like either a small part for me just a small part or a function right it just gets it and again like as I mean it's not surprising but the more context you give in the comment the more it'll kind of do what you write if you're lucky.

I think the other thing to know about copilot is that it's actually running rag it's got an incredibly sophisticated like retrieval look a rag mechanism where every time it does a completion for you copilot it tries to include context from nearby in your file but it also looks for other files in your project that have similar keywords in them. So that's why sometimes it'll be in your test.

That's really interesting that you say that because we're going to get to the misconceptions but we've been running an AI survey and one of the things that people really complain about saying is I use copilot because it's the one that's easy to turn on in your ID and people said that it only uses my files and I wish it would look at the project or understand the whole project but it's interesting to say that because I think a lot of people don't realize that it is trying to do it or

in smart ways. Most people or not most but a lot of people assume that it just only looks at whatever you're seeing on the screen. No it is looking at bits of other files but it's undocumented and it's weird and it's trying to do semantic similarities and all of that sort of stuff. What I do a lot of is sometimes I'll just copy and paste a chunk of one file into a comment in another so that

it's definitely visible to copilot. That's great for things like writing tests. You can literally copy in the code that you're testing into your tests.py and then start with the test and it will further understand. I'm not sure if you understand. You know when you said you need to learn how to use it sounds like you're coming from the other way instead of trying out and saying yeah or nae and you know like because I guess you're working for yourself so it kind of makes sense that you

want to make yourself productive. You figure it out how these things can actually make you more productive. Right absolutely and like it's so much work. I think the biggest sort of misconception about all of this is that you'll get this tool and it'll make you productive on day one and it absolutely won't. You have to put in so much effort to learn to explore an experiment and learn how to use it and there's no guidance. Like I said copilot doesn't have a manual which is crazy.

Claude to its credit Claude is the only one of these things that actually has documentation that's really good. Like if you want to learn how to prompt LLMs the Claude Anthropic Prompting Guide is the actually the best thing I've seen anywhere. Open it I have almost nothing. There are so many hypesters and blogs and tweets and LinkedIn posts full of like junk junk advice. You know all of the things like always tell it that you are the world's greatest expert in X before you ask

all of that kind of mostly rubbish right. But there's so much superstition because this stuff isn't documented and even the people who created the models don't fully understand how they do what they do. It's very easy to form superstitions. You know you try the you're the world's greatest expert in Python thing and you get good answer so you're like okay I'll do that from now on.

It's kind of like if your dog finds a hamburger in a bush every time you walk past that bush for the next two years they will check for a hamburger right because dogs are very superstitious and it's that but for software engineering. And then going back to your stack so we call yeah there's a few more so there's I talked about coding to our interpreter one of my favorite Claude features is again in the future from a few weeks ago called artifacts which is this thing

where Claude can now write HTML and CSS and JavaScript and then it can show you that in like a little secure iframe and so it can build you tools and one of interfaces and prototypes on demand and it's quite limited they can't make API calls from in there it can't actually see the results

so it doesn't have that debug loop that coven-terp to has but still it's amazing like I've been them I've redesigned pages on my blog by pasting in a screenshot of my blog and then saying suggest better color scheme for this and show me a prototyping artifact and it did so cool

so I'm doing a lot more frontend stuff now because I can get Claude to build me little interactive prototypes along the way to help speed that up so I'm spending a lot of time with that I have my my command line tool lllm lets you run prompts from the command line and the key feature of that is

that you can pipe things into it so I can like cat a file into that and say lllm write tests and it will output and then just understand you just build like it's a command line are you running a local model or some where a model server lllm the tool it's based around plugins and it can talk

to over a hundred different models so is this an old tool so yes it's my my big open source my open source language model oh awesome my online project yeah well look at in the show notes as well and yes so it's plugin based originally it could just do open a i and then I added plugins and

now it can run local models and it can talk to other models too so I mainly use it with with Claude because that's the best available model but I've also run like Microsoft's 5 3 and Lama and also a mistroll and things I can run those locally which to be honest I don't use on a day-to-day

basis because they're just not as good you know the local models are very impressive but the really like high-end the best of the best models run circles around them so when I'm trying to be productive I'm mostly working with the the best available models I love running the local models

for sort of research and for playing around and also there a great way to learn more about how language models actually work and what they can do because when you like and people talk about hallucination a lot I think it's really useful to have a model hallucinate that you early because

it helps you get that better mental model of what it can do and the local models hallucinate wildly so if you really want to learn more about language models running a tiny little like some of them are like two or three gigabyte files that you can run on laptop I've got one that runs on my phone

it's actually really which is surprising yeah there's an app called mlc mlc chat and it can run Microsoft 5.3 and google's jema and it's got mistral 7b it's right these are very good models like if you ask them like if you say who is Simon Wilson they will make up things that's a great I love

yeah I use like ego searches to basically see how much they hallucinate they'll they'll say he was the ctl of github and I'm like well I really wasn't but I do use github but but they like I use these on planes they're good enough of python like I can use them to like look up little bits of

API documentation like I can't remember and things like that and it runs on your phone it's really fun yeah awesome so like looking back you've now been coding for like more than 20 years right yeah I mean depending on how you can be professionally people have been paying me for 20 years at

this point people I'm paying for 20 years so like throughout the time you know we have seen some some increases in productivity may that be fire but coming out for developers or other things like if you could you talk through like what were kind of like bumps when you became more productive

as a developer and then when we get to LMS compared to how this bump compares to those ones I love that you mentioned fire bug because that was a big bump right yeah fire bug was the there was the chrome dev tools before browsers had the built in it was an extension for Firefox

that added essentially what you recognize as the developers now and that was an absolute revelation when it came out especially for me because I've spent most of my career as a python program my favorite feature of python is the interactive prompt I love being able to code by writing a line

of code and hitting enter and seeing what it does and then you end up copying and pasting a bunch of those explorations into a file but you know that it's going to work because you worked on interactively fire bug instantly brought that to JavaScript like suddenly you could interactively

code against a live web page and figure things out that way so that was a big one and I think the biggest yeah I think it's a reminder because like some listeners were not necessarily around but before fire bug I was in web development and the way you debugged your JavaScript applications

which were pretty simple at the time but you did alerts to show we didn't even have console.log that there were no logs invented by fire bug yeah so it was just really painful and really hard to debug and you also couldn't really inspect the elements you were changing it was like doing it

in the dark and as you say it was a game changer and now these days Chrome developer tools is better than what fire bug used to be but fire bug was almost as good as the Chrome developer tools today in my memory at least so was this huge leap and like I think for frontend developers like

it's hard to tell how much more but I'm sure at least you have twice a productivity I'll just say something because it took so much longer to fix things or to understand why things were happening so yeah like that that was a big jump so fire bugs could run the biggest productivity boost in my

entire career is just open source generally like 20 so I turned out 25 years ago you had to really fight to use anything open source at all like a lot of companies had blanket bands on open source anything like Microsoft were were were were were were were making the case that this is a very

risky thing for you to even try that's completely gone out of the window I don't think there's a company left on earth that can have that policy now because how are you going to write any frontend code without npm you know that's that's that's all but that the so it was open source as a concept

and I was very early on in open source no jango was a we we jango open source in 2005 python and phb and so forth all came out of the open source community and that was huge because prior to open source the way you wrote software is you sat down and you implemented the same thing that everyone

else had already built or if you had the money you bought something from a vendor but good luck buying a decent thing and then of course you can't customize it because it's proprietary and that the open source and then on top of that open source as a concept it really was get hub coming

long massively accelerated open source because prior to that it was source forage and mailing lists and ccbs and subversion and just starting a new project you had like I started open source project where I had to start by installing track which meant I needed to run a virtual private server and

then get Linux secured and then install like the open source all the time to what get hub became was great software but it was not exactly a one click experience so open source was absolutely huge and then you had get hub making open source way more productive and accessible and massively

accelerating then the package manage so pipi for python and npm for javascript and I mean the the OG of that was was cpan for pearl which was up and running in the late 90s and it's where we owe so much to to cpan and sort of how it made that kind of thing happen but you know today

the productivity boost you get from just being able to pip install or npm stall a thing that solves your problem I think my my hunch is that developers who grew up with that already in place have no idea how much of a difference that makes like when I did it my software engineering degree

20 years ago and the big one of the big challenges everyone talked about was software reusability right like why are we writing the same software over and over again and at the time people thought OOP was the answer they're like oh if we do everything as classes in Java then we can subclass

those classes and that's how we'll solve reusable software with hindsight that wasn't the fix the fix was open source the fix was having a diverse and vibrant open source community releasing software that's documented and you can package and install all those kinds of things that's been

incredible like that that the the the the the the cost of building software today is a fraction of what it was 20 years ago purely thanks to open source it's interesting because like when we talk about developer productivity like it's it's a topic that will come back and obviously it's very

popular very important for people and leadership positions you know who are hiring certain number of people and their their CEOs and we'll ask how are these people used and right now there's a bug big you know push to say that Gen AI is adding this and this much productivity but it's two

things are interesting one is that we don't really talk about how much just having open source are not having to do it ads we just I guess we just take it for granted and the other thing that I want to ask you I want to ask you like how much more productive do you think with this current

workflow you have which is pretty advanced sounds like it you're using a bunch of different tools you spend a lot of time tweaking it so I'm going to assume you're one of the the software engineers who are using it more efficiently to your own personal productivity how do you feel

like how much more productive this makes you and and there and you know there's a tabit here obviously it's hard to you know like be honest about yourself but right now the good thing is we don't have any like any polls vendors will obviously have a bias to say AI vendors that it's

helping them more and you know people who might not like these tools they might have a bias to say like that's not even helping me so I think we're the sense or we can probably get right now is just from like people like you looking on us to get yourself and like yeah okay so I think I've got

two answers to this um I it's difficult to like quantify this but um yeah my guess for a while has been that I have had a giant productivity boost in the portion of my job which is typing code at a at a at a computer and I I would estimate I am two to three times more productive like faster

at turning thoughts into working code than I was before but that's only 10% of my job like as a software engineer as back once you want to sort of more senior software engineer the typing in the code bit is is not near you spend way more time researching and figuring out what the requirements

for the thing are and all of those other activities um so huge boost for for typing for for for typing code the other thing that's and and it had does speed up a lot of the other activities the research activity in particular like if I need a little JavaScript library to solve a particular

problem because I have a I have a bias towards boring technology anyway if I ask a Claude or GPT-4 it will always ask for options I always say give me options for solving this problem and it spits out three or four and then I can go and look at those it's effectively using it's a slightly

better slightly faster and more productive Google search because you can say things to it like okay now show me an example example code that uses that option if you're using Claude Sonny you can say show me the interactive prototype of that option um all of that so that that research stuff happens

more quickly for me um there's a whole bunch of those sort of smaller productivity boost the bigger one the more interesting one for me is um I can take I can take on much more ambitious project because I'm no longer limited to the things that I already know all of the trivia about and I feel like

this is one of the most important aspects of all of this is if you want to program in Python or JavaScript or Go or Bash or whatever there's a baseline of trivia that you need to have at the front of your mind you need to know how for loops work and how conditionals work and all of that kind of

stuff and so I think there is a limit on the number of program languages most people can work in like I've found personally I kept out about four or five programming languages and if I want to start using another one there's a like a month potentially a month long spin up for me to start but get

get and that means I won't do it right why would I use Go to solve a problem if I have to spend a month spinning up on Go when I can solve it with Python today that is gone like I am using a much wider range of programming languages and tools right now because I don't need to know how for loops

and Go work I need to understand the sort of higher level concept of Go like memory management and Go teams and all of that kind of stuff but I don't have to memorize the trivia so given that I've actually shipped Go code to production despite not being a Go programmer just sort of six months

ago that's been running happily every day and it has unit tests and it has continuous integration and continuous deployment and all of the stuff that I think is important for code and I could do that because the language model could fill in all of those little sort of trivia bits for me

This episode is sponsored by TLDR TLDR is a free daily newsletter covering the most interesting stores in tech startups and programming join more than one million readers and sign up at TLDR.tech that is TLDR.tech I sometimes dread going back to certain site projects where it takes me a while

to spin up and remember and it's in a language or an outdated framework that I just don't want to touch and it's like what you said the confidence is higher and I can actually just paste parts in the chat GPC or turn on get up co-piled and it'll like I know what good looks like so

I think when you know that even if you need to have that experience like if I was a brand new programmer I don't think it would I'd be using it to write Go despite not knowing Go but I've got 20 years of experience I can look I can read code that it's written in a language I don't know

very well and I can still make a pretty good like evaluation of if that's doing what I need to do and if that looks like it's good and that's really good and I'm pretty important disclaimer right that the more you look at languages as long as it's an imperative language like

you can read it right I think it will be a different if you we don't really use some language is not as popular like prologue and sml and some of the things I would trust myself yeah I would not trust myself to just look at prologue code and it'd written me and make a judgment as to whether

that was good prologue code but I feel like I can do that with with with languages like Go and Rust you know yeah so with with that I think it's good by the way thanks for sharing I think it's great to see that you are getting productivity and but it also took a lot of work I think like big takeaway for for me would be anyone who's trying out is like like put in the work and experiment to figure out what what workflow works for yourself and that there's just no answers I mean you've

been I think you've been experimenting a lot more than most people have and it still sounds like it's a work in progress oh good for me I really want to touch on misconceptions and doubts they might not be miscuses and your doubts and questions that a lot of people have about these tools let's talk

about resistance a little bit because I feel like the resistant lots of I see so much resistance this and it's a very natural and very understandable thing this stuff is really weird you know it's weird and it is uncomfortable and the ethics around it are so murky like these models were trained

on vast quantities of unlicensed copyrighted data and whether or not that's legal and make I'm not lawyer I'm not going to go into that the the the moral of the ethics of that like especially when you look at things like image models like stable diffusion yeah are now when now being used

when you would have commissioned an artist instead and they were trained on that artist work like that's I don't care if that's legal that's blatantly unfair right if something and there's no person there's one person who wrote just this that they tried it out didn't work that well

plus they don't want to use it because they disagree fundamentally with this and honestly I respect that position I think that is it's I I could I've compared it to being vegan in the past right the veganism I think there's a very strong argument that for for for why you should be a vegan

and I understand that argument and I'm not a vegan so I have made that sort of personal ethical choice and all of this stuff does turn down to personal ethical choices if you say I am not going to use these models until somebody produces one that was trained on entirely like like licensed data I

absolutely respect that I think that's a very like I've not made that decision myself and you know for the code stuff it's all it's basically trained on on every piece of open source code they could get on but it is ignoring the license terms you know the GPS licenses that say attribution is

important you can't attribute what comes out of a model because it's been scrambled with everything else so yeah there are the ethical concerns I completely respect but then there's also let it's scary right it is scary when you think okay I earn a very good salary because I have worked

through the trivia of understanding Python and JavaScript and I'm better with that trivia than most other people and that gets that that and now you've got this machine that comes along and it's better at the trivia than I am like it knows the things that I know it I mean knows in scare quotes

um that that is disconcerting and um there are there's I feel like there's a pessimistic and an optimistic way of taking some of the pessimistic way is saying okay I better learn to be I need to go into the trades I need to learn plumbing because my job is not going to exist in a few years time

yeah um the optimistic version the version I take on is I can use these tools better than anyone else for programming I know I can take my existing program knowledge and when I combine it with these tools I will run circles around somebody who's never written a code line of code in their life

and is trying to build an iPhone app using chat GPT I can just do this stuff better so we've essentially got these um tools that are they're actually power user tools right you have to put a lot of work into mastering them and when you've got that when you combine expertise in using tools with

expertise in a subject matter you can operate so far above other people and like the competitive advantage you get is enormous that's something that actually does worry me most about the resistance is I like people who are resisting this stuff right I like that they're not falling

for the hype I like that they care about the ethics of it I like that they're questioning it I don't I it would upset me if that put them in a serious professional advantage over the next few years as other people who don't share their ethics start being able to turn out more stuff because

they've got this this additional to it's like if you were to say I don't like I don't like search engines I'm never going to search for an answer to my programming problem that would set you back enormously right now and it's I feel like it's it's in a similar kind of space to that

yeah and so another I guess opinion I hear a lot is well it seems like this whole technology is platoon like if we look at the past 18 months chat GPT for is okay cloud might be a little bit better sonnet okay cool but like you know let's ignore that for just second get up co-pilot

hasn't changed all that all that much so I do see a sense especially for for people who are managing engineers and they're also playing with this tool saying like well it sounds like this is what is going to be you know like we just use it as is is is this all like yeah you're you're more in the weeds do you see improvements or drastic improvements or little improvements that's a really interesting question I mean from my perspective I'd kind of welcome a plateau at this point it's been a bit

exhausting keeping up with the stuff I've been last two years I feel like if they were no improvement if we if what we have today is what we're stuck with the next two years it would still get better because we'd all figure out better ways to use it yeah a lot of the one of the most one of my

favorite advances in language models is this thing called chain of thought prompting right this is this thing where if you say to a language model solve this puzzle it'll often get it wrong and if you say solve this puzzle think step by step and it'll then say okay step one this step two step

step three and often it'll get it right and the wild thing about chain of thought prompting is that it was discovered against GPT3 about a year after GPT3 came out was an independent research paper that was put out saying hey it turns out take this model to say think step by step

and it it gets better with all of this stuff nobody knew that right the people who built GPT3 didn't know that was an independent discovery we've had quite a few examples like this and so if we do are in a plateau then I think we'll still get lots of advances from just people figuring out

better ways to use the tooling I a lot of this also comes down to whether or not you bind to the whole AGI thing right like so much of the main stream here right and so so much like it's kind of like test the self-driving cars right you've got this the CEOs of these companies go and say we're

going to have AGI into in two years time it's coming nobody other work again which helps you raise a lot of money but it's also it's scarce I mean it scares me like I'm I'm not convinced that human economies will work if if if all knowledge work is replaced by AI and it

also gives a very unrealistic idea of what these things can do because well don't forget it's also happening with software engineers right there are companies out there whose pitches we will replace software engineers with AI engineers which is a very direct although I'm now starting to see

a pattern of how this is really good for fundraising because it means a lot of potential market and don't forget that that's what they're talking to and once they raise the money you know they have that money they they can then operate and and often like in this case you know with cognition AI

their claims are toned down to the point of it's pretty much a co-biological so there but you see it in the main it is scary because you see it in the mainstream media everywhere this this claim that software like we are I think someone said we're we are replacing our own jobs as software engineers

and as you said it's right I think it's the first time I've seen that written in the press maybe this happened like before it was born but not recently it's funny isn't it it's like um who who would have thought that AI would come for the lawyers and the software engineers and the

illustrators and all of these things that are normally you don't think of as being automatable um but yeah so the agi thing that leads to disappointment people like yeah well I asked it as this this this dumb like logic puzzle and they got it wrong are you how is this like but it also ties into

science fiction you know everyone thinks about the matrix and terminate and all of that kind of stuff um especially honestly the sort of the key problem here is these things can talk now right they can yes they can they can they can they can they can imitate human speech and throughout human

society being able to write well and convincingly has always been how we evaluate intelligence but these things are not intelligent at all but they can write really well they can produce very convincing text which which kind of throws everyone off so see yeah if you're if you're captured

by the agi hype you're gonna then I think yeah I think we're gonna have a plateau I'd be very surprised if we had anything that was agi like um I'd also be like I said I've I've not been solved that this is a net win for humanity I don't know how how society would cope with that

but if we what we are seeing is incremental improvements like Claude 3.5 sonnet is a substantial incremental improvement over GPT 40 and Claude 3 opus and the andthropic the interesting thing about Claude 3.5 sonnet is that it's named sonnet because their previous Claude

3 had three levels those high coup sonnet and opus high coup was the cheap one sonnet in the middle opus was the really fancy one they're clear they they have said they're going to release high coup 3.5 which will be cheap and amazing and opus 3.5 which is going to be a step up from sonnet those

I try to ignore the it's coming soon those ones I am excited about in terms of it's coming soon but yeah so if you're buying into the agi stuff then I I don't buy into it I don't think you get to agi from autocompleteting sentences no matter how good you are at autocompleteting sentences

and then yeah if it's in terms of the the plateau I'm just like incremental improvements enough for me like I want to be faster yeah if we look through back through history like I'm I'm a little bit skeptical just to believe that suddenly like fundamental things would change

in in the software industry you know there's always this people sometimes you know project that this time it will be very different and again there's always innovation but looking back we've always had innovation we've had some new technologies and then incremental improvements so

right like pattern matching that would be logical obviously there's black swan events right like who would have kins who could have seen covid come or this is also a breakthrough but I think there there's a part of like we're not we're not just in a vacuum there's not just this one event

and ag has been predicted to be around the corner by different people since since the start of computing really to be fair but I think the the other something I think about a lot is the impact of tiktok and youtube on professional video creation right like the the iPhone is a this is a really

great video camera and tiktok and youtube have meant that you can now publish videos to the entire world and that has not killed off professional video like people who work professionally in that industry they're doing fine you know what happened is is millions of people who would never have

even dreamed of trying to learn to stand in front of the camera or to operate that equipment are now publishing different kinds of content online and I I that's kind of my my my ideal version of the sort of AI programming thing is I want the number of people who can do basic programming to

go up by an order of magnitude I think every human being deserves to be able to automate dull things in their lives with a computer and today you almost need a computer science degree just to automate a dull thing in your life with a computer that's the thing which language models I think

are taking a huge bite out of and then maybe so there is a version of that where the demand for professional software engineers goes down because the more basic stuff can be done by other things the alternative version of that is the thing where because a professional software engineer can now do

five times the work they used to do maybe two times five times whatever it is that means that companies that wouldn't have built custom software now do which means that the number of jobs the software engineers goes up right a company that would never have built its own custom CRM

for their industry because you'd have to hire 20 people in wait six months can now do it with five people and two months and that means that that's now feasible for them and those those five people are still getting paid very well it's just that they're the value that they provide to companies has

gone up so despite this sort of so that's that's the demand curve that I'd like to see well and also don't don't forget like one thing that we do talk about or I think it's kind of a common knowledge correct me if it's wrong but code equals liability the more code you have the more

liability you have and one thing just what we're seeing is more code will be generated and at some point I just think about this thing have you have you worked at a company or a team where you just had like less experience developers one or two years experience and you leave them for a while or you might have seen oh yeah and then what happens right like fast forward to two years you don't add anyone experience you know like usually like my observation is like it's you get spaghetti code it's

a mess it's it's hard to do and then you pull in some people with more experience who look around they point out some seemingly simple changes that are are you know not that simple for the people they they simplify things you might delete a lot of code and then all will be good in the world or

or those people get more experience but I do think about this part where you know a year in everything still seems to be fine right like the CEO of the company is like oh this team is shipping quickly people are enthusiastic and my sense is that there will be there should be a demand and

again like I'm curious to hear hear thoughts on this but engineers who can go into the generally code and for example explain reason even when the machine fails to to explain this complicated mumble jumble or just say we're gonna delete all of this and it makes sense I'm

confident I can tell you why I'm doing this right and that's what I expect that's the skill that you need like turns out the typing code and remembering how full it's work that's the piece of our jobs that is has been devalued right remembering that sort of trivia and and typing

really quickly nobody cares if you can type faster than anyone else anymore that's not a thing but the systems thinking and evaluating a skill that I think is really important right now is QA like in terms of just the old fact like manual testing being able to take some code and really

hammer-waded and make sure that it does exactly what needs to do combined with automated testing combined with like system design and prioritization there's so much to what we do that isn't just typing code on a keyboard and those are the skills which the things language models can do a lot

of this stuff but only if they're if you ask the right questions of them right like if you if you ask language model to write five paragraphs on what like how you should refactor your microservices maybe it'll do an okay job but who's going to know to even pose that question and who's going to

know how to evaluate what it says so those decisions these things I don't think you should have eight have them make decisions for you I think you should use them as supporting like tools to support decisions that you're making so one of the reasons I love saying give me options for X

and that's what we become right software engineers we are the people making these the high-level design decisions the people are evaluating what's going on I don't think you should have a commit a line of code that the language model wrote if you don't understand it yourself that's sort of my

personal line that I draw and yeah I so I do not feel threatened as a software engineer and honestly partly as a software engineer who's got good with this stuff I really don't feel threatened by it but just generally like I don't I think the bits of my job that these tools will accelerate

there are a whole whole bunch of jobs that bits of the job that accelerate some of which are a bit tedious some of which are kind of interesting but it gives me so much more scope to take on more exciting problems overall I love it and if you can offer advice to like to two different

goofs of people so two separate pieces but experience engineers like yourself in terms of like you know like put in the years worth across different stacks and also to less experience engineers who are like coming into they're already working inside the industry but obviously they're not at

the level just what would you suggest them to make the most out of these tools or to make themselves most more future proof if you will I mean my universal advice is always to have side projects on the go which doesn't necessarily work for everyone you know if you've got like a

family and a demanding job and so forth it can be difficult to carve those out a trick I've used at companies in the past I love advocating for internal hack days you know saying let's nice to water have everyone spend a day working or two days working on their own projects that

kind of stuff can be great good employers should always be able to leave a little bit of wiggle room for for you know for that sort of sort of exploratory programming but some employers don't but if you can get that that's amazing if you're early in your career like people in their 20s can

normally get away with a lot of side projects because they have a lot less going on it's like when I'm managing people I don't like people working super long hours and all of that it's hard to talk a 20 like a 22 year old out of that that's just sort of how people are wired early in their careers

so take advantage of that while you can but yeah I feel like I'm doing my personal web log I'm using all sorts of weird AI tools to hack on that because the stakes could not be lower right a bug in that it'll break a page and I'll fix it so that's where I've been using this thing called

GitHub co-pilot work spaces that they've just started trialing yeah it's a beta you're in debita yeah and I've added four or five features to my blog using that some of them in live like in meetings with people's a demo I'm like oh let's show you this tool I'm going to add auto

complete to the tags on my blog and I did that last week and so I'm I'm using my blog as a sort of fun exploration space for some of that kind of thing but yeah um so if you can afford to do a side project with these tools and like set yourself down to write every line of code with these have

these tools write that code for you yeah I think that's a great thing you can do if you can't afford side projects just use them like take get an account with them I mean both of the best models and are free like gp t40 with open AI and 4 3.5 on it now you have to log in you might

have to give them a phone number but they're you can use a free account with them use those and just throw questions at them as sometimes have a question where you think it definitely won't get this and throw that in because that's useful information throwing basic things just work work

with in that way I think that's that's definitely worthwhile and play with the cloud 3.5 artifacts thing is just so much fun like the other day I wanted to add a box shadow to a thing on a page and I'm like what I really need is I need a sort of very light sort of subtle box shadow and then

I was halfway through prompting cloud for that and said actually you know build me a tool build me a little tool with where I think I said where I can twiddle with the settings that was my prompt let me twiddle with the settings in the box shadow and it built me this little interactive thing

with a box shadow and sliders with the different settings and a copy and paste CSS thing and if I'd spend an extra 15 seconds on it I could have found a tool that existed on Google but it was faster to get Claude to build me a custom tool on demand than to because if you want to Google search you

have to evaluate the answers you get a fallback and then you click through and all that like no I know I know what I want so do that right that's just why it's entertaining I feel this is of what you're saying it's like yeah I mean it's easy just easier said than done but experimenting and I think

you're a blog which we're gonna link in the show notes is just a really good example like I did find myself a little bit reenergized reading how much weird stuff you're doing sorry for the other thing it's gonna be fun right that yeah but I can see that you're having fun with it like

right and again thanks for sharing because you put it out there I think you know that that's another thing that but honestly with these tools it's a bit easier to write it up as well so I think that's yeah it's helpful advice like this is a crucial thing these things are absolutely

hilarious and it's not like they can sometimes they can write a joke that's good but that's not what makes them funny it's trying out weird dystopian things trying something you didn't think we're working having it work I get them to do I use the voice mode I used to do prank funcals to my dog

so I'll be like hey chat GPT I need to give my dog a pill covered in peanut butter I need you to pretend to be from the government department of peanut butter and make up an elaborate story about why she has to have it now go and it does it and it it's spiel and I hold the the the

speakers my dog it's just really really amusing so stuff like that is is so much fun I for a while I was always trying to throw a twist into my prompt so I'm like answer this and the bottom I'd say oh and pretend you're golden eagle and use golden eagle analogies and they would say well if you're

soaring above the competition stupid things like that right just you can you can get it to wrap kind of and it's awful like really absolutely appalling but with the voice mode you can say now now not do a wrap about that answer and just it's cringeworthy it is it is kind of what how

I don't really remember having a tool that we're talking about programming here but you can get it to do all these things within a you know potentially even in the work context just throw it in there it's it's kind of as you said it is fun so it it's I like to look at that part of it so thank you

for the inside and let's end with some rapid questions in the end if you're okay so these are questions I'm just going to ask and you just throw out whatever comes up could you recommend two or three books to people that you enjoyed reading Martin Clemmons book designing data intensive

applications is it's it's all my shelf actually absolutely incredible the blue sky team told me Martin Clubton advise system that this is the book they have all on their shelf because this describes everything you need to know to build blue sky it's kind of amazing at a event right we

had a book club and one of the things we did with the book club is because nobody reads the book for book clubs right it turns out that just doesn't work so what you do instead is you assign chapters to different people and they have to provide a summary of the chapter at the book club

so it's almost like you you you um you um parallelize the active reading the book that worked so well and that was that was I think that was the best book that we that we did for that one um and there's is there maybe a fiction book that you can recommend so my favorite genre of fiction is

British wizards tangled up in old-school British bureaucracy so I like um there's there's Charles Stross does the laundry file series which is about sort of secret like MI5 style wizards there's the river of London series by Ben Arnevich which are uh Metropolitan police officer who

gets tangled up in magic I really enjoy those oh nice what's your favorite programming language and framework and you cannot see Django and Python yeah really pity me on the spot with this one oh yeah um okay um java script and no framework at all I love doing the vanilla java script thing

basically because I used to love jQuery and now document dot query selector all and array dot map and stuff jQuery is built into browsers now you don't need an extra library it is kind of wild yeah I remember that one when I used to use I'm surprised nice what's an exciting company

that you uh that you're interested in and why so I'm going to plug fly dot IO here the hosting company because partly because they sponsor some of my work but no actually completely independently of their sponsorship I picked them to build my data set cloud SaaS platform on because they're

a hosting company that makes it incredibly easy to spin up secure containers for as part of your infrastructure basically I was trying to build this stuff on top of Kubernetes which is not easy to use oh and then I realized that fly dot IO their machines layer is effectively what you can do with

Kubernetes but with an API that actually makes sense and pricing that makes sense so I'm able to build out this SaaS platform where every one of my paying customers gets a private separate container running my software with its own encrypted volumes and all of that kind of thing and so I don't

have to worry about data leaking from one container to another and it scales to zero but it's scales to zero in between the question all of that kind of stuff so yeah I'm really excited about fly as a platform for specifically building that thing where you've got an open source project and

you want to run it for your customers like paid hosting of open source I feel like fly is a really great platform for that awesome well thanks very much it was great having you come this has been really fun thanks a lot thanks a lot to Simon for this if you'd like to find Simon online you can

do so on his blog Simon was in dot net and on social media like x and massed on all in the show notes below you can also check out his open source projects data set and lm which are also in the notes as closing here are my top three takeaways from this episode take away number one if you're not

using lm's for your software engineering workflow you are falling behind so use them Simon outlined a bunch of reasons that hold back many depths from using these tools from ethical concerns energy concerns but lm tools are here to say and those use them get more productive so give yourself

a chance with these take away number two it takes a ton of effort to learn how to use these tools efficiently as Simon put it you have to put in so much effort to learn explore and experiment on how to use them and get there's no guidance so you really need to put in the time and experimentation

by the way in a survey ran in the pragmatic engine about AI tools with about 200 software engineers responding we saw some similar evidence those who have not used AI tools for six months were more likely to be negative about the perception of these in fact a very common feedback from

engineers not using these tools was that they use it a few times but it just didn't live up their expectations and they just stop using them I asked Simon how long it took him to get good at these tools and he told me it just took a lot of time he couldn't put an exact number of months on it but it just took a bunch of time and experimentation and figuring out if it works my third and final take away is that using local models to learn more about large language models is a smart strategy

running local models has two bigger benefits number one you figure out how to just do these how to run models locally it's actually less complicated than one would think thanks to tools like hugging face that make downloading and running models a lot easier so just go and play around with them and

see how smaller model feels like the second benefit is that you learn a lot more about how large language models works because local models are just less capable so they feel less magical Simon said how it's really useful to have a model hallucinate at you early because it helps you

get better at the mental model of what it can do and the local models do hallucinate wildly you'll also find some additional resources in the pragmatic engineer one of them is about rag retrieval augmented generation this is an approach that Simon talked about in this episode it's a common

building block for AI applications we did a deep dive into pragmatic engineer about this approach and this is linked in the journals below also in the pragmatic engineer with a three part series on AI tooling for software engineers reality check we looked at how engineers are using these tools

what their perception is what advice they have to use these tools more efficiently personally i cannot remember any developer tool or development approach that has been adopted so quickly by the majority of back in the front and developers in the first two years of this release

like large language model tools have done so since 2022 so it's a good idea to not sleep on this topic and this marks the end of the first episode under pragmatic engineer podcast thanks a lot for listening and watching if you enjoyed the episode i'd greatly appreciate if you subscribed and left a review thanks and see you in the next one

This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.