#526: Building Data Science with Foundation LLM Models

Michael Kennedy

00:00

AI has changed how we write code and data science is right in the blast radius. Today we move past autocomplete to systems that file PRs, comment in Slack, and even police our CI. Hugo Bown Anderson is back to map the new stack. When classic PyData still wins, where small local models beat the cloud, and how tests become your specs. We dig into cursor and copilot, proactive agents, and practical patterns you can ship this week.

00:26

This episode is all about leveling up your data science workflow, not replacing it. This is Talk Python To Me, episode 526, recorded October 7th, 2025. Welcome to Talk Python To Me, the number one podcast for Python developers and data scientists. This is your host, Michael Kennedy. I'm a PSF fellow who's been coding for over 25 years. Let's connect on social media. You'll find me and Talk Python on Mastodon, BlueSky, and X.

01:11

The social links are all in the show notes. You can find over 10 years of past episodes at Talk Python.fm. And if you want to be part of the show, you can join our recording live streams. That's right. We live streamed the raw uncut version of each episode on YouTube. Just visit talkpython.fm/youtube to see the schedule of upcoming events. And be sure to subscribe and press the bell so you'll get notified anytime we're recording.

01:36

This episode is sponsored by Posit Connect from the makers of Shiny. Publish, share, and deploy all of your data projects that you're creating using Python. Streamlit, Dash, Shiny, Bokeh, FastAPI, Flask, Quarto, Reports, Dashboards, and APIs. Posit Connect supports all of them. Try Posit Connect for free by going to talkpython.fm/posit, P-O-S-I-T. And it's brought to you by Nordstellar.

02:02

Nordstellar is a threat exposure management platform from the Nord security family, the folks behind NordVPN, that combines dark web intelligence, session hijacking prevention, brand and domain abuse detection, and external attack surface management. Learn more and get started keeping your team safe at talkpython.fm/nordstellar. Hey, I want to take just a minute and talk to you guys. I just released a really cool new course called Agentic AI Programming for Python Developers and Data Scientists.

02:33

You've heard me mention a couple times on the podcast how I've had some incredible success with some of these agentic AI coding tools. I hear people talking about how they're not really working for them. And then I look at the results that I'm getting and think, wow, that's something that would have taken two weeks. It's built in two hours and it's well factored and good looking code. What gives? Why is this difference here?

02:58

Well, I decided to create this course to share all the things that I'm doing with these agentic coding tools with the idea of making you as successful and productive as well. Yes, I know we're all tired about hearing about how AI is going to change everything for software developers. But there are some tools here that will give you truly difference-making levels of productivity. And that's what this course is about. So check it out at talkpython.fm/agentic AI.

03:27

The link's in your podcast player show notes. Let's get to the interview. Hey, Hugo. Welcome to Talk Python.

Hugo Bowne-Anderson

03:33

Awesome to have you, man. Such a pleasure to be here and to be back. I think this is my third time over the years on Talk Python.

Michael Kennedy

03:39

I do think it is your third time. It's been a lot of fun data science things. And I think people will see that our conversation, the thing that we're going to really focus on this time, will have evolved a little bit as the times have changed. Since the last couple of years, the data science and the entirety of programming has gotten a little different.

Hugo Bowne-Anderson

03:57

Actually, just our conversations over the years, like kind of art can be a symbol of the trajectory from like early days of data science, PyDataStack, all of these amazing things as it went from academia to industry. then to, you know, large-scale distributed compute when I came and talked about Dask and Coiled. And now a couple of years after our quote-unquote GPT moment and our ChatGPT moment and our stable diffusion moment here to talk about LLM's foundation models meet data science.

Michael Kennedy

04:22

That's right. We've gone from local machine to cloud, AI. Wow. I do see that we'll actually maybe be back. I see a world where we build together smaller models that maybe run locally, doing other things, Maybe some servers, connect MCP servers, connecting all these things, like little special agents. That's going to be something we can potentially dive into. We'll see. But I think the arc is not done.

Hugo Bowne-Anderson

04:46

The other thing that comes to mind is I do, and we'll get into this with AI-assisted programming, which I think superpowers people who know what they're doing, but may not be great for beginners. I do imagine we might have boot camps or classes where you're in a cave and no access to AI or the internet and you learn, you actually learn to code on a laptop without any of this stuff.

Michael Kennedy

05:07

Yeah. There's going to be a local PyPI and be a local set of docs. You know, there was this really cool app. Gosh, I've covered on Python bytes, the news podcast I do. And I wish I could remember, but what it would do is it had like hundreds, maybe thousands of different projects like Flask or Tailwind or whatever, Span Technologies. And you could click it and say, I want to have Flask 3.1 offline, get me the docs.

05:33

And you could create like a catalog of all these different projects you're using, all offline searchable docs across. Amazing. I feel like something like that might come back. You know what I mean?

Hugo Bowne-Anderson

05:42

It's funny you mentioned that because I was chatting with Innes Montani of spaCy and Prodigy fame the other night. She's here in Sydney. And she reminded me that Sebastian Ramirez of FastAPI, when he was building FastAPI originally, I think what she reminded me was they didn't have great internet access where they were. So they did download a lot of things at very slow speeds, then build everything locally.

Michael Kennedy

06:04

I'm reliving that, by the way, in a very weird way. My fiber modem died. And if anybody watches the video, they'll see that I'm actually in the library here, which is great. I preserve a little private room. I got high speed internet here. But at home, I'm tethered with one bar of LTE. So anything I do is at like 30 kilobits. And it brings me back to my youth. But boy, you got to decide what you want to do next. That's it.

06:25

Yeah. So I do, circling back and closing this up, I do think this true learning the code thing is both, it's going to be something that comes back around. I think it's actually a challenge with all the tools and cheats. I don't consider them bad cheats, but the things that can do the work for you, it requires a lot of willpower to stay focused.

06:44

And I do think it's going to be kind of a kobold moment as well, where somewhere down the line, people are going to be like, we need to just get some people that used to type this stuff in by hand. And we need them to look at it and figure out why this doesn't work.

Hugo Bowne-Anderson

06:55

Without a doubt. And, you know, learn Python the hard way. Learn X the hard way. Like sometimes most things you got to do the work, right?

Michael Kennedy

07:02

You definitely do. Well, Hugo, before we get into the topic too much, quick introduction on yourself. Who are you? I know you've been on the show a few times, but it's spanned many years and I say it often, but it's always worth repeating. And like 50% of the people in the Python community are new over the last two years. Like they've only been here two years or less, which blows my mind. So those people probably want to listen to the podcast before they got into Python. Who are you?

Hugo Bowne-Anderson

07:24

So firstly, what is up Python community? Clearly, I'm a huge fan of Python. Used it for many years. Love it. Background many moons ago in scientific research, biology, math, physics. Was working in academic research at Yale University, New Haven, Connecticut. Living in New York City just over a decade ago. The data science ML meetups, hackathons there blew my mind so much. Moved to industry, a small startup at the time, Datacamp. Worked on curriculum, education, internal data science product.

07:52

Wore many hats, as you do. And worked a lot on Pythonic education there. Since then, I've been working in a mixture of DevRel, marketing, product. A year and a half ago on wonderful projects, such as Dask at Coiled with Matt Rocklin, then Metaflow out of Netflix with the wonderful team at Outer Bounds. A year and a half ago or so, space was so exciting, man. I decided to go freelance.

08:13

And so I mixed my time, essentially helping people build, ship, and maintain AI, LLM, ML, data-powered products more generally. I do this through consulting. I do it through advising. I do it through education and developer relations. So helping open source frameworks and products reach developers and getting material that helps them. My former colleague and boss at Outer Bounds, Vilay, who really gets developer relations, refers to DevRel as the wisdom layer.

08:44

And he puts it firmly beside product as a pillar. And I love that because I think a lot of the time we consider education or DevRel as a necessary thing you have to do as opposed to being at a foundational pillar. And once again, that's why I'm in such admiration of the work you do in bringing so many resources to the community

Michael Kennedy

09:03

at large. Thank you. And foreshadowing a little bit, I would like to kind of reinforce that quote you just said. It's the wisdom layer. Like as data scientists, your job is to provide insight and knowledge and trends, forecasting. Developers, our job is to provide solutions and things that that we can use apps and tools and whatnot. And I think a lot of us, myself included, get tied down and like, oh, I'm really good at coding and I'm good at this library.

09:32

And we can kind of forget that like the real first tier job of ours is to provide answers and solutions and apps. And I think a lot of the pushback on AI is like, it's taken my coding.

Hugo Bowne-Anderson

09:44

One way I think about it is it has taken my coding in some ways, but as we'll get to, I never particularly enjoyed writing, love Pandas, never particularly enjoyed writing Pandas code, for example, incredible tool. But if I can help me write my Pandas code, I read it, make sure it's all good in the hood. And then I get to focus on building systems.

Michael Kennedy

10:04

I think that's a huge win. 100%. I would be remiss to not give a little shout out to a couple of things that you have done or are doing. Take it chronologically. A while ago, you worked on the fundamentals of Dask, high performance data science course over at Talk Python. This course is 100% free. People want to dive into it and learn from you. They can absolutely take it.

Hugo Bowne-Anderson

10:23

I might even take it now. It's free.

Michael Kennedy

10:24

It's just over an hour. Yeah, people can drop in and I'll be sure to put a link in the show notes for that. So that's awesome.

Hugo Bowne-Anderson

10:30

That was really fun to build with you as well, Michael. That was during the early days of COVID we were working on that. So, you know.

Michael Kennedy

10:36

What else do we have to do? No, it was really great working on it with you as well. I appreciate that tons. And then you since then have started a data podcast called Vanishing Gradients.

Hugo Bowne-Anderson

10:47

Exactly.

Michael Kennedy

10:48

Tell people about that.

Hugo Bowne-Anderson

10:49

So this is a podcast and I still call it a data podcast. Although a lot of people like you have to call it AI, Hugo. And AI is data. And as we'll get to a lot of the principles in building AI powered products are the same. Modulo implementation details of building data powered products. It's a podcast where I talk with industry practitioners and builders about what they're doing in the space, how they're building, and essentially trying to help propagate knowledge

11:13

from the bleeding edge back to builders and leaders in the space. So recently had a conversation with Hamil Hussain there, who he's the evals guy, among other things, but all about the eval space and how you can use evaluation in the LLM powered software development lifecycle to improve your product. I've spoken with Shell Genteman at NASA, conversations with Jeremy Howard. So one of the things, as I'm sure you do, I love about podcasting is I get to invite people I admire and who I think

11:45

are awesome to chat about stuff and then share it with the public. So that's the rationale there. But it's really to help people propagate knowledge, wisdom, and skills back the adoption curve.

Michael Kennedy

11:56

Along the adoption curve. That's good work. Who should listen to it? Beginners, experts, data people, programmers? Everyone should listen to it. Everyone who's interested in building and

Hugo Bowne-Anderson

12:05

shipping data-powered stuff. And the way I actually chat about it with guests is the first third of any conversation, I want everyone to understand, everyone who's somewhat technical. The middle, we can go a bit deeper. And the third is a free-for-all. I definitely encourage everyone to jump in. And we've got specific episodes on evals, of course, and that type of stuff.

12:26

But we also have industry-specific episodes, such as chatting about what was happening in the early days of shipping LLM-powered software at Honeycomb or at NASA and these types of places.

Michael Kennedy

12:36

That is one of the little secrets of being podcast hosts is you get to talk to people about amazing stuff. You're like, huh, it'd be really cool to talk to the people that made that Fusion breakthrough. They did Python. Why don't we invite them? And don't draw by,

Hugo Bowne-Anderson

12:48

you know, that's amazing. And I get like a lot of my friends who work in the space connect me with other people. So I'm actually chatting with a data leader at Mozilla and then the VP of learning at Duolingo. So we're going to have a lot of really fun episodes coming up. What's up with

Michael Kennedy

12:59

the name, Vanishing Gradients? Where'd that come from? There's the Vanishing Gradient problem

Hugo Bowne-Anderson

13:04

in deep learning. So when you do stochastic gradient descent, you compute gradients and climb down in order to optimize neural networks. And there's a challenge that sometimes gradients vanish and you stop learning. So the rationale was, what happens when you stop learning? And

13:20

let's bring back the idea of learning in this space. The opposite, of course, is the exploding gradients problem, which I also considered calling it, where the gradients just explode, of course, But we went with vanishing for that reason.

Michael Kennedy

13:30

I like that. That's a very clever, very subtle. Nice. So let's talk data science in 2025. And to be clear, I didn't ask, let's talk using AI for data science. Let's talk data science in 2025. And surely, I think there's two things here. I think there's some really interesting, what I don't know how we want to think about, like pure programming libraries and tools that are super powerful. And we could give a quick shout out to some of them.

13:58

But then also, anytime you're exploring data, using some of these LLMs, especially the agentic tooling, it's a game changer. So let's start with the first one. What tools, you know, things like polars maybe or whatever is like jumping out at you over the last year or so that's like, wow.

Hugo Bowne-Anderson

14:14

I chatted about this on Vanishing Gradients with Akshay Agrawal, who built Marimo and develops Marimo, which I encourage everyone to check out. So let's actually rewind slightly and think about what we've been using over the past decade, plus plus. And it's the PyData stack, Jupyter Notebooks, Pandas, SQL, SQLite databases, and in production, maybe Postgres and these types of things. And how has this evolved now? What are modern, really cutting edge tools that we use in similar ways?

14:46

You mentioned polars. I totally agree that this is something we're seeing a lot of activity on and a lot of use on. On the database side, we've got DuckDB, right?

Michael Kennedy

14:54

DuckDB is making a huge impact.

Hugo Bowne-Anderson

14:56

Beautiful to use, but it's so fast as well, right? And I mean, and that's what you want there. And then on the literate programming side, you've got Marimo, which I'm a huge fan of.

Michael Kennedy

15:08

This portion of Talk Python To Me is brought to you by the folks at Posit. Posit has made a huge investment in the Python community lately. Known originally for RStudio, they've been building out a suite of tools and services for Team Python. Over the past few years, we've all learned some pretty scary terms. Hypersquatting, supply chain attack, obfuscated code, and more.

15:30

These all orbit around the idea that when you install Python packages, you're effectively running arbitrary code off the internet on your dev machine, and usually even on your servers. The thought alone makes me shudder, and this doesn't even touch the reproducibility issues surrounding external packages. But there are tools to help. Posit Package Manager can solve both problems for you. Think of Posit Package Manager as your personal package concierge.

15:56

You use it to build your own package repositories within your firewall that keep your project safe. You can upload your own internal packages to share or import packages directly from PyPI. Your team members can install from these repos in normal ways using tools like pip, Poetry, and uv. Posit Package Manager can help you manage updates, ensuring you're using the latest, most secure versions of your packages.

16:19

But it also takes point-in-time snapshots of your repos, which you can use to rerun your code reproducibly in the future. Posit Package Manager reports on packages with known CVEs and other vulnerabilities so you can keep ahead of threats. And if you need the highest level of security, you can even run Posit Package Manager in air-gapped environments. If you work on a data science team where security matters, You owe it to you and your org to check out Posit Package Manager.

16:46

Visit talkpython.fm/ppm today and get a three-month free trial to see if it's a good fit. That's talkpython.fm/ppm. The link is in your podcast player's show notes. Thank you to Posit for supporting the show.

Hugo Bowne-Anderson

17:00

I still use Jupyter Notebooks, but one thing Marimo affords me, because it's actually a.py file as well, you can convert them. Well, they're essentially scripts as well. So the notebook to production story is really interesting there.

Michael Kennedy

17:13

I think Marimo is super interesting. I think when I look at it, when I see people working with it or when I work with it, the limited extent to which I have, it just looks smooth and polished and modern. And I don't know, I just, when I use it, I feel, feel like it's something great. It also solves the problem that while JupyterNetbooks or JupyterNetbooks, JupyterLab, whatever, in general is like an incredible tool for data exploration and presenting data. It has this,

17:42

this crazy implicit go-to sort of sequence, right? Like if you don't just go run all cells and you start bouncing around, you end up potentially running stuff out of order or skipping a step that would have made a different answer, the step below. And that's, that's real dangerous. And so Marimo uses the abstract syntax tree to look at dependencies across cells and make sure they run in order, which I think is an underappreciated benefit. It's like, oh, that's kind of nice.

18:08

Like, no, like, do you want the wrong answer or the right answer? This is really important in data science and science in general.

Hugo Bowne-Anderson

18:14

You're right. My understanding is it uses the AST to build a DAG of cells and execute some. And what that means is, yeah, and it means you can't redefine something in a cell below, but it'll give you a scratch pad to do so if you want to. Now, I just want to say that's fantastic for a lot of cases. There are cases when you just want to explore an experiment where Jupyter notebooks like absolutely excel. So it's not an either or here as well.

18:38

And I do want to say Jupyter notebooks, in all honesty, like get a bunch of hate for that. And neither you nor I feel that way. But I just want to be very explicit that that's not what we're saying here at all.

Michael Kennedy

18:49

Yeah, I have a lot of reverence for notebooks. Not only did they change the game for data science in general, but they changed it for Python. So if you look at the popularity and the people participating in Python, like one of its really powerful aspects is people are coming from all these different angles with different ideas and perspectives and different tools they want to build and so on. That's made it so rich.

19:11

And that started basically in 2012 with the PyData stack with notebooks and all of that.

Hugo Bowne-Anderson

19:18

Yeah, I remember the first notebook I opened was called an IPython notebook, not even a Jupyter notebook. And of course, it's all based around an IPython also. So we got to give a shout out to that. And as I said, I was actually working biology in research at the time. And in biology, we have notebooks, right? Like you write, you put your PCR gel, you put your figures there, you write things. This idea of literate programming is exactly that. And what it does is it brings experimentation.

19:45

It brings scientific rigor and scientific research into computation. And very important for this space where we are, really what we're talking about in data science, ML, AI is the convergence of software meeting data and experimentation. So we need new

Michael Kennedy

19:59

tools for this. And notebooks are one of the most awesome examples of that. Okay. I took us down a bit of a hole with the Marimo stuff because it is cool. Anything else that jumps out to you? I have one at the end that I want to riff on before we get off this topic, but what else jumps out to you? Schools 2024, 2025, that's kind of like, oh, that's different. To step back a bit, we are talking

Hugo Bowne-Anderson

20:19

about like data science with AI and that type of stuff. And this works both ways, right? Like data science plays into AI and building with foundation models, but I'd have to fire myself if I didn't talk about the other way, which is AI helping us do data science. And the tools are AI assisted. The biggest tool is AI assisted programming for data science, which is revolutionary. I think maybe isn't even as big a term as we need for this. Absolutely groundbreaking.

Michael Kennedy

20:48

It's easy to get frustrated saying it's bad for the environment. I'm a good data science, good developer. I don't need this stuff. But for the most part, I feel like the cat is out of the bag. The Pandora's box is open, whatever analogy you want to use here. And it's made such a difference.

Hugo Bowne-Anderson

21:03

Without a doubt. And I definitely agree on climate concerns. We should be having larger conversations around this. You can start using smaller models and local models for AI-assisted programming. They won't superpower you as much. there are, you know, to say all, and this is to say all of AI is very bad for the climate is, I suppose, like saying both hummers and electric cars in the same bucket, right? So, but I totally

21:25

agree that that's a concern we need. And the other thing though, man, if we're talking about like, you know, software's writing the software, AI's writing the software, vibe coding, I don't necessarily understand it. Guess what? We were all copy and pasting from Stack Overflow. We've been doing that for a long time, right? And I don't necessarily understand all that code. So in some ways scaling and superpowering that behavior.

Michael Kennedy

21:47

It's on you. It's on me. It's on everyone who uses it to either, and we're going to get into this more in detail later, use that as a learning experience or as a, well, I don't need to know that. I'll just, whatever it says, right? That was true with Stack Overflow as well. Like you would go to Stack Overflow and you would just copy something. And the knock on people who would take stuff from Stack Overflow and paste it was they had no idea what it meant.

22:11

They just saw that it solved the problem. there was even that joke keyboard that Stack Overflow created. All it had was like a control and a C and a V and they had a Stack Overflow logo. It's like, this is all you need, right?

Hugo Bowne-Anderson

22:21

It was hilarious, right? I'd like to system prompt ChatGBT to not be so sycophantic and treat me like people on Stack Overflow used to treat people sometimes as well.

Michael Kennedy

22:29

My ego is doing way too well today. I need to be beat down. But the thing is like,

22:33

you could go to Stack Overflow and you could go, wow, okay, I didn't know that. And then you learn it and you don't need to go back to Stack Overflow and copy that thing because now you've understood something deep and that's different that's on you when you're copying my stack overflow and it's it's on you a hundred times over if you use these tools right because a lot of times especially the agentic stuff it explains what it does like here's what it was here's why i changed it it could let

22:56

that scroll by or you can go slow and study it and become smarter not more brain dead you know what

Hugo Bowne-Anderson

23:01

i mean if i was getting like pandas or scikit-learn code from stack overflow i'd really like want to understand it because that was my bread and butter whereas if it was front end stuff Like I'd probably go and find the same issue question time and time again. Same with like environment stuff, like getting environments working in Jupyter notebooks or something. I just, I still can't grok that stuff.

Michael Kennedy

23:23

I know. I was just thinking of bash scripts, like shell scripts. I'm like, you know what? This is just, I don't need to, I don't need to remember this. I just bookmark that puppy. And long as it doesn't have RMRF or something destructive in it, I'm just right. Incredible. Okay, one more thing before we move on to like the AI depths that I think is, we got to talk about it. Because today is October 7th, 2025, as we record this, not as I release it.

23:47

So I'll have to be a bit nostalgic for a few weeks. But today is Python Pi Day. Python 3.14 came out today, right? And one of the main features of Python 3.14 is the free threaded aspect being sort of officially taken in. And I know one of the big challenges that's been solved with C extensions and Rust and other stuff, but it's still a bit of a challenge is like, I've got a ton of data. I want to process it in my codes in Python. How do I take advantage of the 32 cores I got?

24:15

Or do I get one 32nd of a computer, right? And so I think starting to think about parallel programming a little bit, it's going to take on whatever significance it takes, it's going to take on more than it has traditionally. Totally agree. And I think it's the data scientists who are going to need it more than anyone. Yeah. Our web frameworks handle that kind of stuff for us. They fan that out into processes and other things.

24:36

But when you've got real computational stuff, there's no IO blocking that you can work around, right, to leverage async, right? You've got to do the CPU stuff.

Hugo Bowne-Anderson

24:45

Exactly. And it's a really good question because I would have a trillion percent agreed with you. And I 100%, but I would have a trillion percent agreed with you pre our chat GBT moment when data scientists, ML engineers, all of these types of people weren't only building products, serving models, that type of stuff, but they were responsible for training as well.

25:06

And I think you're totally right with large-scale analytics. Think about Dask and geospatial, large-scale, multidimensional, geospatial, atmospheric data, these types of things,

25:17

and basic analytics. I'm not even talking about machine learning there, but we have entered a regime now where you can build ml and ai powered products by pinging apis or hosting your own models and that type of stuff whether it's hugging face from hugging face or wherever it may be or you know using olama locally and in that case i think because you're not doing the training yourself you're able to do a lot of things without requiring massive massive compute yeah there is a

Michael Kennedy

25:44

bit of a it seems like a big area but it's a bit of a thin area because you've got the regular programming you can do then you need that async you need that parallelism for higher compute but just you don't go very far until someone says fine i'm doing it in rust i'm doing a C++ or it's an api and then you don't need it again you know what i mean there's like a little stratosphere sort of bit of it and happy python pi day yes happy python pi day that's pretty cool i've not even installed

26:09

it yet today because uv has not shipped their support for it yet and that's when i mentioned

Hugo Bowne-Anderson

26:15

new modern tools like marimo and polars and duck db uv has has to be in there as well also very excited about other package management but plus plus for lack of a better term tools like pixie that uh wolf olprecht and his his team who i know have been on the show and you know who people may know from mamba which helped us with conda so much so uv is not the only only story

Michael Kennedy

26:39

out there solving these working to solve these problems i do think it's really interesting especially for the data science crowd, because things have gotten better for those of us that use pip exclusively. It's like faster, a little bit better resolution, a little bit better workflow. Like some of the tools are brought together, like pip-tools plus just regular pip. But I think it's a bigger consideration for your side of the fence in that there's Conda. And now do you stick with

27:03

Conda or do you use UV? Like that's, they kind of compete more than pip did with uv, I think, actually. I use uv. I agree. I think uv is pushed it over the edge. And then the pyx that charlie's released if you're like an org where they help with building like the layers of

Hugo Bowne-Anderson

27:17

machine models and pytorch and stuff is it's pretty interesting i also just want to say on python pi day that is so geeky and i i would want to i do want to say show my geeky t-shirt to everyone which says schrodinger's cat wanted dead and alive it's a wanted poster so speaking of geeky stuff that's

Michael Kennedy

27:33

perfect that's perfect i i just have a sweater i'm sorry i didn't i didn't prepare yeah that's true i'm in a library it's full of books so i can i'm sure something geeky is behind me all right so We've talked about maybe some of the other not so necessarily AI focused tools is like what people should focus on. But like we both said, it's so transformative. And if people haven't actually tried it and seen it in action, you got to see it to believe it.

27:56

Because I was a skeptic until my friend's like, no, let's sit down and let me show you. I'm like, oh, OK, I get it. And so let's talk about some of the AI tools.

Hugo Bowne-Anderson

28:04

I've done an initial slicing into different levels, which may be useful thinking through the evolution of these tools. And how do we use AI to help us code? And I jokingly level zero because we zero index here is copy and pasting from Stack Overflow. Right now, that's not quite using AI, but it is AI is using collective wisdom as opposed to stuff in yourself. Right. So we've got that.

28:27

Okay. But then after our chat GBT moment, we've people started copy instead of copying and pasting between Stack Overflow and their IDE, VS Code, Jupyter Notebooks, whatever it may be. people started copy and pasting between ChatGPT and the IDE. So you would, you know, let's say you get an error message from your IDE, copy it into ChatGPT, give it a bit more context, whatever it may be. And then such as, you know, give it the code you wrote, plus the error message, plus your environment.

28:56

And it will be pretty good at helping you, depending on what packages you're using, what framework. If you're working in PyData, absolutely fantastic, right? Of course, it's trained on scikit-learn. It's trained on MapLotlib. It's trained on Seaborn. It's trained on spaCy. One of the biggest frustrations at that point was it wasn't trained on ChatGPT's current API. So it would always give you, even if you corrected it, it would always revert to previous OpenAI API.

29:23

Okay. So level one, copy and pasting. Level two, code completion in IDE. So you can think co-pilot or whatever. You're writing code and it will start suggesting things, right? So that's just working your IDE. Now, I think these things people probably know about. But then level three is where things get really wild for me, where you actually, you have an agent in your IDE or your terminal.

29:45

So cursor, which is a VS CodeFork, has an agent and a chat in it where you can have an empty repository and say, hey, I want you to write a program that creates a RAG pipeline over the documents in this subdirectory or something like that. And it will go and do that immediately. It will be, when you do that, it won't be great. It'll do all types of nonsense. It might write lots of directories and subdirectories.

30:13

So we can talk about some of the gotchas, but this is agentic coding where you're chatting and it just throws stuff in. I will also add, and this is something we mentioned briefly beforehand. I don't like typing that much, to be honest. So I use Super Whisper. There are other tools to do this where I dictate to it. And I also have a stream deck, which is what we mentioned beforehand, which a lot of content creators use this. It's like buttons

30:39

and knobs that you can assign to macros. And so when I have a button that opens cursor, puts it in agent mode, attaches Claude, Sonnet 4.5 or Gemma 2.5, whatever it may be. Exactly. That's a stream deck. And then I have accept code buttons and reject code and that type of stuff. So I can actually build a not insignificant amount of software with my voice and three buttons,

Michael Kennedy

31:02

which is so powerful. I also find that if I do dictation, I can be a lot more patient and thorough in a lot of ways, right? Like I don't know about you, but I've had like RSI issues. So I got to be real cognizant of like how much typing I do. I've got my Microsoft Sculpt ergonomic keyboard that I drag with me everywhere because square keyboards will destroy me in like a week. And so it lets me go on without worrying about those kinds of things.

31:27

I use a Mac Whisper, but it's super similar, I believe, right? It just uses the same underlying engine. And it's really nice that you can, I do it for email. I do it for lots of things.

Hugo Bowne-Anderson

31:37

I'll just give that. I love how you mentioned it does help one be more patient because yeah, when typing, well, when chatting with an AI, you can get frustrated and the friction in typing and correcting yourself and that type of stuff, you just don't have when speaking natural language.

Michael Kennedy

31:51

Yeah. And I find people, I don't, I mean, this is not scientific, but my experience has been that people say like, oh, this stuff is not good. it always just gives me junk results and so on, is so often there's not enough information given. It'd be like, create me a graph, not use Plotly to create this type of graph with this type of focus from this data like I did before. You know what I mean? Those are really different things.

32:14

And the more specificity you can give these tools, the better. I use a lot of AI stuff. And I would say I have plenty of prompts that are pages long. And I was like, here's a file, Here's four pages of what I want you to do with it. Let's go. And it is not always writing. So but incredible results compared to what if you just say, you know, analyze this or whatever.

Hugo Bowne-Anderson

32:34

We'll get to this when we talk about gotchas. But having a conversation with your system before getting it to write anything is incredibly important and productive.

Michael Kennedy

32:46

This portion of Talk Python To Me is brought to you by NordStellar. NordStellar is a threat exposure management platform from the Nord security family, the folks behind NordVPN that combines dark web intelligence, session hijacking prevention, brand and abuse detection, and external attack service management. Keeping your team and your

33:04

company secure is a daunting challenge. That's why you need NordStellar on your side. It's a comprehensive set of services, monitoring, and alerts to limit your exposure to breaches and attacks and act instantly if something does happen. Here's how it works. NordStellar detects compromised employee and consumer credentials. It detects stolen authentication cookies found in InfoSteeler logs and dark web sources, then revokes live sessions and flags compromised devices,

33:34

reducing MFA bypass ATOs without extra code in your app. Nordstellar scans the dark web for cyber threats targeting your company. It monitors forums, markets, ransomware blogs, and over 25,000 cybercrime telegram channels with alerting and searchable context you can route to Slack or your IRR tool. Nordstellar adds brand and domain protection. It detects cyber squats and lookalikes via visual, content similarity, and search transparency logs, plus broader brand abuse

34:05

takedowns across the web, social, and app stores to cut the phishing risk for your users. They don't just alert you about impersonation, they file and manage the removals. Finally, Nordstellar is developer-friendly. It's available as a platform and API with integrations for Splunk, QRadar, Datadog, Sentinel, Elastic, and Cortex. No agents to install. If security is important to you and your organization, check out Nordstellar. Visit talkpython.fm/nordstellar.

34:31

The link is in your podcast player's show notes and on the episode page. Please use our link, talkpython.fm/nordstellar, so that they know that you heard about their service from us. Thank you to the whole Nord security team for supporting Talk Python To Me.

Hugo Bowne-Anderson

34:46

Isaac Flath, who he has a wonderful course on Maven called Elite AI Assisted Coding, which I'm actually starting as a student this week. And I may give a guest talk there. He wrote, I don't know whether he came up with it, but it's, it's, he calls it Socratic and dialogue driven development where you essentially pair program with the AI. Don't expect it to do everything, but have, have conversations. So the other thing is a lot of these agentic systems like

35:08

cursor, which I use daily, probably a bit too much. You can plug into any, you know, state of the art API. So, you know, Sonnet 4.5 came out recently and the day after you could use that in Cursor. Gemini 2.5 came out a while ago, then you couldn't plug it in. GPT-5 and so on. The other thing I just wanted to give a shout out to is Continue. Tyler Dunn, speaking of modern tools and open

35:31

source tools, it's like an open source cursor of sorts. And I mean, he wouldn't frame it that way, but you can have all your own local models, data preserving, privacy preserving, and use those in in this way. So just going back to this slicing though, level one, copy and paste code, level two, code completion, level three agents in an IDE or terminal. So Claude code, for example, can be in your terminal, cursor, VS Code, fork. Level four is embedded in other tools like Slack or Discord or

36:01

email. You can tag cursor in Slack to, if you notice a documentation fault, you can tag it in Slack to fix that. Manus, you can tag in email threads now. Level five is more proactive. So all of these are reactive systems. Level five is more proactive. So you can have Cursor, for example, and a lot of these systems, I just use Cursor the most, to do code review in CI, in continuous integration. So whenever I submit a PR, Cursor can come in and do a code review there.

36:31

Now I'm getting a bit to future music. At level six, we've got async or background agents that just do stuff in the background, essentially, which we're going to see a lot more of. And then level seven, which we haven't seen so much proactive agents. And I think these are going to be huge that will just notice stuff happening in production. Like, hey, we have this outlier here. Oh, this didn't quite work. Or agents that come to me on a Monday morning and be like, hey, check this out.

36:57

Check that out. Check that. Like a good colleague, right? A team member. Yeah. So, but we're already at, you know, level five and kind of got background agents as well. So we're getting, yeah, to a lot of really exciting places.

Michael Kennedy

37:13

Yeah. So I'm a hundred percent bought in up to level four, probably like certainly the agent decoding a little bit of the code review, not so much in CI, but like looking at the stuff that I'm, I might ask it like, Hey, what's going on here? Why is it like this background agents? I just haven't, I haven't got there. They seem, they seem like they don't have enough to work with, right? they don't have my whole machine and all the setups and all the things they need. But I can

37:36

see how they would be useful. Certainly the way you're describing like a good colleague.

Hugo Bowne-Anderson

37:39

And so one example is not necessarily in software, but background agents, and I haven't done this, but I've got friends and colleagues who've built background agents that monitor their inbox and will ping them like their email, sorry, and will cluster emails and be like, hey, you really should reply to this one. This is a prospect or a client that really needs your attention right now.

Michael Kennedy

37:57

I think that kind of stuff would be really neat. I will throw out there now, this comes from Sentry, who is a sponsor of the show, just to be fair, but I'm not doing this because they sponsored it. They added this thing called Seer, S-E-E-R, that when your app collects an exception or something or doesn't collect it and it gets up there, depending on how you send it up there, is it'll apply AI to whatever it receives. And it'll look, if you bind it to your GitHub and so

38:19

on, it'll like try to understand the project. And maybe by the time you get to look at the bug report, it actually has a solution suggested as well, which, and it'll do a PR, which that's on the verge of what you were suggesting as this sort of like proactive buddy that's just hanging out

Hugo Bowne-Anderson

38:34

there. As I said, I do want to mention just a few gotchas. Or actually, let me just say some of the really powerful use cases in data science, since we are talking about foundation models for data science, just writing code. I mean, text to seek all these LLMs are trained on now, right? So writing SQL code based on what you say or what you type in natural language. Now, chat with it beforehand so that you make sure it understands your schema. It may not be great at complex joins,

39:03

this type of stuff. Understand the system you're working with, get a feel for how it works. Actually think of it as like a super excited ADHD-esque, perhaps slightly autistic as well. And I mean that with all the love in terms of, you know, absolute like deep memory, be able to recall a lot of information. And on the ADHD spectrum, in terms of just how it will do, like it will spread its attention over a lot of different places and create lots of different stuff, some of which may

39:30

not work, but some of which will be incredible. So have empathy for your system in that sense, but writing SQL code, amazing. PyDataStack, incredible. I want to run an idea by you and

Michael Kennedy

39:39

just see what you think here with what you just said with like, think of it as this super excited, somewhat scatterbrain, junior helper, excited friend. If you had hired somebody, even if they went to Stanford, but they hadn't really done work on any like major data science projects, and they came to your company and you gave them a job, would you expect 100%, like absolutely 100% correctness? And they're, no. So I don't know why people expect the AI to be literally 100%.

40:05

I think I have an idea why, but expecting the AI to be 100% right, where it's kind of doing some of this human level type of reason. I mean, not thinking, I'm not saying that, but this kind of problem, this creative problem solving, we should have a little bit of patience for if it gets it wrong, especially if we give it poor directions. You know what I mean? And I just feel like people so often think, well, it's a computer and it's wrong, so it's trash. It's no good. Well, was it 95% right?

40:31

Because that's really helpful.

Hugo Bowne-Anderson

40:32

This is so important. And there are several things in there. Firstly, I think when after our chat champion moment, Jeremy Howard noticed a complaint that people were like, oh, I need to chat with it like it doesn't do the thing the first time and I need to correct it. And Jeremy was like, what type of human don't you need to have a conversation with to learn stuff? The other thing is in nearly all of these systems now, you have memory. And cursor, for example, has cursor rules.

40:55

Whereas if you notice stuff that it does or doesn't do, put it in the rules. You can have project-specific rules. You can have cursor general rules, this type of stuff. Such as always explain your reasoning. And to your point, I don't think these things think, but they mimic reasoning and thought in pretty sophisticated ways. So I totally agree with that. Also, use these things to not only write code, but to write tests for code producers, right? To debug, to add assertions.

41:24

Now, also make sure you're always reading the code on average that it writes. If things are really important, it's about your appetite for risk. Maybe if things are really important, make sure you know what the code does and read it. If things aren't as important, maybe you don't need to. And an example I'll give there is one of the biggest wins here is being able to vibe code your own data viewers. So let's say I'm building.

Michael Kennedy

41:48

100% I agree with you on this. Incredible, right?

Hugo Bowne-Anderson

41:51

So if I'm building an LLM powered app where, okay, building an LLM powered app and I've got all these conversations, I can build a vibe code, a custom, a bespoke custom viewer to view that. If it's an agent that writes emails, I can even build a viewer that displays the conversations as emails or whatever it may be, right? Now, I need to make sure that it's displaying the correct information. So I need to make sure that my traces are actually looking correct. So I need to understand that code.

42:20

But the ins and outs of the front end it's building, not the biggest deal for me. So figure out what's important and what isn't.

Michael Kennedy

42:26

There's so many tools like this that are like, it's not me worth taking a week to write that. But wouldn't it be cool if I had that? And nothing depends on it. It's kind of like its own side little utility. There's no, it's not a building block that's going to become an important foundational thing. It just, if it works, you're like, that's awesome. I have that.

42:44

People got to take advantage of just going like, oh, I need a utility that does X or I need a view in our admin section of the web app that does this. Nothing depends on it. If it works, amazing. If it doesn't, well, it didn't exist anyway. So whatever. And there's so many opportunities.

Hugo Bowne-Anderson

43:01

I love this so much. And I want to take a slight detour because it's so important. I actually think the surface area of what software is, is expanding and changing completely. So let's just look at, take a bit of a sociological big picture look at history where software, classically, you've, has been very expensive to build, right? You've needed to pay a lot of not inexpensive engineers to build out a sophisticated product. For that reason, you've had to get a lot of demand.

43:32

You've had to have a lot of service area of the market. What that's meant is that you've needed to cover a lot of edge cases to satisfy a large market so that you can make revenue based on the costs that you're accruing, right? Now, with the ability to even vibe code or use AI-assisted coding to build this type stuff, it changes what we can build and what it needs, what type of people it needs to satisfy.

43:58

So that's why the conversation of like vibe coding is going to bury Salesforce or SaaS versus like old men screaming at clouds, vibe coding sucks or whatever it is, misses the entire middle ground of the types of things that are possible. And I do think internal products, such as we've talked about data viewers, internal products, such as, you know, lots of companies are ripping out their marketing automation stacks and building things internally through to, I mean, I've got a friend

Michael Kennedy

44:26

who built a chess tutor app. I mean, it's not chess.com, but like a hundred of his friends

Hugo Bowne-Anderson

44:32

use it, right? So the idea of the different types of software we can build now and thinking about, you know, ephemeral software or just in time software, disposable software, fast software that we can build to solve a problem right now and then move on. I think we need to shift our

Michael Kennedy

44:48

model of what software actually is. I 100% agree. And it's such an exciting time. I also think this actually has an implication for IPI packages and a lot of these just external packages. So think about the ones you use, like this is not going to happen for pandas or polars or Jupyter or

45:05

something like that. But how often have you gone like, oh, I need, I need a package that will let me look up both my ipv4 and ipv6 ip address and you you'd make you take that as a dependency that's like one fun using one function out of whatever thing you grab that might give you that information and there's there's a lot more opportunity to have fewer building blocks and dependencies that you need in your application if you can just say hey agent thing i need this and

45:34

put it in its own file and boom and now you're not dependent on well did that thing upgrade to Python 3.15, sorry, you're stuck. It didn't. You know what I mean? Like it's, you were sort of free from, from these like little weird, I depend on this whole tree of dependencies because I need one little piece of functionality. You can vendor in stuff a lot easier if it's a low stakes.

Hugo Bowne-Anderson

45:54

The other thing that I think AI assisted coding help I've seen help with is, now this is, this will be a bit controversial, but it's going from prototype to production. And what I really mean by that? Let's say your production stack is in like Databricks, Spark, whatever. You can prototype, write your pandas code. And then because once again, Spark, Databricks, all that documentation

46:14

is in training data. You can then convert it relatively easily. But of course, read, test your code that you're pushing to prod just as you would with a junior software engineer, right? The other thing is some gotchas with this. It'll do lots of things you don't ask it to do, okay? such as it will just start downloading packages, for example, if you're trying to write some code, right? And so it'll do lots of things you don't ask it to do, and it will also do things you ask it to not do, right?

46:44

So you'll say only write in one file, and Devon, for example, will create nested subdirectories each with 15 notebooks, something like that. It will also forget lots of things in your conversation. So make sure to remind it of things. It will do something called dead looping. And this is one of the most frustrating and pernicious things, depending on how long the loop is. But it will try to solve your most recent concern. It will really optimize locally around the conversation.

47:15

And so it will solve that. Then another error will appear and it will solve that. And it will another error appear. It will solve that. And after a while, it might actually go back to the initial state. So that's what dead looping is. And this can be a loop of three.

Michael Kennedy

47:26

And it just cycles. Yeah.

Hugo Bowne-Anderson

47:27

five, seven, nine. So be very careful about that. Solutions are get it to zoom out. Say, hey, let's zoom out and have a holistic conversation about this. Now, this is wild that this is how we're interacting with software as well. Now, this is part of the- It's science fiction. It really is. Science fiction. But write product requirement drops with it before any code. Now, it may still just start writing code, even when you tell it to do that, right? Plan with it.

47:50

Write rules. Have empathy for your super excited, bright, fast, forgetful intern.

Michael Kennedy

47:57

I've really embraced this plan thing. All my major projects have a plans folder. And every time I start, I create a markdown file. I say, we're going to plan it out. And I want you to write in this markdown file what we're going to do. And I'll do that with a really expensive model, you know, like a thinking something or whatever, and it'll do it. Then I'll switch. I'll completely throw away that chat. Get another one, a little lower model. Say, we're going to do phase one of this plan.

48:19

Let's go. And then two and then three. And every time I say, when you're done, you update the plan. So you know where we are and what we've done. What's not. And it's tremendously successful.

Hugo Bowne-Anderson

48:28

There is a concern of sometimes you want memory from one chat. Like sometimes a chat just degrades. So start a new conversation. And there are clever ways, different for different products and models, but to get it to summarize the conversation so far, like literally say, I'm going to take another instance of you. Let's squash this so I can pass the important memories to it and so on.

48:48

The other thing is Cursor, a lot of these products have, it used to be called YOLO mode on Cursor, Like Y-O-L-O mode, right? Where it just executes.

Michael Kennedy

48:58

I run in YOLO mode, by the way, I do it.

Hugo Bowne-Anderson

49:00

I have too much anxiety around that.

Michael Kennedy

49:02

So here's the thing that's really gotten that I've really noticed that's interesting is my Git discipline is significantly better now that I'm, if I do that. So I'll open it up and I'll go and I'll start having a chat and I'll make sure everything is checked in. I might do a separate branch if I think it's going to go bonkers. And then it'll do a little bit of work. And I'm like, okay, that's successful. So I'll stage those changes, but not even commit them.

49:26

And then I'll let it keep going and it'll create more work. And I see if it's successful, I'll just like put, and then eventually I'll commit it at the end. It won't even push it necessarily. And then you can also, while it's running, have the get diff window open and you can just sort of see what it's doing by looking at the diffs that start to appear. And so that's why I'm okay with YOLO mode because I can always just get revert and we're fine again.

Hugo Bowne-Anderson

49:49

Totally. And also, I didn't mention this, but these AI assistants are great at looking at diffs with you. And I also, I'm really thinking through what happens when we have so much AI generated code. And I think part of the future of work, I don't want to get too sci-fi-esque, but how many agents can you manage simultaneously? Like maybe the SuperSpa employees will be people who can manage 100 agents.

Michael Kennedy

50:12

It's like the revenge of the program manager, I'm telling you.

Hugo Bowne-Anderson

50:15

And in fact, project management, product management are some of the most important skills moving forward. But also when you have so many agents generating code, what happens to CI/CD? What happens to Jenkins? And there's going to be a whole new space of products and agents that may deal with these types of systems, right? Look during the industrial revolution, what sprung up when looms like appeared and the satellite industries that happened there.

50:39

So I think we're talking about job automation, job automation and job displacement, but the amount of new jobs that will become available, I'm actually very excited about. And we'll put in the show notes, please. But there's an essay by Tim O'Reilly. And I did a podcast with him about it, actually, that we can link to if you're up for it, called The End of Programming As We Know It.

50:57

OK. And he essentially, with his great depth of historical knowledge and his forward thinking through vectors, prevents a really wonderful vision and ideas around where software is heading. Exactly. What about exploring data?

Michael Kennedy

51:13

So we've been talking about mostly writing data, testing, writing code, testing code in the context of data science, but I think it's actually really pretty powerful to say, here's a CSV. This is basically what it means. Let's start looking at, get me some graphs, pull me out some trends. What's important? What do you see that I didn't see? What do you think about this, like

51:34

this exploratory analysis side of it? And that's not even, you know, that doesn't even worry you about like, is there maybe a bug in the code because I don't want to put it in production, right? It's just, it's fooling around to get a jumpstart on understanding.

Hugo Bowne-Anderson

51:45

So firstly, exploratory data analysis is one of a scientist's and data scientist's most important jobs. So the question then becomes is how can we get AI to help us see what's happening that's so integral to what we do? And the truth is they're wonderful. They can be wonderful at pulling out

52:02

insights that I just haven't noticed, or I don't even think about how to visualize. Now, if I ask it to find the mean or median, it may suck at that on average, unless it writes the code to do so, But when you get it to do EDA or exploratory data analysis, it will provide insights that I haven't thought of. So one example I saw recently from a client was throwing in thousands of rows of customer data and website data of customers.

52:33

And it immediately showed clusters of high usage versus low usage. You could see power users. You could see power users who spent a lot on the platform. You could see power users who didn't spend much on the platform. And this is the type of work that takes hours for data scientists to sift through and develop hypotheses around. So when we're talking about, you know, data science is exploration and hypothesis driven, right?

52:58

So when we're talking about exploration and hypothesis driven data science, it's wonderful there. And once again, this isn't to replace what we do, but it's to help us.

Michael Kennedy

53:07

It's having a thought partner, which is fast. Beyond the computer, a different bicycle of the mind. of a sort. Yeah. Maybe like a, maybe it's like an e-bike of the mind. What do you think? I love it. Yeah. E-bikes. I love e-bikes.

Hugo Bowne-Anderson

53:19

They're awesome. And I especially do because I live in Sydney, right? So getting to the beach, it's actually quite hilly depending where, where you are. And the other thing I, yeah, when, so hopefully we have time to talk about this, but something I do a lot of work on is what I call evaluation driven development for LLM and AI powered applications where a really important part of this is error analysis and failure analysis. So seeing where your applications fail. So let's

53:49

say we're building a chatbot, which is RAG. So it has some corpus of documents and it retrieves stuff from it. And we want to interact with it. When we're building that, some of the first steps and ongoing process is seeing what failure modes they are. Is it hallucinating or is it not retrieving things correctly? Is it looking at the wrong documents? These types of things. And you do that to drive the development and iterative process of AI powered software. Now you can also use AI

54:18

to look at that and look at the results in the data exploration. And it can really bring out a lot of different failure modes. It can say, Hey, look at this cluster of conversations where the user finally got the correct response, but they asked to be connected to a representative first, and they didn't get connected to a representative. So this is actually an example where the conversation looks like it's resolved. The support ticket was resolved or whatever it is, but there's a deep

54:44

failure mode within there. So AI can be very good in terms of the data exploration and hypothesis driving process with that. So that was an example, but yeah, should we get into building LLM powered

Michael Kennedy

54:56

software? Yeah, sure. So I wonder whether you'd like to bring up the figure that I will talk

Hugo Bowne-Anderson

55:01

through because I know there'll be people listening and I'll share it in the chat with you. So this is a slightly tongue-in-cheek figure. It speaks to real pain we have. So on the x-axis, we have time and we're talking about building software. On the y-axis, we have excitement or dopamine, if you will, if you want to measure it. And for the scientists out there, I apologize for not having units on my

55:20

axes, but it's all good. We know with traditional software, excitement increases over time. Things are pretty boring at the start. You have a hello world and basic features and add unit tests, then you scale and optimize and load balance. And over time, excitement increases and increases and increases. This is absolutely inverted with generative AI powered software, right? Where you have a flashy demo at the start. You're like, wow, check this out. Okay. Then you're like, oh, wait,

55:44

does it actually work? Or can someone else use it? And so you have issues with basic functionality then. So excitement goes down. Then you're like, oh no, all these hallucinations, excitement goes down. Then you have monitoring challenges. You're like, how can I even look at all my conversations and tool calls and actions and this type of stuff? Excitement goes down. Then you have integration

56:03

issues? How do I integrate into my enterprise stack? Excitement goes down again. And I don't think it's a coincidence that this is one of the most exciting technologies of a generation that's totally addicted to Instagram as well, right? But the question is, and a lot of work I do, is in helping people raise the curve, not change anything, not even change the excitement of the

56:23

flashy demo, but just make sure that excitement goes up and up and up as you build. And once again, There's no free lunch. You've got to do this the hard way, right? So this is something, and a short plug for a course I teach, but this is something I teach a lot on. And I teach a Maven course, building LM powered applications for data science and software engineers with my colleague, Stefan Krauchik, who he works on agent infrastructure at Salesforce. I bring a lot of the data and ML stuff.

56:50

He brings a lot of the ops stuff. But yeah, in this course, we teach a lot of these things. And the way we do that, one way to think about it is what we call evaluation-driven development. So that's not necessarily having very sophisticated evals and tests and that type of stuff to start, but it's about having a sense of where you want your product to go and having a sense of evaluation of how to drive it in that direction. And the skillset that's required for this is so similar

57:22

to people who've built data science and ML powered products, right? So it's a curiosity to explore data. It's a hacker mindset. It's experimenting with different tools. And what have we been doing in the PyDataStack? I mean, look at the data viz landscape in Python. I think it was maybe PyCon 2018 that Jake Vanderplass, I think first time gave his talk, you know, data visualization in Python. And, you know, there were so many tools one could use there. It's this type of mindset

57:49

you need. And then in terms of the product workflow, how you make sure you don't kind of go down this curve, lose excitement in what I call proof of concept purgatory. People call the plateau of productivity. There are all types of fun names about it. But what you do is you use the machine learning mindset, which is get some data in and out of your system. If you don't have users yet,

58:11

you can generate synthetic data to do so, right? Then you label it, pass or fail initially. And you give it a failure mode such as it was hallucination retrieval error wrong tool call and a tool call essentially is what an agent does so an agent is an llm plus tool calls a tool call could be ping an api send an email whatever it is agent plus tool call in some you know while loop or for loop right and so you have did it did this particular interaction pass or fail what was the

58:43

failure mode and then how do I fix these particular failure modes? And I mean, one way to do this initially, depending on the complexity of your system, is to put all of this in a spreadsheet and do a pivot table. I know AI engineers hate it when I tell them to do pivot tables, but if you rank order, use a pivot table to rank order your failure modes by frequency, then you can see what

59:03

to fix first. And if it's a retrieval error, maybe you want to fix the rag part of your system and the retrieval part, right? As opposed to the generative part. So focus on your embeddings or chunks. If it's a tool call, focus on how that tool call is defined, heuristics that the LLM uses there.

Michael Kennedy

59:20

A lot of the time, if you're doing... I would like to add, well, this is great. I would like to add that you can use those agentic coding tools to help build better analytics. Absolutely. Other support. So you're like, God, we don't really have, we can't really track that, right? Like, well, give it half an hour and you can, you know what I mean? Exactly. And so you got to kind of think out of the box and be a little creative there.

59:42

Maybe build an MCP server that specializes in solving a specialized LLM that addresses a shortcoming that you can then force it to use the MCP server and work more constrained or whatever.

Hugo Bowne-Anderson

59:53

Exactly. And as we kind of hinted at earlier, one of the wacky things here is that a lot of this comes down to prompting. And people like, should I fine tune or prompt engineer or rag or that type of stuff? Prompt and prompt and prompt initially, because you can get so much lift by prompting. Now, if it is actually a retrieval error, perhaps you want to improve your embeddings or your chunking strategy or that type of stuff. The other thing is, of course, data, metadata, data ingestion.

01:00:19

Draw an architectural diagram of your system where you see, you know, you have your RAG, you have your output, you have your embeddings, you have your OCR, or if you've got PDS, however you're ingesting your data, a huge amount of the time, fixing how you do your OCR on your PDFs will be far more significant lift than switching out to Claude Sonnet 4.5, okay?

Michael Kennedy

01:00:40

At that point, I totally understand

Hugo Bowne-Anderson

01:00:42

why people want to try the newest and sexiest model. And I'm not telling people not to. What I'm saying is focus on the fundamentals. And then when you have this set of evals of labeled data of what works, what doesn't, You want a test set, essentially, right? Like in machine learning. So it's the same process. You have this test set, your gold standard, which ideally covers, has coverage over all your failure modes. You want eval coverage.

01:01:09

Then when you switch out to a new model, you can see how it performed on your test set. Imagine that. Imagine being able to switch out a model and seeing what's up there.

Michael Kennedy

01:01:18

We will say it's better concretely with data, not just it feels better, which is often.

Hugo Bowne-Anderson

01:01:23

There are all these eval conversations about we don't want these evals. We want online evals. And then there are people do things only by vibes. And all of these things are absolutely valid as well. I think it's a healthy combination for whatever your product needs at any point in time.

Michael Kennedy

01:01:36

I do want to take just a couple minutes, literally, and get your thoughts on people coming into the industry. Data scientists who are just graduating now, or they got their first job, or they're about to get their first job. Like, I imagine this is something of a scary time. I think, well, now I'm not just competing with all the other people. Now these like AI things are biting against me getting a job as well. But I think it's both a blessing and a curse. What do you think?

Hugo Bowne-Anderson

01:02:00

Focus on three things. What value you can deliver? What's your skill as a data scientist? And it is looking at data. It is building and it is tying that to business value. So if you focus on your skills and you build, build, build and consistently tie it to business value, I think you'll go a long way. And this actually speaks to how we think about evaluation more generally.

01:02:23

And I do want to give a shout out to a Sydney-based company called Lorikeet that they build customer support agents for all types of industries. And when they do evaluation, they always evaluate was the ticket solved or not. That's the most important evaluation. It's not did this LLM call result in what we wanted it to be or anything along those lines. Of course, once they see a failure, they tie it back to these more technical LLM-based failure modes.

01:02:53

But I wonder if, yeah, maybe Google Lorikeet AI or something like that.

Michael Kennedy

01:02:58

I had it in another tab. I don't know. I'll get it back.

Hugo Bowne-Anderson

01:03:00

One thing I love about, and I'm indexing on them. There are lots of companies that do this. But one thing that's very interesting about them is that their pricing model is ticket resolution based, right? So if they've built a concierge or customer service agent for you, you pay based on how many tickets are resolved. A lot of their competitors, their pricing is based on tokens, right? Because that's what they pay for.

01:03:24

And in terms of aligning incentives, having it based around resolution is incredibly important for how you build as well. And the reason, the question was, what should early stage data scientists focus on? The reason I detoured on that story is supreme focus on business value, how your skills can be tied to business value via building stuff.

Michael Kennedy

01:03:46

It's great advice. And I will just throw one more thing out there for people is don't let these AI tools undercut your desire to actually learn the details. If you just go like, all right, I asked it and it gave me the answer and it just streamed by and you didn't pay attention, you're doing it wrong. You've got to stop, read, pay attention, question why did it do that. You can even ask it, why did you do this? Rather than that way, it might give you a documentation link.

01:04:11

Like you got to stay active and it's so easy to just go next, next, next, because it's exciting that it's building something. Absolutely.

Hugo Bowne-Anderson

01:04:18

And I will add one other thing there, which is we need to carve out time. These things won't always work. You can spend a day working with AI and make less progress than if you'd done it yourself as well. I want to be very clear about that. What we all need to do is figure out organizations we can work at, work with, and then time ourselves to experiment with these seriously emerging and rapidly changing technologies. So it won't always be wins is my point.

01:04:45

So don't get discouraged when it isn't.

Michael Kennedy

01:04:47

Awesome, Hugo. Thank you for being on the show, sharing what you've been up to, your view of this. It's got your pleasure. Absolutely. I'll put a link to your podcasts, your courses, stuff like that in the show notes for people.

Hugo Bowne-Anderson

01:04:56

I'd love that. Oh, and I mentioned this to you earlier, but always love Talk Python, of course, and I'm so grateful for having me on three times. And I'd love to offer your audience 20% off my course as well. So we'll include that link in the show notes.

Michael Kennedy

01:05:09

Beautiful. We'll put it right by the link. All right. Well, thanks for being here. We live in weird and amazing and crazy times. And yeah, I'm going to leave it with that.

Hugo Bowne-Anderson

01:05:17

On the journey together. Thank you. Thanks, Michael.

Michael Kennedy

01:05:19

That's right. See you later. Bye, everyone. Ciao. This has been another episode of Talk Python To Me. Thank you to our sponsors. Be sure to check out what they're offering. It really helps support the show. This episode is sponsored by Posit Connect from the makers of Shiny.

01:05:34

publish share and deploy all of your data projects that you're creating using python streamlet dash shiny bokeh fast api flask quarto reports dashboards and apis posit connect supports all of them try posit connect for free by going to talkpython.fm/ posit p-o-s-i-t this episode is brought to you by nord stellar nord stellar is a threat exposure management platform from the Nord security family, the folks behind Nord VPN that combines dark web intelligence, session

01:06:05

hijacking prevention, brand and domain abuse detection, and external attack surface management. Learn more and get started keeping your team safe at talkpython.fm/nordstellar. If you or your team needs to learn Python, we have over 270 hours of beginner and advanced courses on topics ranging from complete beginners to async code, Flask, Django, HTML and even LLMs. Best of all, there's not a subscription in sight. Browse the catalog at talkpython.fm. Be sure and subscribe to the show.

01:06:37

Open your favorite podcast player app, search for Python, we should be right at the top. If you enjoy the Geeky Rap theme song, you will be right at the top. You can download the full track. The link is your podcast player show notes. This Is Your Host Michael Kennedy. Thank you so much for listening. I really appreciate it. Now get out there and write some Python code.

01:07:03

Talk Python To Me, yeah we ready to roll Upgrading the code, no fear of getting whole We tapped into that modern vibe, overcame each storm Talk Python To Me, I-Sync is the norm you

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript