Testing at Scale: What Every Developer Needs to Know with Maurício Aniche

00:00

Hi everyone, My name is Patrick Akil and joining me today is Mauricio Aniche, previously professor and researcher over at Delft University of Technology and now Tech lead for testing enablement over at ATM, an organization with over 1000 engineers strong, quite a large group actually.

00:16

So we talk about the testing strategy in an organization like that, how you enable testing with developer tooling and platform engineering, and how AI has been influencing testing and will influence testing with LLMS now introduced in the picture. Lots of fun this episode. But before we go there, I want to do a big, big thank you to everyone, each and every single one of you, because we are 30,000 strong now, Big 30. And this is also the year that I'll be turning 30.

00:42

So it's a big gift to me. 30s and 30, everything coming together, a lot of fun. Enjoy the episode. As as a developer in industry, you are more constrained in the set of problems you can work on your day-to-day because at the end of the day, you have to make an impact on, on the business of the company, right? So they have a smaller set of problems that you can tackle. But but when you do, the impact is right away, right? You ship software, you're changing people's lives on a daily basis.

01:15

As an academic, you have the opposite. You have the full freedom so they can pick any problem in the plan on the planet to work on right it and and that is amazing. But at the same time, for you to truly see your research having an impact on the real world that that takes too long, right? I would spend like a year working on a project and then writing a paper and then going through all, you know, the peer reviewed publication. That is quite tough. And then I had a paper and and I

01:41

was like, now what? How does this get to the real world, right? And so it, it's a 2 sided knife in a way, right? That you have all the freedom in the world, but it's so hard for you to, to identify the right problem to work on. And then and then having this, your idea caught up by industry and then people seeing it applied. And for me it was always about seeing my stuff being applied. So I couldn't just be a theoretical guy that, you know, in 50 years from now my work

02:08

will be spoken about. That's not really for me and if you like, but but academia also allows you to really think deeply about a problem, right? As an engineer, what I see is you have a lot of very complex problems that you could come up with amazing, beautiful tools to solve the problem, but you usually do the smallest that you can to solve the problem so that the business benefits from it. And then you move to sort of the next thing that the business wants.

02:32

And that's just life. But in, in academia, you can look at a problem and say, I'm going to go as deep as humans can, can call on this problem, right? And then and, and, and nothing is for there's no business pressure for you to Hey, just move to another problem. So I know people like the people that I admire the most in academia, those, those people are working for 1520 years on the same problem, right? And that's, that's truly amazing. Not really for me, but that's,

02:57

that's really amazing, right? So, and I think that's why both complement each other because you need people solving problems right now. So society benefits right now of small improvements, but you also need people doing moon shots, right? That maybe 10 years, 15 years from now, they will pay off and then we can use their technology. It's it's a nice balance I feel. Yeah, I I love that perspective and I didn't think about it from

03:20

that side. Like the freedom to explore a problem and to kind of, I mean, you can time box and you can self manage, but to actually have the freedom to do and go in depth as as deep as you want. I know a lot of engineers that go really, really deep in some of the problems and indeed they do it more so on their own time from an educational standpoint, from a curiosity standpoint. Some people can't just drop it

03:42

at a surface level. They truly need to understand before they actually execute, and they usually do that in their own time. So having the freedom to do that then in academia and work on problems for let's say our future selves or our future planet sounds amazing. I do wonder though, like this length that you said, OK, from idea to execution to paper and then people actually adopting it. It sounds like quite a long cycle. And then controversy on the

04:05

business side. Sometimes we're too short. Everything is deadline, fast-paced, too many problems to solve. Solve it in the quickest way and move on to the next. It's like we need to find a balance. I feel like in in both sides. Yeah, for sure. Yeah. But then this fast-paced industry world, that's where you're in now. What kind of problems are you solving that you can share with us?

04:24

So I'm currently at Agin, right? So the famous financial technology, it started as a payments company for those that don't know the company. And now Agin offers plethora of financial services to large companies like Meta, Uber, Spotify, Netflix and the like. And I'm part of the platform engineering. So we have a big group that does

04:46

platform right? And the idea is let's just give our engineers a platform so that they can be more productive in delivering business value financial products to the market. Right. Is it self developed the platform? Yes, OK. Yes, yeah, we rely on open source. Of course, we have some foundation on open source systems, but we have lots of layers that were done by us and my team in particular. We do everything related to testing.

05:13

So we we build and maintain the tools that our agent engineers use to write tests for your systems. So think of from, you know, very small, you know, maybe helpers or layers on top of J units so people can write simple unit tests up to the CICD running complex integration tests, static analysis. So we are the ones that sort of control the static analysis tools that we have in house. And then sometimes we write new static analysis rules ourselves for specific problems that engineers have.

05:49

So sort of everything that is related to quality, my team sort of owns it and which back to the academic side matches my profile perfectly because it can work on developer tools. We do have some freedom to try to go deeper on some of the problems. Yeah, and that's nice. So that's the perfect combination. That's awesome. Like for the context and for my kind of idea of the scale of things.

06:16

I recently had an app a person on that is specialises in platform engineering and internal developer tools and we had a conversation when platform engineering makes sense. You mentioned that when an organization reaches a tipping point with regards to engineering capacity, that's when a team that really kind of owns this platform as a product makes a lot of sense. What is like the scale of the organization with regards to engineering capacity? Yeah, good question. So and I indeed agree.

06:42

And at some point when you're too big, yeah, you start to notice a lot of repetition, a lot of duplicated work. Things that could be solved once are now solved 10 times across the organization so that you can be more productive there. Agin is at this moment a 4000 people company, yeah, out of which maybe 1000 or maybe a little bit more than that are engineers. Well, and engineers of all

07:05

types, right. So we have back Enders, we have mobile developers, we have embedded developers because we also, you know, we have some C code in there. We have data people front Enders. It's it's we have sort of everything. Yeah, my team focuses more on the back end part of agent, so where all the financial logic happens. And so it's a a Java super large code base. Yeah, and it's a good question. How many out of these 1000 engineers, how many of them are Java engineers?

07:39

I don't even should I guess my my guess would be like maybe a 500 something like this, but I might be wrong. But it's it's on the scale of hundreds. For sure. Yeah. I, I talked to I think two distinct people that I still remember, 1 was more involved in eBay and Mark Plant's kind of that ecosystem from Alavinta. And then one was more involved

07:59

more so on the booking side. And we both talked about testing and quality and when your engineering capacity reaches this scale, right, thousand engineers, maybe 5, like 50% of that specific on back end with a specific language code gets produced like it's nothing basically. But the funny part is all of that code has a cost of

08:17

ownership. So code that is maybe not effective and especially when we're talking about a test suite that needs to be fast with regards to feedback and iterative can be quite compromising when that goes the wrong way. I feel like how are you trying to manage or even educate the people that create the code? Yeah. So we try to to approach testing from different angles. 1 is and the one that I mentioned is tooling. But we also try to do not education per SE. We call it strategy.

08:46

We try to to guide teams on, on how they can write better and more effective tests, right. So maybe 3 1/2, four years ago, we actually formalized this in a document. We call this the testing strategy. And the idea was, you know, this is how we see the testing pyramid or so. And this is when we believe you should write a Unity test and you should mock stuff. This is when we believe you should go a bit bigger, maybe write a component test, and then you test your SQL together with

09:11

your code, so on and so forth. So we try to be more prescriptive. Yeah, at the end, trying to help the engineer to reduce the friction of the engineer, right. How do I test this? Should I go for unit test or a component test or integration test, whatever. So we, we do have this part where we try to say to people, this is how they should write your tests. And then those are the tools we give to you so you can write

09:32

your tests. And if you use these tools, well, you benefit from, you know, it's sort of stable, right? It works. It probably it solves your problem faster because you can reuse a lot of stuff. Yeah, that's, that's how we've been. We've been trying to to combine. And it's, it's funny because sometimes we focus more on tooling and then we sort of notice, hey, now we need to, to push people towards, you know, a better strategy on, on how to

09:59

write tests. And then when we focus more on this part then hey, but now we're missing tooling so that you sort of walk hand in hand in the way we we are doing it. Yeah, yeah. I could imagine something like tooling and especially for something that's like very day-to-day nowadays in an engineer's tool belt with regards to testing can be quite crucial, right? Having it the right way is going to be very effective, is going

10:20

to make people more productive. Having it even slightly not great with just uphold people from writing tests or from doing it right. I like Go as a programming language. That's like how I grew up in the industry and testing is very much like ingrained in the culture of how you write software. Also how you structure Go code. I mean, there's not many ways to do the same thing, which is also kind of their vision to keep things simple. But testing is a first class

10:45

citizen. And I I also do trainings with regards to Go. I'm going to give one tomorrow and the day after. And that's always been a lot of fun to talk about testing, designing for testability in the 1st place as well and then executing on that. And The funny thing is if you design your code the right way, that testing is easy in the 1st place. If you design it in a way that it's not easily testable, then

11:06

it's very hard. So software design has always been like a proper principle that you need to adhere to. I'm wondering what you think of kind of designing for testability, because I see a lot of people kind of a small niche advocating for test first development and also another kind of part of that smaller subset that is very dogmatic about it. Write your test first, then do everything basically. And I am a bit more flexible with regards to that. I'm curious to hear your

11:32

perspective. Yes, and I'm also quite more flexible when it comes to test driven development or so, but I'm a big fan of. You know you should design your code in a way that facilitates testing afterwards, right? Because testing is very important and maybe you have to change one or two design decisions here so that your code is more easily testable. Just just do it right. Make your code easily easier to be tested. Now some people advocate that you have to do test driven

11:59

development. So you have you have to write a test first so then you can see precisely what your code should look like. I don't truly believe that that's the only way. Maybe it works for you. Go for it. It worked for me for, for many years, to be honest, maybe the 1st 5 or 6 years of my career, I was quite influenced by test driven development and I was doing this a lot. But I don't do test driven development today, right? Maybe because all those patterns

12:22

are now maybe more natural. I can just, I can just see them. But, but, but I feel like it doesn't matter if, if you write the tests before or if you write after, as long as you're writing tests timely, right? A little bit of production code, a little bit of testing or vice versa. And your code and and and and and and. It's easily testable. You don't have to do weird tricks to test your codes. I think you're fine.

12:46

Yeah. Now I think one of the the biggest challenges in such a big, large, complex and old organization is that it's impossible to only have beautiful code, right? And in organisations where code dates back to 20 or 15 years, you know, back then, we knew way less. The tooling was really bad. Let's be well, not really bad. It was the best that we could have at that moment, right? Yeah, So it all those large organisations, they suffer from code that is, that is not truly

13:17

easily testable. You know, it's not easy. If you want to mock something, you're going to suffer. If you want to test a piece of code disconnected to the rest, it's also hard. So on and so forth.

13:26

And if you part of our exercise there at the beginning was like, sure, if you're writing new code, try your best, you know, you know, the best practices you understand about design for testability, dependence inversion, blah, blah, blah, all those cool stuff in Java. Well, Java or other languages, whatever. But you know, you're going to have to handle this legacy and you also have to test it. Yeah, right. So how can we facilitate your

13:50

life? So, so you can write a test for it. So we, we sort of created a guidelines on, on how you can, you can test this, this legacy stuff. It, it, it is hard for me to give details because it's really focused on the way we coded for 20 years or so, right. But I think that the gist is if you're in such a large organization, you have two

14:11

fronts. 1 is make sure new codes is born with the practices we know today and node codes, you find ways for engineers to still be able to test it in a productive way. Right. When people say people don't want to test the legacy, it's not that they don't want to. Even if they want to, they're not going to be able to write. And maybe they they are, they're going to try and they're going to spend one or two or three days to write one or two or

14:34

three tests. Of course, at some point the engineer will say, I'm going to stop this. I have something more useful to do, more impactful to do, right. But if you reduce the cost a little bit, things get a little bit better. Yeah. So I think and that's why a a platform team can can benefit because can you, you can then sort of see what are the common patterns or bad patterns that happen in the code base and what can you do to reduce their impact on untestability?

15:05

Yeah. Yeah, I, I was thinking something that you mentioned kind of triggered me to think about my own journey with testing. And I think it goes a lot with regards to maturity as well, right? If I, I also see people now having no testing background and starting to test, they're like, OK, I should test every single function that I create. And then we're like, OK, but that's not really the point, right? You test from an outcome perspective. You have a certain input and a

15:26

certain outcome. And it doesn't have to be every function in between a chain, but it does have to be a chain because that's the most valuable, right? And then it's a matter of if you have so many engineers writing tests and their maturity level varies, you're going to have pretty strong tests and you're going to have weak tests. And I wonder how you clean up or at least have a test suite that

15:45

is of high quality. Because I feel like if you go wide and you're like, OK, every test is going to be valuable, Then at some point your test suite is just going to run hours on end and it's not going to be effective for people to iterate quickly and to cycle through their changes with regards to feedback anymore. How do you make sure that your test suite is actually of high quality as well?

16:03

I think you may get it in in the sense of I feel like at an engineer level, one of the hardest decisions is at what level should I test what I'm coding, right? Should I go full unit test and, and exercise this function alone from the rest? Or should I go a bit a bit bigger and then exercise maybe the entire flow right? And I have a, a funny story at Ajin about this. A couple years ago, we wanted to to organise the work of teams when it comes to testing their legacies.

16:33

So we said to them, you can create test plans and those test plans will contain stuff tests you want to write. So you sort of feel more confident when when shipping the legacy to production, right? And then, and the idea was because you're going to formalize this, then product managers can understand and they can prioritize this and they understand the importance. So you make the work visible so people can actually work on it, right?

16:56

Instead of just saying, hey, engineer, right, test for your legacy, right? It's not that simple, right? And we gave this test to the teams and teams assigned engineers to, to sort of create those test plans. And as a retrospective, what we noticed was a bunch of those, those test plans, they, they contained like, let's just unit test this function or, you know, let's test this class or so. And at the end they put a lot of effort in, for example, unit

17:26

testing a lot of stuff. But then at the end they were like, do do you trust now on your test suite? If, if it's green, do you feel comfortable in shipping? And some things are like, no, not yet. Like why is that right? And that was just because they sort of picked the wrong level to test what they needed to test. And it's been maybe a bit less than a year. We released this new tool because then what we noticed was sure this decision, should I unit test or should I test larger?

17:53

It is a tricky one, but you should never forget the business, right? And I think you mentioned this, right, focusing on the business when writing the test, you said something like this. We created this tool Italian, where product managers can can in an easy way say those are, you know, the main maybe critical flows of our app. And then engineers can see this and this is just written in English, right? Very high level. And the engineers can look at it and then write an automated test

18:20

for that. And we're still working towards more adoption of this tool. But our premise is that if it is very clear about what are the main flaws you have to test, then you can just write maybe more, you know, larger tests for this. As an engineer, when coding the feature, you can still write your unit tests, right? And that will give you fast feedback that your function, your new function works. And so, but you also have the safety of a test suite that focuses on the business.

18:46

So we wanted to combine both, you know, like the Super technical aspect of writing tests that developers are usually quite good at it and, and the business perspective on it. So at some point people were like, is this BDD or so are you bringing a behaviour driven development to IGN? And we're like, we're, we're not giving names. We, we just want to make sure that you, that you understand from a business perspective what you're testing in an automated fashion.

19:12

And I feel this is so important if you're on a complex software system, right? Because in this type of systems, no engineers truly understand what's happening, right? It's, it's so big, it touches so many systems and modules that no one truly has this, this big vision. So it's, it's natural that if you're just coding a small change in some function, you don't see the big picture.

19:31

So I think combining both the business perspective and then the Super low level perspective of testing is very important. Yeah, You can see that especially from the business side, covering those critical paths are like a must have, right? And allowing those people that have a bit more business context maybe are a bit less technical to at least try and vocalize what needs to be covered.

19:53

Like at a minimum, especially in financial services, there's a lot of stuff that needs to be covered at a minimum, I feel like and in whatever chain you're working with, especially as your engineering capacity grows. Your engineering capacity grows because you want to go faster, but that also means that separate teams are going to be

20:09

part of one chain in the end. And indeed, having an overview of that chain, maybe even product managers can't have that because they are responsible for their own teams. Maybe overarching product leads have that, but it's more and more abstract, right? The further away you are from execution, the further away you know what's actually happening and what's being executed. So this is like a safety net that you can't go without, I feel like.

20:32

And the funny part was because in this first part where we told them to, to come up with test plans, my audience was engineers, right? So I was either talking to an engineer and I see individual contributor or talking to, to the team lead to the technical team leads, right? And the product was like team lead, you talk to your product

20:50

person, I don't care, right? But then in this second version where we created this product, this product where you put the critical flows or so I decided to speak directly to product managers and they felt so nice to be included in the quality assurance process, right? So for the three 4-5 people that I spoke directly about this, they just, they just, you know, it's nice to be useful in all ways possible, right?

21:17

And they, they, and I feel like they didn't have a clear way to to make sure that that the product worked. And now they have. And that was such a click for me that, you know, yeah, you see, it's not that hard to because people are usually like, yeah, but you have to convince product that you have to write tests, right? But for this, from this experience now of course it is hard because their agendas, their backlogs are full, right?

21:38

And there's business pressure for them to deliver more product stuff. But once they understand that they can be part of making sure that the product is reliable, they a lot of them are taking this opportunity right. Yeah, yeah. I, I don't know, maybe it's because I like, I, I've done a year of product management now and my view on product management, it's like it's product management and product ownership, right? Own your product and quality is a huge part.

22:02

Doesn't matter if your backlogs fall, there will always be stuff to add on top of things. But to make sure that what your build is of high quality and is resilient, I feel like is a bare minimum. So it should be a non negotiable that engineers indeed get the capacity to put in the quality that's necessary. It's just a matter of also having an engineering team that is mature enough to decide, OK,

22:22

these are the critical paths. This is actually not going to benefit us at all if I test on the small of a scale because the income and the the input and output is never going to change. So adding this test, it's just more cost of ownership.

22:34

So yeah, it's, it's very difficult, I feel like to manage that or to have a person that really owns that and understands it. Also from a technical level, I feel like it's, it's growing closer though, because from an education standpoint, especially now with AI educating yourself on different domains that you're unaware of, I feel like knowledge is closer and closer, right?

22:53

Used to be a Google search. Now I feel like I have a trusted advisor and whatever question I ask, I get pretty good level of detail on at least something that either confirms my assumptions or give me stuff to kind of research and explore more, which enables me from a product side. And also, I think from a technical side to either understand tech or to understand business better, which hopefully reaches us to kind of make more informed decisions in the end.

23:19

I'm wondering though, what your perspective is on kind of the code generation and code automation part of it. Because if we're having a context where business people are kind of defining their critical paths and we're saying, OK, these need to be covered with regards to a test suite, that does sound like something that can be automated from a code perspective. If we're in a mature phase with

23:38

regards to code automation. And then conversely, if we have such a requirement or to such detail what needs to be done, then I feel like code can also be automated. What's your perspective on that? Yeah, good question. So I think my opinion has been changing before LL, like I think there's a clear division like before LLMS and after LLMS, right, Because AI for for software engineering is not a new topic, especially in academia that people have been

24:06

working on this for for years. And before the power of LLMS, my feeling was we need both. We need humans writing tests that makes sense from a business point of view, that makes sense to another human. And so, and we need AI to explore more paths in the application because as a human, you're sort of limited, right? And complex applications have so many different possibilities. Then you can have AI to, to help

24:31

you there. And I think a prime example for me was a tool that started at University College London and then Facebook bought it back then it was called Sapiens. You can you can Google about it. And this was just a sort of a smartish tool that would randomly walk in the mobile application. So think of Instagram, WhatsApp and so just clicking randomly, doing random actions on on on on the app. From a user. From a user perspective, so this was sort of end to end indeed.

24:57

But you can do this at whatever level you want, right? And at some point, if the application crashed, then the AI would consider, you know, I'm successful, I found a crash, right? It would generate the report to developers. And, and the power was this AI could do this much faster than you, right? And, and it started, of course, sort of naively random, random walks on the application. And then the people there started to make improvements on the algorithm.

25:21

So it is for those that understand about AI, it's, it's basically a search based algorithm, right? So it's AI, but not as fancy as, as today. And so for me it was the division was clear humans writing tests, AI complementing tests for things we will never find. You know, like maybe another example that I that I used to see a lot testing highly complex distributed systems, right? And then think of systems like Hadoop and this type of stuff, right, highly distributed

25:51

systems. What is the other example that people do a lot in papers like those distributed databases, whatever. And how do you make sure that, you know, you, you send messages to those systems and, and maybe there's a bug that happens because 2 messages gets one millisecond after each other or five milliseconds. And as a human, you cannot write a test, you know, that explores this space and the AI can just keep randomly trying stuff.

26:16

So, but then LLMS came and they are super powerful and you can just give a piece of code and then ask the LLM to write tests for you. And my opinion right now is you should certainly use LLMSI use a chat TPT or, or our internal LLM every day, basically, right? I delegate a lot of, you know, simple tasks to my LLM when it comes to writing tests. I feel like we're going to get there, but we're not there yet,

26:43

right? So if, if, if you try even chat TPT, if you, if you give a simple function to chat TPT, it will generate beautiful tests for you that may even compile right away. But as soon as you give a more complex class, which is what we basically have in a complex system, it gets lost on the dependencies, right? It, it doesn't know if if it should mock or not mock something. It doesn't understand how your database works because maybe your test involves database.

27:06

And so it, it gets confused really quickly. But I feel like it's more about have keeping iterating on the tool so that at some point it will get there. Now what what I see is it, it's so my book in, in my testing book in chapter 1, I make a division there 1 is testing from the developer's point of view. And you know, as a developer, you just want to have a bunch of tests. So if you change something, the test breaks and you know you

27:32

broke something, right? But you also want to do testing from perspective of let me just explore this program looking for a bug, right? Looking for something that you didn't see in the implementation, you didn't see in the tests. So there are these two perspectives. And if you before LLMS, all the research was focused on let's use AI to find bugs you as a human could not see. Now with LLMS, a few people are more exploring from this. Let's just write tests for an

27:56

existing piece of code. And as from let's use the the capability of the LLM to find something that I didn't think about. So there's so much space to be still explored, but I'm, I'm, I'm, I'm very positive. I feel like we will eventually catch up and we will be able to delegate this to the LLM and say write tests for me, you know? Did I miss something in this type of stuff? I like that a lot. I, I think it's going to get

28:21

there at some point as well. And I feel like it's going to be a tool in a person's toolbox to ensure quality and to ensure kind of resilience because that's what in the end that's a means to an end, right? Make sure that we have that test suite. So wherever in whatever chain in this engineering capacity, something changes, it doesn't kind of crush a critical path because that is at the end going to be very costly, especially if you're in a bigger and bigger

28:45

organization. So having something that is then automated for me is, is going to be quite powerful. I wonder though, because we as humans, because we carry this software and we carry the weight of code, we are very conscious of adding code, not even from a testing perspective, but just in general, especially if I'm in a small team, I'm like, OK, do we need this? Can we afford this?

29:03

Do we need to outsource this or can we own this like those are continuous conversations because that's what you need to have. You can't just work on the coding side of things and you can't just add things from a product perspective. We also know that because adding features is not going to make the best product, you make sure that whatever features you add are going to be crucial. Small bets, high usage, high impact and high value and keep it simple, right?

29:27

Don't create a, a toolbox for everything in a product that you have. If you can find a niche and you can make it very small, then be successful and expand in that way. With automation and kind of code generation, especially what LLMS bring now, I feel like generated code is going to be easier. So we do it more often, which means we carry this weight of code more and more. And that's the only concern I have right now.

29:50

But I don't know if it needs to be a concern, because if we don't necessarily care what's being generated, as long as it can compile fast and it doesn't kind of hit our iteration cycle and it doesn't slow us down, might not be a problem. Yeah, two years ago or so, I was in a workshop on software engineering education. This year, 2025, it's actually going to happen in the Netherlands, by the way.

30:11

And one of the big discussions we had there was that we usually train engineers to be very good at writing code. Yeah, we don't train them so much on reading code, right. Take, take a look at your undergrads, right? You're mostly writing what, what, what lectures did you have of, of like, let's just read these codes and with LLMS and I agreed with the discussion in the workshop that was with in in the LLM world, the LLM will vomit a lot of codes.

30:39

And you as an engineer, you have to be able to read that code to first make sure it does what you want. And also if it does in the most effective way that it can do right. So I get myself looking at the code from chat TPT and say, very nice, but you can simplify this. And then chat TPT then rewrites the simpler version of that code because it just, you know, it, it, yeah, it, it takes the, this first decision, it, it can see sort of right.

31:04

And then, and then I, I caught myself now really doing thorough code reviews in, in a chat TPTS code. And I feel like that's at least for now, while the technology is still maturing maybe 10 years from now, we don't have to think about this, right? But now I think it's part of the job to be able to read, chat TPTS code and then help it simplify as much as as it can be simplified.

31:26

Because at the end of the day, for now, no applications are written, or at least that I that have been part of 100% of the, the code written by LLMS. We just, it's bits and pieces, right. And then so I feel like still having a human reviewing and then making sure that this can be pasted in the code base and developers will understand that makes a lot of sense to me. Yeah. Now it's so hard to make predictions.

31:48

As I said, maybe 10 years from now that's not a problem, but today I think it is more than more often than not I I find the code from chat TPT to be very good, but it could be just a bit better, right? And I iterate in my prompts to make it better. I, I'm very curious because I've had this conversation with other people and they don't have the same background in academia as you have. From my perspective, if, if indeed we focus right now in

32:13

academia on writing code. And if you grow out of university and you go into the industry, writing code all of a sudden is automated, like your skill set is going to be kind of outdated. I feel like. I like that you highlight that reading code is going to be more and more crucial, I feel like. But it also requires a certain level of maturity to be able to be prompt and then say and understand, this is simple, this is maintainable, this is what we

32:38

can adopt. And all the other stuff is like garbage or we have to iterate on top of that. It needs that level of maturity. And then there's this kind of chicken egg problem where new people need to have a certain level of maturity to refine code automation, but they don't have that maturity and writing code is all of a sudden not a task for them anymore because that's automated. I don't know how to solve that yet.

32:59

Yeah, so, so I, I, I feel like today, I don't think a is are at least in 2025 replacing engineers and some people have been predicting like a week ago or so. Right. But but, but I do feel if, if you don't know how to use this assistant, you have to know how to use this assistant in order to be productive, right? And I think all your background as as a programmer helps you because, for example, computational thinking, right?

33:28

If, if, if, if you can truly give the task to chat TPT in a, in a sort of a more structurized way, it will just produce better, right? So sometimes I see people writing prompts to chat TPT and this prompt is super generic, like do this for me and then chat TPT produces what it can based on the information. But if you give a more specific prompt, like write this Python program for me that receives this as an input and then does this as an output.

33:58

And then for you to solve this problem, take this, this alternative, this path and show then the code is quite good. Yeah. And then for me, it feels like I'm, I'm still coding it. I, I just, I'm not doing the boring part of coding ifs and in loops. And so I, I delegate that to chat TPT and then I also do the part of let me review the code to see if it works. Is this testable? Does this fit with my code base? Does this do what my, my, the business wants?

34:22

So at this point, I don't feel like my engineering skills are being now thrown away because chat TPT can write codes. I think chat TPT is augmenting my powers because it can deliver much faster. I don't, you know, usually when people, when you, you're usually coding, you have this abstract thinking and then you sort of understand what you want to do and then you just go to coding mode and sometimes in 10 minutes you find the solution for a very hard problem and then you spend

34:45

2 hours coding the solution. This part, this two hours just gets what they become 10 minutes because chat TPT will give you a super nice head start, right? So it's just a superpower. I think the challenge is I feel like the more senior you are, the better you can use chat TPT because you have more maturity. You know more about a lot of stuff, design, coding, etcetera. The junior you are, the harder it is for you to use chat TPT.

35:11

So as, as educators in general, I feel like we have to make sure that the new engineers, they learn how to use chat TPT. Otherwise it's going to be just harder for them to, to, to catch up with the senior folks because it as if I gave this super powerful weapon to seniors. So they, they now just it's very fast, but the juniors there, they cannot use the weapon this, this tool with the same power as seniors.

35:37

So I feel like that's where I think, I feel like the gap that we need to cover in 2024 truly is. Yeah, I I can see that as well. I wonder where it's going to be covered because from a perspective of a business, I see less and less roles opening up for juniors, which is a sad thing to see because that means that, OK, they're looking for people that have experience that can go fast, they can iterate, etcetera, etcetera. But at some point there will be a new generation coming from academia.

36:02

I think it's still a beautiful field to be in. The fact that you can go to conferences, you have an open community, knowledge is being shared and it's not you don't have to pay for it at all. Like things are out there in the open for everyone to learn it to be better. I think it's a beautiful environment and a lot of people want to learn and a lot of people are interested, right? If you grow up with an iPhone or an iPad, you have that technical

36:23

technology aspect. If you see a game, you might be want to become a game developer. If you found a really cool website with a really cool design, it might kind of intrinsically motivate you to think about what makes it take or from a design side, what can I do to influence that? So a lot of people are fondling in. It's just from the business side. I do think there needs to be more room for people to end, kind of educate themselves and grow.

36:45

I think with tooling, educating is going to be more crucial and the people that can learn the fastest probably do have or can fight for a spot for themselves. But I see it being more and more limited, which is concerning. Yeah, Yeah. It's been a tough market right in, in the past years. So I feel like a lot of companies hire juniors because they cannot afford hiring seniors at the speed that they need. So they have to hire juniors. They are they are more available in the market.

37:12

And and because companies naturally started to hire less, this make it harder for juniors. But I'm hopeful that, you know, the economy will get better, all the variables there will get better and then we will be able to start or we as a society, right, will be able to to start hiring more juniors. But indeed, being a general, you're not on a tough spot, especially because you also need to master more stuff.

37:40

I think though, because seniors sometimes are a bit more apprehensive, like I've seen some people say I don't use it, I don't trust it, I've used it years ago and it's rubbish. And I'm like, yeah, well, things are changing faster. The new generation is kind of more on top of tooling that comes out and has the room to experiment so that I feel I can

37:58

be a differentiating factor. If you learn faster how to be or how to excel with tooling that's available, that can definitely give you an edge compared to the people that don't. Yeah, for sure. I think if you're a senior and you're not getting up to date, yeah, at some point the market will catch you. Dinosaur, yeah. I'm very curious because since you mentioned you are in a platform engineering squad and also in more of a leadership

38:20

type role. Platform engineering for me is fascinating, kind of a product with developers as users. I can just see a lot of priorities and then decision making, kind of managing your own agenda and still executing and iterating becomes very crucial. How do you decide on kind of what to execute on or as a team, what to solve first? Because I feel like you can go a long way in a lot of ways basically.

38:43

Yeah, for sure. I, I can talk a lot about how my team has been doing and then maybe I can see a few words about how the organization is doing. Because of course I'm, I'm less involved in, in the whole platform engineering planning in my team, which is focused on test tooling. We try as much as possible to be closer to the engineers. So at aging that's easy because there are no hidden secrets. All channels are sort of open. We have public channels with all developers.

39:12

We have a channel only about testing. So people know if they have a question about testing, they can post there. So we sort of understand, we're very close to them to understand where they are and what types of pains they have. And this gives us a lot of ideas. Yeah. So we also from times to times go to teams and, and try to really spend a day or a week with them to understand precisely. I I do that a lot, by the way, I

39:43

love to be closer to teams. So, and especially as soon as you make some friends, it's very easy. Hey, let me spend one day with you, right. And that type of stuff. So I think being very close to them helps a lot in terms of ideas. I think the challenge then comes in prioritizing because there are just too many ideas. Oh yeah, right. We see too many things to be done. And, and how do you pick the ones that will have the most

40:04

impact? And I feel like that's where we are as, as, as a not only agent, but as, as, as a community. We're trying to get better. For example, try to have more concrete metrics that you can measure to truly see the impact that we're having, right. So you see the emergence of things like. The, the X core, whatever metrics that people are now trying to collect to be more able to, to understand where you are in terms of software engineering evolution, so to say, are you effective?

40:35

Are you not effective? And so, so I feel like that's been also for us a path of how can we stop trusting only on our feelings and our feelings are good. Our, our guesses are very good because we're very close to people. But can we somehow automate numbers, get numbers? So maybe an example we, I gave this talk in November on on the Amsterdam Java user group that was about the two that we have at agent that does test

41:02

selection. OK, so instead of running all the tests in the code base, whenever someone opens a pull request, we via an algorithm, we try to to get a smaller set. So the build is just faster. Nice, right? And we're on the scale of imagine hundreds of thousands of unit tests. So if you can reduce the execution, if if you can skip a couple 100,000 unit tests, the build is just faster, right? So not bad.

41:25

Not bad. So, so that's what we wanted to do. And then at the very beginning we had this this hunch that we are running way more tests than needed, but how much? Very hard to know. We started to make Pocs and at some point the POC was like, yeah, the the two works. We, we can select A smaller subset, but how good is this going to be? And then we sort of stopped for like 3 weeks or so. We didn't have a tool to collect data. So we truly got a lot of open pool requests.

41:56

Yeah, we, and we ran the pool request with and without our tooling. And we did this. I, I wrote this in the blog post, I don't remember, maybe around 100 or so. And this was very boring, you know, semi automated work and we had to wait for the CI and then sometimes it just crashes, right? Because that's how it works. And then you run again. Yeah, I can see that. So we could get numbers on what type of benefits do we expect from these two.

42:20

And and then we saw, yeah, those numbers look nice. Let's then invest and make it into a tool. So I feel we're all getting better as, as the day, as we progress as a team to, to find the priorities. Yesterday we had this huge two hours meeting on what the cool stuff we want to build now. And then we, we, we got to a point there and we were like, this is cool and this is cool and this is cool. And then at the end of these two hours, the outcome was like, OK, we just learned that there's

42:56

some impact somewhere. We just have to find out precisely what, right? So, and I think that's the hardest part. Building is always the funny part. You have a lot of fun. You're coding something challenging, deciding what to build is. And then that's why we're now getting more product people to help us on those decisions Culture, right? Because I do feel like now we we sort of need someone that spends a good part of their day thinking about this. We cannot just do this on

43:26

hunches anymore, right? So I think product in platform engineering is growing everywhere, OK, right. And and I feel that's that's, that's the way. For me, that's a lot of fun. Like I've I always wanted to do product management. I've not done it as a a year and it came from, OK, we have this priorities and priorities are set or indeed we're trying to figure it out ourselves.

43:46

But decision making is hard and having one person to be responsible and to also communicate the why, for me, that was always really important. So that's why I'm now doing it myself kind of more and more. And I don't know if I'm very biased because indeed I have like a technical software engineering background. I look at very much functionally what's going to be the value.

44:05

How can we measure that? I look at small bets versus long bets, what is clear versus unclear, even though priorities might sometimes come from a business side. If we're doing this for the future and we can do things to reap the benefits now, like those are all the thoughts that

44:20

come into that. That's why I think it's fascinating from a product engineer or platform engineering standpoint because your platform is your product and your users are developers and developers making them more effective.

44:32

If we're talking about the scale of 1000 engineers, engineering capacity in any in any way, I talked to a guy at Miro and they're really measuring everything because for them a marginal change is going to have a business impact financially, cost reduction, which is substantial. So everything from test automation to enablement with regards to productivity is going to be crucial. That's why I think platform engineering is more and more

44:57

fascinating. And from a product perspective, those user metrics that you touched on are going to be crucial in decision making. For sure and if you like some, some problems that we are solving, you can clearly see metrics. So let's say build time, right easy to see if you make it smaller, it's better. No developers will complain right of a faster CI. But when it comes to testing, some metrics are just very hard. How do you how do you measure if the tests of a specific team is

45:23

is good? Yeah, I mean, coverage is a proxy, but well, it's also a flawed metric, right? Or we on the static analysis tools, how do you measure the impact of those tools if the tool finds an issue, linear code, is this good or is this bad? How how do I measure the impact of like we use sonar cube spot bugs. How do I measure the impact of those tools? This is truly hard, right?

45:44

And, and I think for me, sometimes people come with with metrics and then my sort of scientific background, when, you know, it starts to scream like, yes, but this is totally unsound. We can measure this, but this is totally unsound, right? The other day. So last year we worked on putting new law way at agent. So new law way is this tool from Uber that helps you, you want to teach your Java code a little bit and it at compile time helps you to find possible new pointer exceptions, right?

46:09

Yeah. And then, and then the question is how do you truly measure the impact of developers using NULL OA? And sure, one can say you can now measure the number of new pointer exceptions before and after in production, but it's, it's just way more complex than that, right? It's, it's sort of a it, it works as a proxy metric, but it's also not perfect. So it's for some products in platform engineering.

46:32

And I guess I'm biased, right? I don't know other types of products, but I can imagine you have some things that are very easy to be measured and you can see metrics right away and some things you can't. And then what do you do with those, right. And I was talking to a guy, he leads platform engineering. I'm not going to mention the name of the company, a Unicorn in Brazil.

46:52

And, and, and this guy was telling me that they, they, they, they separate things into some things are just what was the word he used? Like something like hygiene, you know, like, for example, improving the builds or like some, some stuff you got to do and you, you don't have to measure so much. Like, for example, helping teams to write more tests. You don't have to be super precise. You sort of know that more tests sort of relates to to being good or reducing view time is relates

47:21

to to being good. And then other things do have to be truly measured. If if you're just going to build a complex project product, you have to be careful about metrics, but some stuff you don't have to. And I was like, that makes sense. That simplifies a lot because then you can have some axioms in your company, right? Tests are good. Yeah, small build times are good.

47:38

And then you don't have to think so much if you're working towards those things and numbers are getting smaller than but but it's, it's such a such a cool problem to work like metrics, product metrics for platform engineering, so much to be done there. Yeah, Yeah, I'm thinking of more and more you have an, you have certain outcomes as goals,

47:55

right, as a platform team. And even though you're looking at it from a tools and metrics standpoint, I do think it's a lot of also qualitative data that you're going to look at because developers like or not black and white thinking there's, they're still human and humans have psychology and sometimes an emotional connection to a tool also really helps. So usage and ease of use and also kind of that buy in that you need to have and people that

48:21

actually find adoption. There are maybe even product champions for whatever you're building. Those all mean that it's going to be good for your your outcome. So then all those key contributors, all those key results are going to kind of benefit your outcome. And indeed, if something contributes more or less, it's going to be very hard because

48:39

that's going to be nuanced. I love the example of kind of the implementation on a code level that you gave, because indeed it might reduce bugs in production, but all our efforts will reduce bugs in production. So how can we quantify this small change that we do? Yeah, at the end we cannot. So we have to trust, I think, based on our gut feeling more than anything. Yeah, yeah. And I think that's where things get super funny, right?

49:00

Because it's gut feeling plus quantitative plus qualitative, Yeah. So that you can take a decision and then learn from the decision. So such a beautiful space. Yeah, right. Yeah, as a, as a last thought, so we, we touched on kind of testing and I think resilience more than anything, I'm always of the mindset and I think you, I'm curious to hear what you think as well as that bugs will happen, right? We do everything we can. We try and prevent our critical

49:25

path from breaking. But for example, on LinkedIn, I saw GitHub was down for a few hours. Like, I don't know on the scale of GitHub what went wrong. But a lot of people that host their code on GitHub are impacted, right? No build times, no changes pending pull requests, maybe something crucial in another business that needed a release, right? GitHub being down cannot be prevented. It will happen and it has happened. That's what we see. So then how do you recover from

49:48

it? How do you educate or have that level of maturity in your team that know consciously things will break and when they break? This is what we do. Yeah. So I think all complex systems out there will break and they break, right. Maybe you don't see it, but they they are breaking on the basically on a daily basis, right. Yeah, it, it's just too complex for for anyone to be able to to see those problems beforehand, right. So, and usually, and maybe a parenthesis here.

50:18

And I felt like sometimes I hear people working in such complex systems and they say we don't need tests just because we cannot truly prevent the real incident, right. And I sort of see the point, but I think the way I see it is in such a complex system, you still have to write a lot of tests because you then as a developer, you can open a pull request and then you can know only for 10 new lines of code work, right?

50:40

So you have this small safety net that tells you if if something is working, but you know that you will never, you know, be able to truly test a problem that happens because there was a networking issue and then whatever Kubernetes issue and the combination of those things. So those things will just happen in live and you just have to sort of understand they will happen and then plan so that the

51:01

impact is very small, right? And so at this, at this level, thinking of observability, thinking of releasing not for your whole audience, but in small bits, right feature flags and other type of stuff. So you have to do engineering to reduce the impacts. If something goes wrong, you can quickly rollback. Yeah. And you don't affect your, your entire code base, right. So at agent, for example, we, we, my team bothers people to

51:28

write tests, so on and so forth. But we have another team that focuses on our Sr. ES and our feature teams. And like, if you're pushing something to production, it has to be gated. Yeah, right. So we don't have to deploy to disable your feature. We just press a button, change a flag here on the UI and then suddenly your feature is off and and back to to what was working. So if you all of those things they complement and they have to be part of the quality assurance of a complex system.

51:57

Yes, yeah, I think all. And then of course, and I remember now what I wanted to talk, and then you see sort of companies that are doing state-of-the-art software engineering. They do testing for from a developer's perspective, they do observability, Robex, blah, blah, blah, because they know life is hard. And they also start to have tooling to explore and maybe hopefully identify the problems before. Like, for example, I think the famous example is Netflix doing

52:28

Chaos Monkey, right? They purposely break stuff to see how the infrastructure will will behave, right? Uber wrote a blog poster maybe a year and a half, two years ago that they have this testing platform where you as a developer, let's say you, you change one service, you can deploy this one service, but then the whole flow happens through production and then product. But production knows it's a testing flow.

52:52

It will switch the flow just so that it hits your test container and then the flow continues, right? So sort of creating stuff so you can get as closer to production as possible to explore problems in different ways or teams doing fuzzing testing. You know, people are now trying, OK, we got the testing developer testing part covered. What else can we add? So hopefully we find bugs in in advance. Yeah. So I feel like that's the third wave now because we not all know

53:18

testing works. We all know we have to pay attention to monitoring, etcetera. And now the 3rd wave that you see the big companies doing is let's just create crazy tools that will try to simulate problems that only happen in production, right? Yeah. So the more we learn, the more we are able to build tools. And I think we are now on this third wave as as a society, right? Yeah. I'm very curious to see what from that third kind of pillar is going to stick, right?

53:46

Because you have whatever test, whatever software we create, we know is of high quality. It's going to be covered with regards to a certain level of test suite. And then indeed, in production, we make sure our observability is there. And when we have an issue, we know what to do. It's like education and tooling to enable. And then indeed, what then remains? It's like the things that we

54:05

don't know. You give the example of AI kind of browsing through the application, finding issues that a human might not be able to find. Or indeed, when I remove certain suites of code, does my tests? Do my tests still work or is there missing coverage? Or does my infrastructure break when I disable a component? Or what happens? Indeed, I'm wondering what practice from that third pillar like the chaos side.

54:26

I feel like the remainder is going to stick and then also be kind of a first class citizen in a developer's toolkit. And for me, it's funny because it feels like that's the one problem we don't understand so much. And that's why you don't have good abstractions, right? So if you think of developer testing, you have J unit, and all Java companies in the world can use J unit to build tests, right? And then for observability, you also have a a million observability companies that,

54:49

that work for, for everyone. And that's because we understand the problem. This third one, what you see is companies creating their own tooling and not really open sourcing them because they also don't see like a common layer that they can, they can open to the rest of the world. And to me that indicates we're still learning about the problem. Once we know very well, you will start to see tools that are generic enough so industry can

55:10

adopt in large scale. But I only see now and, and they are not even writing so much about it in, in the academic world. You see Facebook and, and and and Google Uber writing papers about this. So it's still, they're targeting academic conferences. It's not even for the whole, you know, normal industry conferences. So this, this to me says we're still learning, right? But we'll get there. That's, I love that observation. I really do. And I love this conversation so far.

55:36

It's been really fun picking your brain on kind of leadership in testing, enabling people with regards to testing and design, also from a strategic and education standpoint. And I love the knowledge you have from academia as well and what's coming. Is there anything that's missing that you still wanted to share? No, I think we covered a lot. Yeah, that was a fun conversation for me as well.

55:58

Thank you. Thank you so much for coming on and I'll round it off here then if you're still here, let us know in the comments section what you think. Like if you liked the episode and otherwise, we'll see you on the next one.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript