Properties Based Testing - Nikhil Barthwal | Software Testing - Qualität, Testautomatisierung & Agilität podcast

00:00

Hello and welcome to a new episode of Podcast Software Testing. I'm your host Richie and I brought a little episode from QSTRACK 2023 in Frankfurt. This time again an episode in English. My interlocutor was Nikhil Bhartwal and I talked to him about property-based testing, what it is about this test method, what it is for and how to use it, you will hear in this episode.

00:29

And if you like this podcast, you get beautiful new ideas, an inspiration and learn something, then let me know at [email protected] and of course I'm always happy about likes and ratings on the usual podcast portals. And now have fun with the episode. Hi Nikhil, nice to have you on the show here in the podcast. Absolutely, thank you so much for having me. Yeah, you're traveling a long way here to Frankfurt to the QSTRACK. Absolutely.

01:09

Yeah, and you already told me we'll combine it with some sightseeing and visiting Europe. It's actually a beautiful place. Just for context, I actually spent childhood in Germany, which was way back. My father was actually a nuclear scientist and in Jülich, it was at that time West Germany, right, not separate. Jülich is Deutsch for Schung for nuclear physics. I don't know German, so it's something like that. Yeah, yeah. And he was a scientist there for a few years.

01:44

So I went to German kindergarten and I learned German. And I didn't know English, I only knew German. So it was kind of my first language. And then I forgot German because I came back. Since then, I have rarely ever spoken German. Yeah, yeah, yeah. But it's kind of my mom is traveling with me. So she's like, yeah, we used to live there. We went to this place. It's like those memories coming back. Yeah, that's great. So maybe you can reactivate your German language.

02:09

I don't know if it's a good idea on a podcast, because it would look really silly. Yeah, we can do it in English. It's no problem. OK, no worries. Yes, we are here on the QS Talk from Inbus. And you have a speech about property-based testing. And I read your abstract and thought, oh, property-based testing is not a part. I knew very specific, so I want to talk to you. So yeah, let's start. What is property-based testing? Yeah, so let me start with giving a background.

02:43

My area generally is web services, right? Large-scale microservices. You have thousands of services. Now you want to test them. The thing is, when you're testing 1,000 microservices, and they can behave in a certain way, in the sense that one service fails, the failure passes to second one, and you don't actually know it until the fifth, sixth service fails. So cascading failure, right? The number of test cases that you want to test, or number of failure modes of a distributed system is huge.

03:12

So in a simple mathematical term, assume that you have n services, right? And certain failures occur when two or three more services kind of fail. So now by simple math, it's like n cubed, where n is, I don't know, 1,000 maybe. So n cubed becomes what? I don't know, a billion, million, something like that. And you want to test all these. How do you test it? Are you going to write a million test cases? That's where the idea of property based testing comes.

03:41

So the fundamental proposition of property based testing is when the system gets too complicated, it is not possible for humans to write every test cases. So rather, we want these test cases to be generated automatically. How do we generate them? We know what the behavior of the services, right?

04:00

So if we say that this is the behavior of the service, if you pass a certain class of input, we are expecting a certain class of output, a system can basically write test cases for you automatically that can generate thousands and thousands of combinations of these tests and automatically test them rather than you writing them.

04:17

Okay, so for example, in a very simple way, let's say you have services that request two inputs, A and B, and gives you an output A plus B. Sounds about very simple, right? You would write some test cases, 0, 1, 2, 3, some negative numbers, positive numbers, all those things. And if those test cases pass, everything is great. But the problem is that in real life systems, you have buffer overflows, you have integer overflows at a certain combination where you approach that edge.

04:48

It is, it might be a possibility that the service would fail because you had a buffer overflow, you had an integer overflow, right? How do you test those? So rather than testing that, you describe the property that well, if I send A plus B, and the output is C, then if I subtract C minus B, the output should be A because if A plus B is equal to C, then C minus B is equal to A. So I can generate thousands and thousands of combination of A and B, right?

05:15

Get the output, subtract B and make sure it is A. So your test case is not actually going to be what you're writing, what input you're sending, which is what typically how do you test, right? You have like, okay, this is my set of input in I1, I2, I3, expect the output O1, O2, O3. Rather, you describe your system. Right?

05:37

So I have described my system that the property of the service is that if I take the output, and if I take one of the inputs, and I subtract, the answer would be the second input. I just described this property. I don't tell anything else. It's the system's job to write these test cases. Now, this is a very simple example. Yeah. Right? But in real life, things are not that simple. So typically, property-based testing consists of three components.

06:04

You have a modeling language, which in this case is a simple equation, but in real life will be more complicated modeling languages. Now, we have a lot of formal mathematics to describe these systems, right? I use TLA+ but there are a bunch of them, the formal specification. So we describe these formal specifications. That's one component, right? Then we have a generator, which will take the specification and will write the test cases for you. I'll find the violation.

06:32

So that's, so now you found something is wrong if your test cases are failing. Then you have a third component called shrinker. Now, typically, because you're writing these test cases, and these test cases are, I would say, not super random, but intangibly generated from your specification, they can have long sequence of failure. So I'll give a real life example. A test was done on a Google level DB database or something like that.

07:00

If you replicate 17 unique steps, there will be a bug, which human can anticipate that these are the 17 steps that you do. That will replicate all. Property-based testing was used and within one hour, it found that bug. Right? Now, when it generates these test cases, generates a long sequence of test cases, right? So that's what the generator would do and something goes wrong, you know, something's wrong.

07:28

But now for a human, if I give you a sequence of 17 steps and say, well, this is what the failure is, you'll be like, I don't know where exactly the failure is. Yeah, I mean, I understand there's a failure, but these are 17 steps. I don't know which of the 17 step failed. So the third component is a reducer. What the reducer does is once you have found a fault, it actually kind of starts shrinking it and saying, okay, let me try to dig exactly where things fail.

07:57

So now you have these three components. And I describe my system. I have a generator, you can write your own custom generator, or I can use a standard generator, right? I have this generator that takes the specification generates these million complicated cases. These are machine readable, right? We generate these cases. We find these faults, and the shrinker will basically reduce these complicated test cases for a human test case and say, well, this is exactly where the problem is.

08:24

That's how a property based testing would work. And obviously, for every language stack framework, you have different kind of frameworks, right? That would be. And these frameworks, which technology have they? So there are a lot of them. Typically, there are some commercial, there are some open source. So there are frameworks for .NET. I actually have a very small demo for .NET framework, but I actually give a list of frameworks for Python.

08:53

This concept started with a framework with a tool called QuickCheck was written in Haskell. And then this was ported to multiple languages. And now you have FHF for .NET. And you have like, I think, some Python, PyCheck, something. So for a bunch of technology stacks, a bunch of programming languages, you have these frameworks. So yeah, there's no like property based testing is a concept of this is how you test the system. Right?

09:23

Implementation, different people have implemented it in a different way. So that's why there are multiple frameworks. Some people say, well, this is how we generate the test cases. This is how we reduce it. Of course, there's a commercial tools available. I think, Cubic, which is being used in automobiles, because, you know, they are mission critical software, right? The software fails, people die. Yes, you want to be absolutely sure something is everything works correctly.

09:51

Yeah. So the main topic is or the main goal is to make a very huge coverage of all these combinations. Yeah. And to get more reliability also for the system. Yeah, ultimately, the goal is reliability. I mean, reliability, bug free, whatever you call it, they all converge to the same point. But the key idea is that go beyond the human imagination. I mean, when a human is writing a test case, there's a limit to a human. Right?

10:24

If you look at the whole branch of formal mathematics, not just for property based testing, you could look at testing in other systems also like formal mathematics, the whole thing is that the number of permutation and combination of things that can go wrong is huge, way beyond a human can imagine. So I'll give you a very different example from property based testing. There's a branch of mathematics, there's a tool called TLA plus that deals with concurrency.

10:51

Now, what happens is that, you know, in concurrency systems, you have threads, and you have locks and all those things and things can go wrong. The problem with these systems is that these bugs are intermittent. Right? Because a certain set of conditions usually caused by external environments have to happen before your system crash. So you run your unit test cases, everything will work. You run on your production integration test cases, everything is run on production.

11:16

Everything runs for two months, the third month, the first day of the third one, it fails. How do you test those cases? Right? So because this is intermittent, you, you know, it may happen, it may not happen. How do you test it? So here, you have the same thing, because the threads can operate in a very different way, based on the situation, based on the environment, which may not, which may be different from the environment you're testing on.

11:42

So that's where all the formal method comes in, is go beyond the human imagination, it creates a state space diagram of every possible moves, combinations, and then sees that if there's a thread deadlock here, if there's a concurrent issue, what happens and then can find out, well, this is a, this is a problem that you would have. And this particular one TLA+ is actually being used by Amazon AWS. And, and they have, there's an article I can point out, it's actually publicly available.

12:14

Or if you Google, I think AWS TLA+ formal methods, they actually write a paper on it, on how AWS use these formal methods to uncover bugs, which otherwise would cause the system to fail intermittently, not always. And basically, the idea was that to fix that you sometimes have to write a complete rewrite of your system. Because if there is a fundamental problem in your design, then what do you do? Property based testing is kind of the same thing other way around.

12:46

Here in this example, I've described how you design a system and how you build it. Property based testing takes the same idea and says, okay, fine, I have a software that is working, how do I test it? But the key idea remains the same, that human imagination works to a certain point. Once you go out at the scale, the number of combinations of things that can go wrong is so huge, that it is not practical for a human to think or implement those test cases.

13:14

So in this specific case, you design a model of all these points you can combine. You describe your system is how I'm going to word it. Because for tests to generate, it has to have some kind of specification. Property based testing in simple form is you describe properties of your system, there is a property. So for example, let's say that if I'm designing

13:41

an e commerce system, say amazon.com, right. And you can describe a property that every time a person makes a purchase, let's say I purchase a box of chocolates, because I have a sweet tooth, I love chocolates, right. So every time I make a purchase, then from the inventory, if n is the number of chocolates that I've purchased from the inventory, before the transaction and after the transaction, the difference should be n because I've removed

14:07

n chocolates from the inventory, right. I'm also going to have a property of the system that my revenue for Amazon has to increase by n into the price of the chocolate. So I am now describing properties of a system based on the transaction. Now, you would say, what's the big deal? The big deal is that in real life systems, you are using eventual consistencies, because it's multi threaded

14:31

systems. Right. So in a simple system, if you were to implement is a very simple thing, unlikely to run into a problem, but that's not how the same systems are implemented, right? Because you are these are distributed systems, working in multiple databases and different data centers, right? It could happen that this property is true for one data center, but not on data center two. Yeah. Right. So how do you test those conditions? So you are now describing the

14:57

properties of the system. And the system is writing test for you. Okay, yeah, yeah. This so I understand it correct for this, the properties you already mentioned for the putting the the end part from out from the stock and pointing up the revenue, you go one step further out of the system and describe the system also with the data center properties.

15:25

Yeah, yeah. So essentially, now, here's the thing is, there's a diff, you could if you're writing unit tests, see, how do you specify, there are multiple ways of specifying, you could have a specification at a service level, or a function level, you can have a service at a system level. Yeah, if you specify at a service level, then that's typically what a unit test. But if you specify at a system level, that's an integration. Yeah. Right. You could do both.

15:54

And people usually do both. Yeah, makes sense to do both. And what do you think? Where are the where are the limits or where it's not not good to use property based testing? I think the limits would be the certain tests tend to be destructive. For example, way back, and maybe things have better now, but way back, I was actually working at BlackBerry. So when you test, it's actually a phone call that you're making for which you're being built.

16:26

Yeah, right. If you're writing millions and millions of test cases, then you're making million phone calls. Right. And for those million phone calls, you're going to get a huge bill. Right. The same thing is going to be true for if you're testing your system. At let's say you're building a system on top of AWS, and you're writing some tests that requires writing to storage and so on. You know, you're, you're being charged for all the resources that

16:50

using for test cases. Yeah. And now you're writing like a million plus test cases automatically. So your costs can go out of control. So I would say property based test cases are not good, where the resources you're using for testing incurs a huge cost for you. It can spiral out of control. Yeah, yeah, yeah. I understand. Yeah. And when when somebody listens now and said, Oh, I want to start with this too. Do you have one, two, three tips from starting and also one or two

17:25

things we have to be aware of if we start? I think there's a lot of information on the internet. And there's good reference in terms of books and internet. So I think these days information is quite cheaply on the net. So, so depending on you know, framework tools, if you search on the net, you will probably get a lot of I actually did that you get a lot of references. The key thing, and that's probably not so like that's true for all test cases, not just for property based is

17:53

to always start small, and then grow big. So I would say, okay, let me if I'm designing a distributed system, let me try with one service and see how it's practical. Yes. And then, you know, go on and go on and go on and expand my scope. It's just better to grow organically, because when you're writing any kind of any new tool in an organization, whether it is property based testing, it's not just a programming thing, right? You're, you have to make cultural changes,

18:24

people think in a certain way. Now you're forcing them to think in a different way. It's not as simple as here is a new tool, right, these four lines, right? You have to build that culture, the shift of mindset, right? So there's a cultural component to it. And then you have a large army of people in a large organization, right? Changing 1000 people's mind, right? It's not that easy. And I've worked on similar projects where we were working on not so much

18:52

on testing and new build systems. And I had the shiny new build system, and I was showcasing and nobody used it. Why? Because they had skepticism, right? A very tough part of my job, actually, probably say toughest part of my job is not to write code. Toughest part of my job is convincing people to do that. Yeah, at the stage where I am, I, you know, convincing people is a hard job. And these are people like I have 20 years of experience. I know what this is, what are you

19:24

telling me? Yeah, right. And now you have to deal with it. So generally, when you start small and grow big, and I'm not just talking about property based testing, this is actually even if you use kiosk testing or something like that, the principles still stay the same, that when you start small and grow big, if there are any risks of implementation, some roadblocks, you will find out early, right,

19:45

the agile philosophy fail fast. And also, if your tool is working for one service, then it probably will work for another service, it's easy for you to convince people. One thing that I've always tried is, I always pick one customer in like, if you're working in a large organization, we have 50 teams pick one team, which is willing to cooperate with me, right? We'll have property based testing, we have this new tool, whatever changes you're making,

20:12

show them the value. And then let them go and preach those values to other members of the team. Because here, when you're introducing a new technology, basically, you are a seller and your developers are the customers, they have to buy that idea, let's use property based testing, right. And you know, standard marketing, if some seller is telling you my product is great, customer is like, he's gonna say that, obviously, yes. But when you get a recommendation from a

20:41

fellow customer, oh, no, I use that product, it's awesome. Then you have a lot more confidence, because obviously, there's a conflict of interest, right? I would want you to buy my service. So I'm gonna say good things about it anyway, even if it's shitty. Yeah, yeah. Right. So that organic growth has to be there in a large organization, if you're using like, from a learning perspective, that's a whole difference.

21:03

If you I think, one major topic can be in a team, or in an organization, if you have all this stuff of combinations, and this formal description, there is much more test cases than I can understand. So how do you deal with the topic of trust, trusting all this stuff who gets out of the tools? I'm not clear about your question. So normally, a tester has his test cases, and they are very clear one step after another,

21:37

and I know what they do, and I can check them and that are my thing. So I trust these test cases I designed with this property based testing with a lot of combinations who run and I don't even maybe see them all. So how can I trust them? When you run a property based tool, you actually get when the test case failed, you actually get a complete sequence of what happened. So there, you can actually see if it's a valid one, you can write a test case for that specific one. Okay, right.

22:12

I mean, there are millions of things that can go wrong. I agree that in certain cases, there's a possibility that there are false negatives, right? That it actually is not a failure, but it's probably it's maybe a bug in how you describe your system. Yeah, that can happen. Yeah. Right. So the thing is, you can all you can and you should verify that the failure that happened, is it actually a failure? Right? Yeah. So what the system is doing is it's kind of

22:36

not so much replacing a tester more than it's assisting a tester. Yeah. So a million things can go wrong, a human can't complement. So what the system has done is in from those million, it has picked these are the 20 things that could potentially go wrong. Then I as a human, I don't have to go through all the million combinations, I just focus on the 20. Now in those 20, it could be that some of them are false negatives. But there's always a human

23:04

supervision. Because once you begin, okay, this is something that has failed, I have to fix it, you first have to understand what is happening. And this is where the role of the shrinker is. I mean, that's exactly what the shrinker is trying to tell you, it is trying to tell you exactly where things went wrong. Yeah, yeah. I think this is the shift to see this as an assistant for a tester. Yeah, exactly. So, you know, separate from property based testing with all this generative

23:30

AI, there's a lot of debate on humans are going to be replaced. I'm like, humans are not going to be replaced. AI is not replacing humans. It's assisting humans in becoming more productivity, become more productive. And that's exactly what property based testings are doing for you. It's just making you more productive, not replacing you. Yeah, yeah. A great topic. Thank you very much, Nikhil. It was very interesting for me, such a new topic for me too. So I have a lot of

23:55

questions. So thank you very much to give us this insight. We'll put some information, maybe you have some tips and references we can put in the show notes. So everybody can read. Absolutely. If there are a lot. And yeah, I wish you a good time here at the Quest Tag and a good speech afterwards and a good time in Europe for traveling. Absolutely. I look forward to it quite a lot. Yeah, thank you very much to be here. Thank you. Thank you. Bye. Bye.

Transcript source: Provided by creator in RSS feed: download file

Properties Based Testing - Nikhil Barthwal

Episode description

Transcript