Welcome to Software Testing Unleashed! The podcast for testers, developers and software makers who live quality as an attitude. Get fresh ideas and sharp insights To grow your mindset To learn new methods And drive real change in how we build software Better software and better teams For a better world. Hi I'm Ritchie, software-quality coach keynote speaker and book author. My guest today is Nikhil Bhatwar.
Nikhil is an expert in decentralized systems, his worked-in both startups and large corporations giving him a wide view of real world complexity. He also mentors other start ups and speaks regularly at international conferences. His talks cover cutting edge topics like software quality and distributed systems. In short, Nikhil knows how to build systems that scale & survive the real world. In this episode, we talked about property-based testing.
A method to go beyond human imagination in software testing. because of the huge possible combination of these methods How can a system test itself by generating its own test cases? What do you when your test space grows into billions of combinations? and why does some of most critical bugs show up? after seventeen exact steps We explored how property-based testing helps uncover edge cases that humans miss.
Nikhil explained it in a super practical way, from modeling and generators to shrinking tests for failures of human to understand. You'll also hear his take on where this method reaches its limits And How To Start Using It Step By Step In Your Team. So now enjoy the episode! nice to have you on the show here in the podcast. Absolutely, thank you so much for having me!
Yeah... You're traveling a long way here to Frankfurt to the Queesta and I already told me that we'll combine it with some sightseeing and visiting Europe. It's actually a beautiful place. just for context i actually spent childhood in Germany which was way back. My father was actually a nuclear scientist and in Zulich, it wasn't that time where Germany like not used to be separate. Zuliche is Deutsche Forschung for Nuclear Physics. I don't know German so something like That.
yeah Yeah And he use too. He was the scientists there for few years. So i went into german kindergarten and I learned German and I didn't knew English only new Germans. It's kind of my first language. Then I forgot German because I came back. Since then I've rarely ever spoken German. But it's kind of, my mom is traveling with me. so she was like yeah we used to live there and went this way since those memories are coming back. Yeah that's great!
So maybe you can reactivate your german language also? I don't know if its good idea on a podcast because people look really silly. We can do it in English, no problem. Yes we are here on the QS talk from Imbus and you have a speech about property-based testing. I read your abstract part, I knew very specific so i want to talk to you. So yeah let's start. what is property-based testing? Yeah! Let me start with giving a background of when we like my area generally is web services right.
large scale microservices have thousand services. now you're going test them the thing and they can behave in a certain way, In the sense that one service fails. The failure passes to second one And you don't actually know it until the fifth six service fails. or cascading failure right? The number of test cases That You want to test Or Number Of Failure Modes of A Distributed System is huge. So...in a simple mathematical term Assume that you have n services Right?
And Certain failures occur when two-or three more services kind of fail. so now by Simple Maths It's like NQ where N is, I don't know. thousand maybe. So n cube becomes what? A billion million something like that and you want to test all these. how do you test it? are you going write a million tests cases. That's the idea of property based testing. comes so fundamental proposition of property-based testing. when system gets too complicated It not possible for humans to write every test case.
Rather we want this test cases to be generated automatically. How do we generate them? We know what the behavior of the services, right. So if you say that this is the behavior or the service If you pass a certain class of input where expecting a certain Class of output our system can basically Write test cases for you automatically. That can generate thousands and thousands of combinations on these tests and automatically test it rather than you writing there.
Yeah Okay. So for example, in a very simple way let's say you have services that request two inputs A and B. And gives an output A plus B. sounds about very simple right? You would write some test cases zero one to three Some negative numbers positive numbers all those things. and if those test cases pass Everything is great.
But the problem Is that In real life systems you Have buffer overflows you have integer or at a certain combination where you approach That edge it is, It might be a possibility that the service would fail because you had a buffer overflow. You have an integer overflow right? How do you test those? so rather than testing That you describe the property that well if I send A plus B and The output is C then If i subtract c minus b the output should Be eight. Because of A plus b Is equal to see.
Then c-b is equal To A. So I can generate thousands And thousands Of combination of A&B. Right. Get the output subtract B and make sure it is a. so your test case Is not actually going to be what you're writing? What input you are sending which is what typically? how do you test? right? You have like okay, this is my set of input in I one i two i three expect the output over. Oh three. rather you describe your system.
So I have described my system that the property of this service is, if i take one input and subtract it will be the second input. It's a systems job to write these test cases but in real life things are not so simple. So typically property-based testing consists you have a modeling language, which in this case is simple equation. But in real life will be more complicated modeling languages. now we have a lot of formal mathematics to describe these systems like.
they use TLA plus but there are bunch of them the formal specification. so we described these formal specifications. that's one component right? Then We Have A Generator Which Will Take The Specification And Will Write The Test Cases For You Or Find The Violation. So That Is So. now you found something is wrong if your test cases are failing. Then, we have a third component called shrinker.
Now typically because you're writing these test cases and this test cases I would say not super random but intelligently generated from the specification they can have long sequence of failure mode. so i'll give real life example. A test was done on a Google level DB database or something like that. If you replicate seventeen unique steps, there will be a bug which human can anticipate. these are the seventeen steps that we do! That would replicate all of them.
Property based testing were used and within one hour it found that bug Right? Now when generates this test cases generate long sequence as cases right. So that's what the generator would do. and something goes wrong, you know some things wrong. but now for a human if I give your sequence of seventeen steps And say well this is what? The failure. Is it will be like yeah i don't Know where exactly the failure is. Yeah when i understand there's A failure But these are Seventeen Steps.
i Don't know which Of the Seventeen step failed. so the third Component is a reducer. What the reducer does is once you have found a fault, it actually kind of starts shrinking it and saying okay let me try to make exactly where things fail. So now we have these three components.
when I describe my system i have a generator ,you can write your own custom generator or use the standard generator right?I have this generator that takes the specification generates these million complicated cases .These are machine readable. We generate these cases, we find these faults and the shrinker will basically reduce this complicated test case for a human-test case. And say well... This is exactly where that problem is. That's how property based testing would work.
Obviously in every language you have different kind of frameworks right? Exactly! These frameworks are there? which technology do they have? So.. There're lot them Typically, there are some commercials. There's open source so their frameworks. for .NET I actually have a very small demo or .NET framework but i give a list of the frameworks.
for Python This concept started with a tool called QuickCheck which was written in Haskell And then this was ported to multiple languages and now you've FSchef for .Net. You can find something like Python, PyChek bunch of technology stacks and a bunch of programming languages. you have these frames. So yeah, there's no like. property based testing is a concept of this. how you test the system right? Implementation different people have implemented it in a different way.
Yeah so that's why they are multiple frameworks. I'm perversive. well This Is How We Generate The Test Cases. Of Course There'S A Commercial Tools Available i think cubic which is being used in automobiles because you know they are mission critical software. right the software fails people die. Yes, you want to be absolutely sure something is everything works correctly. Yeah so they are.
the main topic or the main goal is too make a very huge coverage of all these combinations and To get more reliability also for this system. yeah ultimately The goal is reliability. I mean reliability bug-free where they all come first at same point. But the key idea is that go beyond the human imagination. I mean, when a human is writing a test case he's there's a limit to a human right?
If you look at the whole branch of formal mathematics not just for property-based testing You could look at Testing in other systems also like formal mathematics. The whole thing Is That number of permutation and combination Of things that can Go wrong way beyond a human can imagine. So I'll give you very different examples of property-based testing, there's a branch of mathematics... There is tool called TLA+, that deals with concurrency. Now what happens?
Is in concurrency systems we have threads and you have logs And all these things go wrong. The problem for this system are these bugs are intermittent. A certain set of conditions usually caused by external environments have to happen before the system crash. So you run your unit test cases. everything will work. You ran on your production integration test cases. Everything is one run on production, everything runs for two months. The third month-the first day of the third month fails.
How do we test those cases? Right so because this isn't a minute and it may not happen how are you testing? So, here you have the same thing because threads can operate in a very different way based on situation and environment which may be different from your testing. That's where all formal method comes is go beyond human imagination.
that creates state space diagram of every possible moves combinations then sees If there's a thread deadlock here, if there is a concurrency or what happens and then can find out. well this is the problem that you would have. And This particular one TLA plus has actually been used by Amazon AWS. There's an article I can point it out. It's publicly available. Or if you Google AWS TLA Plus Formal Methods.
they write a paper on how AWS use these formal methods to uncover bugs which otherwise would cause the system to fail intermittently. and basically, the idea was that you sometimes have to write a complete rewrite of your system. because if there is fundamental problem in design then what do we do? Property-based testing is kind of same thing other way around. Here I've described how you designed this system takes the same idea and says, okay fine. I have a software that is working.
how do i test it? But the key idea remains the same That human imagination works to a certain point. once you go out at the scale The number of combinations of things that can go wrong Is so huge that It is not practical for a human to think or implement those cases. So in this specific case You design model points you can combine. Describe your system is how I'm going to word it because for test, To generate It has some kind of specification.
property based testing in simple form Is you describe? Properties off the system there's a property right. so For example let say that if i am designing an e-commerce systems Amazon dot com and You could describe a property that every time, A person makes a purchase. Let's say I purchased a box of chocolates because i have a sweet tooth. I love chocolates right?
So Every Time I make a purchase then from the inventory if n is The number of chocolate that I've purchased form the inventory Before the transaction and after the transaction the difference should be n cos I've removed chocolates from my inventory. Right? Yeah, i'm also going to have a property of the system that My revenue for amazon has to increase by n into the price Of the chocolate. so i am now describing properties of A system based on The transaction.
Now you would say what's the big deal? the Big deal is That in real life systems You are using eventual Consistencies because it's multi-threaded Systems right. So in a simple System if you were To implement This is a very simple thing, unlikely to run into problem. But that's not how the same systems are implemented because these are distributed systems working in multiple databases and different data centers right?
It could happen that this property is true for one data center but not on data center too. so you test those conditions. So your now describing properties of system. The system is writing tests This so I understand it correct for this the properties you already mentioned for putting that The end part from out from the stock and Pointing up their revenue. You go one step further Out of the system and describe to system also with the data center Properties. yeah, yes So essentially.
Now here thing is there's a div. you could if your writing unit test see How do you specify? There are multiple ways of specifying. You could have a specification at the service level or function-level, and you can have a system-level. If you specify it as a service-level then that's typically what unit test would be. but if you specify in a system level this is an integration test right! And people usually do both makes sense to me. And what do you think?
Where are the limits or where it's not good to use property-based testing. I Think, The limits would be that certain tests tend to be destructive For example way back and maybe things have better now but Way Back was actually working at Blackberry. So when You test It is a phone call for which your being built. If you're writing millions and millions of test cases, then your making million phone calls. Right? And for those million phone call you are going to get a huge bill!
The same thing is gonna be true if you testing the system at let's say... You building a system on top of AWS & you write some tests that require writing to storage etc. You know, you're being charged for all the resources that are using for test cases. And now we were writing like a million plus test case is automatically so your costs can go out of control. So I would say property-based test cases aren't good where The resources you use in testing incurs huge cost because it can spiral.
When when somebody listens now and says, oh I want to start with this too. Do you have one two three tips from starting? And also One or Two things we have to be aware of if We start? um i think there's a lot Of information on the internet and There is good reference in terms of books and Internet. so these days Information Is quite cheaply available on The net. So depending on you know framework tools if you search on the net, I actually did that. You get a lot of references.
The key thing and like thats true for all test cases not just property based is to always start small then grow big. If i am designing a distributed system let me try with one service. Go on and go on, and expand my scope. It's just better to grow organically because when you're writing any kind of Any new tool in an organization whether it is property-based testing? Is not just a programming thing the right your? You have to make cultural changes. People think in a certain way.
now we are forcing them to think in different ways. You have to build that culture, the shift of mindset. Right? So there's a cultural component into it. and then you have a large army of people in a large organization. right changing thousand peoples mind is not that easy. yeah And I've worked on similar projects where we were working on Not so much testing new built systems and i had this shiny new build system when was showcasing. nobody used why because they had skepticism.
a very tough part of my job, actually probably say toughest part of My Job is not to write code. Toughest Part Of My Job Is Convincing People. To Do That At A Stage Where I Am You Know Convinced People Is A Hard Job. These Are People Like I'm. Twenty Years Of Experience What This Is? What Are You Telling Me? And Now you have to deal with it. So generally, when you start small and grow big... And I'm not just talking about property-based testing.
this is actually even if use chaos testing or something like that the principles still stay the same That When You Start Small & Grow Big If there are any risks of implementation some roadblocks you will find out early right? The agile philosophy failed fast! Also..if your tool is working for one service then it probably will work for another service. It's easy to convince people. One thing that I've always tried is, if you're working in a large organization we have fifty teams.
pick one team which is willing to cooperate with me. We'll have property-based testing and this new tool. whatever changes your making show them the value And let them go and preach those values to other members of the team. Because here, when you're introducing a new technology basically... You are a seller and your developers are customers. they have to buy that idea. let's use property-based testing right? And standard marketing.
if some seller is telling me my product is great like he gonna say it obviously! But then get recommendation from fellow customer. oh no I used this product. its awesome. then you have a lot more confidence because obviously there's conflict of interest, right? I would want to buy myself this. So i'm gonna say good things about it anyway even if its shitty yeah so that organic growth has to be their large organization. If your using like from learning perspective. thats the whole difference.
one major topic can be in a team or an organization. If you have all the stuff of combinations and this formal description, there is much more test cases than I can understand. so how do deal with it? With the topic of trust trusting all these stuff who gets out? I'm not clear about your test. So normally a tester has his test cases and they are very clear one step after another, And i know what their do... ...and can check them.
that my thing is so.. ..I trust these test cases I designed with this property-based testing where there's lot of combinations who run.... ...and don't even maybe see them all! How can I trust them? When you run a property-based tool, when the test case fails. You actually get complete sequence of what happened? So there...you can see if it's valid one and write for that specific one. There are millions of things that go wrong. I agree in certain cases is possibility. they're false negatives.
It isn't failure but maybe bugging how to describe your system That can happen, right? So the thing is you can all... You should verify that the failure that happened. Is it actually a failure? Right! Yeah so what the system is doing? It's kind of not so much replacing a tester more than assisting a tester. A million things go wrong. Human can't complement What this system has done. from those millions these are the twenty things could potentially go wrong.
then I as a human, don't have to go through all the million combinations. Now in those twenty it could be that some of them are false negatives but there's always a human supervision because once you begin okay this is something that has failed and i had to fix it first thing. we understand what happening And This Is Where The Role Of Shrinkar. that's exactly what the shrinker is trying to tell you. It is telling me where things went wrong? Yeah, I think this is a shift too.
see there as an assistant. for yeah so separate from property-based testing with all this generative AI, there's a lot of debate on. humans are going to be replaced. I'm like... Humans aren't gonna replace! AI is not replacing humans. it's assisting human in becoming more productive and that's exactly what property based testings doing for you. It just making you more productive Not replacing you. A great topic. Thank you very much, Nikhil.
It was very interesting for me such a new topic from me too. so I have a lot of questions. So thank You very much. to give us this insight or we'll put some information. Maybe you have some tips and references We can put in the show notes so everybody Can read absolutely as if there are a lot. And yeah i wish you A good time here at The QuestHack and a Good speech afterwards and uh...a good Time In Europe For Travelling Absolutely! I look forward To it quite a Lot. Yeah very much to be here.
Thank you, bye!
