Special Guest: Clare Macrae

Matt Godbolt

00:00

Hi, I'm Matt Godbolt.

Ben Rady

00:01

And I'm Ben Rady.

Matt Godbolt

00:02

And this is Two's Complement: a Programming podcast. Today we're delighted to have a special guest: Clare Macrae is joining us. Hi Clare.

Clare Macrae

00:24

Hi Matt. Hi Ben.

Ben Rady

00:26

How are you?

Matt Godbolt

00:27

It's amazing to have another guest on our podcast. So we're really excited to talk with you about all sorts of things, but do you want to give yourself a little bit of introduction? Tell us about yourself.

Clare Macrae

00:37

Sure. Uh, so yeah, I've been programming for a living for more than 30 years now. Um, but my childhood's, uh, only early days, the first language I programmed with was BASIC. And my dad, um, got into computing in the very early years of industrial computing in the UK. And so when I started learning BASIC at school, he got really excited and he got an amazing book called Donald Alcock "Illustrating BASIC" spiral bound, but with the most amazing pictorial explanations of how to program, he was absolutely brilliant. And he even dad even bought a, an early home computer, a Trans-Am Tuscan which I've seen one off the national museum of computing in Bletchley Park. Um, really incredible.

Matt Godbolt

01:24

My hobby is retro-computing and I have never heard of that computer. That's amazing. What was it called?

Clare Macrae

01:31

The Trans-Am Tuscan.

Matt Godbolt

01:33

That sounds like a car.

Clare Macrae

01:35

It's an amazing, amazing thing.

Ben Rady

01:37

Or a Star Wars character.

Clare Macrae

01:40

So I started with BASIC at home and about the same sort of time at the high school or upper school I was at, they had some kind of setup with some I dunno remote mainframe. It didn't didn't matter to me. We had these sort of squares of paper and we could write down our BASIC programs and we'd send them off in a few days later, we'd get back a printout with some information or whenever a message and then rinse and repeat. And then just before I left school, the first BBC micro arrived and which I know is something quite close to Matt's heart. And, um, so yeah, I have this weird memory of, uh, the lunch room, the room, the classroom, where it actually was and where we could go at lunch times, but mostly my memories involve standing behind the other people in a long queue to get to use it for a few minutes.

02:28

So I don't actually know how much I use it, but it's a really major memory from my childhood. And then I went on to do a chemistry degree, um, but I was just rubbish at practical chemistry, two left thumbs, always pour things away last moment, right at the end. And I had an opportunity to do a year long, um, computing project in chemistry and Fortran. And that in turn stood me in good stead for, for my career in the next 30 years. So I think probably originally from my dad, but it's just sort of, I liked the programming. I liked the logic. It seemed to make sense. Um, didn't always succeed, but you know, you could always learn to get better. So that's, that's my early years, really.

Matt Godbolt

03:13

Gosh. And so when you say computational chemistry, what, what, I don't even understand what that means. Like I have very basic GCSE level chemistry understanding of like, well, water is two hydrogens and one oxygen kind of level, but what can we do with a computer program and chemistry?

Clare Macrae

03:30

So I kind of have slightly weird imposter syndrome over this because really I'm a computer programmer at heart rather than the chemist. Um, but chemists said, yeah, but look at the software you've written, you've obviously figured out quite a lot. So you kind of learn over time. Um, computational chemistry is really divided into sort of theoretical side of things where people are trying to predict, um, new facts, um, or trying to predict the results of experiments. Um, and for years, and years and years, the impossible example that was given if that was predicting, um, the 3d crystal structures of proteins, which is always like, you can't even predict the crystal structures of small molecules. So you have crystal, you know, where a molecule aggregates in three dimensions, you know, like in your salt shaker, salt crystals, that kind of thing.

Matt Godbolt

04:25

Grand memories of seeing like a, a lattice for like the sodium, whatever salt, salty, salt crystal things. Yeah.

Clare Macrae

04:33

Yeah. So it's hard enough for, um, to predict, um, calculate theoretically, the crystal structures of small molecules. So proteins, it was like, you know, maybe in a few decades they'll solve that, but in the last few months, alpha, um, alpha fold, I think is the name of the project has just blown away any expectations out of the water, what they've done.

Matt Godbolt

04:56

This is alpha like Google alphabet, um, DeepMind, AI company.

Clare Macrae

05:02

Yeah. So like AlphaGo a few years ago that shocked the go world. So that's the theoretical side of things and that's not really something I've had any involvement in, but then there's, um, uh, software based on experimental data. And that's, that's where my main experience is of working for a nonprofit organization for more than 30 years that exists to, to collect, uh, the results of crystallography experiments, essentially to collect the results of, to collect information about the shapes of molecules in three dimensions and to collect crucially the symmetry operations that say how the molecules aggregate together across the whole crystal structure. And that turns out to be really valuable information and there's a lot of software involved at all stages of the process.

Matt Godbolt

05:50

Got it. And so what language were you writing all this, this stuff is, so you, you were saying you're collecting and aggregating across lots of experiments. Um, and yeah. What, what form did that take?

Clare Macrae

06:01

So the company, well nonprofit organization that I worked for for more than 30 years when I joined it was Fortran 77. I joined in 87. Um, but the organization is well over 50 years old. Now it was, it was formed in 1965 and they really were at the forefront of electronic publishing in the late sixties and early seventies publishing books that they'd written Fortran IV software to plot the chemical diagrams of the structures that are in the database and do the type setting with Greek characters and subscripts and superscripts. That was unheard of at that time. Got it. Um, and by the time I joined that Fortran IV had been ported to, um, Fortran 77 and over time we evolved to other languages. And, uh, the majority of my later years there was all C++ and Qt. So I think it was 1999. We, uh, somebody had the idea of porting all our Fortran to C++ and getting rid of the old Fortran. And it took quite a while, but we eventually achieved that

Matt Godbolt

07:08

I was going to say, because Fortran is deeply embedded in the science community and my own like, very limited dalliance with it. Um, Fortran has some interesting guarantees about arrays and aliasing that C and C++ don't provide, which means the naively converting large Fortran programs to C can actually be a pessimization if performance is important. I mean, maybe performance isn't as big a deal there. Um, so that's an interesting thing to have to sell people on to actually convert over to a more modern language. But then I suppose the benefits are it's more easy to work on or easy to test or easy to extend. What, what kind of benefits, what was the reasoning behind moving to? Um, I say more modern, but for obviously Fortran's still going, um, a more modern language.

Clare Macrae

07:59

Um, I think better design. We were really, so I don't have experience of, of modern Fortran, but the code we had, there was lots of global data. We, we were really dependent on common blocks and things like that. And I mean, it's hard as someone who's been programming for a long time to try and convey to newer colleagues that, yes, this might look ancient to you and not how you do it, but the people who did it at the time, they were, you know, what they were doing was really powerful with, with what they knew at the time and, uh, providing capabilities that weren't weren't otherwise there. Um, the, it was very object-oriented C++ that we ended up with and we, there were lots of things that it would never have been possible to implement in the Fortran days. Um, my career went through really kind of weird, um, evolution of, of responsibilities.

08:51

So when I started in the late eighties, early nineties, we, you know, it was a nonprofit organization that evolved out of Cambridge university. We didn't have sales and marketing teams. We didn't even have user support. So the developers did all of those things and internal systems administration as well. So it meant there wasn't a huge amount of time. You certainly didn't have a hundred percent of your time on programming, but you were speaking to users directly. So you had a really good sense of what people needed. And it's really painful feeling of, yeah, I understand what you need to do, but I have no concept at all of how to do that in Fortran and with the code that we had, things like you started a search of the database when it got to the end, people wanted to do a new search, but our software just halted and we couldn't see how to rework it to go back to the stuff, really basic things.

09:44

And then over time, we ended up with, you know, powerful user interfaces and really maintainable code. But by that point, there were multiple layers in the organization, meeting the customers, meeting the users and dealing with them. So by then we got all our requests secondhand or third hand, um, and it was much harder to get that, that sort of, you know, empathy and direct connection of what do users really want to do. I think Ben, you're talking in an earlier episode about you want before you're implementing feature, you want to actually run it from a user's perspective and really understand what the user needed to do. And I saw that for a long time, get harder and harder. And then later on with agile, we ended up sort of getting more and more direct contact as well. So it kinda got better, but...

Ben Rady

10:31

Yeah, not only do you lose that empathy, but at least I find that you also lose a, um, a fair amount of, I don't know, a better way to describe this, but almost problem negotiation, right. Where, you know, somebody is trying to accomplish a goal and they might think of it in certain terms, you know, it's sorta like the old Henry Ford aphorism about if I asked my customers what they want, they would say faster horses. Right. Um, and so as a technologist is like a lot of times you have, because of your knowledge of technology, you have different ways of looking at a problem. And if you can't have a face-to-face conversation with the person who's trying to solve that problem, if that is through a second party that may be technical or non-technical, it's, uh, it's tricky, right? Um, you wind up building things that, you know, don't, aren't, aren't either aren't the best solution or in the worst case, don't really solve the problem at all, because it's sort of a preconceived notion of what the problem should be. Um, is that, so I know, I understand that you are now you've sort of moved on in your career and you're now, uh, starting to consult. Um, is, is that one of the things that sort of motivated you to do that?

Clare Macrae

11:38

Well, the real motivation for that was, uh, for the last two or three years, I've been volunteering on some open source software, uh, for testing hard to test code legacy code that you think there's no way we can add test to that. There's no way we could break it down. And that's an approach called approval tests, which was invented by someone called Llewellyn Falco. And it reached the point where I was just learning so much and I was in this fantastic kind of virtuous circle. I would speak at a local meetup group or a conference, um, which meant I had to learn more about the software in order to be able to talk about it, but then people would ask questions and I would learn more. And that was just taking up more and more of my time. And I kind of reached the stage where, um, you know, I had a fantastic 30 years, but, um, I was learning so much that was, um, kind of felt like I could speak to a different group of people and a wider group of people to share some stuff that I'd learned and worked out, but also a lot of stuff I was learning from other people and, and it wasn't really being talked about at C++ conferences.

12:46

And so I thought it was a way that I could, you know, in the time left, I've got in my career a way that I could really try and help other people out. So the motivation wasn't primarily the consulting company. It was, I want to carry on traveling and going to conferences and that's worked out really well this year.

Matt Godbolt

13:09

So you mentioned approval testing, what are approval tests?

Clare Macrae

13:14

Approval tests are...It's this really strange if of it's a small body of code that turns out to be fantastically powerful. And it's easiest to explain by describing something that I think a lot of people end up inventing themselves and rolling their own, which is you've got some code already exists and you want to add a test for it and you can't break it down into small chunks. So you, you call some function that generates a lump of data, and then you write it out to a text file and you save that as your master version, your anointed version. And then you make your test repeat that and do some kind of diff. And if it changes in future, then, um, you get this great big wall of output that says these 5,000 characters differ from these 5,000 characters.

14:05

And then you cut and paste it into a diff tool and you try and work it out. Um, so that's the kind of the fundamental that it's based upon. And lots of people invent their sort of homegrown versions of it. I know I certainly have done. And what approval tests is, is I guess I might call it an abstraction built on top of that, but it has a lot of sensible behavior built in by default. So for example, if there is a failure, it pops up a differencing tool and it plugs into differencing tools. And so if your different differencing tool shows you these five characters on the left, and your new file differs from these 10 characters on the right, it makes it much easier to decide whether, Oh, I've made a mistake. I need to fix my code. Or I was that's good, the new feature I just implemented as worked, I'll use the difference in my favorite differencing tool to copy the, the received that the, what we just got over and anoint that as, as the new version,

Matt Godbolt

15:13

It's funny the language we use for that. Cause we always call that blessing. And you say anoint. There are some very religious overtones to this kind of thing.

Clare Macrae

15:19

[inaudible],

Matt Godbolt

15:19

This holy sacred text which is like what I was expecting to get to be.

Ben Rady

15:23

You have to have a special hat.

Matt Godbolt

15:24

I wonder what it is at all. So it's like, that makes us sort of go towards those things. The other thing is like golden is the other sort of term I've heard for these golden tests and stuff. And certainly for giant, I know like GCC's internals have like all thousands and thousands of test cases they've collected along the, uh, along the years and like, well, this is what it should come out to be. You know, we found this weird situation where it was broken and rather than well as, maybe as well as writing individual tests to test those components were like, well, this is the piece of code the user had. And then this is what it should have generated. And so it's a really powerful thing, but to plug it into a diffing tool, presumably this is something that you can configure. So my CI build can just go, no, if it doesn't match and how does yeah.

Clare Macrae

16:06

Yeah. So it recognizes, uh, through environment variables. It recognizes if it's running on a bunch of well-known, uh, CI systems. And in that case, it writes out a text diff rather than, uh, um, spinning up a GUI tool. Cause obviously you don't want to block your, um, your CI system if you happen to have graphical installed on it. Um, so, uh, yeah, Llewellyn is a big fan of convention over configuration. So in all this library, this approach is implemented in Python and C# and many different layers of .NET and, um, loads and loads of different languages with the same vocabulary describing the steps and the options and the configuration on each of them. So I've actually made quite heavy use of the Python version, myself for home projects. Um, but it turns out that even though it's a really simple approach and, and for the C++ version, we have no it's written in vanilla C++ 11.

17:11

So we have no, um, kind of process control in or anything like that. So it's got a list of 20 or 30 diffing tools that it looks for in standard locations and other path. And you can always tell it to use other ones if you want, but it turns out that although it's a really simple idea and it's incredibly powerful and convenient, it's also really possible to write code, write tests that generate walls of output that hide the detail that you're actually testing and that become fragile to maintain. And the worst possible thing is to have it set up so that it writes out too much output and not all of the developers understand the purpose of the test because then if you get a test failure and actually it showed a bug, somebody comes along, sees the differences. I want to make that test pass.

18:06

I'll just approve the new output. And they've lost the signal that, that actually there was something wrong. So there turns out to be lots of nice patterns, won through experience of what information you choose to write and only focus it on the information that's relevant. A particular test case include the inputs and a description of what's happening in the output as well as what the actual output are. So make columns line up. So it's easy to glance at and make sure developers understand the difference between seeing a bug and improving the output. And then with that comes incredible power and incredible convenience for testing legacy code and hard to test new code. Right?

Ben Rady

18:49

Yeah. I was going to ask, it sounds like this is a tool that you would normally use on a legacy code base, right. And like something that was never designed to be tested. Um, but you also said you were using it in your, some of your personal projects. So is this also something that you use for, for other things that are not legacy code?

Clare Macrae

19:06

Yeah, so I guess there's at least three broad areas for it. So certainly, um, you've got, uh, a lump of code that you can't yet divide up and write unit tests. And so if you can find, um, a place to hook into it, um, to, to run the code that you need to run and that, again that comes with practice, that's, that's, uh, um, a big use of it. It's also really useful. So an example I give is on my PC at home. I have some Python scripts. I use, I download, uh, bank statements and, um, munge the data so I can import it into, um, the finance software I use. Okay. No big deal. And when I wrote it, I kind of had this little personal nagging doubt that I ought to be writing tests for this, but I was never going to, and then I learned about approval tests and I added an old downloaded set of transactions into my, because of course it's version controlled because why wouldn't you?

20:07

And then I run approval tests on it. And whenever I rework the software, which isn't that often I can rerun the approval tests and see if the behavior has changed. So I'm not sure whether I would do that sort of thing in a commercial environment, but for small projects that you want the convenience of knowing if you've broken it and you don't want to have to write small sort of user centered tests. That's, that's really good, but there's, it's got other things built in, like, it's fantastic if you've got large numbers of inputs. So if you've got some function that takes for now take six or eight arguments, it's got a thing called "verify all combinations" where you can pass in a container with a set of values for each of the input parameters. And again, there's skill about formatting, the output and so on.

20:56

So, but you know, in a few lines of code, if you're looking to get good test coverage, even of new code, it's very, very quick to keep running it through a test coverage and add a new data point, add a new data point or to that existing array. And so what if you end up with, uh, a file with 20,000 lines of output, uh, because your diffing tool is only going to show you the differences and you'll get you often by seeing the patterns in the failure. Oh, look, it's all of the values where parameter three is negative. That's where it's gone wrong. Okay. That tells us where to look. Um, so yeah, it's, it's fun and it's exciting. And it, um, it answers some of the questions that you between you have touched on in some of the earlier episodes of yeah. How would you even begin testing here?

Ben Rady

21:47

Yeah, there's a lot of when it comes to legacy systems in general, there's a lot of chicken and egg problems that you run into where, you know, in order to make the code testable, you have to be able to change it confidently. And in order to change confidently, you have to test it. And it sort of, what do you do there? Yeah. This whole approach actually kind of reminds me of a technique that I've used for a while when writing simple bash scripts, where I will, if you're familiar with the watch tool in Linux, uh, it will run a command over and over again, and there's an option for it to show diffs. So we'll highlight the differences in the output from one run to the next. And so if I've got a bash script that is fairly simple, right? There's not really any branches or anything, it's just processing a bunch of stuff and put it on the screen. I'll just run the batch script over and over again and watch, you know, like a two second or three second interval and then edit the script and watch for those differences now, you know, I have to be quick...

Clare Macrae

22:42

Hmm.

Ben Rady

22:43

But, for simple things that actually works really well. And I almost wonder if, you know, being able to for more complicated things, um, being able to pipe the output through a tool like this and run it using, um, something else like, um, probably like with entr or some other sort of, uh, file system event based tool where it's not just running every few seconds it's running only when I make a change, uh, could be another way to sort of test. Cause bash scripts are notoriously difficult to test. Like there are ways and they're all terrible. Um, and so I wonder if this could be yet another tool in the toolbox of, I have a 500 line bash script, that's like incredibly important to my company and if it ever breaks, we're all screwed, but I still need to change it today. So what do I do?

Clare Macrae

23:30

Yeah. Yeah. That's really interesting. So, um, I guess, uh, one of the things I like about bashes, you can turn on all sorts of warnings and make it fail if you have a twice a use and an unset variable.

Ben Rady

23:42

set -euo pipefail?

Clare Macrae

23:44

Yeah. I don't do it often enough to memorize the runes, but I can always find it really, really quickly. Yeah. One of the things I'm quite envious of in some other languages than, than, um, uh, than C++ is tools like Ncrunch, so tools that are built into IDEs, and they're always running your tests as you're typing. So, you know, it comes back to your, your rule of eights and you need your test to be really, really fast, or to be able to slate only the tests that your...for the area that you're working on. Um, so that, um, what was that tool that you mentioned, um, that runs, would you say it was watch?

Ben Rady

24:26

Oh yeah. Watch. So the one I'm wearing when I'm working with, with Bash scripts, I'll generally, I mean, I, I actually do this. I use watch for lots of things. It's a super useful tool. Um, but it just runs a command over and over again. And there are, there are options to show, uh, like differences from one set of output to the next, and it can be really useful for, uh, all kinds of things. But yeah, running, running bash scripts is a great use of it. And just seeing those diffs

Matt Godbolt

24:51

More generally as well, you can use it for like, uh, there's an option that says exit when the output changes. So it runs the command once and then caches it and then it keeps running the command over it every two seconds until the output changes. So if you've got like a directory you're waiting for a program to write something into, you can do like watch, and I forget the command line dash dash quit of change LS. And then you walk away from your computer. It comes back and it's like, Oh, it completed. Then when someone has dropped the file in the directory you are expecting, which is like super poor man's, uh, watching tool was a file system monitoring tool. But, uh, there's a number of things you can do that with, you know.

Clare Macrae

25:25

Makes me want to play around, um, with, if I've got, you know, small C++ projects I'm building. And as I'm typing, I want to rerun the tests. If I had watch building the code and running the test. And obviously as I, cause, you know, modern IDEs, you can make them save as you type, then you don't even have to remember to run the test in the IDE you, uh, say watch up running and you can keep an eye on, you'll see when your tests pass or fail. That's interesting.

Matt Godbolt

25:57

Yeah. I mean, you do something like this. I think Ben, when it comes to like Python.,

Ben Rady

26:01

Yeah. The times when I've worked in C++ too, I actually do this with another tool, which I think I mentioned called the entr where I have entr watch my, um, like, you know, resulting test binary and then whenever it changes, I run it. So it's like, I'll just kind of on the IDE to do the compilation for me. Um, but then whenever it successfully compiles a new test binary entr will run the tests. So even if my IDE doesn't support automatically running tests, I can kind of make it, do it in a terminal. Um, and it just, it's just one of those things where it's like, you know, why, you know, why do two steps when I can only do one, right. Especially if I'm doing it all the time. Um, so if I can turn, you know, control+S into the feedback loop that tells me everything that I need to know about my code, whether it's, you know, did compile successfully. Okay, great. Uh, you know, did the tests run? Okay, great. Um, it's just, it just makes life easier. And, um,

26:57

Yeah, those, those tools are super easy to use and install, cause it's just, you know, apt install entr and point it at a file and tell it what command you want to run. And Bob's your uncle.

Clare Macrae

27:08

Really exciting.

Matt Godbolt

27:10

The, the very first version of, of Compiler Explorer was in fact, just a watch and GCC a bunch of pipes. So it's, it's a, it's a...yeah. It's a very valuable, uh, prototyping, web development prototyping tool. It turns out.

Ben Rady

27:26

Yeah, totally.

Matt Godbolt

27:29

So I've got a question about how you'd start approval testing. How does one go? You mentioned, for example earlier, like making a procedural set of like run all these six different parameters and here's three different values for the April amateur and true and false would be, or whatever it is, obviously that's going to generate your, you know, 200,000 line output, which you correctly say, you know, like if you, if you catch a mistake, hopefully if you've done it right, it's a subset of those lines in your diff tool. It shows you the lines that make sense. But when you go from nothing to that, how do you do you just kind of go, I hope there's no bugs to begin with. Let's just bless whatever comes out the first time we run it and then we're using it more to see if I changed it.

Clare Macrae

28:07

Yes. Yeah. So built into approval tests is the idea that a few, the first time you run a test, of course you don't have the approved output. It doesn't know what it is. So, and lots of diffing tools don't like it if you give it a non-existing file on the command line. So the first time you run a test and it derives the file names from the names of your tests. So it comes up with sensible names. You don't have to think about any of this. It just, you can change the name if you want, but by default it does sensible things. So I see you, haven't gotten approved file for this. I'll create an empty file and then show you the empty file and the current output. Side-by-side. And when you're working with legacy code, code that exists and you're trying to sort of lock down the behavior by writing tests, the right answer is always the current behavior, because usually at that point, you you're wanting to, well, you either want you to add a feature or fix a bug. Maybe you need to do some refactoring first, but you want to make sure that you don't accidentally change the behavior

Matt Godbolt

29:13

You lock in the existing behavior and say what, even if it's not, doesn't make sense, even if it's not exactly what the customer wants at the moment, I'm in the, do no harm stage of the sort of cycle I, I make my, my, uh, my test, I lock in that behavior and then I maybe go do my refactoring. And then I run my test again. Then if it all still passes, and I know I haven't changed anything that I had done before. That makes perfect sense. I mean, it's a standard sort of cycle, except that instead of me having to invest the time thinking about what is it, I must test about my code in order to be confident that the changes I'm about to make don't make it, I'm kind of using a sort of global view of like, well, just the, as long as we execute the code path and as long as the representation of the output is to your point, uh, captures all of the things that are interesting about what I want to test, then I just need to run it and output that. And that's my, my starting point.

Clare Macrae

30:06

That's exactly right. And it's, it, wouldn't be unusual at that point for you to see something in the output and go, ah, that looks wrong! And so, um, don't mix up changing the behavior with refactoring, the code, those uh those actions need to be kept separate. And I heard, I've heard some fun war stories about, and, and if you think there is a bug, don't go and change it until you've spoken to somebody who knows the product.

Ben Rady

30:36

Yeah. People have been relying on that bug for years. Don't go changing it now.

Clare Macrae

30:42

Yeah, exactly. Owns law. Yes.

Ben Rady

30:43

Yeah. Yeah. How do you deal with, um, the sort of like, I mean, I can think of a certain categories of things where you might get tripped up, like time, you know, uh, things that are things that vary by time, uh, unique identifiers, uh, GUIDs, you know, potentially depending on what the input values are, like, how do you, how do you deal with those kinds of things in the textual output?

Clare Macrae

31:04

Um, yeah, that's a good question. So the approval tests approach has a vocabulary for code to deal with that situation called scrubbers. And so the kind of, um, so, uh, the way you do it is you, you know, you run a test, you approve the output, then you run the test again and it fails and you see there's a date and timestamp, that's different, for example, well, we have various helpers that you can use to say, well, if it's a GUID, we have a thing which says convert any, anything that matches a GUID regular expression to some kind of placeholder text and even says, it keeps track of the GUIDs it's matched. So if the same GUID, it appears three times, it will say, I think it's in square brackets or something like that, GUID one. And then the next different it's, if it comes up with a different GUID, it says, GUID two.

31:56

Date and time is a bit harder because of locales and things like that. We can't, it's out of the scope of our project deal with every locale worldwide. So we've also got a regular expression, um, functions that you can call and you pass in a regular expression and then some replacement text. And so you can say convert all or everything that matches this date and time, regular expression to something perhaps in square brackets, it says date and timestamp. Um, so, um, so we've, we focused on having helper functions to make it easy to do that and new programming, the logic that you want for the pattern you want to replace. And so the kind of the conventional answer with date times is, well, you create a policy object that reports what the current date and time is, and you inject that all the way down your code until the actual point. And then who knows what you've broken on the way and how much time you spent, whereas the approval tests, uh, approval tests approach is now just write out the file and then reread it afterwards. And munge it in any way you want to sort of beat it into submission to serve your needs and save your time. Just a fantastic, different way of looking at things. Um, but I love it.

Ben Rady

33:13

Uh, have you ever used this technique on things that aren't particularly well-represented as text? I mean, you know, if you base64 encode anything, you can represent it as text, but say like podcast, audio, or video or a picture or, uh, something like that where the textual representation isn't particularly informative.

Clare Macrae

33:32

Yeah. So when I was working on a 3d visualization program for visualizing crystal structures, I had to add a new style of visualization for that. People have been asking users have been asking you for, for decades. And we finally made the time to implement that. And I was on a, quite a tight loop of, I don't even know the maths at this point. So I'll have a first stab and then run a few hundred or few thousand crystal structures through it and see which ones crash. And, um, so I used approval tests for that. So my approved, um, and received images were PNG files I think. And at that point I was never going to be able to commit those to our version control system, which has already several gigabytes in size. Um, but, uh, I was able to run it repeatedly on my machine and it made for really good conversations with the product owner, who was, he was a fantastic product owner, really helpful, really responsive.

34:30

And, uh, sometimes he would say, yeah, that matters. We need to fix that. And other times he would say people using that kind of structure, uh, not going to be using this display style. So it doesn't matter. And if I hadn't had that conversation, I would have tried to make it look pretty rather than saying, well, at least stop it crashing, but don't try and make it look nice. The other thing I've seen it used a lot in is audio. And there seems to be a pattern of people asking for help. Um, we're testing audio outputs, and it turns out there are some nice approaches around say that what you're generating is an audio wave. You'd save that as a bunch of numbers and that would be the right answer. But that's really hard to understand when there's a failure. Well, it's easy to build in extra code very quickly that says, and also generate a picture of this. So I'm kind of visual representation.

Ben Rady

35:28

Is this like ASCII art?

Clare Macrae

35:31

Well, I mean, if you've got some software that can generate the, like the way the audio wave, um, and you probably wouldn't version control those, you wouldn't need to that the text output is the master thing, but if you get a failure and approval tests has a concept called "reporters" and you can say, as well as popping up a difference of the numbers also convert them both to something that you can visualize and then open them up in a diffing tool that visualizes differences in images really well. And you can show it to the domain expert who says, yes, that matters. Or actually no human is going to be able to distinguish between those two, just accept the new answer. So anything that, um, you can create any kind of official representation of is really amenable to this.

Ben Rady

36:23

Interesting. Yeah. I can imagine that the, uh, combination of factors there could definitely get challenging where it's like, okay, I have date and time sensitive information inside of an SVG file that I've generated. And the only way to see that the differences are material is to open them up in two SVG editors and compare them and say like, yep, no, that makes sense. Or versus like, no, this isn't really important. And then building all the tools to filter all that stuff out so you don't get false, false positives.

Clare Macrae

36:55

Yes. Yeah. There's a, there's another implementation of approval tests that's in Python called. Um, I want to say text diff, but I'll make sure the right link is in the show notes in case I've got that wrong (http://texttest.sourceforge.net/). Um, and it allows you to do things like proving, uh, web pages and PDF files and things like that. And it does that by building on the power of Python to convert those files, to text representation. So for example, you've got a PDF and yeah, someone might want to inspect the styling, but mostly you want to make sure you haven't changed the content of it, the text content. So run a Python tool to extract and nicely format the text content displayed, text content, and then use that for approving. So there's all sorts of once you sort of free your mind from, well, there's this exact output and I have to make sure it doesn't change more to, well, what do users need to, um, perhaps, uh, domain experts need to see in order to understand whether there's a meaningful change or not becomes super powerful.

38:07

And depending on the time that you've got, you could leave the output, uh, the testing being done via approval tests, you could leave a few tests and as an integration test down the line, or maybe you want to use it as a place or a placeholder on the way. And I use it to learn about the behavior of the code by feeding in lots of different inputs and then write down, um, uh, kind of business logic based, uh, user user expressed tests. So it's it's yeah. Depending on your scenario, approval tests might be the end point or it might be a step on the way. Yeah. I mean, I

Ben Rady

38:49

Would argue in most cases that it's like, if you can make that additional transition and go from something where, you know, you're using a sort of capture based testing solution, whether it's approval tests or, um, you know, there's, there's other tools for this as well to something where you can have more targeted tests, like in general, that's good. But I certainly wouldn't want to recommend that everybody do that because, uh, given the state of certain legacy code bases that just is not economically feasible. Um, but yeah, it's, it's an interesting line and it's sometimes it's hard to, to find those lines of like, you know yeah. You know, the suite of approval tests that we have, like, okay. I guess one question would be in your experience, how long did these tests take to run? I would assume it's kind of like highly dependent on the system, right?

Clare Macrae

39:40

Yeah. Yeah. So the, if you were running incredibly fast calculations through them, then I think the majority of the time would be the file IO because it writes out the, the current output and then it finds a matching file and then it reads it and it says, is it equivalent? And if you've put in a regular expression, it's applying that and then it launches a different tool if it's different. So there is a certain amount of overhead with that. And I do know the theory of, uh, if it touches the file system, it's not a unit test, but the people I'm talking to have no test at all and no confidence in changing their code. And so at that point, any test is better than no test.

Ben Rady

40:23

Yeah, absolutely.

Clare Macrae

40:25

Um, and so we haven't talked about, well, what if you can't pull out a usable chunk of code to throw, uh, data in and, and run through approval tests? Uh, so there's a whole extra skill that, that comes there with, um, learning enough about safe refactoring to be able to, and how much you trust the refactoring tools in your IDE, um, and that you aren't using some obscure corner of a language where you've, I don't know, got inconsistent assignment operators or something, you know,

Matt Godbolt

41:00

Who would write in such a language anyway, that would allow that to be a thing?

Clare Macrae

41:06

Um, so there's, it turns out the more that you look at this stuff, the more experience there is out there. And the more there is to be learned about using IDEs to, uh, separate out, um, chunks of code into functions that you can call and, and reuse. Um, I haven't yet attended it yet, but, um, uh, Llewellyn Falco and Jay Bazuzi, in the States have been running some workshops on how to safely get code under test that has no tests by almost entirely doing completely automated refactoring. So you don't make you at each stage. You see what the next tiny change you want to make is, and they do amazing transformations of code. Um, it sounds really exciting and they've, they're running a series of those right now, and I'm really looking forward to attending. It happens to be in C#, but the requirement, because you're using the IDE to do the work, all you need to do is to be able to read C#, which is that's no problem.

Ben Rady

42:14

Um, yeah. Yeah. Certainly when I was working at Object Mentor with James, uh, that was, our preferred technique, or at least my preferred technique for doing this in Java, um, and part of this is just the Java IDEs are so powerful, but doing this in Java was you just rely on the automated refactorings and, you know, read the code that was produced, like make sure that it also makes sense to you. Don't just blindly hit buttons, but, um, rely on the automated refactorings that come from, you know, IntelliJ or Eclipse to be semantically equivalent in most cases, and try to find the minimal number of automated refactorings it takes to get the tests in place. And then from there you can start refactoring the code that is covered by the tests manually. Um, have you, I mean, I would imagine that that kind of technique would not only be useful for unit tests, but also for approval tests. Like, you know, have you, have you done that where you're sort of like using automated refactoring techniques to create a hook for these kinds of things?

Clare Macrae

43:18

Yes. Yeah. Um, the main IDEs I use these days are Clion for C++, and PyCharm for Python. And the refactoring tools work so much better in Python than in C++. And they tend to be quite a lot quicker as well. So it becomes a case of learning what tools work well. And perhaps occasionally you need to move a variable around before you divide the code up so it's got less scope to, to search. But, um, so yeah, that is something that I have been, um, enjoying learning about. And, um, by coincidence, I'm going to be putting my money or my time where my mouth is, uh, coming up soon. So on the 16th of February, I'm doing a webinar for JetBrains, a live webinar. I'm talking about using refactoring in CLion to add tests from no tests. That builds on, uh, that builds on an earlier webinar that Arnie Metz did a few years ago. Um, he was showing Clion refactoring tools, and I learned a bunch from watching his video, but at the end, somebody said, what if you don't have tests? And I thought, well, that's a, that's a good thing to explain, right?

Matt Godbolt

44:37

Your experiences can come in and show, well, this is how you can get tests from essentially from nothing almost right. You can use the existing behavior to build your tests.

Clare Macrae

44:46

So if you're listening to this podcast, sometime soon after the 16th of February, the video of that will be available. And the main focus of that is going from no tests to having tests that you can trust and have confidence in. And then in March, I'm doing a talk at the ACCU conference online conference, and that's much more specifically about refactoring and getting the IDE to do your work for you. Uh, so there'll be a lot of, a lot of practicing and a lot of experimenting and different IDEs and comparing the refactoring tools and so on.

Matt Godbolt

45:19

So you mentioned actually, just going back a little bit here that, um, the approval test library that you've been working on is open source. So, uh, presumably we can go and find out what, remind me what it's called and where we might find it.

Clare Macrae

45:32

So the approach has called approval tests and there's a couple of good URLs to go to, uh, approvaltests.com, has got, a bunch of links from there, and many of the implementations are in the GitHub user "approvals" (https://github.com/approvals). Uh, so get hub slash approvals is the other possibility. And you'll see all the different languages that are supported there.

Matt Godbolt

45:53

Right? So there's, there's more than just C++ there you said Python before now and, um, many other languages as well. So this sounds like something which is just, as you say, it's a generally applicable approach. And, um, given the same sort of set of what was it, "scrubbers" was one of the sort of terms of that you used. And then, um, what was the thing that does the transformation from the text output to something else for diffing at the end, if you were, if, if it was, uh, like you mentioned with the audio example, what was the name of that?

Clare Macrae

46:22

So the vocabulary it uses for diffing tools is "reporters". So that's how it shows and you can write your own reporter that converts to a different file format and then does the diffs and things like that. Um, looking at approvaltest.com. It's got about the logos of about 12 or more different languages, Java, C#, C++ PHP, Python, Swift, JavaScript, um, Lua, Objective C, Ruby and, um, Perl.

Matt Godbolt

46:54

Perl?

Clare Macrae

46:54

Yeah.

Matt Godbolt

46:55

I mean, I guess it makes sense. It's still going. That's fantastic. So how can people find out more about you and how can they contact you, um, online

Clare Macrae

47:05

On Twitter, I'm Clare McCray, UK, and my website is claremccray.co.uk. And I would say, I love testing challenges. If you've got small testing challenges, just contact me and we can chat, perhaps share code online or something like that. Also available for training and consulting on code as well. But, um, I'm in the lucky position of this is this is my fun and my hobby. Um, and, uh, I love sharing what I've learnt, so help me learn more.

Ben Rady

47:37

Yeah, sounds great.

Matt Godbolt

47:38

That's fantastic. Well, thank you so much for being with us today and yeah, it's been absolutely brilliant to have you.

Clare Macrae

47:43

Thank you. It's been huge, fun and great to meet you too, Ben.

Ben Rady

47:59

Great to meet you, Clare.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript