Hey Ben,
Hey Matt,
How are you doing?
I'm doing great.
Glad to hear it. So we've been talking a bit about performance and I've realized that kind of my career has taken me from hacking around in assembly through to trying to coerce compilers into doing all the work for me. Right. That seems to be the journey that I've taken. And I realized that I, there are people still that are much, much better at making the compiler, do all the heavy lifting for, for us than I am. And so I figured we should talk to somebody who can tell us all about the kinds of things you can do if you really, really work hard at the, the compiler and, and some of the tricks you can do with C++.
Yeah, well, that sounds cool.
So
And, and so we have a guest today.
We have a guest, we have a guest, there's another window on our screen, which no one else can see, obviously, but so surprise to nobody on this call, but surprise to our listeners: We have a very special guest today. We're joined by Hana Dusíková. Hi, Hana.
Hi,
How are you doing?
Thank you so much for, for agreeing to do this. We are excited to have you here.
It's it's too late here, but it's, it's great. And last few days was kind of up and down, so I'm great.
Oh good. All right. I'm glad to hear that. And thank you as well for, for dealing with the, the time zone difference. We, we, we are happily in our lovely early afternoon, whereas you are into way into your evening, so we appreciate you taking the time. So I just wanna introduce you to our guests, our guests!? No introduce you. I wanna introduce you to our listeners. You are our guest. You can see how often we do this. So Hana, you are a staff scientist at Avast and you represent the Czech body for C++, right? Is that the right way to say that?
Yeah, it's that's correct. And also I'm an Officer here in C++ committee: I'm chairing SG7, which is study group for Compile-Time programming and static reflection.
That's not a, that's not a, a sci-fi series, right? I thought
S no, no, I figured maybe it's prequel to Terminator probably.
Well, we'll get to that in a minute. I'm sure. But yeah, I'm always confused because: was it WG 21 is the C++ standards committee. Is that what its name is?
Yeah, it's Working Group 21.
And then SG is special group or
Study group
Study group. Okay. So you're SG7, which is compile-time programming and what was the second part?
Static reflection,
Static reflection. Wow. These are all sound like very exciting things that I want now in my programming language.
I think it's, it's really, it's a really boring topic,
It's it? But now you tell us!? We just invited you here to talk!? About these kind of, well, we talk about all things, but
We talk about a lot of the boring stuff on this podcast. Other people seem to find it interesting. I don't understand why, but this is just Matt and I talking about the stuff that we always talk about.
I mean we can just listen to Matt talking and it's exciting. I think at CppCon, you did, you did like 30 minutes. Standup comedy was it?
I don't think that's true. I mean, maybe that you just thought that my, my presentation was just so funny, but no. Well, yeah. I don't
I think you, you did great because you cannot like you didn't any feedback and you just go and go,
Well, normally the feedback is "please stop talking, we've had enough," or at least that's what I read in people's faces.
No,
Bless you. So my understanding is when you're not hacking on C++ and dealing with the committee stuff, you, you're also a keen photographer and a Greyhound owner, but there's something cool about kind of photography you do, right?
Yeah. I'm doing panoramic photography. It's like these spherical pictures, like Google StreetView before Google Streetview was cool actually.
Right.
And it's quite technical. It's not like, it's not like typical photography. You need to have like a tripod, you need to have like special head for that. And also wide l. And you need to take, I think usually 30 pictures and make sure no one is moving or no, like not mean moving like going between pictures. Because then someone is cut in half and it looks, or
You get two copies of the dog that was running at the same time through your picture, right?
Yeah, exactly. But you can also like erase them if they move between pictures, there is like always some overlap. So you can like remove all picture, like almost a whole crowd.
That's amazing.
If you have time and patience.
In, right. Which it strikes me as if you're writing C++, especially the kind of template C++ the perhaps or one writes yeah. You might have a lot of spare time and patience.
Yeah. Right.
So what, what got you into Panoramic photography? Because, I remember once taking photographs, like with my, my, my handheld and like using some photo panorama stitching software to kind of put it together. Is that the same kind of thing, but just more,
Yeah. It's actually really similar and it's the technical part is to rotate your lens around a nodal point in aligned. So there is no parallax error and then it's really simple.
Oh, I see.
So this is like the, the center of the tripod is creates a point and the lens is always around that as opposed to just turning the lens and then you the, the point. Oh, okay. That, that makes sense. That's interesting.
Usually when you, when you do panorama with like a phone, or like take it in your hand and do like stuff like this, right?
Yeah, yeah,
Yeah. It's hard to like to describe on a podcast or
Yeah, no, but the thing where you hold your phone out and slowly pan it from one side to the other.
Yeah. It makes error. But actually Google is using it in Andoid to extract depth from pictures.
Oh gosh. Interesting. I had no idea they were doing so much. Like I just take photographs and then every now and then Google Photos says, Hey, I made a Panorama out of four photographs. You just happened to take together. That looked like they could have been put together. And I, I have no idea how they do that.
Yeah. But like I approach is a bit different. It's not machine learning at all. It's just processing power and it's a lot of, lot of, Gigapixels.
And yeah. Yeah. I you've sent me some images before that have barely, almost crashed my browser because they're so huge. So what, what kind of resolutions are we talking about?
We are talking about limits of a JPEG
Right. I didn't know JPEGs have limits. What, what they
Like. Yeah. Yeah. It's 16,000 something and it depends on implementation. So
So ther's a uint16 somewhere is this is like a width and height.
Interesting.
Well today I learn.
And there is actually funny thing, not every implementation supports the whole specification of JPEG. And it depends about the last five pixels does the like difference between good implementation and bad implementation. The last five pixels is like in middle of the like this square in JPEG,
The, the macro blocks, right. There are 16 by 16 or whatever. Yeah.
Yeah. And not, not every implementation is able to handle it.
How peculiar? I would've thought this was a completely solved problem by now, but you know, obviously not.
Yeah. I found out when I like tried to load JPEG in Apple preview and it was only in only gray, nothing else.
That's amazing. I just thought, you know, JPEG's a JPEG, I know it's a complicated format and I'm really excited by like the clever tricks that JPEG uses to sort of get the compression levels that it does. But I didn't realize that it was much space left over for them to, to get things wrong. Turns out this computing lark is pretty difficult after all getting things. Right.
Yeah. Then then I ask Sean Parent that like what's, what's going what's going on
Sean Parent of Adobe Photoshop fame, right?
Yeah, yeah. You should, you should invite him, right?
Yeah. He,
Yeah, he has always some story. And he told me that he makes sure that the JPEG encoder and decoder in Photoshop makes all the corner cases and it was really hard.
I, so have you been sending him test images now or is it, was it already handling all of these things, right. You
Not, not to him, but of to Tony Van Eerd. Yep. He's working on a projection and he asked me for some big picture. So I sent him one, one of them and he told me it crashed their application and he had to do some fixes
That's that's for really cool, I mean, images. Right. I know, you know, every now, and as I say, like you crash browsers, there, you load up a big image. Like I'm fond of looking at these high resolution, similar Panorama, actually similar technology, funnily enough where people have de-capped silicon. So you, you use acid to etch off the top of the chip and then you'd use a microscope and you take loads and loads of photographs, really highly zoomed in, and then you stitch together the die shot of the, of the chip. And it's super cool because now you can start reading off how the chip actually worked for, you know, like 6502-era technology, at least. And at least that's the stuff that I, I can tractably look at and get some idea what's going on. But those images typically are yeah, like they're PNGs and they are, you know, 60,000 by 60,000. And so yeah, you load them into your browser and it, of course the browser immediately shrinks it down. Cause it knows it can't fit that into, but then it's kind of coming down like, like an an old school modem, you know, 14K4 modem. Why on earth in this day and age, is it taking so long for an image to come down and then it kind of goes bump and it's like, Chrome's dead. And you're like, oh, I see...
Or sometimes again in Apple Preview you can open the image. It's totally fine. And, and when you zoom it it's suddenly gray. And if you un-zoom it, it's still gray. Oh.
And gray is like, "I've stopped responding" gray. Yeah, probably I see. Right. That's like, I know Ubuntu does this thing where, you know, sometimes if the application's not listening anymore, it's like, well, I'm gonna render what the last picture you had. But like in black and white to show the user that you are not the pictures there, but the application isn't.
Yeah.
It's funny. It is. I mean, it's really interesting these things because you know, we, in our sort of day jobs, we, well, I, I speak for myself and maybe for Ben here and obviously Hana, your, your jobs are probably are very different, but like memory is kind of something we don't worry about as much anymore. Like we work on service side stuff for trading. So, you know, we just put more RAM in the machines and we're fine. Right. But when you're dealing with images that are, as, as that, you have to write your software differently, you don't necessarily want to, or can even rely on say the virtual memory system doing a good job because it can't possibly page things in and out as well as you can, if you know how your software is going to access the image in the first place. And I don't think I'd ever really thought about that.
I think this applies to every domain. If, you know, if you knew a semantic of your algorithm well enough, you will always be better than some generic algorithms,
I guess. So I guess so. And
That's probably true.
So talking of knowing more about your domain and a very gramaphone-needle-scratch-noise segue. Yeah. One of the really cool things that I know you do is work on, as we say to the, in the intro, some, some compiled time programming things and you're most well known for, or at least I'm, I know you most well through the, a regular expression library, the compiled time, regular expression library.
No, that's not true. I'm actually are most well known from my slides.
That is also true. Yes. But the slides, if I'm not wrong, are the things that explain the tricks that you use in your CTRE library. Yeah. That's so, yeah, we should. All right. We should let you know, we'll put some, some things in the notes about this, but Hana's slides are absolutely legendary and have basically changed the way people do presentations in the C++ space because.
They are super interactive and she'll show like a parse tree of a, you know, what looks like a photograph or a picture, a nice rendered picture of a parse tree of a regular expression, including like, this is this node, this type, here's the multiple of this one or whatever that kind of like tree view. And then everyone's like, oh, this is cool. Now we can see how this works. And then she'll like single step through a matching a string and different things are highlight. And you're like so far, so you just hack this in PowerPoint and you just done all manually. And then she'll just edit live, give me a, give me a regular expression and she'll type it in. And the tree changes and it's like live updating all the stuff. It's the best.
Oh! That's pretty amazing.
I mean, yeah, it's definitely something to aspire to. I know when I did a, a redo of a presentation on like memory accesses, I spent a whole bunch of time trying to do similar highlighting of like, this is where the cache comes in and this up, and then this cache line lights up like this.
Oh yeah.
Do you remember that presentation? Right. And it's a pale imitation of what Hana is able to do. So, so yeah, Hana, what, how are you doing all of that stuff? And
I'm actually programming it, right. So implemented a really simple version and a really hacky version of expression in JavaScript too, to be able to hook them into like into the D3 library to draw the nice diagrams and visualize it. And I spent lot of like month
You, reimplemented the same thing you'd written in C++ in, in your own particular style, which we'll talk about in a second about why it's so cool that you've done it that way. Wow. You wrote it in JavaScript just so that you could draw the slides to then totally answer how you did the C++ mode
Also. Yeah. Because, because it was less work than making slides, like with some animations, because then I changed something and I need to like redo everything or no, no way.
Right. That's I I've
Still, I was gonna say, and I guess because of the nature of the library would be difficult to like transpile that, right. Like with the, what is the thing that I'm thinking of?
Graphviz?
No, no, no, no, no. Like the, the regular expression, you reimplemented the regular expression library to make this work in JavaScript. Yes. Yes. So like there, I don't know why I'm blanking on the name of this thing right now, but there's this the tool that you can use to, to basically compile oh,
C++ into JavaScript, so like WASM and ?
Stuff. And I think
Twitter, yeah. That thing, like now I'm blanking on it. Yeah. Emscritpen!, And so on. Yes.
Yes. I
Was
Actually, but the whole point is, is that you're doing compiler to make that work. So like the chances of that being supported by the, you know, WASM compiler is basically zero,
Right. The thing is the WASM compiler will happily do it. It's just that the everything happens at compile time.
Exactly.
So you have to actually put the compiler in the webpage in order to then actually change it!
Of course. Yes. It's too late. Yes.
Who would be daft enough to put a compiler on a webpage anyway?
So you actually, for, for like visualization of the compile time library. Yeah. You need to have runtime library if it hooks through it. So it's a different engine.
Tell, tell us about the compile time nature of it and bear in mind, you know, like, I, I, I'm pretty good at the C++ stuff. Ben is got a working knowledge and we don't really assume that our listeners have much of a thing. We try and draw 'em from a wide community. So what is compile time programming. Okay. and, and why would, why, why should we even know what, why, why should we care, I guess?
Okay. since I learn about C++ at first I didn't like templates.
This is, this is why we can be friends,
Super complex, hard. Yeah, exactly. And then I read somewhere that they are actually Turing complete. So, and someone in some article jokingly said, you should be able to make regular expressions in templates.
Oh.
And it bothered me in my mind somehow how I would do that. And I was think like through many years I was like experimenting with some parts only, like then I finally was able to match only like string against another string. And, and then I learned more and more, and then suddenly I was able to match like, a regular expression, but not like regular expression in string, but like expressed as a template, like like the expression tree of the regular expression, but included as a template type name in C++. So it's like, oh, repeat angle, brackets, string character one, character, B character C.
Got it. And so youHave to kind of, as a, as a programmer writing the regular expression, you, you actually had to write the encoded regular expression as the tree that it would ultimately be rather than just the string that we know and love, but yeah, regular expressions
Tolerate, we love and tolerate, we know and tolerate.
And actually, I like the, I, I like the form because it's more expressive than like, like regular expression, which is
Likes going on. They're very terse aren't they? And that's, it's, it's nice if I can just imagine seeing one of these and knowing that the, one of the things that, that you could put in this are comments inside your regular expression in C style comments. So at least explain what on earth's going on, which is harder to do, or at least with the, the standard regular expressions.
Yes, exactly. And then I was thinking how to convert a string into this form, and that's simple LL(1) parser and I
Simple, you've just said simple, you glossed over something, which is very challenging.
Like the only data structure you have in template meta programming is a stack.
Okay.
You can you can have type type list and you can other type type in front and type remove type from front. That's all you have.
So this is kind of very LISP-like primitives that you have a access to. Right. But
When you have stack you actually Turing complete, then you can do whatever you want.
Wow. I don't think I registered that. That was the only thing you really needed in order to be able to, to, to do this kind of stuff.
I mean for Turing Complete, you need to have two stacks, but yeah.
Okay. Right, right. And just so just be clear, right? The, the thing that you're sort of stacking here is that you're making a new type, which is specialized in different ways and you're concatenating onto a type. So this is all in the type system of the compiler. There is no actual stack that you are manipulating. That's gonna be in your runtime program. It's all in the compiler at compile time. So
Yeah, it's a type. And if you other like type representing your current state in a parser, and then you add a type representing one character on input,
Right.
And modifies to another type which represent different state of the parser. Then you apply this repeatedly in in a like recursive way in a function. Yep. And at the end, it will, your result, which will give, give you only, this is regular expression. Or this is not a regular expression. That's like, step one. Okay. Then you like, so then
Let me just try, run that past you again. So you build, you parse this thing and you're building this type up and you're sort of pushing and popping as you're... ...Following the LALR parser rules. And at the end, it either succeeds and you get back, do you get back a bool or do you just get back a type that is the tag type of saying, this is a valid regular expression, or this is not a valid, regular expression?
Yeah. You can fail in multiple fashion. You can fail in some horrible like error in compiler. Also you can propagate boolean that expression is wrong. And then you can have only one static assert checking the boolean and with a nice error this is wrong. This is an incorrect expression.
I see. So static assert is an assert that runs that compile time. Right. It's just either true or false, and there's a compiler error. If the, whatever you passed at a static error is not a constant known-at compile time and true.
Yeah. It must be know at compile time.
Right.
Right, right. And then next step, when you have a parser...
Iso this is just the beginnings. Okay. Right.
And then next step is to have some, like context as a third parameter during parsing: another stack where you based on the symbol in the grammar do some modification on the stack.
Okay. So you've got one, that's just like the parse stack yeah. Which you were using just to sort of like note, keep track of where you are, how many open parens and close parens; that kind of thing. Yeah. And the other one is kind of where you're putting, I'm guessing this is gonna be like where the actual sort of compiled regular expression is gonna
For like typical expression grammar. You can have like a number and number and then a action plus. And you take two items from top of the stack and wrap them into number into, into plus operation. And the plus operation is actually can easily map to the first stage like the expression as expression and by expression. So you have like concatenation. So there is, operator concatenation when you just take two elements from the top of the stack, grab them and place them back to the stack.
Okay. So it's like a, like a stack machine that you would write if you're like, even like you're writing, like a maths, very simple maths parser-type thing, you know, three, two push, two push to add
You've ever used a TI calculator or
Texa Instruments calculator from like that.
Yeah, exactly.
But, but instead like multiplying numbers from input, you actually store the operation with the numbers itself.
Okay. So you kind of just keep it as that. You don't actually add them. You say this is an additional operation with one and one or whatever the equivalent. Okay.
Yeah. The input is actually from the grammar, the LL(1) parser will give you, give it, you a stream of operations in postfix notation.
Okay. Yep.
So then it's just two atoms. Concatenation atom, concatenation atom conation and then you have another, sub-expressionand then at the end there is a multiple options or, "or", "and". And then you, you will have the expression as a tree as a type.
Got it.
And then you can, easily run because you just call the operator, evaluate over the type and it'll, go recurively in all branches is, and at the end, it'll give you true or false if the input matches this regular expression.
So at this time, at this point, we have transitioned from compile time to run time. Right. So that the yes, again, right. I'm just gonna write, read this back to you to make sure I've got this straight because my head is spinning, but the, the result of the parses is accumulated into this, this, the second stack that you're talking about, which effectively ends up being a bespoke type, who's only responsibility in the entire world is to hold the compiled operations that will match the regular expression that was given.
Right.
So if I say "A|B" it's a tree representation of the A and the B, and the "OR", or something, something ish like that. And then that type I can now at runtime give a string and it will do the matching of A or B in this instance. That's amazing because presumably that means if I make a mistake, I think, well, as you just said, if I make a mistake in my regular expression, I don't have to wait till runtime for it thing to throw an exception saying, oh, by the way, that's not valid. It happens at compile time,
But that's actually the first phase.
Right. That's the bit we were just talking about before with the boolean or the, the tag type, right? Yeah. And then, so what implications does this have for, and what's the difference between say, you know, me just using a normal, normal, regular expression
Your compiler knows everything you are doing, and it can inline everything. And it's actually going down to assembly, like compare, compare, compare, compare, "ret".
So really does net out all this sort of very complicated sounding, sounding highfaluting type based stuff. Once it's all boiled down to like actual code generation, the compiler just essentially writes the handwritten code that you would've done yourself. If you have said, okay, you, this needs to match either A or B.
Yeah.
And it's just the two compares in an or, or whatever. And we're done. Wow!
My like first approach, because I, when I wanted regular expressions in like, in with a very performant one, I was thinking about using LLVM to write the code and use the LLVM optimizer.
Right.
LLVM is really hard to understand. And So I knew C++, so I write it in C++ kinda, and then optimizer will do it for me anyway.
That's a, so you still get to use LLVM if you're compiling with Clang, you already are using LLVM, but it's just, you're using it through the medium of, of normal C++ as opposed to like, were saying, if you were to, you could use LLVM directly yourself, in which case you'd build up nodes and you'd then build a dag and you'd build the comparison operations and the basic blocks that are doing the comparisons and whatever. And then you'd say to LLVM, Hey, can you generate me the assembly for that? Which would still be a, which would allow you to do some of that at a run time, I suppose, which is the only advantage that you might need if you actually wanted to have dynamic regular expressions, which is usually a security risk, right. If you let somebody type in a regular expression, you're probably doing it wrong. Like run time-based.
I think even, even evaluating a regular expression at run time is security risk,
Right? Because depending on the regular expression, you could have some absolutely catastrophic backtracking stuff going on where it could just blow up.
Not even backtracking. You can just run out of stack.
Right. Right. I suppose so potentially unbounded. Right. I had don't think I thought about that. That's, that's actually mildly terrifying given that my website is, is entirely regular expression based in terms of all of the horrible things...
But it's more mostly around in the clients. So it should be fine.
Yeah, no, there's plenty of stuff that's going on on the server, unfortunately.
OK good luck!
But, you know, like, I don't think a regular expression overrun or stack smash is my biggest source of security risk, given that I just take arbitrary bits of user code and run them on my servers. So, you know, anyway, that's
Also a true fact fact.
...That's Yeah, so that's amazing. So compile time is, is letting us push stuff that would traditionally be done with like a, a static offline parser generator stuff, and then a, a code generator, like you said, like LLVM, or even just writing your own stuff and being able to hoist it into mostly sane looking C++, you know, like I have a very low threshold for template meta programming.
Yeah. But like, from a user perspective, it's just CTRE::match
So from a user's point of view, it looks, it's just a very slightly different way of looking at use, like the angle brackets instead of the round brackets.
I think Corentin [Jabot] says they are called Hana brackets because of that. But
Hana brackets are well, that's what I'm gonna call them. That's we, we've had all sorts of funny names of the things over the time. Like, I, I I'll, I'll call them chevrons, you know, like the things that like are and then operator like this and, and no one can see this, of course, on the podcast, but I'm doing like two crocodile hands facing one way is like the output operator. And then the other way is the input operator. And, and then a good friend this
Like alligator, and this is crocodile, something
Like that. Yeah. I dunno how that works. What do you call that? I mean, we've stopped using these except for their, you know, as, as nature intended to shift variables around you know, bit shift stuff, that's the right thing. Hopefully everyone's using fmt, libfmt; this the now nowadays for any kind of actual string manipulation. And then the only other one that springs to mind is something that a, a colleague of mine and Ben's used to call mummy and daddy duck and their baby ducks, which is whenever you have to use double ampersands and then the dot-dot-dot in the template list, "&&...". And it's like the, the, the two mummy and daddy ducks, and then the little little baby ducks. And I think so now I can't unsee that now every time it's not very often I can write code that needs that level of like thing, but it it's there. I, what do you call that? It's like a variadic R-value...
It's perfectforwarding and a variadic pack of arguments.
Gosh, that sounds like tongue twister
I'm probably incorrect and probably someone from [WG21] Core said to me you should call it universal references. You should call it r-value references or no,
Stop talking about it. Double duck. Yeah.
Double-Duck operator
Double duck is best.
Yeah, I think, I think you all
Right.
No, that's cool. So, I mean, other other things, I mean, regular expressions are probably the only the beginning of this, right? Like the, in terms of like very small grammars that make sense to be able to compiled into a program. You know, we've all kind of realized the string matching is painful. If you have to write it yourself and, and regular expressions for all the warts that we were joking about earlier, they can be a very efficient way of encoding something you want to get done. And certainly CTRE gives you a pragmatic way of saying here's a string matching thing that everyone understands. And yet I don't have to pay any kind of runtime performance costs. I get perfect assembly output. That's gonna match it as fast as anyone could possibly write handwritten stuff. What else can we do?
Yeah: I was gonna say like, does this general technique, I mean, obviously for regular expressions, you've, you've, you've got this library, but does this mean that you can do more things generally at compile time using this technique? Like, you know, I, I can imagine like, you know a few situations in which you might want to have other things calculated at compile time and be like, oh, it'd be super cool if I could have a regular expression here, but I can't because it's too early in the process. And I mean, I don't know, that's sort of just my, my naive impression, but is, does this open the door to more of those kinds of things?
You can calculate anything you want, if you don't have any input output. And we we are working on input too, so you can actually open a file or read, read it, and then do some transformation and then store it in your binary or generate code based on it. It's from JeanHeyd [Meneide] "embed".
The std::embed thing. Right.
Oh, okay.
This is long, you know, we've always wanted to have one has always wanted to have like a a configuration file kind of thing where, you know, you wanna have something which isn't code necessarily, but does go, go into the code. That's a brilliant way of doing it. I can just imagine, you know, JSON parsers, YAML parser things, being able to generate bespoke types potentially. I mean, template based types, I guess.
Yeah. Not, not YAML please
Not YAML why I like, YAML, what's wrong with YAML Hana?
You need to write a parser for it. And then I will ask again,
Okay. Right. Subset of YAML I get that. The, the ampersand thing that lets you refer to other parts is probably are very, very tricky, but maybe there's something else I'm not thinking of. Yeah. All right. But I, I don't like JSON, can I just say it out loud? Right. It's not very fashionable. I know JSON is like the Lingua Franca of everything, but like, I can't put comments in JSON and that makes it really hard for me to write anything where I'm explaining why some, you know, like package.json for like a, a node project. Right. You have to have a package.json. It has to be valid JSON. Not only does it have to be exactly valid JSON, like I can't have a trailing comma and I have to always quote everything all the time. I can't explain that, "I bumped a package version, please don't do you know, this is why this is it".
I don't directly depend on this package. I want package, but I need version three. Otherwise I get a security warning. And I can't put a comment on the line saying, I don't care about this package, but it needs to be three or greater or something like that. And there's just written everywhere. And I know I used to have a trick where I would always just put an extra key of like "__comment". And then I could put a comment in the middle of a dictionary. Just hope that no one cared that there was the thing. Anyway, rant about JSON over.
I think actually like parsing JSON is like 200 lines of code in C++ I recently write one. So
That, I mean, I've, I've heard, I know Ben Deane, and ironically of course, Jason Turner,udid a, a presentation a long while ago on, on parsing JSON and I, I kind of like, it feels like that's great and it shows the, the technique off, but I would never want to do that because I hate JSON so much. Well, I like Jason Turner, just to be clear if you listen to this...
Jason, the person not JSON, the file format.
Yeah. Right. No, I mean I'm so I'm wondering what other things, I mean, could one write a C compiler?
What?
I think someone did already
Get out !
A few years ago, there was some, someone on reddit with a constexpr C compiler.
That's absolutely daft. I, I love it. I mean, I've seen people with, you know, the brain language, brain F language that I won't swear on on our podcast, but that one I've seen that that kind of makes sense. I can see how that could go together, even like with my primitive understanding of template trickery
But actually it's not like it's not template meta programming. It's actually constexpr.
Oh, so now we, yeah, yeah. Go on. Tell us about the, so what's the difference between template and constexpr?
Yeah. Template. Meta programming is actually abusing the proprieties of your language. They came with templates to make easier write some containers and etcetera, and then someone found they are actually Turing complete.
I see. So that was never intentional. Like, like,
No, that's not never intentional.
My, my naive thing of like, I want, you know, I wanna write an array of, or a vector yeah. And expanding a rate of type T that's what I think of templates. Right. And you're like, well, that's not Turing Complete. That's just like, yeah. Fill in the dots with the T in all the places. Yeah. Right.
Then there is a constexpr programming, which is just really bad name for this function should or may be evaluated at compile time. And then you are writing just ordinary C++. So we can write, actually write a really easy code, really simple code. And mostly in most cases you can just add a constexpr modifier to all your functions it'll work anyway. So if you, if you implement that C compiler; just mark everything constexpr and it should probably work.
Wow. That's I mean, so that's where we've come along from. So what, what use was a kind of amusing hack that was discovered by folks back in the day of like, Hey, this is accidentally T complete, which, I mean, I think so many things and life end up being accidentally Turing complete. Right. to constexpr being sort of brought into language as a way of tagging, just normal functions. And if I remember rightly the first few additions of languages, the language that had constexpr, it was super limited. You could only like have a single return statement.
They were afraid of that,
That right. You're right. You, they were afraid of this. And so you could write your constexpr multiply two numbers together, which was, you know, constexpr int multiply(int a, int b) { return a * b; }, that was all you could do. And you go, oh, there, well, that's really exciting. I was like, well, what does that help me here? And
You, you, you could do like recursion, but you can do like conditional conditions. No, you can, you can do any like cycles and then allow, then they allow it and they allow it more in C++ 20 we allow allocations. So you can actually allocate, you can use a std::vector. Now you can use
In constexpr time? So this is just a normal C++ program that's running in the compiler effectively. And the output of it can then be used in things that are in the compiled program. Wow.
Yeah. Actually cannot return a std::vector from the function to out from the compiled time, because there is like problem that if you allocate something in compiled time where it'll live in runtime. I see. So there is like the so
Yeah, dynamic allocation that you did during compile time can't live on into the lifetime of the program because it's not dynamic anymore. Right. How do you delete the memory that was like, well, it was in the compiler.
Yeah. All your allocation must be freed when you are leaving the constexpr evaluation context. Right. But you can, is, is it inside of your function to make the algorithms much easier, simple than just using templates and tricks? I see. You can actually implement your own stack with vector,
With actual normal code, rather than like cons-ing on the side of some LISP- like thing that's getting bigger and bigger and bigger. And so you're no longer abusing the type system to hold the internal state of your algorithm. You can just freaking variables, right? Like, like, like a normal person would do. That's so cool.
Exactly. And then you can implement the parser in a much more ordinary way and it'll be much faster. And I mean not like one or magnitude, but like exponential versus linear. And
This is what run times speed or you talking compile time speed. So the compiler itself is much better at, I dunno, understanding the code that it normally reads all day every day to compile our code than it is dealing with the abuse of the template system and all the instantiations that requires.
Yeah. For example, if you have like regular expression, which is few kilobyte longs crazy, by the way.
I've seen the ones for like the "match email addresses" and they look a bit like that.
Yeah. That's yeah. Yeah. And if you try to compile with a CTRE I don't know if it ever end And it usually it crashes on insufficient memory.
Right: I can imagine because of what, but
Then I made a prototype of CTRE using constexpr parsing, which is like limited limiting also the template instantiation. And then I tried few megabytes-long regular expression, and it took like half second or something to compile.
Oh wow.
That's such a big deal. That is huge. Yeah. Like we've talked a lot, Ben and I, one of the things of the themes of this of this podcast is like, my impatience waiting for code to build. And, and Ben's impatient waiting for tests to run. Right. That's all like our shtick here.
That's kind of true.
Yeah. And, and so having tools in our toolbox that allow us to use these advanced techniques that are coming down the pipeline, or even here already that don't also suck in compile time is very important to be close to my heart. So I'm really excited to hear these. That's amazing.
Yeah. And you can actually easily test your calls,
I was gonna ask.
Yeah, exactly. And your code won't ever compile. If it's like, if there is any problem, you can just run it in compile time and place it into static_assert like, like a require in test. And if, if you are okay, it'll compile, if not, your compilation will fail.
So you, the compiler runs the test implicitly as it's, as it's compiling that's that sounds like Ben's heaven.
Yes. Yeah. Well, it's like, so the, the, the, the answer to the question of, should I check things with the compiler or should I check things with test is yes.
Yes. Both. All of the above. I mean, that's great,
But actually you should, that's both in C++ 20, you can actually have different code, which will be run in time and in, on time, in same function. Oh,
Wow.
Ah, yeah. This is something I wanted to talk about, actually, that is the, one of the, the things about constexpr is that it can, any code you write that's constexpr can also be run dynamically. So for example, if you call a function like my, a multiply routine that I just sort of said, and you needed it at compile time, the compiler will obviously evaluate it with all the constants available and promise you that, like, you know, you've got a, an array of size 10 because you did two times five, right? End of story. But if you call it with dynamic parameters, it will just generate the code and do it like normal. So there in and in the Hana's just reminded me or remind us, you can now ask the compiler, am I being evaluated inside the compiler? Or am I being evaluated at run time? Like, am I in the Matrix or not? Yeah. And then you can change the answer, which of course is like terrifying.
And you shouldn't do that. You shouldn't do that.
You shouldn't do that. So when my, all right, you said you shouldn't do it, but why would you need to know? Yeah.
Sometimes it's really hard to implement something in constexpr limitations. It's usually recursive algorithms, but in runtime you can use some intrinsics, for example,
I see.
Or inline assembly.
I see. So, like, for example, if I had a matrix multiply routine four by four matrix multiply routine, then of course I want to be able to do that at constexpr, who knows, I might be rotating some points or something, that's in a big list of things, but if I'm running on the real time system, I actually want to use the, the, the underlying x86 architectural instructions myself to do it. So you need to be able to switch on the that. So
You can have like two parts of one for compiler and one for your runtime.
So back to your point that you made, which is that, you know, having the compiler run, your test is fine provided you don't have any code that differs. But if you do have code that differs, you should also test in with the normal techniques, which brings me into the next question I have actually about this, which is how do you debug this? Because I'm used to putting break points and printfs everywhere. And I don't think the compiler lets me do that's or maybe it does?
Yeah. there, there is no tooling for that. And like one trick I found is to create a, a template, which is undefined like only a declared if I'm correct.
Okay so you just a, there exists a thing called Bob, but you don't give a body. So it's like the compilers can't do anything with Bob. If you try to do,
It's actually a template
Bob. Okay. So Bob is, yeah. Sorry, this is terrible. Now I use Bob for anything (or Ian is the other one), but Bob is of T right. And you know, so you've got a T that you can give it. Okay.
And if you try to instantiate it somewhere, like it's like print for debugging. Yeah. It'll fail in compilation. It'll tell you, there is no instance of
Print for debugging of, and then whatever you put inside the, so yeah. You can do your yeah. You can throw an exception kind of thing with a, with a helpful message. But the helpful message, unfortunately is just a type name inside a deliberately left out bodied.
This is like function good for debbugging. But for like when I wrote CTRE I start from the parser and I wrote a lot of tests to make sure it's working properly. And then I built on that. So, so when I change something it'll immediately break. So I know something is not right, and right. It helps. So obviously test helps.
Well that that's very on brand for us. So I'm glad, glad to hear that. Even, even when the compiler's doing all the work for you testing is useful, but
You're actuall writing code. So you need to test it. If you, if you don't test it, then it's like meaningless.
Absolutely.
So how long until we have a template programming language testing framework?
Actually I think Catch2 other like some like macros, which will force evaluation something in compile time to check it. So there definitely are some tools. All right. Like my, but in CTRE I was lazy and my tests are actually bunch of static asserts. And if any of the CPP file fails to compile, then there is a problem.
Right. Right.
And every time I get some like issue or mostly issues, then I try to like, make it happen, the problem. And then I fix it
It right.
If I try to fix it before that, I don't know what I fixed.
Yeah. Yeah. That never works out. Gotta watch it fail first. Yeah.
Yeah. Start from a failing test, you know, and then work backwards otherwise. Yeah. You, so
Actually it's exactly same as normal programming. Right? Exactly. Just like awkward.
Yeah. Yeah. I guess I've gotten so lazy with debuggers, like actual, like gdb and whatever that I'm. So, you know, I'll happily write something without testing it as much as perhaps I ought to, or at least certainly if I'm trying to understand someone else's code, you can really do far wrong than single stepping through and kind go and getting the feel of it. And I feel that like, there perhaps there's tooling for that, that would be useful.
I think you can always hook into compiler with GDB.
That's true. You, if you can understand what's going on inside GDB using, sorry. Inside GCC using GDB. I mean, I wouldn't surprise me that you can understand that Hana, frankly, but no,
Definitely. I cannot.
I have to ask by the way can, so there, there is a library called HANA already. That's a template meta programming library. Are you who, who first, who, who, who stole, were you before the library or were you named after the library?
I was born before the library.
Yeah. And then they stole your name?
No, no, no, no. There is no connection at all.
I find that hard to believe. I mean, there's certainly some nominative determinism of going on here. Surely, you know, you, no, now I,
It took me a while. It was like, at CppCon I did my lighting talk about CTRE and people were like talking like, you must know Louis [Deonne]. Oh,
That's right. It's Louis', right? Yeah, yeah,
Yeah, yeah. Out of boost HANA, no, I what's going on. I didn't know. So
You didn't know that there is like, yeah. A big compile-time sort of pre-processory I actually, I don't even know what HANA is. I'll be honest with you. I know you, but
It's like a style of using templates, but without you actually using knowing that you are using templates, I see using like function, argument deduction tricks. Okay. And actually last week we had no, might be this week. I don't, I'm not sure anymore.
No, what time it is or what day?
Yeah, yeah. Yet study group seven meeting. And we were discussing like future of reflection. And we were reminding people that boost HANA style or Hana style function were rejected. Right.
And as you saying this, right?
No, it was Daveed [Vandevoorde] actually, but it's funny,
But no, no, literally no connection. That's funny. I, it is, is easy.
Maybe it was prophecy?
Maybe it was prophecy that, well, take it as, see, that seems like a great, great point to end on here as we've got to time. But thank you so much for being with us today. This has been really, really interesting. And I think I've finally gotten, I mean, so full disclosure before now, I've been to several of Hana's presentations on this kind of stuff, and I've never really had it stick in my head the way even when no, no
...Matt was my Remote Clicker.
So before, before it was fashionable to do presentations, remotely. Hana was ahead of the curve. I forget, what was the, the reasoning you, the
My flight was canceled. I couldn't go to the conference. Oh, wow. So I did my presentation remotely, but something failed for, with remote control of his computer, because I wanted to run slides on Matt's computer because they're like animations and everything. And I was supposed to do control it with VNC, but sometimes, sometimes, somehow it fails. Matt was my clicker, so I was telling him like, next slide, next slide, next slide. And then there was like the audience and there was the slide with D 3 expression tree. And I asked Matt, click on the text box and write some regular expression.
So I got to be, do the cool thing with the presentation. And take all the glory. And then poor Hana did all the work. Wow. No, that was, that was fun.
Wow. Oh man.
That was fun, but that's,
And that was actually funny conference and yeah. Yeah. Matt is my clicker.
That's that's my, I peaked, I peaked in my career now. I've been Hana's clicker and I was very proud to be. All right, friends. Well, once again, Hana, thank you so much for your time. And for explaining this all to me, as I say, I think I have a much better understanding now than I think I've had before.
And thank you for inviting me.
Yeah, this is great. Super interesting.
