Compile-Time Programming (with Hana Dusíková) - podcast episode cover

Compile-Time Programming (with Hana Dusíková)

Feb 20, 202250 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

Ben and Matt are joined by Hana Dusíková and discuss panoramic photographs, Matt's career peak, and compile-time programming, including her ground-breaking regular expression library. Links from the show:

Transcript

Matt Godbolt

Hey Ben,

Ben Rady

Hey Matt,

Matt Godbolt

How are you doing?

Ben Rady

I'm doing great.

Matt Godbolt

Glad to hear it. So we've been talking a bit about performance and I've realized that kind of my career has taken me from hacking around in assembly through to trying to coerce compilers into doing all the work for me. Right. That seems to be the journey that I've taken. And I realized that I, there are people still that are much, much better at making the compiler, do all the heavy lifting for, for us than I am. And so I figured we should talk to somebody who can tell us all about the kinds of things you can do if you really, really work hard at the, the compiler and, and some of the tricks you can do with C++.

Ben Rady

Yeah, well, that sounds cool.

Matt Godbolt

So

Ben Rady

And, and so we have a guest today.

Matt Godbolt

We have a guest, we have a guest, there's another window on our screen, which no one else can see, obviously, but so surprise to nobody on this call, but surprise to our listeners: We have a very special guest today. We're joined by Hana Dusíková. Hi, Hana.

Hana Dusíková

Hi,

Ben Rady

How are you doing?

Matt Godbolt

Thank you so much for, for agreeing to do this. We are excited to have you here.

Hana Dusíková

It's it's too late here, but it's, it's great. And last few days was kind of up and down, so I'm great.

Matt Godbolt

Oh good. All right. I'm glad to hear that. And thank you as well for, for dealing with the, the time zone difference. We, we, we are happily in our lovely early afternoon, whereas you are into way into your evening, so we appreciate you taking the time. So I just wanna introduce you to our guests, our guests!? No introduce you. I wanna introduce you to our listeners. You are our guest. You can see how often we do this. So Hana, you are a staff scientist at Avast and you represent the Czech body for C++, right? Is that the right way to say that?

Hana Dusíková

Yeah, it's that's correct. And also I'm an Officer here in C++ committee: I'm chairing SG7, which is study group for Compile-Time programming and static reflection.

Matt Godbolt

That's not a, that's not a, a sci-fi series, right? I thought

Hana Dusíková

S no, no, I figured maybe it's prequel to Terminator probably.

Matt Godbolt

Well, we'll get to that in a minute. I'm sure. But yeah, I'm always confused because: was it WG 21 is the C++ standards committee. Is that what its name is?

Hana Dusíková

Yeah, it's Working Group 21.

Matt Godbolt

And then SG is special group or

Hana Dusíková

Study group

Matt Godbolt

Study group. Okay. So you're SG7, which is compile-time programming and what was the second part?

Hana Dusíková

Static reflection,

Matt Godbolt

Static reflection. Wow. These are all sound like very exciting things that I want now in my programming language.

Hana Dusíková

I think it's, it's really, it's a really boring topic,

Matt Godbolt

It's it? But now you tell us!? We just invited you here to talk!? About these kind of, well, we talk about all things, but

Ben Rady

We talk about a lot of the boring stuff on this podcast. Other people seem to find it interesting. I don't understand why, but this is just Matt and I talking about the stuff that we always talk about.

Hana Dusíková

I mean we can just listen to Matt talking and it's exciting. I think at CppCon, you did, you did like 30 minutes. Standup comedy was it?

Matt Godbolt

I don't think that's true. I mean, maybe that you just thought that my, my presentation was just so funny, but no. Well, yeah. I don't

Hana Dusíková

I think you, you did great because you cannot like you didn't any feedback and you just go and go,

Matt Godbolt

Well, normally the feedback is "please stop talking, we've had enough," or at least that's what I read in people's faces.

Hana Dusíková

No,

Matt Godbolt

Bless you. So my understanding is when you're not hacking on C++ and dealing with the committee stuff, you, you're also a keen photographer and a Greyhound owner, but there's something cool about kind of photography you do, right?

Hana Dusíková

Yeah. I'm doing panoramic photography. It's like these spherical pictures, like Google StreetView before Google Streetview was cool actually.

Ben Rady

Right.

Hana Dusíková

And it's quite technical. It's not like, it's not like typical photography. You need to have like a tripod, you need to have like special head for that. And also wide l. And you need to take, I think usually 30 pictures and make sure no one is moving or no, like not mean moving like going between pictures. Because then someone is cut in half and it looks, or

Matt Godbolt

You get two copies of the dog that was running at the same time through your picture, right?

Hana Dusíková

Yeah, exactly. But you can also like erase them if they move between pictures, there is like always some overlap. So you can like remove all picture, like almost a whole crowd.

Matt Godbolt

That's amazing.

Hana Dusíková

If you have time and patience.

Matt Godbolt

In, right. Which it strikes me as if you're writing C++, especially the kind of template C++ the perhaps or one writes yeah. You might have a lot of spare time and patience.

Ben Rady

Yeah. Right.

Matt Godbolt

So what, what got you into Panoramic photography? Because, I remember once taking photographs, like with my, my, my handheld and like using some photo panorama stitching software to kind of put it together. Is that the same kind of thing, but just more,

Hana Dusíková

Yeah. It's actually really similar and it's the technical part is to rotate your lens around a nodal point in aligned. So there is no parallax error and then it's really simple.

Matt Godbolt

Oh, I see.

Ben Rady

So this is like the, the center of the tripod is creates a point and the lens is always around that as opposed to just turning the lens and then you the, the point. Oh, okay. That, that makes sense. That's interesting.

Hana Dusíková

Usually when you, when you do panorama with like a phone, or like take it in your hand and do like stuff like this, right?

Matt Godbolt

Yeah, yeah,

Hana Dusíková

Yeah. It's hard to like to describe on a podcast or

Matt Godbolt

Yeah, no, but the thing where you hold your phone out and slowly pan it from one side to the other.

Hana Dusíková

Yeah. It makes error. But actually Google is using it in Andoid to extract depth from pictures.

Matt Godbolt

Oh gosh. Interesting. I had no idea they were doing so much. Like I just take photographs and then every now and then Google Photos says, Hey, I made a Panorama out of four photographs. You just happened to take together. That looked like they could have been put together. And I, I have no idea how they do that.

Hana Dusíková

Yeah. But like I approach is a bit different. It's not machine learning at all. It's just processing power and it's a lot of, lot of, Gigapixels.

Matt Godbolt

And yeah. Yeah. I you've sent me some images before that have barely, almost crashed my browser because they're so huge. So what, what kind of resolutions are we talking about?

Hana Dusíková

We are talking about limits of a JPEG

Matt Godbolt

Right. I didn't know JPEGs have limits. What, what they

Hana Dusíková

Like. Yeah. Yeah. It's 16,000 something and it depends on implementation. So

Matt Godbolt

So ther's a uint16 somewhere is this is like a width and height.

Ben Rady

Interesting.

Matt Godbolt

Well today I learn.

Hana Dusíková

And there is actually funny thing, not every implementation supports the whole specification of JPEG. And it depends about the last five pixels does the like difference between good implementation and bad implementation. The last five pixels is like in middle of the like this square in JPEG,

Matt Godbolt

The, the macro blocks, right. There are 16 by 16 or whatever. Yeah.

Hana Dusíková

Yeah. And not, not every implementation is able to handle it.

Matt Godbolt

How peculiar? I would've thought this was a completely solved problem by now, but you know, obviously not.

Hana Dusíková

Yeah. I found out when I like tried to load JPEG in Apple preview and it was only in only gray, nothing else.

Matt Godbolt

That's amazing. I just thought, you know, JPEG's a JPEG, I know it's a complicated format and I'm really excited by like the clever tricks that JPEG uses to sort of get the compression levels that it does. But I didn't realize that it was much space left over for them to, to get things wrong. Turns out this computing lark is pretty difficult after all getting things. Right.

Hana Dusíková

Yeah. Then then I ask Sean Parent that like what's, what's going what's going on

Matt Godbolt

Sean Parent of Adobe Photoshop fame, right?

Hana Dusíková

Yeah, yeah. You should, you should invite him, right?

Matt Godbolt

Yeah. He,

Hana Dusíková

Yeah, he has always some story. And he told me that he makes sure that the JPEG encoder and decoder in Photoshop makes all the corner cases and it was really hard.

Matt Godbolt

I, so have you been sending him test images now or is it, was it already handling all of these things, right. You

Hana Dusíková

Not, not to him, but of to Tony Van Eerd. Yep. He's working on a projection and he asked me for some big picture. So I sent him one, one of them and he told me it crashed their application and he had to do some fixes

Matt Godbolt

That's that's for really cool, I mean, images. Right. I know, you know, every now, and as I say, like you crash browsers, there, you load up a big image. Like I'm fond of looking at these high resolution, similar Panorama, actually similar technology, funnily enough where people have de-capped silicon. So you, you use acid to etch off the top of the chip and then you'd use a microscope and you take loads and loads of photographs, really highly zoomed in, and then you stitch together the die shot of the, of the chip. And it's super cool because now you can start reading off how the chip actually worked for, you know, like 6502-era technology, at least. And at least that's the stuff that I, I can tractably look at and get some idea what's going on. But those images typically are yeah, like they're PNGs and they are, you know, 60,000 by 60,000. And so yeah, you load them into your browser and it, of course the browser immediately shrinks it down. Cause it knows it can't fit that into, but then it's kind of coming down like, like an an old school modem, you know, 14K4 modem. Why on earth in this day and age, is it taking so long for an image to come down and then it kind of goes bump and it's like, Chrome's dead. And you're like, oh, I see...

Hana Dusíková

Or sometimes again in Apple Preview you can open the image. It's totally fine. And, and when you zoom it it's suddenly gray. And if you un-zoom it, it's still gray. Oh.

Matt Godbolt

And gray is like, "I've stopped responding" gray. Yeah, probably I see. Right. That's like, I know Ubuntu does this thing where, you know, sometimes if the application's not listening anymore, it's like, well, I'm gonna render what the last picture you had. But like in black and white to show the user that you are not the pictures there, but the application isn't.

Ben Rady

Yeah.

Matt Godbolt

It's funny. It is. I mean, it's really interesting these things because you know, we, in our sort of day jobs, we, well, I, I speak for myself and maybe for Ben here and obviously Hana, your, your jobs are probably are very different, but like memory is kind of something we don't worry about as much anymore. Like we work on service side stuff for trading. So, you know, we just put more RAM in the machines and we're fine. Right. But when you're dealing with images that are, as, as that, you have to write your software differently, you don't necessarily want to, or can even rely on say the virtual memory system doing a good job because it can't possibly page things in and out as well as you can, if you know how your software is going to access the image in the first place. And I don't think I'd ever really thought about that.

Hana Dusíková

I think this applies to every domain. If, you know, if you knew a semantic of your algorithm well enough, you will always be better than some generic algorithms,

Matt Godbolt

I guess. So I guess so. And

Ben Rady

That's probably true.

Matt Godbolt

So talking of knowing more about your domain and a very gramaphone-needle-scratch-noise segue. Yeah. One of the really cool things that I know you do is work on, as we say to the, in the intro, some, some compiled time programming things and you're most well known for, or at least I'm, I know you most well through the, a regular expression library, the compiled time, regular expression library.

Hana Dusíková

No, that's not true. I'm actually are most well known from my slides.

Matt Godbolt

That is also true. Yes. But the slides, if I'm not wrong, are the things that explain the tricks that you use in your CTRE library. Yeah. That's so, yeah, we should. All right. We should let you know, we'll put some, some things in the notes about this, but Hana's slides are absolutely legendary and have basically changed the way people do presentations in the C++ space because.

They are super interactive and she'll show like a parse tree of a, you know, what looks like a photograph or a picture, a nice rendered picture of a parse tree of a regular expression, including like, this is this node, this type, here's the multiple of this one or whatever that kind of like tree view. And then everyone's like, oh, this is cool. Now we can see how this works. And then she'll like single step through a matching a string and different things are highlight. And you're like so far, so you just hack this in PowerPoint and you just done all manually. And then she'll just edit live, give me a, give me a regular expression and she'll type it in. And the tree changes and it's like live updating all the stuff. It's the best.

Ben Rady

Oh! That's pretty amazing.

Matt Godbolt

I mean, yeah, it's definitely something to aspire to. I know when I did a, a redo of a presentation on like memory accesses, I spent a whole bunch of time trying to do similar highlighting of like, this is where the cache comes in and this up, and then this cache line lights up like this.

Ben Rady

Oh yeah.

Matt Godbolt

Do you remember that presentation? Right. And it's a pale imitation of what Hana is able to do. So, so yeah, Hana, what, how are you doing all of that stuff? And

Hana Dusíková

I'm actually programming it, right. So implemented a really simple version and a really hacky version of expression in JavaScript too, to be able to hook them into like into the D3 library to draw the nice diagrams and visualize it. And I spent lot of like month

Matt Godbolt

You, reimplemented the same thing you'd written in C++ in, in your own particular style, which we'll talk about in a second about why it's so cool that you've done it that way. Wow. You wrote it in JavaScript just so that you could draw the slides to then totally answer how you did the C++ mode

Hana Dusíková

Also. Yeah. Because, because it was less work than making slides, like with some animations, because then I changed something and I need to like redo everything or no, no way.

Matt Godbolt

Right. That's I I've

Ben Rady

Still, I was gonna say, and I guess because of the nature of the library would be difficult to like transpile that, right. Like with the, what is the thing that I'm thinking of?

Matt Godbolt

Graphviz?

Ben Rady

No, no, no, no, no. Like the, the regular expression, you reimplemented the regular expression library to make this work in JavaScript. Yes. Yes. So like there, I don't know why I'm blanking on the name of this thing right now, but there's this the tool that you can use to, to basically compile oh,

Matt Godbolt

C++ into JavaScript, so like WASM and ?

Ben Rady

Stuff. And I think

Matt Godbolt

Twitter, yeah. That thing, like now I'm blanking on it. Yeah. Emscritpen!, And so on. Yes.

Ben Rady

Yes. I

Hana Dusíková

Was

Ben Rady

Actually, but the whole point is, is that you're doing compiler to make that work. So like the chances of that being supported by the, you know, WASM compiler is basically zero,

Matt Godbolt

Right. The thing is the WASM compiler will happily do it. It's just that the everything happens at compile time.

Hana Dusíková

Exactly.

Matt Godbolt

So you have to actually put the compiler in the webpage in order to then actually change it!

Ben Rady

Of course. Yes. It's too late. Yes.

Matt Godbolt

Who would be daft enough to put a compiler on a webpage anyway?

Hana Dusíková

So you actually, for, for like visualization of the compile time library. Yeah. You need to have runtime library if it hooks through it. So it's a different engine.

Matt Godbolt

Tell, tell us about the compile time nature of it and bear in mind, you know, like, I, I, I'm pretty good at the C++ stuff. Ben is got a working knowledge and we don't really assume that our listeners have much of a thing. We try and draw 'em from a wide community. So what is compile time programming. Okay. and, and why would, why, why should we even know what, why, why should we care, I guess?

Hana Dusíková

Okay. since I learn about C++ at first I didn't like templates.

Matt Godbolt

This is, this is why we can be friends,

Hana Dusíková

Super complex, hard. Yeah, exactly. And then I read somewhere that they are actually Turing complete. So, and someone in some article jokingly said, you should be able to make regular expressions in templates.

Matt Godbolt

Oh.

Hana Dusíková

And it bothered me in my mind somehow how I would do that. And I was think like through many years I was like experimenting with some parts only, like then I finally was able to match only like string against another string. And, and then I learned more and more, and then suddenly I was able to match like, a regular expression, but not like regular expression in string, but like expressed as a template, like like the expression tree of the regular expression, but included as a template type name in C++. So it's like, oh, repeat angle, brackets, string character one, character, B character C.

Matt Godbolt

Got it. And so youHave to kind of, as a, as a programmer writing the regular expression, you, you actually had to write the encoded regular expression as the tree that it would ultimately be rather than just the string that we know and love, but yeah, regular expressions

Ben Rady

Tolerate, we love and tolerate, we know and tolerate.

Hana Dusíková

And actually, I like the, I, I like the form because it's more expressive than like, like regular expression, which is

Matt Godbolt

Likes going on. They're very terse aren't they? And that's, it's, it's nice if I can just imagine seeing one of these and knowing that the, one of the things that, that you could put in this are comments inside your regular expression in C style comments. So at least explain what on earth's going on, which is harder to do, or at least with the, the standard regular expressions.

Hana Dusíková

Yes, exactly. And then I was thinking how to convert a string into this form, and that's simple LL(1) parser and I

Matt Godbolt

Simple, you've just said simple, you glossed over something, which is very challenging.

Hana Dusíková

Like the only data structure you have in template meta programming is a stack.

Matt Godbolt

Okay.

Hana Dusíková

You can you can have type type list and you can other type type in front and type remove type from front. That's all you have.

Matt Godbolt

So this is kind of very LISP-like primitives that you have a access to. Right. But

Hana Dusíková

When you have stack you actually Turing complete, then you can do whatever you want.

Matt Godbolt

Wow. I don't think I registered that. That was the only thing you really needed in order to be able to, to, to do this kind of stuff.

Hana Dusíková

I mean for Turing Complete, you need to have two stacks, but yeah.

Matt Godbolt

Okay. Right, right. And just so just be clear, right? The, the thing that you're sort of stacking here is that you're making a new type, which is specialized in different ways and you're concatenating onto a type. So this is all in the type system of the compiler. There is no actual stack that you are manipulating. That's gonna be in your runtime program. It's all in the compiler at compile time. So

Hana Dusíková

Yeah, it's a type. And if you other like type representing your current state in a parser, and then you add a type representing one character on input,

Matt Godbolt

Right.

Hana Dusíková

And modifies to another type which represent different state of the parser. Then you apply this repeatedly in in a like recursive way in a function. Yep. And at the end, it will, your result, which will give, give you only, this is regular expression. Or this is not a regular expression. That's like, step one. Okay. Then you like, so then

Matt Godbolt

Let me just try, run that past you again. So you build, you parse this thing and you're building this type up and you're sort of pushing and popping as you're... ...Following the LALR parser rules. And at the end, it either succeeds and you get back, do you get back a bool or do you just get back a type that is the tag type of saying, this is a valid regular expression, or this is not a valid, regular expression?

Hana Dusíková

Yeah. You can fail in multiple fashion. You can fail in some horrible like error in compiler. Also you can propagate boolean that expression is wrong. And then you can have only one static assert checking the boolean and with a nice error this is wrong. This is an incorrect expression.

Matt Godbolt

I see. So static assert is an assert that runs that compile time. Right. It's just either true or false, and there's a compiler error. If the, whatever you passed at a static error is not a constant known-at compile time and true.

Hana Dusíková

Yeah. It must be know at compile time.

Matt Godbolt

Right.

Hana Dusíková

Right, right. And then next step, when you have a parser...

Matt Godbolt

Iso this is just the beginnings. Okay. Right.

Hana Dusíková

And then next step is to have some, like context as a third parameter during parsing: another stack where you based on the symbol in the grammar do some modification on the stack.

Matt Godbolt

Okay. So you've got one, that's just like the parse stack yeah. Which you were using just to sort of like note, keep track of where you are, how many open parens and close parens; that kind of thing. Yeah. And the other one is kind of where you're putting, I'm guessing this is gonna be like where the actual sort of compiled regular expression is gonna

Hana Dusíková

For like typical expression grammar. You can have like a number and number and then a action plus. And you take two items from top of the stack and wrap them into number into, into plus operation. And the plus operation is actually can easily map to the first stage like the expression as expression and by expression. So you have like concatenation. So there is, operator concatenation when you just take two elements from the top of the stack, grab them and place them back to the stack.

Matt Godbolt

Okay. So it's like a, like a stack machine that you would write if you're like, even like you're writing, like a maths, very simple maths parser-type thing, you know, three, two push, two push to add

Ben Rady

You've ever used a TI calculator or

Matt Godbolt

Texa Instruments calculator from like that.

Ben Rady

Yeah, exactly.

Hana Dusíková

But, but instead like multiplying numbers from input, you actually store the operation with the numbers itself.

Matt Godbolt

Okay. So you kind of just keep it as that. You don't actually add them. You say this is an additional operation with one and one or whatever the equivalent. Okay.

Hana Dusíková

Yeah. The input is actually from the grammar, the LL(1) parser will give you, give it, you a stream of operations in postfix notation.

Matt Godbolt

Okay. Yep.

Hana Dusíková

So then it's just two atoms. Concatenation atom, concatenation atom conation and then you have another, sub-expressionand then at the end there is a multiple options or, "or", "and". And then you, you will have the expression as a tree as a type.

Matt Godbolt

Got it.

Hana Dusíková

And then you can, easily run because you just call the operator, evaluate over the type and it'll, go recurively in all branches is, and at the end, it'll give you true or false if the input matches this regular expression.

Matt Godbolt

So at this time, at this point, we have transitioned from compile time to run time. Right. So that the yes, again, right. I'm just gonna write, read this back to you to make sure I've got this straight because my head is spinning, but the, the result of the parses is accumulated into this, this, the second stack that you're talking about, which effectively ends up being a bespoke type, who's only responsibility in the entire world is to hold the compiled operations that will match the regular expression that was given.

Hana Dusíková

Right.

Matt Godbolt

So if I say "A|B" it's a tree representation of the A and the B, and the "OR", or something, something ish like that. And then that type I can now at runtime give a string and it will do the matching of A or B in this instance. That's amazing because presumably that means if I make a mistake, I think, well, as you just said, if I make a mistake in my regular expression, I don't have to wait till runtime for it thing to throw an exception saying, oh, by the way, that's not valid. It happens at compile time,

Hana Dusíková

But that's actually the first phase.

Matt Godbolt

Right. That's the bit we were just talking about before with the boolean or the, the tag type, right? Yeah. And then, so what implications does this have for, and what's the difference between say, you know, me just using a normal, normal, regular expression

Hana Dusíková

Your compiler knows everything you are doing, and it can inline everything. And it's actually going down to assembly, like compare, compare, compare, compare, "ret".

Matt Godbolt

So really does net out all this sort of very complicated sounding, sounding highfaluting type based stuff. Once it's all boiled down to like actual code generation, the compiler just essentially writes the handwritten code that you would've done yourself. If you have said, okay, you, this needs to match either A or B.

Hana Dusíková

Yeah.

Matt Godbolt

And it's just the two compares in an or, or whatever. And we're done. Wow!

Hana Dusíková

My like first approach, because I, when I wanted regular expressions in like, in with a very performant one, I was thinking about using LLVM to write the code and use the LLVM optimizer.

Matt Godbolt

Right.

Hana Dusíková

LLVM is really hard to understand. And So I knew C++, so I write it in C++ kinda, and then optimizer will do it for me anyway.

Matt Godbolt

That's a, so you still get to use LLVM if you're compiling with Clang, you already are using LLVM, but it's just, you're using it through the medium of, of normal C++ as opposed to like, were saying, if you were to, you could use LLVM directly yourself, in which case you'd build up nodes and you'd then build a dag and you'd build the comparison operations and the basic blocks that are doing the comparisons and whatever. And then you'd say to LLVM, Hey, can you generate me the assembly for that? Which would still be a, which would allow you to do some of that at a run time, I suppose, which is the only advantage that you might need if you actually wanted to have dynamic regular expressions, which is usually a security risk, right. If you let somebody type in a regular expression, you're probably doing it wrong. Like run time-based.

Hana Dusíková

I think even, even evaluating a regular expression at run time is security risk,

Matt Godbolt

Right? Because depending on the regular expression, you could have some absolutely catastrophic backtracking stuff going on where it could just blow up.

Hana Dusíková

Not even backtracking. You can just run out of stack.

Matt Godbolt

Right. Right. I suppose so potentially unbounded. Right. I had don't think I thought about that. That's, that's actually mildly terrifying given that my website is, is entirely regular expression based in terms of all of the horrible things...

Hana Dusíková

But it's more mostly around in the clients. So it should be fine.

Matt Godbolt

Yeah, no, there's plenty of stuff that's going on on the server, unfortunately.

Hana Dusíková

OK good luck!

Matt Godbolt

But, you know, like, I don't think a regular expression overrun or stack smash is my biggest source of security risk, given that I just take arbitrary bits of user code and run them on my servers. So, you know, anyway, that's

Ben Rady

Also a true fact fact.

Matt Godbolt

...That's Yeah, so that's amazing. So compile time is, is letting us push stuff that would traditionally be done with like a, a static offline parser generator stuff, and then a, a code generator, like you said, like LLVM, or even just writing your own stuff and being able to hoist it into mostly sane looking C++, you know, like I have a very low threshold for template meta programming.

Hana Dusíková

Yeah. But like, from a user perspective, it's just CTRE::match

Matt Godbolt

So from a user's point of view, it looks, it's just a very slightly different way of looking at use, like the angle brackets instead of the round brackets.

Hana Dusíková

I think Corentin [Jabot] says they are called Hana brackets because of that. But

Matt Godbolt

Hana brackets are well, that's what I'm gonna call them. That's we, we've had all sorts of funny names of the things over the time. Like, I, I I'll, I'll call them chevrons, you know, like the things that like are and then operator like this and, and no one can see this, of course, on the podcast, but I'm doing like two crocodile hands facing one way is like the output operator. And then the other way is the input operator. And, and then a good friend this

Ben Rady

Like alligator, and this is crocodile, something

Matt Godbolt

Like that. Yeah. I dunno how that works. What do you call that? I mean, we've stopped using these except for their, you know, as, as nature intended to shift variables around you know, bit shift stuff, that's the right thing. Hopefully everyone's using fmt, libfmt; this the now nowadays for any kind of actual string manipulation. And then the only other one that springs to mind is something that a, a colleague of mine and Ben's used to call mummy and daddy duck and their baby ducks, which is whenever you have to use double ampersands and then the dot-dot-dot in the template list, "&&...". And it's like the, the, the two mummy and daddy ducks, and then the little little baby ducks. And I think so now I can't unsee that now every time it's not very often I can write code that needs that level of like thing, but it it's there. I, what do you call that? It's like a variadic R-value...

Hana Dusíková

It's perfectforwarding and a variadic pack of arguments.

Matt Godbolt

Gosh, that sounds like tongue twister

Hana Dusíková

I'm probably incorrect and probably someone from [WG21] Core said to me you should call it universal references. You should call it r-value references or no,

Matt Godbolt

Stop talking about it. Double duck. Yeah.

Ben Rady

Double-Duck operator

Hana Dusíková

Double duck is best.

Matt Godbolt

Yeah, I think, I think you all

Ben Rady

Right.

Matt Godbolt

No, that's cool. So, I mean, other other things, I mean, regular expressions are probably the only the beginning of this, right? Like the, in terms of like very small grammars that make sense to be able to compiled into a program. You know, we've all kind of realized the string matching is painful. If you have to write it yourself and, and regular expressions for all the warts that we were joking about earlier, they can be a very efficient way of encoding something you want to get done. And certainly CTRE gives you a pragmatic way of saying here's a string matching thing that everyone understands. And yet I don't have to pay any kind of runtime performance costs. I get perfect assembly output. That's gonna match it as fast as anyone could possibly write handwritten stuff. What else can we do?

Ben Rady

Yeah: I was gonna say like, does this general technique, I mean, obviously for regular expressions, you've, you've, you've got this library, but does this mean that you can do more things generally at compile time using this technique? Like, you know, I, I can imagine like, you know a few situations in which you might want to have other things calculated at compile time and be like, oh, it'd be super cool if I could have a regular expression here, but I can't because it's too early in the process. And I mean, I don't know, that's sort of just my, my naive impression, but is, does this open the door to more of those kinds of things?

Hana Dusíková

You can calculate anything you want, if you don't have any input output. And we we are working on input too, so you can actually open a file or read, read it, and then do some transformation and then store it in your binary or generate code based on it. It's from JeanHeyd [Meneide] "embed".

Matt Godbolt

The std::embed thing. Right.

Ben Rady

Oh, okay.

Matt Godbolt

This is long, you know, we've always wanted to have one has always wanted to have like a a configuration file kind of thing where, you know, you wanna have something which isn't code necessarily, but does go, go into the code. That's a brilliant way of doing it. I can just imagine, you know, JSON parsers, YAML parser things, being able to generate bespoke types potentially. I mean, template based types, I guess.

Hana Dusíková

Yeah. Not, not YAML please

Matt Godbolt

Not YAML why I like, YAML, what's wrong with YAML Hana?

Hana Dusíková

You need to write a parser for it. And then I will ask again,

Matt Godbolt

Okay. Right. Subset of YAML I get that. The, the ampersand thing that lets you refer to other parts is probably are very, very tricky, but maybe there's something else I'm not thinking of. Yeah. All right. But I, I don't like JSON, can I just say it out loud? Right. It's not very fashionable. I know JSON is like the Lingua Franca of everything, but like, I can't put comments in JSON and that makes it really hard for me to write anything where I'm explaining why some, you know, like package.json for like a, a node project. Right. You have to have a package.json. It has to be valid JSON. Not only does it have to be exactly valid JSON, like I can't have a trailing comma and I have to always quote everything all the time. I can't explain that, "I bumped a package version, please don't do you know, this is why this is it".

I don't directly depend on this package. I want package, but I need version three. Otherwise I get a security warning. And I can't put a comment on the line saying, I don't care about this package, but it needs to be three or greater or something like that. And there's just written everywhere. And I know I used to have a trick where I would always just put an extra key of like "__comment". And then I could put a comment in the middle of a dictionary. Just hope that no one cared that there was the thing. Anyway, rant about JSON over.

Hana Dusíková

I think actually like parsing JSON is like 200 lines of code in C++ I recently write one. So

Matt Godbolt

That, I mean, I've, I've heard, I know Ben Deane, and ironically of course, Jason Turner,udid a, a presentation a long while ago on, on parsing JSON and I, I kind of like, it feels like that's great and it shows the, the technique off, but I would never want to do that because I hate JSON so much. Well, I like Jason Turner, just to be clear if you listen to this...

Ben Rady

Jason, the person not JSON, the file format.

Matt Godbolt

Yeah. Right. No, I mean I'm so I'm wondering what other things, I mean, could one write a C compiler?

Ben Rady

What?

Hana Dusíková

I think someone did already

Matt Godbolt

Get out !

Hana Dusíková

A few years ago, there was some, someone on reddit with a constexpr C compiler.

Matt Godbolt

That's absolutely daft. I, I love it. I mean, I've seen people with, you know, the brain language, brain F language that I won't swear on on our podcast, but that one I've seen that that kind of makes sense. I can see how that could go together, even like with my primitive understanding of template trickery

Hana Dusíková

But actually it's not like it's not template meta programming. It's actually constexpr.

Matt Godbolt

Oh, so now we, yeah, yeah. Go on. Tell us about the, so what's the difference between template and constexpr?

Hana Dusíková

Yeah. Template. Meta programming is actually abusing the proprieties of your language. They came with templates to make easier write some containers and etcetera, and then someone found they are actually Turing complete.

Matt Godbolt

I see. So that was never intentional. Like, like,

Hana Dusíková

No, that's not never intentional.

Matt Godbolt

My, my naive thing of like, I want, you know, I wanna write an array of, or a vector yeah. And expanding a rate of type T that's what I think of templates. Right. And you're like, well, that's not Turing Complete. That's just like, yeah. Fill in the dots with the T in all the places. Yeah. Right.

Hana Dusíková

Then there is a constexpr programming, which is just really bad name for this function should or may be evaluated at compile time. And then you are writing just ordinary C++. So we can write, actually write a really easy code, really simple code. And mostly in most cases you can just add a constexpr modifier to all your functions it'll work anyway. So if you, if you implement that C compiler; just mark everything constexpr and it should probably work.

Matt Godbolt

Wow. That's I mean, so that's where we've come along from. So what, what use was a kind of amusing hack that was discovered by folks back in the day of like, Hey, this is accidentally T complete, which, I mean, I think so many things and life end up being accidentally Turing complete. Right. to constexpr being sort of brought into language as a way of tagging, just normal functions. And if I remember rightly the first few additions of languages, the language that had constexpr, it was super limited. You could only like have a single return statement.

Hana Dusíková

They were afraid of that,

Matt Godbolt

That right. You're right. You, they were afraid of this. And so you could write your constexpr multiply two numbers together, which was, you know, constexpr int multiply(int a, int b) { return a * b; }, that was all you could do. And you go, oh, there, well, that's really exciting. I was like, well, what does that help me here? And

Hana Dusíková

You, you, you could do like recursion, but you can do like conditional conditions. No, you can, you can do any like cycles and then allow, then they allow it and they allow it more in C++ 20 we allow allocations. So you can actually allocate, you can use a std::vector. Now you can use

Matt Godbolt

In constexpr time? So this is just a normal C++ program that's running in the compiler effectively. And the output of it can then be used in things that are in the compiled program. Wow.

Hana Dusíková

Yeah. Actually cannot return a std::vector from the function to out from the compiled time, because there is like problem that if you allocate something in compiled time where it'll live in runtime. I see. So there is like the so

Matt Godbolt

Yeah, dynamic allocation that you did during compile time can't live on into the lifetime of the program because it's not dynamic anymore. Right. How do you delete the memory that was like, well, it was in the compiler.

Hana Dusíková

Yeah. All your allocation must be freed when you are leaving the constexpr evaluation context. Right. But you can, is, is it inside of your function to make the algorithms much easier, simple than just using templates and tricks? I see. You can actually implement your own stack with vector,

Matt Godbolt

With actual normal code, rather than like cons-ing on the side of some LISP- like thing that's getting bigger and bigger and bigger. And so you're no longer abusing the type system to hold the internal state of your algorithm. You can just freaking variables, right? Like, like, like a normal person would do. That's so cool.

Hana Dusíková

Exactly. And then you can implement the parser in a much more ordinary way and it'll be much faster. And I mean not like one or magnitude, but like exponential versus linear. And

Matt Godbolt

This is what run times speed or you talking compile time speed. So the compiler itself is much better at, I dunno, understanding the code that it normally reads all day every day to compile our code than it is dealing with the abuse of the template system and all the instantiations that requires.

Hana Dusíková

Yeah. For example, if you have like regular expression, which is few kilobyte longs crazy, by the way.

Matt Godbolt

I've seen the ones for like the "match email addresses" and they look a bit like that.

Hana Dusíková

Yeah. That's yeah. Yeah. And if you try to compile with a CTRE I don't know if it ever end And it usually it crashes on insufficient memory.

Matt Godbolt

Right: I can imagine because of what, but

Hana Dusíková

Then I made a prototype of CTRE using constexpr parsing, which is like limited limiting also the template instantiation. And then I tried few megabytes-long regular expression, and it took like half second or something to compile.

Ben Rady

Oh wow.

Matt Godbolt

That's such a big deal. That is huge. Yeah. Like we've talked a lot, Ben and I, one of the things of the themes of this of this podcast is like, my impatience waiting for code to build. And, and Ben's impatient waiting for tests to run. Right. That's all like our shtick here.

Ben Rady

That's kind of true.

Matt Godbolt

Yeah. And, and so having tools in our toolbox that allow us to use these advanced techniques that are coming down the pipeline, or even here already that don't also suck in compile time is very important to be close to my heart. So I'm really excited to hear these. That's amazing.

Hana Dusíková

Yeah. And you can actually easily test your calls,

Matt Godbolt

I was gonna ask.

Hana Dusíková

Yeah, exactly. And your code won't ever compile. If it's like, if there is any problem, you can just run it in compile time and place it into static_assert like, like a require in test. And if, if you are okay, it'll compile, if not, your compilation will fail.

Matt Godbolt

So you, the compiler runs the test implicitly as it's, as it's compiling that's that sounds like Ben's heaven.

Ben Rady

Yes. Yeah. Well, it's like, so the, the, the, the answer to the question of, should I check things with the compiler or should I check things with test is yes.

Matt Godbolt

Yes. Both. All of the above. I mean, that's great,

Hana Dusíková

But actually you should, that's both in C++ 20, you can actually have different code, which will be run in time and in, on time, in same function. Oh,

Ben Rady

Wow.

Matt Godbolt

Ah, yeah. This is something I wanted to talk about, actually, that is the, one of the, the things about constexpr is that it can, any code you write that's constexpr can also be run dynamically. So for example, if you call a function like my, a multiply routine that I just sort of said, and you needed it at compile time, the compiler will obviously evaluate it with all the constants available and promise you that, like, you know, you've got a, an array of size 10 because you did two times five, right? End of story. But if you call it with dynamic parameters, it will just generate the code and do it like normal. So there in and in the Hana's just reminded me or remind us, you can now ask the compiler, am I being evaluated inside the compiler? Or am I being evaluated at run time? Like, am I in the Matrix or not? Yeah. And then you can change the answer, which of course is like terrifying.

Hana Dusíková

And you shouldn't do that. You shouldn't do that.

Matt Godbolt

You shouldn't do that. So when my, all right, you said you shouldn't do it, but why would you need to know? Yeah.

Hana Dusíková

Sometimes it's really hard to implement something in constexpr limitations. It's usually recursive algorithms, but in runtime you can use some intrinsics, for example,

Matt Godbolt

I see.

Hana Dusíková

Or inline assembly.

Matt Godbolt

I see. So, like, for example, if I had a matrix multiply routine four by four matrix multiply routine, then of course I want to be able to do that at constexpr, who knows, I might be rotating some points or something, that's in a big list of things, but if I'm running on the real time system, I actually want to use the, the, the underlying x86 architectural instructions myself to do it. So you need to be able to switch on the that. So

Hana Dusíková

You can have like two parts of one for compiler and one for your runtime.

Matt Godbolt

So back to your point that you made, which is that, you know, having the compiler run, your test is fine provided you don't have any code that differs. But if you do have code that differs, you should also test in with the normal techniques, which brings me into the next question I have actually about this, which is how do you debug this? Because I'm used to putting break points and printfs everywhere. And I don't think the compiler lets me do that's or maybe it does?

Hana Dusíková

Yeah. there, there is no tooling for that. And like one trick I found is to create a, a template, which is undefined like only a declared if I'm correct.

Matt Godbolt

Okay so you just a, there exists a thing called Bob, but you don't give a body. So it's like the compilers can't do anything with Bob. If you try to do,

Hana Dusíková

It's actually a template

Matt Godbolt

Bob. Okay. So Bob is, yeah. Sorry, this is terrible. Now I use Bob for anything (or Ian is the other one), but Bob is of T right. And you know, so you've got a T that you can give it. Okay.

Hana Dusíková

And if you try to instantiate it somewhere, like it's like print for debugging. Yeah. It'll fail in compilation. It'll tell you, there is no instance of

Matt Godbolt

Print for debugging of, and then whatever you put inside the, so yeah. You can do your yeah. You can throw an exception kind of thing with a, with a helpful message. But the helpful message, unfortunately is just a type name inside a deliberately left out bodied.

Hana Dusíková

This is like function good for debbugging. But for like when I wrote CTRE I start from the parser and I wrote a lot of tests to make sure it's working properly. And then I built on that. So, so when I change something it'll immediately break. So I know something is not right, and right. It helps. So obviously test helps.

Matt Godbolt

Well that that's very on brand for us. So I'm glad, glad to hear that. Even, even when the compiler's doing all the work for you testing is useful, but

Hana Dusíková

You're actuall writing code. So you need to test it. If you, if you don't test it, then it's like meaningless.

Matt Godbolt

Absolutely.

Ben Rady

So how long until we have a template programming language testing framework?

Hana Dusíková

Actually I think Catch2 other like some like macros, which will force evaluation something in compile time to check it. So there definitely are some tools. All right. Like my, but in CTRE I was lazy and my tests are actually bunch of static asserts. And if any of the CPP file fails to compile, then there is a problem.

Ben Rady

Right. Right.

Hana Dusíková

And every time I get some like issue or mostly issues, then I try to like, make it happen, the problem. And then I fix it

Ben Rady

It right.

Hana Dusíková

If I try to fix it before that, I don't know what I fixed.

Ben Rady

Yeah. Yeah. That never works out. Gotta watch it fail first. Yeah.

Matt Godbolt

Yeah. Start from a failing test, you know, and then work backwards otherwise. Yeah. You, so

Hana Dusíková

Actually it's exactly same as normal programming. Right? Exactly. Just like awkward.

Matt Godbolt

Yeah. Yeah. I guess I've gotten so lazy with debuggers, like actual, like gdb and whatever that I'm. So, you know, I'll happily write something without testing it as much as perhaps I ought to, or at least certainly if I'm trying to understand someone else's code, you can really do far wrong than single stepping through and kind go and getting the feel of it. And I feel that like, there perhaps there's tooling for that, that would be useful.

Hana Dusíková

I think you can always hook into compiler with GDB.

Matt Godbolt

That's true. You, if you can understand what's going on inside GDB using, sorry. Inside GCC using GDB. I mean, I wouldn't surprise me that you can understand that Hana, frankly, but no,

Hana Dusíková

Definitely. I cannot.

Matt Godbolt

I have to ask by the way can, so there, there is a library called HANA already. That's a template meta programming library. Are you who, who first, who, who, who stole, were you before the library or were you named after the library?

Hana Dusíková

I was born before the library.

Matt Godbolt

Yeah. And then they stole your name?

Hana Dusíková

No, no, no, no. There is no connection at all.

Matt Godbolt

I find that hard to believe. I mean, there's certainly some nominative determinism of going on here. Surely, you know, you, no, now I,

Hana Dusíková

It took me a while. It was like, at CppCon I did my lighting talk about CTRE and people were like talking like, you must know Louis [Deonne]. Oh,

Matt Godbolt

That's right. It's Louis', right? Yeah, yeah,

Hana Dusíková

Yeah, yeah. Out of boost HANA, no, I what's going on. I didn't know. So

Matt Godbolt

You didn't know that there is like, yeah. A big compile-time sort of pre-processory I actually, I don't even know what HANA is. I'll be honest with you. I know you, but

Hana Dusíková

It's like a style of using templates, but without you actually using knowing that you are using templates, I see using like function, argument deduction tricks. Okay. And actually last week we had no, might be this week. I don't, I'm not sure anymore.

Matt Godbolt

No, what time it is or what day?

Hana Dusíková

Yeah, yeah. Yet study group seven meeting. And we were discussing like future of reflection. And we were reminding people that boost HANA style or Hana style function were rejected. Right.

Matt Godbolt

And as you saying this, right?

Hana Dusíková

No, it was Daveed [Vandevoorde] actually, but it's funny,

Matt Godbolt

But no, no, literally no connection. That's funny. I, it is, is easy.

Hana Dusíková

Maybe it was prophecy?

Matt Godbolt

Maybe it was prophecy that, well, take it as, see, that seems like a great, great point to end on here as we've got to time. But thank you so much for being with us today. This has been really, really interesting. And I think I've finally gotten, I mean, so full disclosure before now, I've been to several of Hana's presentations on this kind of stuff, and I've never really had it stick in my head the way even when no, no

Hana Dusíková

...Matt was my Remote Clicker.

Matt Godbolt

So before, before it was fashionable to do presentations, remotely. Hana was ahead of the curve. I forget, what was the, the reasoning you, the

Hana Dusíková

My flight was canceled. I couldn't go to the conference. Oh, wow. So I did my presentation remotely, but something failed for, with remote control of his computer, because I wanted to run slides on Matt's computer because they're like animations and everything. And I was supposed to do control it with VNC, but sometimes, sometimes, somehow it fails. Matt was my clicker, so I was telling him like, next slide, next slide, next slide. And then there was like the audience and there was the slide with D 3 expression tree. And I asked Matt, click on the text box and write some regular expression.

Matt Godbolt

So I got to be, do the cool thing with the presentation. And take all the glory. And then poor Hana did all the work. Wow. No, that was, that was fun.

Ben Rady

Wow. Oh man.

Matt Godbolt

That was fun, but that's,

Hana Dusíková

And that was actually funny conference and yeah. Yeah. Matt is my clicker.

Matt Godbolt

That's that's my, I peaked, I peaked in my career now. I've been Hana's clicker and I was very proud to be. All right, friends. Well, once again, Hana, thank you so much for your time. And for explaining this all to me, as I say, I think I have a much better understanding now than I think I've had before.

Hana Dusíková

And thank you for inviting me.

Ben Rady

Yeah, this is great. Super interesting.

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android