Java and/or C++?

Matt Godbolt

00:19

Hey Ben.

Ben Rady

00:20

Hey Matt.

Matt Godbolt

00:20

How you doing, buddy?

Ben Rady

00:22

I'm great.

Matt Godbolt

00:23

Lovely, good to hear. So recently at work we've been discussing, uh, the benefits of Java versus, uh, C++, and I figured we should talk about it here, because first of all, it means that you and I can talk about something which we needed to discuss at work, but in a way that other people can listen in,

Ben Rady

00:42

Right? Yes. So this officially counts as our day job, right?

Matt Godbolt

00:45

That's right. We can claim this back and everything, I think.

Ben Rady

00:48

Right.

Matt Godbolt

00:51

So, you know, uh, the, we have a new project and we're spiking it out in Java, which makes a lot of sense, I think, but we, we should just discuss, what are the trade offs in deciding a language? We've asked some of our guests, I think this kind of stuff is like, what makes you pick a particular language over another? And, um, I figured we should discuss a little bit about that.

Ben Rady

01:09

Yeah. I think we're gonna have to try to describe a little bit about what this project is cause with any of these decisions, right. It's all about the constraints of the project and what you're trying to achieve and a bunch of other things.

Matt Godbolt

01:19

Absolutely.

Ben Rady

01:20

And it's gonna be hard for us to have that discussion in a very effective...It's like, well, we're doing a thing. Yes. Right. So maybe, maybe, I don't know. You could give your description of what this project is and, and sort of what the constraints are.

Matt Godbolt

01:31

Right. I think so. So, um, obviously we work in finance and, uh, so there's a certain amount of secrecy about what we're exactly doing. So we can't go into huge amounts of detail, but you're right. To have some idea about what we're, we're talking about here. So we're talking about a system which will observe events in the world, many thousands of them, potentially thousands per second, in some cases, and make some kind of decision about when is the right time to either buy or sell, um, stocks based on a whole bunch of signals that we are gonna get a team of smart people to come along and do like you and I are not the kind of people who understand the economics of it or whatever, but we can certainly build a framework that lets people come in and make, um, smart, um, signals out of the massive information coming in. And there's a ton of information. We have to sift through it somewhat quickly. And we're talking something which is somewhat performant, not necessarily, uh, like nanoseconds, but in the milliseconds, let's wave our hands a bit and say around that to make sort of decisions. So does, is that your, was that a fair description, do you think?

Ben Rady

02:41

Yeah, I think, I think that's good. And one of the points you hit on is really important, which is, is that this project is gonna be worked on by a mixture of people. Some of whom are professional software engineers and have been doing it for decades. And some of whom have been working maybe for decades, but not as software engineers. Right. They have their background as in mathematics or statistics and yes, they can code because, you know, these days programming is like literacy, right? Like you gotta be able to do it a little bit, otherwise, you know, it's like, you know, you can't even write an email. Um, but, um,

03:13

Uh, but they're not professional software engineers, right? Like, you know, just like somebody writing an email, isn't a professional writer. So, um, there's, and, and because of that, we don't want what I would normally want on a lot of these types of projects, which is everybody solves all the problems together, right? Like you have a big group of people and yes, people have different strengths and weaknesses, but the expectation is that like, Hey, you know, maybe Ben your, your, your SQL's a little bit rusty and out of date, let's have you work with this other person on the team. Who's just has a, you know, is a relational database master. And they know every little trick and tip and you, if you work with them, the two of you are gonna build great stuff and you'll get better and they'll get better. Right. I don't know that we necessarily want on this project to say, Hey, let's take this person with a PhD in statistics or, uh, mathematics and have them, you know, learn how to write really great multi threaded code. Right? Like that's not a, that's

Matt Godbolt

04:12

Right, that's a non-goal. A very specific non-goal.

Ben Rady

04:13

Not a good use of everyone's time.

Matt Godbolt

04:14

Not that those folks aren't interested in it. They, they, I think we've talked about this before the difference between, I think, you know, programmers and, and non-programmer folks is like, I think as you and I, as programmers, there's a certain amount of pride in just building a nice program. And the fact that it does something useful in the world is a pleasant side effect of that. Yes. We hope. Right. You know? Yeah.

Ben Rady

04:34

Pays the bills?

Matt Godbolt

04:34

Our day job. We, yeah. The thing that pays the bills. Yeah. My mortgage is paid. Thank you very much because the nice thing that I build happens to make somebody some money. That's great. Um, for the folks that we are talking about here, the program is very much a means to the end, their goal is, can I model this thing? Can I, uh, can I predict the future in some way? And the program is how they had achieve that. And they're relatively unexcited by the beauty of the program itself. Some of them are, yes, but most of 'em just wanna get stuff done and that's completely understandable.

Ben Rady

05:03

So that's one constraint. And then the other constraint, which you kind of mentioned is that we do have a performance constraint and actually at prev-prev-co, we had a name for this, which is named after a, a good friend of yours and mine, Matt Hellige cuz we called it the Hellige Middle. Do you remember me talking about this? Oh,

Matt Godbolt

05:21

I, it rings a very faint be I can't believe it. Yeah. Tell, tell me about the Hellige middle.

Ben Rady

05:25

So the Hellige middle is, so it was it's the middle world between sort of very low latency, extremely performance sensitive systems and batch systems. Right. Where in a very low latency system, there are things like physics will not let you do. Right?

Matt Godbolt

05:43

Right.

Ben Rady

05:44

And the discussion with like a stakeholder or somebody like that is you, you, you sort of have physics to, it's the old thing of, of, of, um,

Matt Godbolt

05:53

Grace Hopper,

Ben Rady

05:53

Grace Hopper, and, and the wire. Right.

Matt Godbolt

05:55

I can see you brandishing it.

Ben Rady

05:56

This is a nanosecond, right. There's nothing you can do to change that. Right. Yeah. And so you have that just sort of fall back on it, be like, Hey, uh, you know, can't we just can't we just do this can't we just do that. It's like, no light doesn't move that fast. We cannot do that. I am a hundred percent certain. We cannot do that. And that obviously has a lot of constraints and has, can be difficult to work with, but it also provides a sort of fundamental layer to work from that is non-negotiable right,

Matt Godbolt

06:21

Right. There's a flaw that's well understood. And you, you, yeah, no amount of, of, of changing the problem can get away from it in most situations.

Ben Rady

06:30

Right. Exactly. Yeah, yeah. Yeah. And then in the batch world it's like, okay, well, you know, can you guys do, can you add this? Can you add that? It's like, well, yeah, it's gonna maybe make the batch job run a little bit longer. It's like, yeah. Okay. That's fine. If it's a daily batch job and it takes more than a day, then maybe you gotta change some things around.

Matt Godbolt

06:44

Right. There's but ultimately the constraint, there is not at least from, from the point of view of the, the business goal, it doesn't matter. Right. I mean, as a program, working on that batch system, and if every time you make a change, you use to test, it takes a, you 23 hours and 35 minutes. You might, you might just wanna, uh, speed it up a little bit, but that's not gonna prevent the business from, from its achieving its goals. Right.

Ben Rady

07:05

Right. Right.

Matt Godbolt

07:05

So then the middle...

Ben Rady

07:06

So the Hellige middle is you care about both. You care about performance. Oh. But not so much that you're constrained by physics. Right. Right. Which means when somebody comes along and says, Hey, could you just add this in the answer is an unfortunate, it's not, no, and it's not yes. It's like, well, probably

Matt Godbolt

07:28

It depends. Everyone hates that.

Ben Rady

07:30

It's gonna really difficult to figure out if that was a bad idea until six months after we've done it. Right. Like, um, and so the sort of middle world where you're, you're not cons, you don't have as many hard constraints, but you have a whole bunch of soft constraints. And the intersection of all those soft constraints can be a very difficult to just figure out. And B very difficult to sort of negotiate from like a, you know, business and strategy standpoint about like, okay, we have a whole bunch of things we could do what will happen if we do them? Which one should we do. Right. We almost don't have the constraints there to, to tell you. It's like, well, that one's physically impossible. So we can take that off the list right now.

Matt Godbolt

08:08

Got it. Right. That's it strikes me that most software development falls into this category.

Ben Rady

08:13

I think it does actually.

Matt Godbolt

08:14

It's just that I, I, I have previously, you know, I worked with Hellige on a bunch of this stuff. So, so we, we would spend a lot of our time down at the, the physics constrained end. And I think like any kind of, uh, uh, I was gonna say art, because that's the only thing I can think of, but this isn't art, but maybe, or maybe it is any kind of art that's, the constraint really helps. Right? The more constrained your environment is the more creative you can be within that at, uh, that constraint system. And I think some of the best code I ever wrote and the best ideas I ever had came from constraint system, be they ridiculous, you know, 35 year old, uh, eight bit hardware that only can do so much or even games, consoles, or any of those things, or, well, here's the fastest computer money can buy. Here is the shortest cable we can possibly fit. These are the constraints. Everyone else is laboring under the same constraints. Go make it, do the thing we need it to do as fast as possible. So when you're in that, and then maybe in the batch job, there are just literally no constraints. I'm gonna put a asterisk. I'm sure

Ben Rady

09:12

There's no co constraints. Yeah, there there's just much, there, there are few, I think there's fewer of them generally. And it is a little bit of a, a spectrum. It's got one firm side on, on like one side of the spectrum. And then the other one is just sort of trails off into infinity.

Matt Godbolt

09:25

But that, that Hellige middle is where there are reasonably it's, it's a very high dimensional space. There are reasonably many trade offs you can make and still have a good solution that sat there in a, in a minima, whether it's a global minima is like, no, who knows, because there's just so many other choices you could have made. Right. I see. Okay. So yeah.

Ben Rady

09:44

Yeah. You sort of like millisecond performance range, I think is, is the sort of, when we were talking about it with Matt, that was sort of the prototypical thing of like, so you were building the system and it sort of had these same kind of performance constraints to it. And he's like, oh yeah. You know, the problem with that is, is that you wind up in this place where, you know, you you've got fewer constraints and that actually makes the problem harder, right? Yes. Yes.

Matt Godbolt

10:07

So it's very easy to answer, like how important is it that I spend this time pouring over my cache layout and my hashing algorithm to find my, my thing in, in, in this set of, of, of possible things, uh, compared to, well, you know, we're just using a normal default unordered map thing or a default hashmap container or whatever, and then it's probably fine until it, isn't probably fine then you're never quite sure if it's in the middle of that or not. Is that contributing to it or not? Sorry. Yeah, I'll have to excuse my, my dog, who is, it occurred to me actually earlier when I was making a cup of coffee before we came to do this recording that we started doing this when my now one year old dog was absolutely tiny. So I wonder, I can't remember when we did our first episode, but we must be coming closer to, to who an anniversary of sorts, although when this goes out, who knows, but, uh, yeah.

10:59

Now I have a full size dog running around and barking everyone anyhoo the, the Hellige middle and the milliseconds and the, and the under constrained thing, which means that I, I suppose charitably, you could say that means there's a lot of scope for individual choice about what is for the people working on the project, which I think probably leads us to, the idea that this we, what we're developing in is Java because of the various people that are available and the skill sets they bring. But there obviously there are other reasons why it's a great choice.

Ben Rady

11:29

Yeah. But, you know, we wouldn't even be having this conversation if it, if we weren't in the Hellige middle, right. Like the, the, the choice of programming language would be well FPGA, that's

Matt Godbolt

11:41

The very far low extreme. Yes.

Ben Rady

11:44

Right, right, right,

Matt Godbolt

11:45

Right. And then bumping up above it is, you know, C/C++.

Ben Rady

11:48

Exactly

Matt Godbolt

11:49

Style, maybe, uh, you know, although again, you know, we're gonna talk that more generally about Java. It's not, it's not impossible to have Java at that end. And in fact, we worked on larger trading systems at previous companies that were all in Java and they did pretty well. Thank you very much. And it's, you know, it's all about choosing the right thing for the right team and that kind of stuff, which I guess is the theme for today is like why, what are the trade offs and why, what, what, uh,

Ben Rady

12:12

So, so in it's interesting cuz in those worlds, I actually think Java is not a great solution and this is my opinion. This is just sort of like based on my experience. But it's like if you start getting into the world of having to write low or no garbage Java where you're like kind of torturing the language to try to figure out how to get it, to prevent it from doing garbage collections of any kind,

Matt Godbolt

12:34

A fundamental property of the language, you have to basically disable and there's no help from the system to, to do that. Right. So I think maybe we should go dig into that a little bit because maybe it's not, you know, and also it's worth saying while we're here, when we're talking about Java here, really we mean any kind of JVM based language, right. Because there's, they're pretty fungible, which is a real excellent property of the ecosystem that you can mix and match closure and Java and Scala, I think is Scala one of those

Ben Rady

12:59

Scala. Kotlin

Matt Godbolt

13:01

Kotlin yeah. Great call.

Ben Rady

13:02

Groovy uh, man, there's a whole, there's a whole bunch of them,

Matt Godbolt

13:06

But talking about like the, the no garbage, um, aspect here. So a trading system, how has to react to events coming in and you wanna act timely to those in a timely fashion to those events. And ideally it's not just the fact that you can rely, uh, sorry. It's not just that it's a fast response. It's that the, um, the properties of the, uh, the time that's taken are sort of well understood. You know, you, you can say it's like between three and five microseconds, you know, would be a really lovely thing to be able to say, I can react between, right. And I know that there's some variance because the computer isn't a, a perfect, uh, environment anymore for, for programs. There are other things going on, cache lines are going and whatever, but what you want is variance in your, your response time, because you can always account for a relatively slow.

14:00

To some extent you can, you can deal with slowness. Um, if, if you know it's gonna happen, but if it's a wide range of like, yeah, most of the time it takes five microseconds to respond, but sometimes it takes 50 milliseconds. You're like, well, those are the times when you're gonna lose a ton of money in, in the kind of world that we are talking about. And so I would rather have a 30 millisecond all day, every day, reliable time than something which has the properties of three to five mics most of the time and 50 milliseconds, some of the time that's a worse

Ben Rady

14:29

Under some conditions, which are very difficult to understand and reproduce

Matt Godbolt

14:33

Exactly so, right, right. Given those things and obviously as a, as a, as more of a native programmer, um, the, I, the things I think about in that world is, is my cache warm is my branch predictor, uh, well trained all those things, right? In the Java world, you have the, uh, the, the runtime that is trying to adapt to make your program run as fast as possible. It can do JIT compilation. And that's great. Um, once that's settled down, though, that usually the JIT is, is good. You know, you've run your code a few times and now it's warm. There's a whole bunch of things we can talk about maybe about that. But like most of the time I one thinks about the, JIT sort of stabilizes at some point, you don't have to worry about anymore. But if you are creating new objects and, uh, uh, which is a very idiomatic Java thing to do, then at some point, the garbage collector is gonna kick in.

15:23

And of course the garbage collector is amazingly sophisticated and clever piece of software using multiple threads using not a non stopping, um, garbage collection technique. So he doesn't actually have to stop the world and all that kind of stuff, but every now and then you'll bump up against it. And then you get a big block of variance in your program. And so the folks that we've worked with before have kind of come to the conclusion that you can use Java in these situations provided, you've got yourself into a steady state. You create no new objects.

Ben Rady

15:54

Right.

Matt Godbolt

15:54

Which means it's in a garbage free. No more, no nothing's gonna get thrown away. The garbage collector doesn't tend to run anymore because there's nothing for it to do. And now you're in a world where you can have a re relatively reliable response time. And that's great. And I think the point that you're making is that's not Java anymore.

Ben Rady

16:12

It is a, it is. I think in that world, you lose a lot of the benefits of Java, right. Which is Java has a really the JVM. Has a really great ecosystem of a lot of different tools and libraries and people that can help you troubleshoot problems that can help you solve problems with open source tools and other tools. Um, and you throw away about 80 to 90% of that when you're like, oh yeah, I can't make any garbage, or I can only make enough garbage to make it to the end of the day.

Matt Godbolt

16:45

That's an, I mean, so that's another technique we're both aware of is that, you know, you just get throw tons of memory into your computer and you say, right. They heap size is 1 trillion bytes. You know, like the whole memory of the computer is one heap go, hopefully we don't get to the end of it. Yeah, yeah.

Ben Rady

17:01

Yep. Um, or you start doing, you know, weird things where you like garbage collect very specific times, but again, like you're now breaking the, the model of the JVM when you're doing that. And can you do it sure. Have people been successful doing that. Absolutely. Absolutely. They have, but if that's, that's not a world that is, um, uh well-populated right. You can't just like Google and find the stack overflow answer.

Matt Godbolt

17:26

Right if you're hiring someone straight out of university and they've got Java on their list of, of, uh, skills. That's gonna be a new thing for them, right. No one really does this. It's not a, it's not a known. Very well known. Uh, uh, issue.

Ben Rady

17:43

Yeah. Yeah. So, yeah. So I mean, in those situations, you know, I am as much as Java and the JVM are familiar technologies to me and I feel comfortable using them. It's a place where, you know, I would question pretty heavily, like, is this really the right solution to this problem?

Matt Godbolt

18:01

Right. Right. And that's specifically for these cases where the variance needs to be kept low.

Ben Rady

18:08

Right.

Matt Godbolt

18:10

So how, how does that compare to our situation? Would you say?

Ben Rady

18:16

Say, Well, I think we have a little bit of both on this project. Right. We have some situations in which, you know, we're, um, listening to messages over an unreliable protocol. And so, um, we, if we don't receive those messages, uh, we, we will miss them. It's not like a TCP connection where it's like, well, it'll buffer a little bit. Maybe you'll get some back pressure. Right. That's not gonna happen. Like if we don't, if we don't hear the message, the tree has fallen in the woods and there was no one there.

Matt Godbolt

18:54

Did it really fall if no one heard

Ben Rady

18:55

Did it, did it, did the packet ever even get sent? Who knows? Um, and so in those situations, like latency is really important. It's not just a question of throughput. Now. I have definitely seen lots of situations in which the throughput between C++ and Java is at least comparable. I've seen some, even some situations in which Java is better because of some of the tricks that it's able to pull.

Matt Godbolt

19:20

Right, absolutely.

Ben Rady

19:22

And, and I think that is sort of a natural use of like, in those contexts that wasn't like the sort of tortured, garbage, free stuff that I was talking about. It's just like, no, it's like the JIT runs and it warms up and it gets used to your pattern of execution and it actually can be faster.

Matt Godbolt

19:35

And the GC, most of the time keeps up and then every now and then you stop and pause, but like the, the, the price you pay is not so high and amortized over a longish program run, it's, it's probably even faster than tracking it yourself under some cases. Right.

Ben Rady

19:50

Exactly. It can be at least this at least comparable. Yes. But in this case, it's like, no, if, if, if you don't read that packet in time, you're never gonna get it. And now, you know, you have a whole different category of problems, which is, can be, can be difficult to deal with. And so, you know, for those kinds of things, I think something that is more predictable, um, is, is just, it's just the simpler solution. Right. Right now it's not one that I personally feel comfortable building. Um, because you know, I've written, you know, I, my, I did my undergraduate degree in C++, and that was in, you know, 1996. Um, and you know, I wrote a little bit of it when I first got outta school and I've had some little sort of side projects with it, but I wouldn't ever, uh, ask someone to pay me to write C++.

Matt Godbolt

20:46

To write. Ha ha!

Ben Rady

20:48

That is, that is unprofessional to be like, you should give me money and I'm gonna give you C++ source code.

Matt Godbolt

20:54

Right. Yeah. A lapsed C++ programmer. And, and so, yeah, yeah, yeah,

Ben Rady

20:59

Yeah. So, so even though that's, you know, not necessarily the kind of thing that I would feel comfortable doing, I feel like it is definitely the right solution.

Matt Godbolt

21:06

Right. So the there, I mean, so obviously that's, that's like an aspect of the system we described is that there's an number of events coming in and you've just described some, almost like a physical issue. This is, is more like the physics based thing. There are packets arriving and they're coming in as fast as, uh, they're being sent by the, the sender, which is as fast as things are happening in the world outside. Sometimes that reaches line rate on a 10 gig network card. And you can't, you can't not be dequeing packets off of the network card because eventually the buffers run out and you're done and you you've missed the information there. And so that's a perfect example where you want the absolute, that's say absolute predictability, you know, anyone here who does like hard, real time systems is laughing at my idea of like, you know, absolutely dependable performance, like no, nothing like that, but like dependable enough for you to never be more than a few thousand packets behind what the network card is, dequeing, so that you can keep up with it and keep feeding it, new buffers and all that kind of stuff.

22:06

Yeah. Um, and then amount of processing. And then, and then the rest of the system, you know, is this, what, what, so obviously we're having this discussion and you know, you, and I obviously know where we're going with this. Right. But like, what we described really was a system where, um, as I described it, you know, you are trying to react to events in the market and you have to react relatively quickly and with low-ish variance, and we've just talked about like a physical characteristics of how you cue, accrue, uh, events happening, but making those decisions and handing code over to, uh, people who are not computer science majors, uh, to, to write features, to say predict when they should buy or sell or whatever. That's not in C++.

Ben Rady

22:48

Yeah. Well, in fairness, you don't have a computer science degree either and you do just fine.

Matt Godbolt

22:54

That's very true. Yeah.

Ben Rady

22:57

Um, but, but yes, they're, they're, they're not professional software engineers and they don't want to be, right? Um, that's, that's not what their career path is. So yeah. So we, so we sort of have this, this system, um, that I don't know. I mean, I, I feel like it is, is probably best done in, in at least two parts. And, um, even within the parts, there might be some sort of layers of abstraction, um, that, that sort of make sense and, and, and might sort of subdivide what, what certain people work on just for their own, um, uh, simplicity, their own, their own, like, you know, uh, the scope of things that they have to worry about in their, in their jobs. Um, I think it is interesting though, to sort of talk about, you know, as we've discussed before, it's not like modern processors are actually executing the X86 instructions you're sending them.

Matt Godbolt

23:48

Exactly!

Ben Rady

23:50

Yeah. So, so a lot of this is just about abstraction and the sort of, um, the cost of abstraction, whether it's the processor, that's not actually executing those instructions or the JVM doing things that you don't fully understand, or the VMware virtual machine, that's doing things to the network that you don't understand either. Yeah. Um, every time you add these sort of layers of abstraction, it has at least the possibility of introducing behavior that you don't understand. And there's sort of the old joke about, you know, the difference between scientists and engineers is when scientists are surprised, they're happy and when engineers are surprised, they're sad. Um, and so, you know.

Matt Godbolt

24:29

I've never heard that before. That's brilliant.

Ben Rady

24:32

Yeah. It's a thing. So, so like as engineers, you know, when we see something and they're like, I don't understand why that just did that. It's like, oh, well, crap, this is gonna be my whole week now,

Matt Godbolt

24:40

I'm not going home on time today. Yeah. I'm not gonna have three people crowded around my monitor all staring. Like, how can this possibly be?

Ben Rady

24:46

Right. Exactly. So the more of those layers you have, the more likely it is that you do that. Now, if you want to use the silicon rocks we've tricked into thinking as the basis for your system, you're gonna have that X86, you know, microcode thing like that's happening, but you don't have to add other ones to it if you don't want to. Um, and so in the, in the situation of, you know, we, we have people who are, who are gonna be working on the system who are not software engineers, you know, we're gonna add, we're gonna have to add some layers of abstraction for them. And, and, and, you know, the question is, is what are those layers really look like? And, and are there people that can understand the layers beneath so that when they sort of come to us and they say like, Hey, so I ran this thing and it didn't do what I expected. Like somebody can peel back the covers and, and, and sort of explain, okay, well, here's why it did this. And here's why I did that. Right. Um, but a lot of that also sort of depends on, you know, like developer or, you know, basically like workflow ergonomics.

Matt Godbolt

25:47

Right. That's what I was gonna get to like this layering and abstraction. Obviously you add the right number of layers of abstraction to make your life as easy as, and, and as flexible as it needs to be. And when you're down at the physics level, there are the fewer you, you almost were required to use fewer levels of abstraction or to use abstract layers of abstraction, which ultimately net out, which is kind of the C++ way. And indeed the JVMs ability to, you know, in terms about software layers of like interfaces and whatever, then eventually they can all boil down to, yeah, look, we know what you're doing, you're just reading a byte here. Right. Good stuff. Um, but then there are another sort of, uh, uh, aspect of this is the various people on the team. Think we've talked about this before as well.

26:29

The team dynamics are almost the most important thing. It's like, how, how high velocity can we have our teams be? How happy are they at working? How comfortable are they with their tool sets? And when then you add in like the folks we're talking about who have more of a maths background, how, how confident are we, we can give them, uh, something which doesn't turned out to be a loaded, uh, RPG pointing at their own head, which is kind of a C++, I know I, you know, I maintain that it would be perfectly feasible with enough time and engineering effort to hand a, a sort of gilded environment of like, put your code here in C++ to hand to non C++ folks. And, and mostly not have them shoot themselves in the face, but that would be a huge undertaking, I believe, compared to having essentially language level.

27:21

safety nets that something like Java gives them, or, you know, a lot of the rest of the world that we are using in, in our, in our industry uses Python for this. Because again, there are a number of ways you can blow yourself up with Python, but almost none of 'em involve strange memory leaks or, you know, crashing with threading behavior and all that kind of stuff. Right. So a lot of the decision processes round, how can we give an ergonomic and, um, uh, an environment to the researchers such that they can do their job really well, and we can still be performant enough.

Ben Rady

27:55

Right? Yeah. The performant enough is, is, uh, one of the places where this gets tricky. I think part of that problem is hard, no matter what language you choose for this. Right. Um, you know, I, I think the, the, you have to really think about like the, the sort of the platform and the structure that you're giving them. And it is gonna be like language specific. Like obviously in Java, like to an extent you don't have to worry about memory. Uh, well, you don't have to worry about memory freeing, uh, right. You have to worry about memory allocation, because if you allocate too much of it, you're gonna burn all your CPU time in the garbage collector, and then nothing good is gonna happen from that.

Matt Godbolt

28:37

But, that is a feature as well. That is a feature, right. That is part of that is a feature as like, well, okay, we don't have to worry about this and that's a good thing.

Ben Rady

28:45

Right. Right. And I mean, a lot of the stuff that you, you do get in, uh, well designed C++ I can see ways where you could structure things where they were very difficult to use incorrectly. Whereas, um, one of the things that I have certainly seen with just not even, not necessarily researchers, but novice software engineers, is they start to develop superstitions around garbage collection and the JVM about like, oh yeah, you shouldn't use this because it creates too much garbage or, you know, stuff like that. And like,

Matt Godbolt

29:18

I've seen, you know, folks looking using like, uh, patterns from like the late nineties about always creating string builders in Java and doing all this horrible concat concat concat concat type stuff, rather than just writing out the obvious stuff. And like, since about Java 10, well now I can't remember what it was, but like the compiler Java compiler has been able to do the thing for you. Right. And yes, you know, again, there's those superstitions and it's because it's not easy to measure the effects of it. You end up doing these kinds of things. I mean, and like, that's not unique to Java folks. Let's be absolutely clear the number of people that will like pack structures in weird ways in C, because they think, well, this will be better. And you're like, well, is it right? Yeah. Yeah. Yeah.

Ben Rady

30:01

So, yeah. So I think a lot of this actually comes down to some of, you know, the things that we talked about an earlier episode about performance testing and like having sane and intersubjective ways to do performance testing where like, when someone, whoever they are, when someone introduces a, a performance problem, whether it speed or memory or both, um, there are ways to measure it. And there are ways to talk about it that are, um, not, you know, the system doesn't feel crisp today.

Matt Godbolt

30:32

Oh, my word that is such an in joke.

Ben Rady

30:37

Do you wanna?

Matt Godbolt

30:38

We, should we explain? I think we have to explain that.

Ben Rady

30:41

I, yes, I, I have only heard this second hand. I was not there when this was uttered. So I, I...

Matt Godbolt

30:45

It's I have a sort of, alright, I'll give an abridged version where, uh, if you are the operator of a trading system, you turn basically turn it on. You can adjust some parameters and you're watching the buy and sells go in and that's all your interactions with, and hopefully you notice that you make money at the end of the day or during the day as well. And famously one of our, our desk heads, we would make a, a change. And then he would describe the feelings that he got turning the system on and observing the trades with like literally no information, really at all, but going, oh, it seems really, you know, crisp, it's like a good trade. Do whatever changes you made, made this better? You know, like, like, like it was like some kind of, I don't know, air freshener variant. I, it seemed very, very strange me.

31:29

And in, in fairness, in the games industry, we used to have similar things, you know, oftentimes there wouldn't have been any obvious change that would've made it faster, slower or anything. But yeah. I remember, um, a, a colleague of mine in the games industry saying that, um, that they would add in all these extra variables that weren't wired into stuff to hand to then the like game designers. And they would be like, tweaking the boss, fight by moving these sliders around, going like, yeah, this is, it needs, you know, 78, um, you know, secondary aggression, which literally did nothing, but it was like a placebo thing to make them feel better. And they would come up with similar words to describe how it would feel, but, you know, that's actually unkind. I don't want, I don't want to think people that think to think that's okay. You know, don't, don't, pull wool over folks eyes,

Ben Rady

32:11

It is funny though.

Matt Godbolt

32:13

It's funny.

Ben Rady

32:14

It's unkind, but it is funny. Uh, um, so yeah, I mean, I think, I think it, it's a hard problem working this environment where you're gonna support people that are gonna write code, but they're not professionals at it. They don't wanna be professionals at it. They have another profession that they're professionals at, um, is, is a challenge, no matter what tools you use. Um, and, and.

Matt Godbolt

32:36

Right, for certain.

Ben Rady

32:36

And the tools that you choose will inform the kinds of problems that you have to solve for them. Um, but you know, the premise that we have going in this project right now at least, is that we think Java will be a better environment to support those, those researchers on the code that they write. Um, but it's certainly not the case that the entire system I feel

Matt Godbolt

33:00

Right. I think that's something we glossed over. I think, you know, a bit earlier is that, you know, we, one of the, the layers of abstraction such as we described it is, is less of an abstraction layer and a physical layer between a part of the system, which can run in pure native and keep up with the network card and then have a way of communicating to the rest of the system through an elastic enough buffer to handle the fact that maybe from time to time, the other part of the system isn't quite keeping up, but it doesn't matter because the packets are being dequed and put somewhere where people are getting, and they've been processed enough to make straightforward for, for the downstream, uh, um, consumer to, to handle too. So that, that sort of seems to be the right compromise, where we are putting the native engineering effort in the place where it's required.

33:57

We, we are keeping our options open in the other world. There could be, be places where we could still use native aspects or other languages where, but like for now, the rest of the system might as well be in a language where out of the gate, we can get folks in and they can start writing code and be productive without necessarily being aware of all the foot guns that C++ brings. And it still has a decent chance of being performant. I mean, like we've said, Java can be very, very performant. Sometimes it can beat C++. So it's a, it's a sort of mix and match approach I think is what yeah. That we've come to here.

Ben Rady

34:33

Yeah. And I mean, there is a certain additional strength in that diversity of, you know, the more of these platforms that you have, the greater access to sort of tools and libraries and things that you have. Right. If we find some library that solves a particularly important problem for us, um, and it's a native library, we don't have to pull any like JNI shenanigans to shoehorn it into our Java system. We can just build some native code around it and have that be in major component of the system and that will work great. Um, and cuz, you know, I mean there, there's nothing more disappointing to me when I'm working in Java and I, I clone some project and I've got the JDK installed or it automatically installs the makefile it's like would usually do and I go to run it and it's like, oh yeah, you haven't installed this apt package. And I'm just like, WHYYYYYY?

Matt Godbolt

35:22

No,

Ben Rady

35:23

Why? Yeah. It's just, it's the worst.

Matt Godbolt

35:28

It's, it's a sad, unfortunate yeah. Situation that the many things that do need native stuff to just require you to apt install it as like a global somewhere and sudo this you're like, no, there's, we shouldn't have to sudo into anything these days.

35:44

So yeah. Building, building our own, um, native components and allows us to, to put in certainly if they're native, um, and that's been a, a mixture even in the, the trading systems that I mentioned before, the previous companies that were mostly written in Java, ultimately there was usually a little thing at the very end that was native and it was either shared memory communications or literally Unix pipes or, or actual JNI. And that would, you know, the, the things that needed it, you could target and say, well, this is the thing that generates tons of garbage. So why don't we just have the native code, do the work and then net it out to the answer and there's no garbage then. Right. You just,

Ben Rady

36:23

Right. Right. Yeah. So I mean, at least for this project, for the constraints that we have and for the people that we have, both researchers and engineers, uh, the, the answer to the question of C++, or Java in my mind is yes.

Matt Godbolt

36:39

Right.

Ben Rady

36:41

And, and I think that, you know, as, as the project evolves, it will be easy to see, you know, maybe the proportions on each of either of those, but I can't imagine doing this project where that would be purely one way,

Matt Godbolt

36:50

One or the other. Yeah. Yeah. That makes sense to me too. So yeah. I, I dunno what, um, what our listener will think of this episode. It's been an interesting one because you know, this has been names have been changed to protect the innocent version of, uh, a discussion we've been having for a month probably. Yeah. Or so, and this is kind of the, the outcome we've come to. And I think it's a great compromise between all of the various pull pullings in this Hellige middle that we find ourselves in. So hopefully it's been interesting for those listening along and we haven't gone around in too many circles as we have been doing in our own discussions about this.

Ben Rady

37:29

Yeah. This is the condensed version of this.

Matt Godbolt

37:31

This is yeah. The, The 35 minute version of what has been a month of conversation about this kind of stuff and trying to find and justify and prove to ourselves that this is the right path to go down.

Ben Rady

37:43

Yeah. Yeah.

Matt Godbolt

37:44

Well, I guess we should, uh, leave it at that. I mean, we haven't really touched much on the, some of the really cool, specific things that Java can do inside the JVM and some of the JIT tricks it can pull, uh, but you know, maybe we should leave that for another episode or if you can find somebody.

Ben Rady

37:57

It sounds like a great other episode

Matt Godbolt

37:59

But, but we, for now we're sort of relying on all that kind of stuff and going, it will definitely be fast enough by a long way. And, and, uh, and the, the native code will be applied where it's, where it's required. And maybe in other places where it will be useful to have it too

Ben Rady

38:13

Yeah. Who knows, maybe we'll revisit this in six months and be like, wow, this was a bad decision. And let's explain why.

Matt Godbolt

38:18

Yeah. Well, if that does happen, you'll hear it here.

Ben Rady

38:21

Yeah, that's right. That's right.

Matt Godbolt

38:23

Cool. All right. Until next time, they're my friend.

Ben Rady

38:25

All right.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript