Virtual Infrastructure - podcast episode cover

Virtual Infrastructure

Jul 15, 202252 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

Ben and Matt compare container technologies like Docker to virtual machines, and discuss the tradeoffs when deploying applications. Matt explains the scary things that can happen when you share a VM with strangers. A visitor enters through the couch.

Transcript

Matt Godbolt

Hey Ben,

Ben Rady

Hey Matt,

Matt Godbolt

How you doing?

Ben Rady

Very good.

Matt Godbolt

Excellent. Well, we don't normally refer to, uh, things that have happened in the news because that gives us a certain flexibility in the order that we release these recordings. But you and I were literally just talking about the fact that Broadcom has bought VMware and we were gonna talk about some level of containers versus not containers versus virtualization versus whatever. And it seems like we should, we should bring that up. So, let's talk! What do you...

Ben Rady

Seems like a good topic.

Matt Godbolt

Right? Exactly. It's such a deep one. And you know, we've got varying levels of experience in different technologies for essentially what is, how do I make sure my software works in the environment that I'm expecting it to? And I'm, I'm thinking personally from this point of view, like a developer who deploys server type applications, headless applications that run on machines in the cloud or in data centers and whatever like that. But I guess actually now I'm saying that I was sort of giving that so that, you know, I can't say, um, too much about how UI stuff is developed, but then there's a number of software packages I see for Linux these days that come as a pack file or a, um, uh, what are those things called? There's, there's a, there's a bunch of different snaps there snapshots, which are essentially, here's a, here's a whole operating system's worth of application wrapped into a file system and then presented as if it's like a, a single thing. And it's like a Docker container. Right. But, but differently. So, so it, it it's, it is everywhere. And I think we all want software, that's easy to deploy and run, but there's a number of ways of achieving it. What are your thoughts?

Ben Rady

Uh, well, I mean, I think it's interesting that I think you and I have a sort of a similar perspective in that we look at those tools that way. And if we asked somebody who, you know, was a little bit more focused on infrastructure, they would probably tell us something similar, but definitely not the same in terms of like being able to take, um, you know, a fixed amount of hardware that they have provisioned and paid for and, you know, have, uh, prepurchased the electricity for, and have backup batteries for, and have networking for and say, okay, well, how do I take this sort of fixed resource and allocate it out to all of the, uh, needy, greedy software engineers who keep telling me that they want more servers? Um, well, I gotta have a server for this app and I have a server for this app and a server for this app.

Um, and so, you know, I think from their perspective, they might see some of these virtualization tools as a way to, um, you know, manage those resources more effectively and have, um, a little bit more control over not only just the resources themselves in terms of memory and compute, but also those, the, the sort of blast radius, if one of them, you know, goes horribly wrong. Right? Like, you know, being able to, uh, you know, wipe an image and, and give someone a fresh new server, uh, with a few clicks, a button is way easier than driving down to the data center and unracking a machine that is no longer responsive, um, because somebody did something terrible to it. So...

Matt Godbolt

Never, I dunno what you're talking about.

Ben Rady

I've never done that.

Matt Godbolt

Never done that.

Ben Rady

Uh.

Matt Godbolt

Oops. I just fork bombed my own machine

So you're right. Actually, that's a very valid point. Um, those infrastructural things are super important and it's sort of a, a funny thing. We, I was talking about it at lunch with a bunch of folks the other day, and, um, regaling with a story from, from my past and a friend of mine who used to work at, uh, uh, a big airline and those folks are still using mainframes and mainframes have always been able to do all the things that we are now kind of starting to rediscover in that virtualization world. Right. You like, Hey, you want more CPUs? Yeah. We can bolt more CPUs on while the mainframe's still running. Hey, you wanna shut down the mainframe, do maintenance and bring it back up again. Sure. What we can do is we can teleport the mainframe's image up to a backup site. Nobody even notices that your connection is now going to Manchester instead of London.

Um, and your terminals keep on responding. Everyone's still doing their re requests and the batch jobs are still going. Meanwhile, they power down the main machine, fix the Ram, and then you can teleport it back. And those things have been around for, you know, half a century. And yet we are rediscovering them in terms of what, I mean, specifically you were mentioning things, uh, like, like VMware, um, allow you to manage the resources, really fine grained and make those kinds of like, Hey, we need to move from one machine to another machine. I it's, it's sort of miraculous. Yeah. That, that it works as well as it does.

Ben Rady

Yes, yes, yes. Yeah. Cuz those mainframes were clearly designed with those specific use cases in mind, right.

Matt Godbolt

Hardware capabilities to do those things.

Ben Rady

Right, right. They built those things from the ground up with like, okay, we're gonna be able to do this offsite backup and we're gonna make sure that it all works. And with all these other things, we sort of backed our way into it because it's like, clearly there's a need to do that. But you know, the old school operating systems and CPU architectures and all these other things that we have, uh, maybe someone gave that a thought a long time ago, but they certainly didn't design the whole ecosystem from the ground up to be able to do that. And so now we're sort of, um, in this state where it's like, that need is still there. The desire is still there and it's this sort of tricky problem of, okay, well how do you actually do that?

Matt Godbolt

And yeah, folks like, uh, VMware have got their solution for it. There are obviously other vendors that can do it. And, and of course, I mean, one should, should note as well that the, the chip manufacturers have been slowly heading this way too, adding more and more like hardware level virtualization things. Cause you know, like we've always been able to do these things. It's like my, my hobby of writing little, uh, emulators for old machines, once you can fully emulate something, of course it's state is just a bunch of numbers that you've got, you can move that around anywhere you like, and then kind of carry on somewhere else and have, you know, your single step through each frame of a game and then go backwards because you could just emulate from a snapshot, one fewer frame forward and keep, you know, that kind of stuff. So this has always been possible, but it was just infeasible to do it without actually running the same CPU as you are, you are trying to virtualize and, and the same hardware, but then things have come along. But I feel like we're going off base from where we, where I was thinking are going

Ben Rady

Yeah. Yeah, no, I mean, I, I think these, all these things are all are all kind of related, you know, you can, um, and maybe I, I, I don't think we should necessarily dive into this at the start, but one place that you could maybe take, this is like this isn't just about virtual computers. It's also about virtual networking equipment, right? Like if you look at, you know, some of the tools that are out there, it's like, yeah, you, you think that this IP address is a switch, but it's, it's not like

Matt Godbolt

, I mean, one only imagines what's going on in like the, the AWS's and the Google cloud infrastructure in terms of their physical network separation and their ability to, as you say, make it look like you have your own cloud to yourself, knowing that actually no, those, those fibers are the same fibers that everyone else is using between all of the racks. It's all magic.

Ben Rady

Right, right. Yeah. Um, but yeah, talking specifically about the, uh, the part of this that is, you know, I, as a software developer, as somebody who's, you know, sort of building a, a total application, you wanna be able to deploy it. That's really important. Uh, you want to be able to, um, you know, connect to the machine, that's running it and troubleshoot it and read the logs and, you know, run TCP dump and netstat and all the other wonderful tools that we talked about in some prior podcast. Um, and, you know, still, uh, have the flexibility of, of, of the things that we were talking about in terms of, you know, um, you know, making maximal use of those resources and being able to tear it up and down and being able to, um, you know, build, uh, the definition of what that system is, uh, in a configuration file, rather than, you know, in PCPartPicker.

Matt Godbolt

Here's a checklist that Barry has to go down and make sure they all look the same. Right?

Ben Rady

Yeah, yeah, yeah, yeah. Right, right. Um, and so, you know, there's, there's, there's lots of, lots of different tools to do this, but I think they all sort of serve the same needs. So what are some of the tools that you've, that you've actually used in anger to do this?

Matt Godbolt

Well, I mean, the main one that springs to mind, uh, other than bespoke ones that I guess became Kubernetes when I was at Google, there were some things that became sort of like strange containery type things. I dunno if that's exactly the same thing now, I think out loud, but, but the one I have the most experience with is, is Docker and Docker is a great solution to the, I want to have a reproducible environment that's incrementally built with layers. And so it's relatively efficient if you are only changing the end layers. And, um, and you can definitely have the, that kind of feeling of like, well, if I have a Docker image that I can give to you and you are gonna run it, then I am 99 point, you know, six nines positive that if it worked on my, it worked on your machine, because what I in fact did was ship my machine to you.

and now you are running my machine, which is a blessing and a curse. And I think that's the problem, right? Is that it can be misused like anything like any technology um, my experiences with Docker. So Compiler Explorer started out, actually didn't start out with anything. It just started out with a shell script, running the node, JS on a bare machine, and then very quickly it was like, how am I gonna manage this? So I decided to use Docker rather sensibly, at the time. And Docker served as well for many, many years. Um, Docker did not scale with the gigabytes of, and gigabytes, you know, hundreds of gigabytes of compilers that I wanted to build into the image. The images took longer and longer. Every time to build. We're gonna take a pause there where my wife comes in through,

Ben Rady

Through the couch?

Matt Godbolt

Through the back. Yeah. Through the couch,

Ben Rady

How have I never realized that there's a door behind your couch?

Matt Godbolt

How else do you get between, um, places? You know, we put flue powder in and then we can go anywhere to any other couch.

Ben Rady

Diagon alley. That's how, that's how this works?

Matt Godbolt

That is exactly how this works.

A very nice lady

Sorry.

Matt Godbolt

It's okay.

A very nice lady

The back door's jammed.

Matt Godbolt

The back door's jammed. Okay. The back door's.

Ben Rady

Oh, no. So you had to come in through the couch door.

Matt Godbolt

Come in through the couch door. All right. There goes my dog. And then we're gonna have to try and remember what I was saying and work something out, or just pretend it didn't happen and just put this in and, you know,

Ben Rady

Oh yeah.

Matt Godbolt

To do Barry.

Ben Rady

I say Steve, for some reason. I don't know why that is Steve that's Steve. Yeah.

Matt Godbolt

So we were talking about Docker.

Ben Rady

Yeah. You're talking about using Docker in, in, uh, Compiler Explorer.

Matt Godbolt

That's right. So the problem with, um, bigger and bigger images is that, um, no matter how you cut it, you are uploading layers upon layers upon layers, upon layers of a, of a piece of software with more and more compilers. And it was just getting unwieldy and there are definitely tricks you could do with volumes and other things like that. And we, we looked at them for a while, but ultimately we backed out when we realized we needed more security than Docker would give us. There were some, at the time, there were some relatively high privilege, um, exploits for breaking out of Docker containers into the wider world. And we were kind of tacitly relying on Docker to also be a sort of protection domain. And the other thing is that if you're running inside that container, even if you, um, even if you don't get privileged escalation outside that container, that container is long lived.

So if you're like servicing somebody's request and it was a poison request, and it was able to monkey with the system, it's now monkeyed with that running Docker container. And so it's gonna be there until we restart the machine or restart the docker container. So there was some things we didn't want, um, properties we didn't, um, that we wanted to get in terms of jailing. And once you're in one container, you can't have a container inside a container inside a container arbitrarily. At least at the time you couldn't. So we switched out to a different approach where we just have tarballs and run them on the operating system, but it did serve a need for a long time. And it's a frequent question we get asked is, "Hey, do you publish a Docker container of what you, of a Compiler Explorer, uh, instance that I can just get started?" Because people do just want to do Docker run, blah, and you get that benefit. It just works. Yeah. Um, we have different ways of achieving that, I think. But, um, it, so anyway, that's my experience with Docker. I also have used it at a number of places at work and, um, I think it works great if you plan very carefully, your Docker image layout and the layers are sensible and well managed. And yeah.

Ben Rady

So when you say a Docker layer, what do you, what do you mean what's a layer?

Matt Godbolt

So Docker logically is, um, a file system, a whole operating file system. It has, um, it literally untars for one of a better, um, explanation into a bunch of temporary directories and then overlays each directory, uh, one directory over the other. So you start with a base image, which is maybe, you know, like your entire Ubuntu distribution. And then you're like, oh, the first thing I'm gonna do is I'm gonna install these 20 apt packages that I need. And so the next layer will be another file system that only contains the things that change between the base system and the system where you ran pseudo APT install my hundred packages. I needed my extra packages. And then the next layer might be, oh, and now I'm gonna copy some files from my git repo that I'm running it in into the container at a particular location.

And that's another layer of the file system that only contains those copied files and then so on and so forth, each, each layer would add in more bits of the software and configuration. And the cool thing is, of course, is that you only need to regenerate layers that changed. And of course the layers that are immediately after them. So if I change, for example, the base Ubuntu image, of course, everything depends upon that. So I'll have to rerun the commands that populated the later layers and create new layers. But if I'm just changing my application software and I don't change my dependencies and my system dependencies, then oftentimes it's only that last layer of a few hundred kilobytes or so that changes. And so, uh, not only is the build time faster, but the way that Docker Docker, um, distributes itself is as compressed layers. And very often, of course, if you're upgrading software time and time, again, those, those base layers are already on the system, in the cache somewhere. And the only thing that you need to do is upload the few hundred K time, which is fabulous. Yeah. So that's a really good way of, of having, um, a, a sort of incremental deployment of your, of your software.

Ben Rady

Yeah. So what happens if I have like a, a layer that's like fetched the latest version of this thing from the internet?

Matt Godbolt

Well, that's, that is an excellent question. And, uh, that is one of the biggest problems with something like Docker is that it's very easy, uh, Docker cached based on the text contents of the command that's to run. Oh, okay. So if you just say curl, get me latest version of something pipe through tar -zxf or whatever to extract it, then that command will run exactly once on your machine when it populates that layer. And then if you run again, having uploaded up, having changed the, um, the contents on the website that you're curling from, or like a new version of the software is released and the URL doesn't encode that in some way, you know, you're getting, you know, like, right.

Ben Rady

You're getting latest or whatever,

Matt Godbolt

Bob dot latest exactly. Right? Yeah. Then you won't see that, but unfortunately, anyone who later builds with your Docker container will see that change.

And so these things will not necessarily agree. And so it's really important that if you are fetching external resources, uh, and it's so easy not to get this right. But if you are fetching external resources that you get, like a specifically named version of everything that you want to get for two reasons, one, it means that you get reproducibility. If someone else grabs your Docker file and just says, build me this please. And the second thing is that necessarily, if you want to change that image, you have to edit the, the URL that encodes the git sha or the version number or whatever. And which means that it will be rebuilt automatically, but it's hard to do that right. And it's hard to make sure you apply that everywhere. Even things like the base image itself, you know, often time when you say in the Docker file, Hey, I'd like to build something based on a Ubuntu 20.04.

That's essentially what you say, you say from Ubuntu colon 20.04 from Ubuntu colon latest or something like that. And those are kind of like a git pull of whatever someone has tagged as being the 20.04 for a Ubuntu. If you really, really want to make sure you get reproducible bills, you need to put the SHA hash of that particular layer in the get command as well. So that, you know, you're always gonna start with the, um, uh, the same version. And of course there's a duality there, right? It's convenient from, you know, from my mindset, it's great to have a totally reproducible build. And that means that I can hand you a Docker file, not, not the contents of the Docker image, right. That's different. But if I hand you just the text that says, this is how to build my world, you will get the same answer that I got every time.

And that's really powerful, but it's super inconvenient because, um, every time some little trivial fix in the base images pushed, you know, a security patch or a security fix or whatever, then I have to think to go back and change the sha to be the latest one. And that kind of feel if I want to keep those things going. And of course the first thing you're gonna do this is almost always what the first, uh, line after the, from Ubuntu is, sudo. Not sudo, cause you're running as root. Is apt, get update and update, update, sorry, upgrade and update, right? Because you want to in pull in all of the, the, the things that are are latest. There's no kind of version for that. There's no bi-temporality to that. So you're a bit stuck at that point. Um, and that factors into where some of the problems that one has with, with something like Docker, it's a boon, but you have to be really careful how to use it and have to understand these slightly sharp edges. And maybe most people don't care about those, but I know that it's affected us before. And we, we have a, you know, you and I have definitely got, um, an industry where we really want to be able to reproduce what we did before and, and understand it.

It's also very easy to generate gigantic layers. If you think about, um, if you, if you don't design your Sy, your Docker file correctly, you know, so in the example, I just gave up apt update, apt upgrade, apt install, right? Those are like sensible commands. I might type myself if I had a fresh new computer that you handed me,

Ben Rady

Right.

Matt Godbolt

The simple thing to do would be to run them as three separate layers. And that makes a lot of sense, but I've pulled down a whole bunch of stuff and replaced a bunch of, uh, um, there's a load of temporary files that get pulled into the apt directory that I probably don't need in my production image. Um, I've then updated a whole bunch of stuff, which has replaced a bunch of stuff. And then I'm like maybe installing my own packages. And maybe I remove some system packages that I don't want. Right. And so I've got three or four layers, each of which is strictly additive. And then there is sometimes if you had to delete files, so I, you might be tempted at the end of that to go. And the last thing I do is rm minus re, /var/apt/cache right?

Kill the cash. I don't want it anymore. It's like gigabytes of all the intermediate crap that was downloaded while I was installing my packages. But if you put it as a separate step, unfortunately those already exist. Those intermediate files exist in a layer that delete can't remove them from the layer. It just marks them as being, you can't see them anymore. It puts tombstones in there. And so your overall size, the number of bytes you need to ship around still contains the layer that has all of those files in it. And then a separate layer that says, and by the way, all those files are gone now.

Ben Rady

Right, right.

Matt Godbolt

So you have to be really careful. So you, what people end up doing is writing a, a, a long stanza of like app get and update, and what as like one giant long single bash command. And at the very end of that, rm minus rf /var/apt/cache and dpkg dash dash, you know, purge, cashes, all the things as one thing. So atomically all those things happen. And then it's just the, the end result that gets shipped as the layer.

Ben Rady

Yeah. Yeah. And that, I've definitely seen that in Docker files and it's sort of this, like, you know, uh, it just reads as gobbledygook as at the start of the file and you sort of parse it and you sort of figure out what's going on there, but it's, it's not the sort of like clean, you know, one instruction at a time, maybe with a helpful comment as to why you're doing it, um, that you'd want.

Matt Godbolt

You, you know, you, you see, sometimes people will write shell scripts that they then copy into the image to run and then delete again afterwards, just because then the shell script is essentially atomic from the point of view of the layers and it's, I mean, it could be a tooling thing. It could be just what you'll get used to. I don't know, but it's easy to get wrong. Yeah. And the thing is that as a developer running locally, you tend not to notice these mistakes because it's necessarily incremental. You've been doing this, you've been building on and building on and building on. Right. And then when you ship the, the, when you docker push for the first time you discover that you've got several layers of a, you know, gigabytes each, and I'm sure you've done this as well, when you've pulled someone else's Docker image and you're like, oh my golly, what an earth is it pulling down?

Ben Rady

Why is this Docker image so big is a game that many have played and few have won.

Matt Godbolt

Right. Right. And I think a lot of the time, people reach for Docker because it's super convenient. Everyone understands it and it does solve a very real need. But I think oftentimes in my experience with the, the kind of things that we do at least, um, a tarball of the code that you're gonna run, maybe containing the node.JS binary, you wanna run it with, or maybe, you know, cause we are in a luxurious position where we own our machines, they live in a data center, we know which machines they're running on, which, you know, probably virtual machines as it happen. So that's another layer of a virtuality above all of this. Um, but if we know a lot of things about what version of libc is it running? What, you know, base operating system are we running?

What things can I assume are there, which of course is now a dangerous game to play, which Docker kind of makes you address fully. But most of the time you're like, well, okay, if I've got libc this version, I'll just pass along all my dependencies. Right. And it's not that big, you know, for native applications, often a bit of an, uh, um, a few environment variables. And suddenly now, uh, all of your DLLs will be looked for inside the, the directory you ship. And then you're just like, copy them all with you. And that's a bit bigger, but you know, we're talking tens of megabytes of, of library files here, right. In a little tarball that you extract and will run on a developers machine and a remote machine. And I guess the other sort of critical part about Docker is that it requires elevated privileges, which means that there's a lot of monkey around with which user you are running as right.

And that sometimes it's useful. You sometimes you want a totally unprivileged user that's isolated from the rest of the system. Um, and, and, you know, in the, like the kind of was it 12 factor type model where, um, an application sort of consumes only logs to standard out only reads and writes to external things through TCP. That's fine. You tweet it as like a black box, but very often it's tempting for developers to carry out. Well, it would be really convenient if I could get to this set of files on the network, or if I could write to this log directory. And so you start passing things, you start puncturing, the isolation that Docker gives you. And then suddenly you wonder why on earth, you've got a hundred files that are owned by the wrong user,

Ben Rady

Right.

Matt Godbolt

Excuse me, as a truck going past. Um, but you know, you run this command and then you like try to delete it afterwards. And it goes, I'm sorry, I can't delete that. You know, you need to be root. And you're like, wait a second. I'm not, I, how did you, how are you root?

Ben Rady

Yes. How did you write this as root? And I, and I think it is really an unfortunate thing that the default behavior behavior of Docker is to run as root, cuz it's really easy to sort of fall into a trap of, of, um, building an application that accidentally for really no good reason needs those elevated privileges. Right? Like if you had just been forced to think about it for a minute, you would've been like, oh, well we don't. I mean the, the dumbest example I can think of is like we're binding to port a hundred instead of, you know, 2000, right? Like there's no reason in the world why that integer matters to anyone. But if you build a whole application, it's like, yeah, there's 30 other apps that connect to port 100, cuz that's the port that we chose. Um, and not realizing that that requires elevated privileges. Um, then you've, you've just added a whole bunch of you've added a constraint completely by accident.

Matt Godbolt

Right.

Ben Rady

Um, and, and running as a non-privileged user, you'll find that out right away. Um, and there are other things like that too. And I, and I feel like it's almost like the testing thing. Right. And I on brand,

Matt Godbolt

Oh my gosh. Testing you say, tell me more!

Ben Rady

I know I haven't talked about testing in like a podcast and a half, so

Matt Godbolt

I know. All right. You've got...

Ben Rady

It's the, you know, part of the reason you write the test first is to make sure that the resulting solution that you come up with is testable. Right. If you build something, uh, and you don't think about tests and then you try to add the test later, it's really hard. And so most people don't right. And it's, and the reason for that is, well, you came up with a perfectly reasonable solution if you completely ignore this other constraint. Yeah. And then you try to add it in later. Right. And so you're doing kind of the same thing when you run, uh, you know, apps in as root in Docker is you've, you've got a constraint that would be nice, but you don't even think about it until it's too late.

Matt Godbolt

It's invisible, which, okay. So I'm gonna take the other side of that just to sort of in the defense of a Docker style thing. I know obviously this is, uh, uh, there there's many a nuance here, but right. One of the things that Docker gives you kind of out of the gate is deployability, which is another thing that if you don't think about right at the beginning, it's hard to retrofit. We've all seen applications that you're like, well, this is well, well, and good. If I can get clone and I've got full access to the internet and then I can run, uh, these commands and I've got access to these things and I can do whatever. And you're like, that's great on my developer machine. Again, the loudest truck in the world is now outside my house.

Ben Rady

They're circling, just circling.

Matt Godbolt

They really, no, it's just, he's taunting me. He's reversing it up. This has been the most I'm I will try and edit some of these things, but I think if you're, if you can hear this dear listener, then I failed to edit the podcast very well. All right. I think they've gone. So, but yeah. Where were we? Um, I was ranting about something

Ben Rady

You were about to defend docker. It was shocking.

Matt Godbolt

It was, I was defining no, the deployability is an important thing to not have to retrofit afterwards and Docker kind of hands you that straight away. You're like, well, docker pull, docker run amazing. Right. My CI is docker build and docker push. And my run time is docker pull and docker run. And the cool thing is that my developers can run as if they have the CI build because they can docker pull as well, and then docker run as well. And so it ticks tons of boxes. Right? Yeah. It's so lovely, right. From that point of view. Yeah. Again, until you discover that half of your computer is now owned by root and you don't actually have root privileges on it. And then you're like, well, I'm stuck with these files, I guess.

Ben Rady

Yes. Right. Until you fire up the container and then, uh, rm them from the containers,

Matt Godbolt

Inside the container

Ben Rady

The container has root.

Matt Godbolt

Yes. I mean a good friend of mine. I will not drop them in the, in, uh, but a good friend of mine has a one liner that gives you actual root privileges on the machine that you're on. If you have Docker available with non pseudo, it's a convenient little thing to remember and just clicking it. Oh, that's so.

Ben Rady

Right.

Matt Godbolt

You have Docker, you basically have root. Yeah. Even if you weren't allowed it in the first place.

Ben Rady

If you, and if you live, if you work in one of those horrible environments where they don't let you have sudo on your own machines, which is insane, but they do exist. You can maybe put in a request for Docker instead and get basically the same thing.

Matt Godbolt

let me just say that this, this, uh, sec, this is a personal opinion that Ben and I hold, um, don't wanna get anyone in trouble with their security teams. Please don't do anything daft with that information, but it is true. Yeah. And it's great for taunting your infrastructure and SecOps folks, if, uh, you indeed need Docker for whatever. Anyway, that's, that's Docker, other containment, containment, container solutions. I mean containment

Ben Rady

Containment solutions like from the Ghostbusters.

Matt Godbolt

Like from, yeah. I was actually thinking the same thing. Yeah. The light is green. The trap is clean.

Ben Rady

Uhhuh

Matt Godbolt

Your that's? Well, my virtual machines have all been, um, eight bit, um, if I'm, which makes them considerably easier on some axes, but yeah, so the let's explain a little bit about how Docker is working. So at least Docker on Linux, which is my only experience here. So Linux supports, um, name spacing. That is the ability to make groups and, uh, resource allocations that are kind of contained and have their own name space away from anyone else running on the system. And now obviously you can think about a user is a sort of a name space of vaguely, but you know, if you type PS as a particular user or PS aux, you can see all of the other users that are running on the system in this instance, name spaces can contain off areas of the operating system so that like the main operating system can see what's going on.

But if you are inside that name space, if a process is inside that name space, it only sees things in its own name, space, and name spaces can be file systems. They can be users. They can be, um, oh, uh, CPUs. And that may be secrets, but there's a number of things. Number of like, um, aspects of the system, which can be compartmentalized and held separate. Um, but you're still running the same operating system. And you're still doing all the things that you were doing before. You're just making a new name space. So what Docker effectively is doing is making a new name space, um, creating inside that name space, a bunch of links to the outside world, for things like the terminal for things like, um, oh yeah. Network is another name space you can create and you can make a name space. You can make a bridge.

Then that talk that talks one name space to another as if it was, uh, one of those network devices that we're talking about. Like a, uh, uh, um, and, and then you're basically running like a regular process, except that if you type PS or if you two type LS, you'll only see the world that the container gave you through giving you your own name space. And it's a bit like if someone's ever looked at like chroot jails, which was like the, the precursor to this where you could say, Hey, start a new process and pretend that the root directory like the slash the top of the hierarchy is this sub folder I just made. And then you can never see outside of there. And you could imagine that you are effectively in a jail. You can't see outside of there, and your process can run along and, um, and, and be isolated.

And you can see how you might build like a, a duplicate operating system image in there and then run it. But it's running really on the main operating system. And that has a really interesting side effect. The kernel calls that you are making are going straight to the host operating system's kernel. There is no kernel that you are running inside your Docker container. So if you're running on, um, kernel version five point star, um, and there's some whizbang new feature that's in kernel version six and above, and you've got a Docker image, that's Ubuntu 24, whatever that wants to use that it ain't gonna work. No amount of Docker magic will make new features appear in your running kernel. Virtualization, on the other hand, takes this down to the hardware level and pretending effectively like you've are, oh God.

Now the distractions are a cat hitting the microphone. Uh, at the virtualization level, you are pretending that you have, uh, a CPU and resources network, it resources and hardware resources that don't actually exist. And then a, a, a full on kernel boots up in that world. And as far as that kernel is concerned with a few caveats, it thinks that it's running on a real computer, but it's actually running on a simulation of a computer that's running on the real computer. Now it's.

Ben Rady

Kinda like how we're all living in a simulation.

Matt Godbolt

We are all living in a simulation, which explains an awful lot. Yes. But yeah, we're all living in some kind of the matrix and all we're doing is we're putting another matrix in our matrix so that we can run. Yes. Uh, another copy inside of that. So as far as that virtual machine is concerned, it is a full sovereign computer in its own.

Right. And it can do anything. It likes unaware that when it says, Hey, oh, I've got a network device over here what's really happening is that some kind of, um, trap is happening in the CPU when it's accessing or trying to access that device and an operating system, one layer up in the list of, of matrices

Um, and then when you say emulation, you think it's gonna be super slow. And in fact, you know, you could obviously write an, a genuine emulator and then you would, um, you could, you know, pretend to be an ARM machine when you're running on an x86 or whatever. What typically happens is that, um, uh, these are hardware accelerated. The CPU knows quotes that there are layers and rungs of the, of the hierarchy of ma uh, of, of simulation environments. And, um, it gives the hypervisor more privileges than the, um, the operating system underneath. And in fact, mostly nowadays, um, the guest operating systems, as they're called, are in cahoots with the virtualization layer, they actually do know that they are living in a simulation, and that allows certain things to be a lot faster. So instead of actually having to emulate a real network card, and like, as with this sort of two way back and forth between the hypervisor and the underlying, uh, operating system, there can be some kind of agreed thing of like, Hey, I'd like to talk to the network card.

I'm just gonna pull all the, the data I would like you to look at over here. And then, Hey, hypervisor, imagine that a network, you did the, whatever the network card thing gives you. There's a certain amount of collaboration. I'm making that up in full disclosure. But yeah, what that means is that when you go to your, uh, Amazon account and say, I'd like a new computer, please, that computer is not a real computer. It is just a virtual computer running on someone else's infrastructure. And you get a certain number of CPUs, which, and a certain number of disc ios per second, and all that good stuff. And this then comes back to the VMware thing that you were saying in the beginning. This is why infrastructure folks love it, because I can buy two, a hundred and twenty eight core, uh, terabyte RAM machines. And then I can hand them out to as many developers as I'd like in like two or three or four CPU slices, which I can't even buy. I can't buy a two CPU computer anymore. And they get to share it and they all have root on their machine and there's no way they can bust out of their virtualization environment. To get to the hypervisor, but they have they, and then they can, they can like blue screen, their kernel can panic. The whole thing can go down is exactly like a normal computer, except that really it's just one 10th of the physical machine you're running on.

Ben Rady

Right. Right. So when the annoying developer tells you that they need a server to run their app and you ask what the app is, and they're like, well, this is node.JS app that runs in one thread. You're like, there's no way on the planet I'm giving you a $10,000 server to run a single threaded Node.JS app. So I'm just gonna give you this one little slic.

Matt Godbolt

And you think it's a server and it has its own operating system, which means obviously there is a, you know, your storage requirements, both in terms of memory and in terms of disc space go up. Because, you know, like there is a real honest to God, Linux kernel running there and probably on the sibling CPU, like literally on the die, you know, two millimeters away from you is another CPU running someone else's Linux kernel. Right. And never the twain shall talk to each other.

Ben Rady

Right. Rowhammer issues and other things aside

Matt Godbolt

Yeah, don't, don't gimme an in to talk about that kind of stuff. Right. You know, so actually, yeah, right. We are gonna, we're gonna have to now because you poke my buttons,

Ben Rady

Row hammered them!

Matt Godbolt

Not row hammer, but that's, that's definitely one for another conversation, but what, um, what a reasonable person might do given what I just said, is say, well, the hypervisor is sat there. Not doing very much. Doesn't need any CPU resources most of the time, cuz it's reactive to the host operating systems that are really running on the CPUs. Right. But we could potentially say, well, let's give one or so CPU to the hypervisor itself. And it can do some background maintenance activities. What if it's scanned through all of the physical memory of the computer and went, wait a second, I've seen this 4k page before. Right? I've got every single of my 60 guest operating systems have all loaded up variants of the same Linux operating system. Why the hell would I have the same 4k pages? You know, like many, many, many 4k pages that are exactly the same.

Cause they all loaded like, you know, VM Linuz 4, 5, 29, whatever, why don't I just point them all at the same actual physical location and then discard the copies of it. But like pretend to all of the individual guest operating systems that they have their own copy and then it's just copy on write. If they try to write to it, then they get their own copy a bit like, you know, when you fork a process on a single operating system. The same tricks happen. Makes perfect sense. Now obviously you have to do it retroactively when you fork, you know that every page that you currently have is gonna be shared in the child process, but this is a sort of emergent property of once you've booted a machine up, eventually some pages will be the same on one machine as they are on another, in which case you do duplicate them. And then you're right. You've got more free memory for the system as a whole. And it seems like there could be no, there could be nothing wrong with that until the security people come along.

Ben Rady

Yes. And ruin everyone's day

Matt Godbolt

And ruin everyone's day, exactly, exactly!

Uh, so it was shown that, and maybe I won't go into too much details for two reasons. One, I don't necessarily know the details. And two we've probably talked too much about this already. Um, it was shown that if you have the same implementation of open SSL or one of the other cryptographic libraries as a co-located virtual machine to you, so I, I'm gonna just go to Amazon and I'm gonna ask for a hundred EC2 instances and then I'm gonna run a test to see if I can find that I'm co-located with my target just by coincidence. I happen to be running on a machine that also has an SSL process somewhere in it, all right? The chances are that obviously those 4k pages will be de-duplicated cuz it's the same dot so that we've both got it's open SSL Ubuntu whatever version right. Now I can start doing timing attacks because I know my physical RAM is associated with the same physical Ram that they have. And so if I know which code paths are taken in their code, I can poke around in my cache and sort of try and determine whether.

Ben Rady

Oh and start getting the the keys basically.

Matt Godbolt

Exactly.

Ben Rady

I got this byte of the key and I got that byte of the key and I don't have it all yet, but that's close enough.

Matt Godbolt

It took a long while to read this bit out. Cause it must been in an L3, but if it wasn't then I know it must be in someone's L2 somewhere and that someone might be and all these kinds of things. And you could imagine how terrifying that is from a point of view of, of, of security. You're like you've lost the isolation between the virtual machines that aren't even meant to know that their siblings exist. So that's your own fault, Ben

Ben Rady

Worth it.

Matt Godbolt

We can talk about Rowhammer, another time

Ben Rady

Worth it, worth it.

Matt Godbolt

So in terms of deployment though, I mean, it's, you sort of alluded to that by saying that like as a developer, it's convenient to be able to go to your infrastructure folks and say, can I just have a server to run my right little Node.js app.

Ben Rady

Or not even talk to them and just like run a script that generates one for you and they keep tabs on it and they know who's allocated to, and they can call you up and say, Hey, you're using 35 servers. Do you really need them? But you know,

Matt Godbolt

that's very true.

Ben Rady

automate those things. Right. And it's really great when you do.

Matt Godbolt

That's very true. I mean, I forget of course that that's, this is what Terraform and what the like do for, for, for me in Amazon. Right. I just say how, when another computer, another computer appears, it has never occurred to me that really somewhere behind the scenes, some, all this magic is going on to make that happen, but you know, it just does. Right. And yeah, that puts a lot of power and responsibility, but a lot of power into developer's hand, you don't have to like overload a machine and you get the isolation that say a Docker container would give you, but at a much deeper level now different problems again. Right. You know, at least in your own server, if it's running as root, well, it's only running as root because you made it run as root. As root, right.

Ben Rady

Yeah.

Matt Godbolt

So what do we think about that in terms of like the, the trade offs? What would, what would make you choose one method over another?

Ben Rady

I mean, I, I, I tend to lean more toward, you know, having virtual machines and, you know, having, uh, more of like the I'm gonna get this virtual machine. I, I will probably build some very lightweight automation to set it up. But again, the setup of it is mostly just, you know, kind of like you were saying, the apt update apt upgrade, you know, maybe install one or two system packages, but hopefully not if I can avoid it and then just run all my applications as a user, as an unprivileged user. And you know, every version is a new tarball that gets copied up to the computer or maybe have some automated thing that pulls 'em down from a central repository.

Matt Godbolt

You've got like a deployment thing that use you've got like a, is it git-deploy?

Ben Rady

Oh, git-deploy. Yeah.

Matt Godbolt

Is that open? That is open source, right?

Ben Rady

That is open source. Yeah. So git-deploy is sort of my Heroku style deployment script that I made, um, that will let you take any server that you have SSH access to, uh, and, um, basically push to it as it, as if it was a git repository. And as a side effect of that, if the, if the push works, that is your code is not out of sync with everyone else that's deployed to it. It will deploy your application and start up. And so you get to sort of use the, git semantics around push and pull as your mechanism to make sure that you don't accidentally clobber someone else's deployment.

Matt Godbolt

I see.

Ben Rady

Right? Um, and it's, so it's sort of a safer way to be able to empower people to deploy locally from their machines, if that makes sense to do. Now, sometimes that doesn't make sense to do.

Matt Godbolt

It doesn't always make sense. Right. Yeah.

Ben Rady

But in fact it sort of usually doesn't make sense, but sometimes it makes a ton of sense. And it's really nice to be able to do that in a way that is safer than just, you know, scp

Matt Godbolt

Right. But I mean, often, you know, there, there are, there are also places where, or times when you want to be able to push to like a development machine.

Ben Rady

Oh absolutely, yeah.

Matt Godbolt

A development cluster. And that seems like a good thing there where I would actually want the feature is I have a code on my machine that I want to have running in a environment that I can't reproduce myself locally. It's not ideal to be in a situation where you can't quite reproduce it locally, but sometimes, you know, I wanna batter it with 200, um, machines that are gonna send queries to it. And so I wanna deploy my version that has my fix or whatever.

Ben Rady

Yep. Yep. And I mean, you know, you can take speaking of virtual virtualization, like you can take these things a lot further. And one of the things that I've been playing around with one of my projects is sort of getting rid of the idea of the production environment. So all of the environments in this project that I'm working on are just branches. There's the main environment for the main branch. And that's where the DNS entry for the top level domain points to. But if you make a new branch, it will automatically spin up a new environment and it will marshal all the services that that environment needs.

Matt Godbolt

Ooooo.

Ben Rady

And it will do everything that it does. And so if you want to make a change that involves potentially making changes to the infrastructure like, oh, I'm gonna change a security group, or I'm gonna change, you know, the number of servers from, from four to five or whatever it might be. You just create a new branch, you push that branch to GitHub and the infrastructure magically appears.

Matt Godbolt

That is awesome.

Ben Rady

And the name of that infrastructure is literally the name of the branch. So they're tied together in that way. And when you delete the branch, the infrastructure gets torn down.

Matt Godbolt

That's super cool.

Ben Rady

So, the main branch is always there. That's sort of the quote-unquote production environment. Um, but if you were to ever delete the main branch, it would also actually tear down the, the, the

Matt Godbolt

I mean that's probably what you want though,

Ben Rady

You know, it's like, it's sort of a weird thing, but it's like, it's like,

Matt Godbolt

No, I like it.

Ben Rady

coupling those two things together very tightly and saying a branch is an environment. There's no such thing as the dev or the test or the UAT or the production, they're just names of branches. Um, and that is only possible because of virtualization. You couldn't do that any other way.

Matt Godbolt

Way on a real machine. No. Well, yeah, for all the reasons, I mean, cost was what I was about to bring up because you know, that, um, I, I'm sort of trying to move Compiler Explorer towards a system, which is a tiny bit more like that where instead of the staging environment that we do testing being kind of like just a subcategory under the production environment, it's like its own AWS account effectively. And then I can do the kind of things you're talking about like, Hey, let's have a new, um, load balancer. Let's try out a different way of doing everything in the staging environment. Uh, but for me that's prohibitively expensive because those resources are not free and they're quite expensive. Like having one load balancer is expensive enough. Uh, and, and I can configure that one load balancer to kind of say, well, if it has slash staging in the thing, then goes to this, this subsection. Right. And that's how it works at the moment. But, um, so there's a trade off to be had there. And obviously, in, in a world of infinite resources, it's no problem that if you create 12 different branches, you've got 12 environments.

Ben Rady

Right, right. Right. Well, one of my subtle motivations for doing this, and again, I'm trying this on my own project, but you know, maybe one day I'll get to do this in a, in a, a, a more, um, you know, widely shared, um, company environment is to directly manifest to the bean counters, the cost of so many different branches.

Matt Godbolt

Amazing.

Ben Rady

It's sort of like, yeah, you know, branches have a cost and it's hard for you to measure that cost. Cause it's mostly cognitive load on developers. What if we just turn that into dollars.

Matt Godbolt

Actual dollars

Ben Rady

And then you could measure them and be like, why then you'd have accountants, yelling, "Why do we have so many branches?"

Matt Godbolt

As, as one of those folks that sends out the emails and the, the nags to people saying like, Hey, this PRs been open for three years. Is there any chance of it being closed? I totally I'm down with that. Yeah. The cognitive load, when I hit auto complete in, uh, in the

Is this here? Those kinds of things. Yeah. Yeah. No, that's, that's, I, I like that approach. I like the idea of, of manifesting. And I think, you know, obviously you've talked about doing it in terms of virtualization. There are ways and means of doing it with the, uh, Docker style approach as well. And I know we're kind of getting close on the amount of time that we've got available, but, um, I'd like to sort of suggest, you know, there's the Kubernetes, um, type approach there's, uh, Nomad, which is a Hashicorp system, which can be used to run Docker containers. And then there are definitely load balancer type things that can talk to those containers. And you could definitely have it so that every branch that you commit builds a Docker image and pushes it to a, a tag, a named tag for that branch, and then auto registers a container, a job running in nomad to say, Hey, I'm the, your project, hyphen your branch name environment that has all of these machines running.

Ben Rady

Yeah. Why is this here?

Matt Godbolt

So it can be done too. You are still paying the cost. There are processes running in one operating system or one set of operating systems that is the nomad cluster or the Kubernetes cluster or whatever. It's a sort of lower, um, I guess it's higher level. I dunno how you describe it, where it, where it's cutting, you know, is it lower level or higher level, um, than virtualization? It's definitely higher level like measured on the axis that makes sense to me right now. But, um, but, uh, yeah, so, so you can achieve it using that too, which is, which is great. Um, and I mean, I think all of this kind of comes down to is what we're talking about in, at this instance is infrastructure as code. However it's achieved, be it VMs. Or Nomad machines, running Docker stuff. And so we've kind of strayed from the original point about like, what, how does one do deployment? How does one use virtualization? How is it what different things are available? But...

Ben Rady

Yeah, they're all related though.

Matt Godbolt

They are, I guess, related. Yeah. Yeah, yeah. And that infrastructure as code thing is, is, is super important to be able to say like, yeah, I, it's not, no one has to rack a machine. No one has to, um, physically move any cables around when I stand up this instance here. Um, and that instance is defined by a piece of code or a configuration file that's generated by code or just a configuration file that a human edits, which you know, is, is a fabulous way of tracking. I mean, we are all especially software engineers. We know where we stand with source control and CI and things like that. So having, having the Mach, the, the physical world work that way too, and be able to roll back and all that kind of stuff is super cool. However, so it's achieved.

Ben Rady

Yeah.

Matt Godbolt

All right. Well, this is, this has been a fun exposition of what on earth can we remember about how all this stuff fits together?

Ben Rady

yeah. I feel like we only really like touched on it. Like there's you could probably do a whole other hour on these topics, like you and I talking about Kubernetes.

Matt Godbolt

Absolutely. Yeah. I mean, I don't know enough about Kubernetes. I mean, I use Borg at Google, which I, I believe to be related in some way, but I don't know either. I remember them trying to pretend that it wasn't called that. And they used to pretend that it was Anita Borg. It was named after not clearly the, the, the evil people in, in Star Trek.

Ben Rady

Right.

Matt Godbolt

Uh, which yeah. And I think that was because they, it leaked out because, um, they weren't laundering their, uh, referrers. And so people were running internal services from machines and then people would like link to, I know not YouTube videos, cause that would be Google too, but you know, link to other people's websites. And then it, someone went through like what this, all the machines had, like, um, names that DNS names that uniquely refer to the job that were running that was running on it. Yeah. Which is super convenient for everything. You know, you wanna hit your job and it's running a web server, then you just go to that long name and it hits the machine and the machine then looks at the Mach, uh, the, the name you gave it. And then it redirects it to the correct port for that particular instance that you were running on. And then off there, there you are. There's your, there's your job running and you can look at it. Um, but obviously if you then have a webpage on there that has a, Hey, click, the cat animation, you click the cat animation, then you've leaked, you know, twelve.seven.borg.google.dns or Whatever.

Ben Rady

Right, right, right, right.

Matt Godbolt

Oops.

Ben Rady

Referers referer headers, man. Yeah.

Matt Godbolt

Yeah, yeah. All right. We should stop talking. We should stop talking. We've plenty of things to audit, uh, to edit

Ben Rady

and audit.

Matt Godbolt

and audit

Ben Rady

we got, we can't let the Borg stuff leak out.

Matt Godbolt

That's true. That ship has sailed.

Ben Rady

Cool.

Matt Godbolt

All right. I'll see ya. Next time.

Ben Rady

Bye.

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android