Boring is Awesome

Matt Godbolt

00:14

Hey, Ben.

Ben Rady

00:19

Hey Matt.

Matt Godbolt

00:20

I hear, you've been thinking about databases.

Ben Rady

00:23

A little bit. I have some ideas about databases.

Matt Godbolt

00:26

So when you mean database, you mean like sequel databases, like server databases, right? As opposed to other more colloquial,

Ben Rady

00:34

MySQL, you know, Oracle, if anyone still uses that, I don't know.

Matt Godbolt

00:40

I'm sure. I'm pretty sure someone's still using Oracle

Ben Rady

00:43

Gotta be at least one or two people. Oracle seems like they have money. So there has to be someone somewhere.

Matt Godbolt

00:49

What about SQL Lite? Does that fit into your general Pantheon of databi that is the plural of database I'm sure.

Ben Rady

00:54

Yes, but in a more interesting way. Uh, but yes it does

Matt Godbolt

01:00

So what's your beef with databases? I'm assuming it's a beef, cause otherwise you wouldn't have mentioned it.

Ben Rady

01:05

I have a minor beef with, with databases in that I think that databases should be used to store relational data. If you don't have relational data, you should strongly consider not using a database. That is my beef with databases, because I see many, many situations where people do not have relational data and they're like, we have data, where should I put this data? I'll put it in a database. And that creates a whole host of problems.

Matt Godbolt

01:43

So to be clear here, you're talking about also relational databases, not like the NoSQL style, which I think that was a Lord of the Rings thing. Wasn't it? The NoSQL, they were like the ring wraiths. I've completely caught you off guard there. No SQL, sorry for my, for the listeners who aren't playing along with the stupid pronunciation thing. Uh, no SQL databases, which are much more like just data dumps, um, with no with no relational component or not with only complicated layering.

Ben Rady

02:13

Document databases are kinda the...yes

Matt Godbolt

02:14

to get you're talking about traditional RDBMS. Things. And your observation is that some data that is non-relational ends up in a database because it's convenient probably, or is just there.

Ben Rady

02:26

It's familiar, I think is a lot of what it comes down to. Right. People, they know how to configure them. They know how to use them. They know how to interact with them. They're comfortable with SQL, you know, they have operational teams that can support them. There are lots of vendors. It'll give you one, you know, you've been using Postgres for 10 years and it's like super comfy. It's like a warm blanket.

Matt Godbolt

02:47

Right, right. What's wrong with that?

Ben Rady

02:50

Well, the problem is, I mean, it gets back to the old programmer adage of use the right tool for the right job. Right. And just because a tool is familiar, doesn't necessarily mean it's the right thing to do. Now, I'm going to have a slightly hard time blaming anyone for using, you know, we had this phrase at PrevCo, uh, we said boring is awesome. And you know, using boring technologies, old technologies that work really well and are well-trodden and have all the bugs sort of wrung out of them already. I'm going to have a really hard time complaining that somebody is using one of those to solve a problem. Cause it's like, yeah, okay. You want to actually get this thing working and you don't want to be a technology fetishist and just use the newest, coolest, no SQL thing or whatever. Sure. Fine. Respect. But, there are lots of times when you could get even more boring than a relational database and put it in, Oh, I don't know a file.

Matt Godbolt

03:47

Well, I was going to say what could be more boring than a database? And you're telling me a file. What is this boring access that you're referring to? What is boring a proxy for in this particular instance? What are you getting from the, yeah.

Ben Rady

04:00

What are we getting from boring? Well, the first thing we're getting from boring is that there's a large community. The product, the, the tool that you're using has been used in a lot of different contexts, has very nice documentation, has a large community around it that can help you use it. Um, a lot of the bugs have been worked out of it. It's been used in like, you know, production environments and a lot of different ways. And it's just a generally reliable tool. And

Matt Godbolt

04:24

So that's like Postgres definitely ticks the box of boring in that respect because it just works. Everyone understands that you can advertise for a job and say, Hey, if you got Postgres experience, someone can come in and you know, roughly where you stand with with, with that. But you're saying that even more like before there was Postgres before there was any database, there was a file

Ben Rady

04:47

File systems, file systems are real boring.

Matt Godbolt

04:50

They they're even more boring than databases.

Ben Rady

04:52

You put, you put file systems on your resume and people are going to go like, what? Right? Like it's super boring, but turns out if you don't have relational data files are a great way to store data. You can write them to the file system and read them. Right.

Matt Godbolt

05:11

So give me an example of the kind of data you're thinking about when you're very specifically saying like non-relational data. Give me an example because I'm wanting to sort of get yeah.

Ben Rady

05:25

Anything binary, first of all. Right. I, whenever I see people putting like JPEGs as binary, blobs into a database, I'm thinking like, okay, like I get that you have a data storage device of some kind and you want to put the stuff in there. Um, but you know, is this really the best place to put it now again, if you have other relational data around it, you know, you have relational data that needs to refer...

Matt Godbolt

05:54

Maybe, you know, maybe it's your avatar in your forum system and everything else is relational. You just don't have the PNG stored in there, but it's not like you can do select PNG, underscore decode of blah in your like actually do something with it. The data is just so, but I mean, that's why databases have blobs after all. That is what these are meant for. Right. But you're saying that like, if that's all you're doing, maybe you need think carefully.

Ben Rady

06:16

If primarily what you're doing is storing images. Do you really need a relational database and store your images? I think probably you don't. Um, and there are also other reasons why, you know, it's, it's obviously important to think about the context and things can be used. Scalability is a big concern in a lot of different contexts. You can get scalability through relational databases, but if you're using the relational database, simply as a mechanism for scalability, what about an object store? Right? You don't necessarily need to use files or can't use files because it's like, okay, well this has to run on multiple computers that can all share the same file system without some NFS trickery that we don't want to do.

Matt Godbolt

06:58

Redis or S3 type stuff. Or, yeah.

Ben Rady

06:58

What about a file store? Again, you're not using, you don't have relations in your data. Don't use a relational database, use something simpler. So, you know, there's stuff like that. So like, like for example, if you're building internal tools, right? Some internal websites, something like that, the number of people that will ever use this tool in its entire lifetime is like two dozen, right? Do you really need to run that on more than anything, but like a single server or a virtual machine or something where, you know, you can just write files and read files and back those files up and it'll be fine. I don't think that you do. And if you can do that, then you get a whole bunch of other stuff for free. We had talked on an earlier episode about the importance of manual testing, being able to run things locally and do what the users do exactly on your local development workstation, and be able to reproduce the steps that they take and troubleshoot things in the same way that they will. And I, I think that's extremely important, so important in fact, that I would be reluctant to add any technology to a project that hindered me in that effort. One of those things can be a relational database. If you have to have a database loaded with all the data, with all the right schemas loaded up on your machine, then you can't have a lot of automation around creating that.

Matt Godbolt

08:28

It's harder to get it right. It's a barrier to doing it. It's not impossible. And like virtualization technology and other things have coming on to make it a bit easier, but it's, it's not straightforward compared to here's a file,

Ben Rady

08:42

But the question is, will you?

Matt Godbolt

08:44

Right

Ben Rady

08:45

Can you, and will you or two different... Yes, I can make a Docker container for my, uh, database and I can write all the scripts and tools to load all the things into it. And I can integrate that in with all my projects. So when I fire up the server or whatever it is, and also fires up the database and it tears it down properly. So I'm not leaking Docker containers. And my laptop crashed well, 35 instances of Postgres running. That's why it crashed. Right? All those things you can solve all of those problems and take the time to solve all those problems. A lot of people don't, they just say, well, I'll just run the database myself. And then they, and then,

Matt Godbolt

09:21

Or everyone points to the dev database, which you just assume ran somewhere. Oh, always someone else using dev right now because I'm running my test against it. Oh, sorry. I'm yeah. You know, that's yeah. We've all been there.

Ben Rady

09:33

Yeah. Yeah. So I think that's what people generally do because they don't want to take the time to automate it. But I think that there's an even better option there, which is why do we need a database for this thing? Right.

Matt Godbolt

09:45

So you mentioned files. That makes sense to me. Right. Um, I remember, uh, at prev-prevco the sort of internal link shortner that was hacked together, used exclusively a file-based an append only text file to store, you know, a space separated. Here's the short URL, here's the long URL. Let me just load it into memory every time, you know, it started back up again and there you are. That's a simple enough thing if it fits into memory. And that's, that's a perfect, I think example of what you're talking about is no relation on that aspect of this whatsoever. It can all live in memory. And so it was implemented as a JavaScript, like map literally from the short URL to the long URL. And it was, yeah, as I say, a text file, but what if your data is bigger than that, or needs to be indexed more than that, you get an awful lot for free with something like a SQL database and it'll, Hey, I'm, I need to look up by this field or this other field, and now I can write the code to do that, but it's kind of easier if I just let the database manage that bit and I could create two indices now.

10:44

Right.

Ben Rady

10:44

I mean, if you're looking things up by various fields that sort of smells a little bit like relational data to me, in which case I would say...

Matt Godbolt

10:51

Not necessarily like, like for, you know, take my URL shortener, for example, what if I wanted to say, okay, well, um, what are the short URLs that lead to this big url? Like essentially an inverted index, right. Um, I could, obviously I can write the code just to have two, you know, two maps, one that goes from short to long, and one that goes from long to short, but I'm, I'm basically making a database and if anyone needs anything more like, Hey, what about, uh, you know, it has a user field and it as well, right. And that could be who created it. And I said, we'll find the ones created by me now that you were definitely straying into relational

Ben Rady

11:24

Aspect here is that if you're making a URL shortener for an internal application, then the lifetime of this application is going to have megabytes of data at the most, which means you can do just about anything that you want.

Matt Godbolt

11:39

Right. I'm, I'm sort of just making the point that like, if you start down a road where you end up writing all of these things, and then one day you're like, Oh no, it doesn't fit in memory. Well, this is annoying. How are we going to make it scale? How are we going to be? And,

Ben Rady

11:51

If one day it's never going to fit in memory. Okay? Sure. But there are certain categories of things where you can be like the number of short URLs at this company.

Matt Godbolt

12:01

Well, you say that we actually did hit this problem, which is why I kind of bring this up. That was a, uh, API server could generate the short URLs, which meant that you could very quickly churn through and create thousands of them, which was fine until it then took days for the machine to restart, because it had to read through terabytes of this textbook, but it was, you know, again, it wasn't a big problem and adding layers in your software could mean you could switch it out later on and put something else there. Right. And just saying it's,

Ben Rady

12:28

I mean, it's a hell of a lot easier to go from a set of flat files to a database than it is to go the other way.

Matt Godbolt

12:33

Correct. Correct. But like the API that a database gives you is one that is sort of, um, sort and search and find and reorder and limit and all these things, which you might not need to in a relational data store, but you, you know, you still want to be able to do those things, you know, aggregate like stuff. Um, and you end up writing that yourself, the file doesn't give you that. So the database has both the storage mechanism and the querying mechanism for that data. Whereas if it's a file, it's just a storage mechanism. And then you, it's up to you to kind of layer it, which is probably a feature, but I'm just want to sort of, talk about that.

Ben Rady

13:07

I mean, certainly if you find yourself building your own indexing system into flat files, you're probably, it's probably time to move on to something else. Right. But

Matt Godbolt

13:15

Maybe not a relational databases, perhaps,

Ben Rady

13:18

Maybe not a relational database. And one of the places where I definitely see people abusing relational databases is with messaging based systems, right.

Matt Godbolt

13:26

Oh my.

Ben Rady

13:27

With a message or an event based system people using the database as a, as a terrible message queue where they're writing things in and reading things back out and trying to time those things

Matt Godbolt

13:38

Select start from this where ID is greater than or equal to the last ID I got from you. And keep retrying. Oh no.

Ben Rady

13:45

Did I get any new rows in this table? Did I get any new rows in this table? And that is that, that is a huge dysfunction, right?

Matt Godbolt

13:53

That sounds. Yeah. That's definitely somebody running around with a hammer thinking everything else is a nail at that point, you're like, well, I've got my database. What else can I do with it?

Ben Rady

14:01

Right. And, and, and the thing is, is, uh, quite often systems devolve into, or evolve, evolve into sort of more event based systems because people want things faster. They want them in real time, they want you to update automatically, right? Like I want my webpage update automatically. I want my report to update automatically. And so like, they sort of evolve into these systems over time and they don't, people don't stop and sort of reevaluate and be like, all right, well, while we have like, you know, gigabytes of data and not terabytes or petabytes of data, maybe now should be the time where we take the leap to go from something that is relational to something that is more event-based. Um. And there are lots of tools for that. One of the things that I will say is that I personally think it is generally easier to create the sort of like in-memory slash stub slash fake implementations of most message systems than it is to reproduce all of SQL.

15:00

And you alluded earlier to SQL Lite, which is possibly the one exception to this, right. And it's a great tool and you should use it if you, if you find yourselves in these situations, but generally if you've got N consumers and M producers, and you just want to tie them together, like you can do that in memory for a single node to test locally pretty easily, right? So you can have your, your real producer consumer that talks to, you know, Kafka or rabbit MQ, or zero MQ, or, you know, your message bus of choice, whatever. Um, and you can have an alternate implementation of that that is not talking to any of those services, just runs entirely memory. And as soon as it receives a message, it sends it all to the producers. Cause they're all in the same process. And that makes it really easy to run things locally and test locally in a pretty realistic way. Like obviously you're not going to be able to like tease out all the weird things about your messaging system by doing that, but, you know, you can do most things. Um, and so that, that sort of particular dysfunction I think is, is a bad one. When people don't sort of take a beat to just say like, maybe we should switch

Matt Godbolt

16:02

The, the thing you just said that about the SQL is an interesting one though, because whenever I have used a SQL, um, and usually it's with something like SQL Lite that I'm actually using a file somewhere, because for some of the reasons that you said, you know, I don't need a server or stuff. Um, but the, I ended up having to wrap all of my, uh, objects, um, interactions with the database in a very high level, abstract API so that I can test them because there's no way in heck I'm going to test the SQL query itself or maybe I am, but there's only so much you can do and be sure that you've done the right thing there. So, so, you know, and then you, or, or, you know, I guess there's the traditional solution to assisted have an ORM, which then maps your objects into a database and you kind of assume the ORM just works and then you test the objects or the, uh, the ORM mapped objects and their interactions with those. And just assume, but yeah, having, having to sort of stub out something that looks like SQL doesn't sound very testable.

Ben Rady

17:01

Yeah. Well, it's just, it's just, you, you wind up with this sort of mock magic approach where it's like, okay, and maybe you do the thing where you, um, you know, you do, uh, make it till you fake it, right? So you like run it against the real database and you make sure that it really works. And then you stub out the parts that were interacting with a real database using the data that you got back.

Matt Godbolt

17:22

The results that you got back.

Ben Rady

17:24

Basically, it's like, you know, write your tests and don't mock anything out, connect to the real database and then copy paste and then edit, you know, shrink it down, you know, all that good stuff. You can do those techniques. And that's fine. It's like a kind of a brittle test.

Matt Godbolt

17:38

I was going to say...

Ben Rady

17:39

I mean, it's, it's a reliable test in that. It will be give you the same behavior every single time. You're not going to get any weird effects where it's like, I ran it and it failed. And I ran it and it passed. It's like, no, you're getting the opposite of that with that approach is good. But if you ever change your mind about what you want that SQL to be, you have to go through the whole process again and basically take the mocking out and then redoing it and putting it back in right.

Matt Godbolt

18:01

A bit more like, um, the thing we discussed with Claire with the sort of like golden tests, acceptance testing, except there isn't an obvious place to put an automated system such as the, uh, the test that she was talking about that because you have to talk to the database and then you, this manual process of like getting rid of the wheat in the chaff from the chaff. Yeah. That's yeah. You, yes. We talked about SQLite I, but although I sort of glossed over that a little bit, but SQL Lite is an in an intermediate kind of form because it has some benefits. It certainly doesn't have the drawbacks of needing a central database server with all the Dockery thing or the dev instance or whatever it can be just as it is just a local file on disk. Um, so what are you, what are your feelings about that?

Ben Rady

18:46

I think, I think that's a really good intermediate thing. There was a project that I worked on...oh God, when was that? I want to say it was like 10 years ago, but I don't even remember. But basically we had made the conscious choice to stick, to like very generic, you know, ANSI SQL basically to say, we are going to be able to work with any database, not just Postgres, not just something else, basically only for testing, right. So that we could run against SQL Lite and you could bring the whole system up with SQL Lite and be very confident that when you moved over to Postgres or MySQL or whatever we were using for production, we were using Postgres in production, but it would just, it would just work. Right. And obviously, you know, there are cases where you can find different data, vendors interpret things in different ways around the problems, but for the most part, that was a pretty good solution. And I honestly, I feel like this was a while ago. I don't know if, if that's a realistic solution anymore, honestly. Um, I feel like there might be people that are like, yeah, if you're gonna use Postgres, there's no way you can write standard SQL and have it actually get the value of Postgres that you want. Like, okay, maybe that's true.

Matt Godbolt

19:53

That's what I was going to say the value there, you know, like as soon as you start down the road of like stored procedures to update things, which of course you typically would only do if you're starting to take the benefit of like, maybe some of the more relational things in the database, because you have to atomically update three tables or something like that. In which case you would kind of maybe where we've moved out of the part that you were talking about, where you're like misusing relational databases to store non-relational information. But that's like, no, that's a valid use of a database. If it's a database like RDMS stuff, fine, go knock yourself out. But is it just a file store of JPEGs or, you know, URL shortener even a URL shortener thing is on more on the fence, but yeah. Um, is there, um, is this top of mind because of things that you're thinking about at the moment, or is this just something that came to you?

Ben Rady

20:38

I mean, it's, it's sorta touches on something that I think we were going to maybe talk about it in a different podcast, but maybe we, this will be the blend of these two things. Um, which is like the project make file. You know what I mean? Where, like I personally think that, um, Oh, who said this, this is probably not, I'm just going to attribute everything on this podcast that I can't remember who said it to Michael Feathers and that will be right be like, sixty percent of the time.

Matt Godbolt

21:08

Seems a fair...a fair guess.

Ben Rady

21:08

I've gotten a lot of wisdom in my career, Mr. Feathers. He's a wonderful person. Um, but he said, uh, you know, code is a way you treat your coworkers.

Matt Godbolt

21:17

Yes. I think that was him,

Ben Rady

21:19

It probably is him. Um, and one of those aspects I think is if you want to bring people onto a project, right? You want people to help you fundamentally, you have to help them help you. Right. You have to do things for them to make it easy for them to contribute. You can't just push it on them and be like, well, if you're a real programmer you would just read through all of these things and figuring out how it works, or, you know, read my partially up-to-date documentation that I wrote three years ago or whatever it is, you have to create an environment that is welcoming and friendly and easy to use. Otherwise they're either not going to work on it, or they're going to be forced to work on it and they're going to hate you, right. Or they're going to hate the code.

Matt Godbolt

22:02

Not you, although...they will grumble.

Ben Rady

22:02

They probably won't hate you, but they'll just, they might grumble. They might grumble about you a little bit, but they'll mostly, they'll just hate the code, right? They'll hate the thing that they're doing, which is not good. It's just like not filling up the coffee machine or leaving your smelly lunch in the fridge. This is a bad thing that you can do to your coworkers and you should not do this. And so one of the aspects of this I think is you should be able to check out a repository and run a simple command and do all the things that we have talked about on these podcasts, over many times. You should be able to run it locally and manually test it. You should be able to run the tests and verify that they pass. You should be able to deploy it. You should be able to build an artifact that is deployable.

22:43

You should be able to do all of these things. So there's not that many, it's like maybe half a dozen, right? It's like run the system, run the tests, build a deployable artifact, deploy the artifact, right. If you can do those things, then you can do most things that software engineers need to do. And you should automate all of those things. How do you automate all of those things? That's another question. The way that I've been doing it in recent years is by using make, because make is a tool that is good at resolving dependent tasks sometimes in parallel. And it's ubiquitous, like basically any Linux environment that you're going to be in is going to have, make, and yeah, make files aren't the easiest thing in the world to write, but they're actually not crazy hard to read. Like if you've already got one and you sort of understand how targets work, they're not crazy hard to read.

23:33

And you, if you're working in a compiled language, you might want to use make, or cmake, but you might want to use, make to do some stuff anyway. So you probably have it all there anyway. So it's not bad. Now, can you do this with shell scripts? Absolutely. You can. I have. It works great. Can you do it with other tools? Sure. Again, applying boring is awesome. I would go for a more boring tool here because there are definitely some boring solutions, but that is the thing that I think is important. So to answer your question of why is this top of mind for me it's because I've had a few projects recently that have had data that was marginally relational and certainly not very big that depended on a relational database. That was like,

Matt Godbolt

24:18

I figured that was a wound here.

Ben Rady

24:20

And, and the instructions are in the Readme, are like install Postgres, load, these schema, you know, create these tables by loading the schema in and then configure the Postgres URL to this. And then you can start the system up

Matt Godbolt

24:32

And you're like, no, make, I want to do make test. And if it needs Postgres, then fine. It may be even, it can bring a Docker, whatever something or any pod man, uh, database, there should be no manual steps in this. That's the critical there. Exactly. Anytime that the, yeah, I, I think, you know, well, you and I agree on this very, very strongly, right? Every, every project that I've worked on and I've had so much positive feedback from people that say, like, I can't, I love it when it's your project, because I just do git clone. And then I type make, and the compiler itself even gets installed on my computer and just works. I'm like, yes, that's, that's how it should be. If I need a magical version of GCC, because I need this particular flag, then I will arrange for that to be on your computer as a result of typing make, as opposed to here's a list of sudo apt get install crap that you have to do first.

25:21

Like that's not that that should not ever, um, be, be, uh, allowed. Um, and yeah, I mean, there's a variety of, uh, of open source projects that I've worked on that I'll have a similar thing. And I think it's a big bit. And if I actually, I heard someone raised a bug recently, cause it's one of the things that stopped working. But mostly I can point people at say compiler Explorer and say, yeah, you know how you get it running locally, make, and it'll churn away for a bit. And then you go to port 10240, and then you've got your own local install of it. And it's like, people are like, Oh, I was expecting there to be more. No, it's just that because that's all you need again, it's broken right now apparently, but, but, but yet.

Ben Rady

25:59

working on it

Matt Godbolt

26:00

it's, I think it's a valuable, um, an important thing. It's just a, and as an API, make, you can go far worse than make as you say. I mean, NPM is sort of does it for the JavaScript community at some point and there's you know mavens and things and whatever, but you know, you make, you can run those. Um, I usually have a make file, at the top of my project that maybe even run cmake that then runs Ninja for all, you know, but you know, you don't have to know that if you're just saying no, make the project, I don't care. It's like, well, there's layers and layers of things going on. You don't need to know about it. Hey, Conan is being installed in a virtual Python, virtual env on your machine. And we're installing all the dependencies through Conan. Right. But again, you don't need to know that you, it just works for you and it's all done through the magic of make.

Ben Rady

26:41

Yep. And that serves two really important purposes. Actually. One is that it is this sort of like, you know, code is a way you treat your coworkers thing. Um, but the other thing is, is that it is absolutely correct form of documentation, right? Like how do you configure and build and deploy the system? Well, it's all here. I'm a hundred sure it's correct because we use it every day, all day.

Matt Godbolt

27:06

It's how my CI runs, it's how my deploys run. It's how I run locally. Yeah.

Ben Rady

27:11

Right. So not only can you, can you read that to figure out how it all works, but you can confidently change it and know like, Oh, if I make this change here, everyone's going to do it like this with this version of the code, there's no like, separate like, Oh, well there's the build, but then there's the code and you have to keep them in sync. And if you roll back one, you got to roll back the other. It's all together. It's all in one place and it all works. And so there's huge value sort of documentation and, and coordination value in automating those things. And this is, I mean, to me, it's sort of one of those things you just have to choose to do, right? Like we kind of talked before about like, you know, we're wizards, we can do anything. What you choose to do in the sea of all possible things is going to determine a lot about what your working environment is like and what you're able to do and what you're not able to do. I don't think anyone, I don't think any of our listener listening to this podcast are

Matt Godbolt

28:08

I'm reliably informed that we have at least two, actually now I was talking to somebody other than all respective spouses.

Ben Rady

28:16

Right. So both of our listeners would agree, I think, that any of these things that we've talked about on this podcast are possible to do, right. It's just a question of, should you do them? And I think that you kind of just have to start with the decision of like, yes, I'm going to automate this stuff entirely so that you can just right. Type make. And yes, that might lead me, lead me down some strange paths where I'm building tools to make sure that this is possible to do this, but if you make the decision to do it, then everything sort of will follow on from that. If you're committed to it.

Matt Godbolt

28:53

Just like we said earlier, as well as, you know, if you start, if you can start from the beginning in that way, it's harder than sorry. It's easier than retrofitting it later. So if you're like, well, it's just always been the case, you type make and it gets everything. And we started with just hello world and, you know, we've got the compiler and we've got the thing building. And then we, we added a dependency on a third party library. Okay. We're going to make sure that that comes down as part of the make file and you sort of incrementally put it on rather than having it, um, uh, trying to sort of retrofit it. Yeah. I it's, it's easier to do those kinds of things, but again, I think you're right. It's an effort of will on your own part that you have to make that decision.

29:31

This is going to be worth it. I'm going to take a hit early on. And I mean, once you've done it a few times, it's not even a hit, it's just a way of life. Right? So it's sort of the zen. The Dao, all of a new project is, you know, new directory. The very first thing I do is VI make file. And I'll paste in, uh, we're saying there's a really nice little pattern that we've, um, we've picked up along the way. And, uh, both of them, well, I've picked up from you, but I think you picked up from Jake McCrary who picked up from someone else of having like a help target that sort of greps itself out of the make file and just with a bit of like awk and sed and magical things kind of makes an auto help page for your make file. And so you can just, it may be your default target is that as well. So if you type make it just says, Hey, these are the things you can do. And you're like, that's great. Um, but yeah, so that's what I'll, I'll paste that snippet in, into my make file. And then I'll like, just have a make echo target that just says hello world. And then start from there.

Ben Rady

30:26

Yeah. That help file thing I think is, is nice. Like, you know, just sort of gives you that sort of half dozen here are the things you can do as a developer and you sort of get people started. The other thing about this that is not only is this, uh, something that, you know, if you started early, you know, sure you get that momentum going, it's easier if you start it early, there's actually lots of situations where you can tap into the power of laziness in order to get people to do the right thing, which is an, a great example of this, I think is continuous deployment. So if on day one, you, you know, follow my advice and say, the first thing you do is deploy, right? So the deploy your hello world that does basically nothing and have it automatically deploy whenever you push to the main branch, then it will be difficult to not have production in sync with the main branch, because it's going to do that automatically whenever you deploy.

31:22

And people will just orient their behaviors around that from the start, there'll be like, well, if we push to the main branch it's going to deploy, so how do we make sure that that doesn't break? Well, I know I'll write a test or I know I'll do this other thing or whatever. Right? You've got smart people, they'll figure it out. But if, if you start with that philosophy, it actually becomes the easy thing to do to do it right, as opposed to this extra step that you have to take. Um, but you have to start there or you have to very quickly get there. Um, because if you go in later, it was like, well, we're going to, we're going to deploy to production every time you push to the main branch, a hundred very valid reasons why that's a bad idea, right. And you should not do that.

Matt Godbolt

32:04

And that's actually an intro that you, you said you reminded me of, um, uh, a couple of issues I've seen in the last couple of weeks, which have both come down to not projects, not auto pushing on the latest version. And then later on, somebody act accidentally or, you know, as a side effect, pushing of new version of project and breaking other things, because, uh, it was like a, a relatively significant number of changes that got rolled out to a system. And you're like, no, if you, if, if it's pushed every time you push, then we'd find out a lot earlier. And it would be causally linked with the thing that you had just done as opposed to, but I just did this thing. How long was, can that affect this other thing? Oh, I picked up two weeks worth of changes in one go, ah!

Ben Rady

32:46

It's shocking how much? And I mean, if you, if you talk to anybody that was into like lean systems and the lean stuff, like, you know, 10 years ago or whatever, they'll tell you this, obviously, but it's like, there's, it's tracking how much queuing theory there is in software, software development management, and process and stuff, right? Like if you, if you understand queuing theory really well, you can start to see those things in how developers push out changes, right. And, and, you know, the whole Toyota production system and all that sort of fed into all this stuff. This was, this is what the cool kids were talking about. Like 10 years ago,

Matt Godbolt

33:24

I'm not one of those.

Ben Rady

33:24

Really the post, the post agile people, uh, the agile, some of the agile refugees that were like, you know, why are y'all talking about stand-ups and cards and things. I just want to build stuff.

33:38

Um, but yeah, like, like queuing up changes. So like a perfect example of this is exactly what you're talking about is the longer you queue up changes, the more, uh, the more cost there is to actually deploying those changes. And that happens in multiple dimensions. One is, is that you've lost context, right? The people who made the changes, uh, just have slept since then. And they just don't have the sort of top of mind, uh, knowledge that they would've had. If there was like, all right, I just built this thing and now I'm going to deploy this thing. Hey, it broke. That's probably the thing that I just changed, you know, exactly what's going on. And it's all the cache is all still warm. Right? It's all, it's all up there. Um, the other thing is that you, you can, uh, unfortunately, sometimes, uh, defer those bugs for your coworkers, which, you know, not only have they slept since then, they are not you, which means they don't know anything about this change that's going out.

Matt Godbolt

34:34

Which is what happened to me, yep.

Ben Rady

34:35

Exactly, exactly. So that can increase the cost. And the other thing is, is that you get errors on top of errors, right? So somebody checks in a change that breaks something, somebody goes and makes another change and they look at your code and they go, okay, well, apparently that's how it works now. I'll do that and they're doing something wrong. And then they make two of those things that are wrong. And that just sort of compounds on top of each other until the thing finally hits the real world. And then that whole chain of things breaks because we were building wrong on top of wrong this whole time. And you never knew it. Um, so those are, I mean, those are just some sort of basic ways, but it's like this, this, this general problem of, if you're queuing up changes to your system, you're taking on a lot of risk and you gotta be really careful that that risk is actually worth it.

35:19

Sometimes it is sometimes you can't just do things where it's like, yeah, literally every change just goes right to pride. You know, there are, there are situations where that can't happen. Lots of situations. Right. But, but understanding that your goal should always be to shrink it. Right. And to, and to also just recognize, like, if you can't do this, we'll hear some of the problems that you're going to encounter. You're going to encounter the problem like you saw today. Okay. How do we deal with that problem when it happens? Um, one of the things that I have advocated for a long time is that git revert is not a personal insult. Um, reverting commits is something that you should take advantage of, right? Like it's not, you're not, you will have a much more complicated operational process. If you have this mentality of everything that everyone has ever committed to this repository must be either fixed or remain pristine or never get rolled back.

36:13

Like your life will be made so much easier if you just sort of have a meeting where we all come together and be like, all right, everyone in this room, we're all going to agree if I revert your commit. It's because I love you. And I want you to be able to go on vacation and not have to worry that the code that you've committed to the repository is perfect and unassailable in all ways, you can leave the building and go home to your family and loved ones. And if I see that you've made a mistake as we all do, I'm just going to revert it. And I'm going to tell you that when you come in and the next day you'd be like, yep, Ben reverted my commit. Thank you for reverting my commit. That means I can fix this now at my own leisure and not have to be woken up at two o'clock in the morning by pages or interrupted by dinner saying, Hey, Ben, you committed a bunch of bad code. And then you left the building and now we need you to fix it right now. It's like, can't you...can't you just revert it.

Matt Godbolt

36:58

I, I love this. I think this is a brilliant analogy. Yeah. Because there is you're right. I mean, isn't that funny social issue that, yeah, I do feel guilty. Reverting someone else's changes like, you know, uh, somehow, uh, uh, a bad reflection on them when it's like, no, it is a pragmatic thing that I'm doing to buy us back the stability that we had before, unless the change was required for operational reasons. Then often as you say, it's like, well, okay, you can come back in tomorrow and you can revert the revert and then you can fix whatever issue it was. And then you can, yeah. No harm, no harm done. And I'd like to think that if someone reverted, one of my changes, I wouldn't feel put upon, but you know, it's a it's it's it's I, yeah, I do like the, if I revert your change because I love you and I want you to have a lovely evening without me, or a vacation or whatever.

Ben Rady

37:51

Yes. Whatever it is, whatever it is, I want you to be happy. I want you to be uninterrupted in your life and I'm just going to revert your change and then we'll talk about it tomorrow. Right. Or whatever, right after lunch, whatever it might be. Um, and it's, it's, it's one of these things of like, I feel like if you can adapt some of these things we've talked before on this, on this podcast about like, you know, the reason that I got so interested in engineering practices, agile engineering practices in particular is because I sort of realized it's like, if there's certain things that you do and you do them well, there's a whole host of other things that you don't need to do. Right. And I feel like this is an example of that, where it's like, if you get comfortable with this as a team, as a, as an organization where it's like, yeah.

38:32

When we commit, it goes right to master and it goes to, or it goes to the main branch goes, main branch gets deployed to production. That's just how everything works. If we run into problems, we revert to commit and then we've got a reverted commit. And then that gets deployed. Now the problem is fixed. Right? If you do that, you don't have the queuing problems. You don't really have to worry that much about versioning and like keeping old versions. Like you have sort of a nice thing of like, you know, depending on the particulars of your project, not every project is gonna be able to do this, but you can get into situations where it's like, you know, unless you find yourself very often needing to roll back and your deployment system, however it is, doesn't let you just roll back to a sha, which some do, right?

39:16

Like if depending on how you set it up, you can say, okay, well, everything we've ever deployed is just, you know, marked by the sha. And if you want to roll back to a particular commit and you can roll back, you can rerun that sha, but there's like a whole bunch of versioning things that you probably also don't have to worry that much about like your solutions to those can be significantly simpler because you know, you're just reverting commits instead of, you know, all, we need to roll back to version 1.27, and then where's where version 1.27. I don't know. I started in Artifactory or whatever fetch it from Artifactory. There's just a bunch of stuff you don't have to build. So I think, and again, not every project is going to be able to do this. This is not a universal solution, but I think the main thing is just sort of thinking in these terms and trying to like simplify things in these ways, you'd be surprised at what clever solutions you can come up with. If you just embrace the philosophy of it, right. You start with the philosophy and be like, how do I, how do we get as close to this as we can? Work back from that.

Matt Godbolt

40:13

So how do we get to that from databases? I feel like somewhere on the line, I know there is a link, but you kind of switched gears. And another thing, makefiles!

Ben Rady

40:21

I did, but yeah.

Matt Godbolt

40:25

So my understanding is that we got there from like, if you don't have to fire up a giant database or run against the big database, then that enables you to have the kind of self-contained hermetic project where you just clone the project and type make, and you can run all the tests. You can do all the deployment, you can do everything within that world without having some exogenous dependency an exogenous unnecessary dependency on a database, just trying to make sure that we've got the trail.

Ben Rady

40:54

To tie all these ranty pieces together? That's a good idea. That's exactly right. It's it's, it's, you know, everyone sort of agrees that simpler is better and, and all we disagree about is what does it, what does it mean to be simpler? Some people would say like, why are you, you know, building, you're writing your own code to scan a file to query things that you could just throw into a relational database. Isn't it simpler to just write a little bit of query instead of to write a hundred lines of code. And my argument a lot of the time, and again, this is very context sensitive, but a lot of the time is no, I'd rather write a hundred lines of code than have a database, because if I have to maintain a database, then I can't do all these other things. Other things are more valuable to me.

Matt Godbolt

41:35

There's a hidden cost.

Ben Rady

41:35

Yes, exactly. The other things are more valuable to me than saving myself a hundred lines of code. Right. I'll just write the a hundred lines of code. It'll be fine. And then that means that when you clone my repository and write, make run, the system comes up and you can use it just as a user would with no special stuff to have to make it work. And when I deploy it, I know exactly how it's going to work, because I don't have to coordinate the deployment of the software with the deployment of a database or write database migrations that go from one thing to the other thing or any of that, because I have my a hundred lines of code to replace all of that.

Matt Godbolt

42:09

Cool. Well, I think that is databases fully covered. We need to come up with a better ending than that. Maybe we could stop a bit earlier than this and I'll just do some magic editing. Cause that seemed like a natural end point. Or maybe I can, maybe we just put this into it and then everyone can just see how rubbish we are finishing things.

Ben Rady

42:31

How bad are we at endings? We are this bad at ending.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript