Rails at Super Scale with Kyle d'Oliveira - RUBY 667 | Ruby Rogues podcast

Speaker 1

00:05

Hi, everyone, Welcome to another episode of Ruby Rogues. I'm David Kimura, and today on our panel we have Matt Smith, Luke Sutters II, and we have a special guest, Kyle da OLIVERA. Do I say that right?

Speaker 2

00:19

That's Delivera Doliver. Yeah.

Speaker 1

00:22

So, Kyle, would you mind telling us a bit about who you are, who you work for, and some of the things that you're doing.

Speaker 2

00:28

Sure. My name is Kyle. I've been working for a company named CLEO. Let's say Legal Practice Management SaaS software. It's based out of Vancouver, Canada. It makes practice management software aimed at lawyers. We're looking at transforming the legal space. Our mission is to transform the practice of law for good. There's a nice little double upon for there. And it's been really interesting seeing some of the changes in legal that we've kind of made an impact with over the

00:57

last few years. I've been working on Rubyan rails for the better part of the last decade. But when I started working on rails it was rails versions zero and I've been upgrading rails ever since and so now finally up to rail six and so touching all of the major versions. My major focus at CLEO, which I've been at now for eight years has been on the back

01:24

end infrastructure side of things. So the main focus is scalability for the codebase, but also in the terms of the organization, like what happens when we have two hundred developers working, what happens when the dataset sizes are to the size where we can exhaust regular integers and we need to actually go into like vacants. We look at approachability. How easy can we just take a new developer and dump them into the codebase and have them up and running.

01:51

Because as things go to scale, there's obviously new patterns that need to be adhere to that you know, we don't necessarily need to focus on with small project, but we do need to focus on for large projects, and my team has focused a lot of that to making the effort and experience for all of the developers easy and fast.

Speaker 1

02:10

Yeah. Absolutely. One thing that kind of rings true is you always have to think about scalability when you're developing, but don't actually write for scalability when you're developing. So keep it in the back of your head saying is this going to come back and bite me later or is it you know, a really non issue. I remember one time I had a situation where I was soaring just three kilobytes of data in a database, and I thought, Okay,

02:40

this is going to get used a little bit. They were images, so you can kind of see where this is going. I'm like, you know, that's not a big deal. It's only three kilobytes. But unexpectedly the consumers loved the feature that it was supporting. And now that single table is over thirty gigabytes and it had as millions upon millions of records. I'm like, oh, that was an unexpected But I guess that's kind of where I did not

03:08

think of scale at the time or proper way. So introducing that kind of technical debt kind of painted us in a corner because now transitioning away from that model is going to be a pain when you're dealing with that much data.

Speaker 2

03:22

Yeah. Absolutely, it's hard to know what you don't know. So if you don't think about the scale at that point in time, it's hard to know what problems you're even going to run into.

Speaker 1

03:31

So you gave a talk last year about death by one thousand commits. Could you give us a high level overview of that talk and kind of some of the things that entails.

Speaker 2

03:44

Yeah, So working at CLEO, the code base is quite large. We have tens of thousands of commits that we go through, and it's really easy to see patterns of developers working on features. The features go live, and at some point in the next six months a year, those features come back to bite us. So as like the first commit is great, the tenth commit is you're starting to notice some things. By the hundred, there's maybe some problems. And

04:14

by the thousands, lat thousands commit on it. Right, you've stopped because now you have to completely refactor and rebuild a lot of this technical debt that you introduced. So my talk was talking about some of the lessons that we've learned. And although these the lessons are very specific to specific problems, there's kind of a generalized idea of what some approaches that you can take to dealing with

04:38

technical debt in your own projects. If you are able to, for instance, keep if you're able to automate technical debt away entirely, well, now there's a whole classification of problems you no longer need to think about, and you can feel confident that those are just automatically protected and if you are cleaning up after yourself as you go and making it easier when there are curve balls being thrown at you, Fixing technical debt and dealing with it when

05:02

you hit scale doesn't have to stop you entirely. It just becomes a constant, small tax that you pay. But if you invest in the tools, you can actually start moving faster even as you.

Speaker 1

05:12

Scale right, and so would you mine also explaining what technical debt is, what would you consider technical debt and what are some things that you would maybe not consider technical debt? Kind of like some debusting myths about technical debt, I would.

Speaker 2

05:28

Say technical debt is like accumulation of decisions that are made while coding that you eventually need to correct in the future. And as developers, I think we're always making these decisions. Can we cut a corner here to deliver a feature out a little bit early? And I think those are like technical debt isn't bad. I think when you are willing to get something in front of the users and deliver value earlier by incurring a little bit of this technical debt that you then have to clean up,

05:59

I think that's totally okay. But I think technical debt often comes in the situation of developers making a decision that a framework needs to be super generic and it's got a little bit speculative, and then they come to implement something in the future and it's just really difficult to deal with because it's so generic and hard to understand that new developers have to then unpack that and wind it back just to implement something new in it.

06:27

Some things that I think are not necessarily technical debt can kind of come from maybe decisions that actually made sense at the time and aren't necessarily any cutting a corner. So I mean, it may make sense to build a system that is very generic, and maybe that is the correct choice, and you build through and then things change, and when things change, that's when you might have to have like the technical debt comes back. But until the things change, it actually might not be I think that's

06:57

a bit of like a generic answer. But it's hard to pin down a concept like technical debt because almost everything we write is debt of some form.

Speaker 1

07:08

Yeah, I definitely have to agree with that. So where are some of the real world examples that you guys have experienced over your years, where at the time you made a decision and you or the team thought like this was a great choice, this is the right way to do it, but then later you found that it became more troublesome or more of a headache than it was worth.

Speaker 2

07:32

One of the things that popped up is actually something that you know, we decided on because the rails community pushes for and this is what comes out of the box. So if you think about RAILS migrations, if you think about how they're often applied, if you think about some examples that you've worked on, there are often times where you use something like a tool like Capistrano, which deploys some code and as part of the code database, migration

07:54

gets run. And for projects that's fine, that's like a for most small things like that, migration that runs is fast and it's not a problem. But so this is an example of a decision that we kind of were like, let's just inherit what the community uses. But as we started scaling out, we started encountering problems with it. So, for instance, a table that if you ran a migration on it took thirty minutes. This means that our deployment took thirty minutes. It also timed out so we lost

08:22

all of the context of it. But also during this period of time, the table locked, so any developer or any queries that started going to that table stopped being answered. So all of our servers shut down, and we couldn't kill the altered table because it was already mid progress. And after it finished, we now had a table in UH with like a new state, but the code hadn't actually finished deploying, so now we're running into different problems.

08:50

So this is a little bit of a decision that it makes a lot of sense when you're small, like go really quick because you can, and it makes sense. But when you hit a certain piece of scale, well you can no longer run with those assumptions and you need to change those. So a new process needs to be built. And for database migrations, we need to build them in a way that are like entirely asynchronous to a deployment process.

Speaker 3

09:11

Thirteen minutes is a that's quite a migration.

Speaker 1

09:14

Yeah.

Speaker 2

09:15

I think the table, this table that we run uses us stores a little bit of all of the activity that users do, and it was like the first table we ran into that it exhausted like thirty two bit integers and we needed to flip the IDs to be Biggins. We didn't think that would be a problem either, and it's it's leaps and bounds bigger than any of the other tables we have in our system.

Speaker 3

09:37

I'm going to ask you of this question now, which is how do you make your system capable of asynchronous table migrations.

Speaker 2

09:46

There's actually there's good question, and there's actually a lot of tools that exist that we don't necessarily need to build ourselves. GitHub has a tool called ghost. There's another tool by Percona in the Percona Toolkit. I can't remember it's like maybe online schema replacement. Can't remember the exact name. But the general strategy is to instead of changing a table with like an altered table, you actually create a

10:09

brand new table, populate that table with various mechanisms. Some of them use triggers, some of them use the binary logs, get the table to like a table that's in sync, and then do quick renames. And so you rename the table to be the old one to be old. You change the new table to be the new one, and the new queries start flowing into this new table. And

10:28

you can do this as long as you want. It's entirely non blocking, but it has to be in a process that exists entirely outside of like the deployment ZECH.

Speaker 1

10:37

Yeah, and that could have its own issues if you have you know, thousands of requests per second coming in. So yeah, definitely not a fun problem to solve. And it's also I guess good to know what kind of migration or really what kind of SEQL functions will cause a table lock. So adding an index or adding a column and stuff can your table. So being aware of what actually is going to lock the table is really good information to know.

Speaker 2

11:07

Some of them seem obvious, like I think if you're dropping a column or adding a column, that could potentially lock, But some of them are not. Like if you changed a VarChar from like a VarChar one hundred to var ar chart two hundred and you're just increasing it, does that lock? Maybe? I actually don't know off the top of my head. What if you change the character set, what if you changed the coalition? I don't know.

Speaker 3

11:25

Is this on my sequel or post ris.

Speaker 2

11:28

This was in We use Percona, which is just an offshoot of my sequel, so it'll also be different for between databases. So PROCONA might have different decisions.

Speaker 3

11:36

Shout out to the ConA guys. I've done some works in a place where we had some Pocona consultancy. They were really good, really delivered.

Speaker 1

11:44

So that kind of covers the database and scheme a side of things. To step away from the code you had mentioned about onboarding people with a larger client base. What does that process look like for you guys, And how do you really bring a junior or mid developer into the company and have them productive quickly.

Speaker 2

12:05

Yeah, so a lot of this comes from tooling and education. Right, there's as like senior developers or people who have just different experience from different places. We've accumulated huge amounts of knowledge and it's kind of all tribal, and I think the if you join a company that doesn't have a great strategy, a lot of the strategies for sharing that knowledge is like just work together, go submit pull requests and have them code review and learn from the code review.

12:33

And I think that's okay, you can learn that way, But there are better ways to push information to people. And this is a concept about like just in time education. So one of the an interesting example of this can be through the linters. So I did to talk about this as well for the twenty twenty Couch edition of Rails comp called Communicating with Cops that focused on using rubocop as a mech isn't to provide education. Did a little bit of deep dive into how ubi cop works

13:04

and how to build your own custom camp. But one of the things that we approach with that CLEO is as people make mistakes and learn about bad patterns, we try to codify those patterns so that it's it doesn't happen again, but people get education about it right as it happens. A good example of this that is super trivialent doesn't often bite people until like there's just an unexpected case would be maybe the Rail's convention of naming files.

13:32

We've seen cases where people maybe make like a user's model, but then make a typo in like the spec. So rather than call them like user, spec called it users and it's plural or something along those lines, And you know this is like the spec is still run, but there might be some tooling that we expect to adhere to the Rails convention and it doesn't quite line up.

13:51

So you can have a linter that basically checks the name of the files and the name of the classes and make sure that they're in line, and if not, alert people and do that as part of their editor or do that as part of them committing code. And they get warnings and they get education as they're writing code. So they just wrote something, they save the file, they get a little warning popped up being like, hey, you may have made a type out here. And this goes

14:15

even too as far as behavior. If we know that there exists bad patterns, so for instance, making an HTTP call inside of a transaction, which we know is going to be potentially bad, we can actually automatically prevent that as soon as that starts happening, as soon as we're able to detect it. So it might be in a test, might be as part of a winter. We provide that education right back to the developer so that they understand what they did wrong and the avenues of what they

14:40

need to do to fix it. So now when a junior developer enters the company, they can actually just feel free to start writing code, and write even code in kind of a way that maybe breaks some patterns, and a lot of time they're going to start getting education right away, and then we can do all of the usual things as well. As pull requests come in, we

14:59

can review them and provide more education that way. And if we find constant patterns of every junior developer we come in makes the same mistake, let's codify that so that they get the feedback immediately.

Speaker 1

15:10

Yeah, that's kind of one of my pet peeves. I guess you could say with linting is that if a particular project has a set of practices it likes to follow, maybe it is no more than one hundred characters on online, that kind of feedback should never happen in a code review. That if you have those kind of expectations, then they need to be known expectations via a linter, whether it's Rubacopper standard RB, and it should never be an unknown

15:42

exception to our unknown expectation to the developer. So I'm definitely on board with that, And that's something that I've had to fight and struggle with, is going through code reviews and having everything kind of nitpicked, because one, it decreases the morale of the developer if every pull request they're making it's just getting bombarded with styling quirks or requests to change. So I could definitely agree with that point.

16:14

And I think that every project should adopt some kind of linter if there are expectations of what they're doing. Even if you bring in rubocop, you disable everything by default and then you just start adding in or allowing which exceptions your team follows on that particular project.

Speaker 2

16:34

Yeah. Absolutely, And I think there's even one step farther of a lot of linters can do auto correcting. So if you you know, if you care about having on one line space between methods, don't even have rubo cop or a linter warned about that. Just auto fix it. Like that's something that developer just doesn't need to worry about. And you know, it also removes a lot of this argument over like should I use double quotes? Should I use single quotes? If it just auto medically fixed and

17:00

developer can write whatever that they want, that's fine. But I've also run into issues of having pull requests being bombarded by style and it really distracts from the code review about the behavior.

Speaker 1

17:11

Yeah. Absolutely, although you do have to be careful about the auto correction. I remember one time in my earlier days of development, when Ruby mind came out, I tried out ruby Mine's code refactoring thing. I forget what they call it, but I had some really poorly written classes

17:30

and it just absolutely broke everything. Like, I have no idea how that happened, but things just were not working the way they were before, and I had to pull that merge back out because you know, of course, as a early developer, I didn't have any tests on the application, so I didn't really notice that things were broken until they got deployed.

Speaker 2

17:50

Yeah. Yeah, I did definitely need to be careful there.

Speaker 1

17:54

So you also previously mentioned about so not necessarily on boarding developers, but having a lot of developers work on the project. So what point do you go from a small shop to a large shop where you have to start putting different kinds of practices in place? And what are those kind of practices when you're dealing with a lot of developers on a single code base.

Speaker 2

18:16

So I actually it's not clear where that point exists. I think it's probably going to be different for every organization, and probably different for exactly the work that you're running into. I think the thing is to be listen to the pain points of the developers. So if you notice that there are you know, there's pieces of friction that occur between developers, like that's the point where maybe there's actually some tools that need to be built to make this easier.

18:41

So one thing that I think comes up really quickly in organizations is often the concept of like a testing server. So you've got your developers environment, you've got your your maybe your CI but maybe you want like a production like environment for things, and so you have a staging serve. You know, when there's five developers, it's really just coordinate and be like, oh, staging is mine now, I'm going

19:02

to test something. When it's done, I will hand it off and maybe reset it back to whatever the master branch and let people work that way. But that really falls apart when you have one hundred developers. How do you coordinate one server where everyone is trying to test something. If you have one hundred developers fighting for that resource, you can kind of budget a little bit by maybe having a fixed number, and you know, you round robbing them out, but again, at some point that's going to

19:29

break down. So if you think about, like, what's the problem here is that every developer wants to potentially test something on an asynchronous schedule, maybe it actually makes sense to build some tooling so that you can spin up a like staging servers on like Amazon Easy two or on Google on Demand and just route them there. And so that's something that we ended up having to do really early of building our own toolings so that we can we call them beta environments where we can have

19:58

arbitrary number of them. Someone spent the effort to basically say, like this branch on GitHub, I want a clone of the site on Amazon, and within like ten minutes, you've got a domain that points to it. You've got the full stack, you can you have full control, you can do whatever you want, you can break it, and it gives developers a lot of autonomy to test things that they want, and you know, removes a lot of this, Oh,

20:25

let's deploy it and see what happens. You have a full environment that you have full control over, Go test it, go see it with as much data as you want, and then see what happens. Another example kind of along those things is like deployments. Do you have a handful of senior developers who can deploy or do you deploy on like a every Monday you do a big deployment that's going to start really breaking down when you have

20:47

a lot of developers. You know, at CLEO, everyone has the ability to deploy, everyone has the ability to merge code. So we give the power to the developers and now you know, a junior developer can come in, write a fix to a read me, merge the code, deploy it without having to really bother people outside of getting a code review. And know, now we're deploying code probably upwards of thirty ish times a day, and that number is

21:12

just only going to go up. And so as we're running into these issues, we are just looking at what can we do to build tooling so that it's no longer frustrating for developers. And the important part of this is developers need to voice things, and you know, managers and companies need to listen. If we're wasting five hours a week per developer on this one thing that's frustrating, like build tooling around it.

Speaker 1

21:34

Yeah, that's one of the things that I did just for my own hobby project and just continual learning, is that a self hosted git lab instance and I set up a Kubernet server which will automatically create the infrastructure for the application that got pushed, so it always happens on any kind of development or master branch push and then also on each commit up to the repository and they'll spin up an entire infrastructure within Kubernetes with the FQDN that that feature can then be tested. So it

22:10

works on smaller applications. I don't know how it would work on applications that consume thirty gigs of RAM of resources, but I think on smaller applications that kind of thing can really save you from having to have dedicated test servers that shared by several people.

Speaker 3

22:27

When are you going to do an episode on that, Dave?

Speaker 1

22:32

I do have a Jefre Ruby episode on Kubernetes, which that's where I got the inspiration from. On that episode, I just didn't tied into the CICD portion.

Speaker 3

22:44

I got a I got a question for you, Kyle. It sounds like you've got to You've got a lot of data if you're running thirty minute migrations, and you've got a lot of developers, and you've got a good testing, good infrastructure. What I what I've found is a lot of kind of real memorable problems I've had is where you get something running and it feels like it's going to be fine, but then it gets deployed to the master database and that's the point at which there's some

23:15

bad data in there. There's something in there from ages ago, from a previous version, and it absolutely sinks you. And these days, whenever I possibly can, I just pulled the entire production database out and test against that. Do you do that or is your database just so huge, kind of throwed around. You can't do that, especially with a lot of her developers.

Speaker 2

23:40

It used to be something that we did. We used to have We used to call it the snapshot, and you could point environments at the snapshot and run test queries on it, but we actually do it did hit a size where the time it took to set up the snapshot every day was taking longer than it would take to actually back it up. So it was just

23:59

starting to become unfeasible for us. And we're also dealing with sensitive data and we don't necessarily want to give free access to all of that data for our our clients, so we instead try to invest in a little bit of tooling. We definitely still have issues where everything looks good in development, everything looks good in like data or test and we deployed to production, something is wrong. So we think about what can we do to make that better?

24:24

And so we you know, if it's about a lack of index on like a database query or something like that, we can try to check that ahead of time and build some tooling and alert people when something goes wrong. But also in production we can be say like, hey, this query took thirty minutes, that's unacceptable, this career took five minutes and return that information as like an exception to the developers that they need to fix, but without

24:48

interrupting the actual request behavior. And if things go really south, just roll it back, like we don't really have. It's not a blame if someone deploys thing and goes south and they quickly roll it back. We just try to take that as a learning opportunity. And how can we take that learning opportunity and share it to everybody so that everyone learns from it. Then answer your question.

Speaker 3

25:10

Yeah, I mean you must be dealing with a lot of data, and I've worked with you call it hipA data in the States where it's kind of confidential data, and that hugely complicates testing data chranswers because you have to have to either heavily anonymize or write your own tools kind of replicate a few one hundred thousand medical records.

Speaker 2

25:34

Yeah. All we can also do is I mentioned earlier that we could talk. We have these data environments that we can spin up. You just use like a sequel dump to store data in there. And although this isn't necessarily production data, developers have full control over what that data looks like. And so you know, if we wanted to see what happens if there is tens of thousands of something in a table or more, we can just build like little scripts that can feed this seed that

25:59

database and then test it outside of production. It's not perfect because it doesn't always match the same shape as production, but you can. It's an iterative process, and that information gets codified, so you can keep adding to the seeds in those manner so that it becomes a better and better representation as.

Speaker 1

26:15

We go forward. Yeah, so kind of back to the technical debt. I have a unfortunate story of something that I inherited one time where I think metaprogramming is awesome and can do a lot of really cool things and can really get you out of a bind in certain situations, but then it can also be overly abused. And I was searching for a function that was not working properly within Ruby, and I couldn't find it in the code base at all. So I thought, okay, well, surely that

26:47

this is in the gym or something. So I started looking at all the gems that's included into this Rails application, started tearing apart the gems, opening them to search for this function. Still couldn't find it. Turns out they are doing a classy val on something that's pulled from the database. So they actually stored Ruby functions as column or data within a column on the database, and that's what was getting executed. That's where the function was defined. So to me, that's a what's that?

Speaker 3

27:23

What's wrong with that?

Speaker 1

27:26

Yeah, so you know, other than you could not possibly even test that bit of code any with any kind of reason. But it was a nightmare. So just a warning to when you think that you're doing something really cool and elegant that's avoiding code duplication or whatever. I would much rather have code duplication all across my application than having that level of obfuscation where you're never going to be able to even remotely troubleshoot it.

Speaker 2

27:55

Yeah, metaprogramming is a like it's actually one of the best lengths of Ruby. You can do so much with it, but it's on you once you have it. It's the hammer and everything is a nail, and you want to use it. And that's that's often a trap that new developers when they learn about metaprogramming, they really want to

28:11

go into. I think a good lesson to come out of that story is that if you think about code, it's written once but read countless times, and so if you can take the little things to optimize the code for the reader, that is much better than sacrificing readability

28:30

to optimize for the writer. So if it takes you an extra thirty minutes to write a whole bunch of cookie cutter methods, but now those methods are in place and they're static and it's easy to read and reason about end test, that is well worth that thirty minutes, because you're going to lose more than that reading that piece of code in the future.

Speaker 1

28:50

Yeah. Absolutely, And it could even be taken into something like private methods where if you have a class which has a bunch of methods, start sorting them out which ones are private methods so they do not need to be accessible to the consumer, because I've had situations where I've worked on a class that grew over a thousand lines and there were hundreds of methods in there, and I had no idea which ones were publicly accessible, that were truly supposed to be publicly accessible, and which ones

29:24

were really meant to be private. So not having that level of abstraction, so to speak, you lose a lot of visibility in how important is this class to the consumer.

Speaker 2

29:37

Yeah. Absolutely, anything that you can do to make those kind of classes easier to understand and read for a new person is great. And also just backing up a little bit to your example, this is also an instance where metaprogramming bit you, but metaprogramming is also interesting that it could save you because you can also ask Ruby

29:55

about Ruby. So if anyone didn't know what this is a attack that I use all the time for debuton pieces of code that I've never been familiar with, you can If you can have access to a console, you can ask Ruby what methods are available with like a dot methods call. You can also get access to the method itself and then ask it like, what is its source, where does it live where? That can make life easier to track down methods that may be dynamic or created by gems.

Speaker 3

30:23

I recently learned how to use the LS command in Prey, and now I just I just live out of the LS pri command. The Ruby API traffic's drop off considerably. I find I find the dot methods to be quite noisy. This is very the bose if you're kind of trying to pick out which command it is. And I really like the pre LS command.

Speaker 4

30:47

Yeah, I mean you can do to make that less noisy is take like object dot new and subtract the methods out of that and sort it and all that sort of stuff, and you can do it all on a one liner because we're in But yeah, I'll ask is another great option.

Speaker 3

31:03

It's my documentation suffered for it. I must admit. Now my attitude is just I can just arlest the class and see what's going on. Man.

Speaker 2

31:10

I think that's another example of someone making some tooling that you know, makes something that. Yeah, if you knew to call dot methods and subtract object dot new dot methods or object dot methods, it's great. But now it's two characters and it's nice and easy, and it's much more approachable, and then you can have access to things that you may not knew existed.

Speaker 3

31:30

Can I ask you about can we turn back the clock and ask you about Rail's zero?

Speaker 2

31:36

Oh, it's been a long long time since I've worked on rail zero. I can try to answer questions, but.

Speaker 3

31:41

So it sounds like you've been on a bit of a journey with scaling things up. What did you do before rail zero?

Speaker 2

31:48

Oh? I Actually most of my career has been working with rails. So before rail zero, I was working at like an enterprise Java shop that I don't remember a lot of details of it anymore. It's kind of too far in the past. But I think I've been working with rails now for eleven years. I think, so it's been just a long time just rails. I don't remember a lot of the pre rails world. To be honest, that is.

Speaker 3

32:13

The correct answer. There is no other system. I ask because we were talking about the N plus one queries, and my complaint is that rails makes it too easy to do N plus one queries because if you just kind of thought of all the guides, that's what you get. If you kind of do a dot lder each. Then you're going to be there for a while and you start noticing that when you start getting into a few

32:42

thousand objects. So you can be sitting there prototyping something and think this is great, and then when people start using it, you drop it in. That's when you start hitting these gotchas. But I think people forget what the battle days were before you had the rails tooling out of time it took when you had to write your

33:01

own queries. It's really quite significant. And they mentioned Enterprise Java that was not a whole lot of object relation mapping going on in that, so that it is a double edged sword when you're operating at the scale you do, what are the parts of rails that start to bite.

Speaker 2

33:23

We've definitely been bitten by how easy it's been to make M plus one queries in the past. I think pretty much any rail shop is going to be doing it. Rails offers tooling to help with that, but the tooling still requires a lot of effort. You have to kind of know what N plus one query you're introducing and fix it. So though that's where you can build some more tooling. There exists a gem that we built a

33:47

jip preloader. There's also another community gem called a Goldie loader that removes stuff like M plus one queries, and those are ways to like basically eliminate those kind of problems. Some other things that kind of come off on Rails

34:02

as we are building is like discoverability of templates. So I think you're one of the previous episodes of Ruby Ropes was talking about this, but as as it scales up like rails, EERB makes it really easy to render partials all over the place, but it's really hard to understand, like if you're looking at a page, where are those partials actually coming from? And how can you dig back into them? So we've like that's a challenging thing with

34:27

rails as well. There's also some things with the community for things like paging that can be problematic at scale. If you look at what some of the basic gems that offer, it often comes down to a limit offset, which is also really fine on small data sets, but as you get too data sets that are really really large and you're going to page really deep into them, it actually starts really falling apart and breaking down and things that you might not know until you actually just

34:54

hit that scale. I think the some of the RAILS conventions also starts becoming a little bit problematic, and you see a little bit of discussion about this. You know, Rails at one point said throw all the logic into the controller, and then eventually the controllers became skinny and all the models became really fat. And I'm sure everyone has that god object that exists in their project, the user object or the account object that is five thousand

35:19

lines and really difficult to reason about. And people are offering opinions of having like service classes or various different patterns to try to combat that. But we're still trying to unpack some of the things that you started to RAILS projects with.

Speaker 4

35:35

One question on that as far as how you've seen and the progression of the companies you've worked at, have documentation right, Like, on the one hand we've just talked about, you can use cops, you can use linters and say go out and try things, break things, autocorrect things, experiment basically, then there's self documentation making sure you're writing good method names, good class names that are intuitive, and then there's inline documentation,

36:04

and then there's high level documentation of hey, we're using this set some conventions and everything else. This is a big question, but what what do you think is the right thing to put in each of those buckets in order to make an intuitive project that scales across you know, more than twenty developers up to one hundred developers.

Speaker 2

36:25

Yeah, and you know, here's a little bit of like my kind of thoughts from it. But I'm not going to say, like my thoughts here are perfect. I think everyone's mile edge will vary because documentation is a tricky thing.

36:34

So when you get to if you're getting to like gotcha's like if you ever tell someone like, oh, if you see this pattern, don't do it like this is like if you have code reviews that like, oh, I've been bitten by this before, that should be something that falls into like the linting or the like the just

36:51

and time education where you try to codify that. If you see people that have inline comments and code that says, you know, like this next few lines are going to iterate over something and do these operations, that's probably an indication that their code is not written well to describe it, and that comment is not super valuable, so that actually it might be something like that comment shouldn't exist, and instead we should maybe extract a method that describes it

37:16

better and kind of move in the direction of code describing itself. When you are implementing something that's specifically tied to code, it should probably exist at the code level. So if you are if you have like a module that you want things to include, and someone developers need to implement certain methods in there, maybe the module should define those methods and raise like a not implemented error that have a very clear this is what this method

37:41

should do, this is what it should return. Here are some examples, and just link to them in your own codebase. And so now when a developer looks at that specific piece of code, it's still tied to the codebase. But all of that's, you know, at the code based level.

37:55

There still needs to be something at like a higher level that's like a read me in the documentation or in something else entirely, So we have stuff that exists and read me that's kind of more about like process, but process is specifically related to our code based So a good example of this would be how do you do this these asynchronous migrations? So like this isn't really super tied to code, because you might make a migration,

38:22

but then what's the process for getting that live? So we have a like a step by step guide to be for ra clio. If you want to do a migration, here are the steps that you need to take, and as much as we can, we just link back to code rather than re implement the code. But we'll also just describe things in English and offer templates there and then we go one level higher to things that exist more at like a process level for the organization. So

38:47

for that we use a tool called Confluence. There's lots of tools that exist that kind of do similar things, but for those that's that's things that exist outside of the code based So if an incident happened, how do you do a post or would cause analysis on that

39:02

and there'll be documents for that. Or you know, if you wanted to propose a new style of a new feature that you wanted to get some buy in using some new architecture, just wanted to make sure that the approach is correct, you can do like a design doc in this confluence and get people kind of bought in well before you've actually written the code. But once the code is written, that document is less relevant.

Speaker 4

39:24

Absolutely. I was kind of going from the standpoint of, like we were talking about bringing a new developer in and getting them used to the whole environment, and you've definitely tackled some of that in terms of, you know, here's the process migration example there. What about just getting them used to the entire structure of your application where certain logics live, certain design paradigms that you've talked about.

39:47

Some of those can be encapsulated in lnters, but some of them are larger than winters, And so is that when you're doing the specific guide for walking them through that process.

Speaker 2

39:58

Yeah, so then there's definite things that lenders aren't going to be able to do, Like linder won't be able to tell whether this thing should be a model or a service class or something. Right, it's not really going to be able to it doesn't understand the business logic of it. So for things like that that we kind of have to rely on like little handbooks of being Like here, like we've code fight, our style guide. We try to make sure that we keep that up to date.

40:20

There are some things that we still teach through kind of tribal knowledge and code reviews, like if some smits a pull request and we notice it, we'll still correct it there and we'll do a lot of pairing, so we'll get developers up to speed by working with people as supposed to just going off on their own. But I think this is just a learning process. Like we I don't think we are perfect at getting developers onboarded,

40:41

and I don't think anyone is. And I think that's the important distinction that you just it's an iterative process. Is if you if you bring in three developers and they all have the same issue, that's probably when you might need to introduce some new documentation and be like, hey, here's our new developer handbook. You might want to read it.

Speaker 4

40:58

And then that absolutely and you've been on top of that. You have personalities too, and you know certain people gravitate towards certain things. How what's your methods to when you have what I would consider external documentation, whether that's living and or read me. It's not in a Ruby file

41:15

or an HTML file or something like that. How do you guys have any triggers in order to Hey, if something happens over here and we decide on a new paradigm, make sure you go update that guide documentation or is it oh, just like we brought a new person in and we've got this new convention that's not documented. Oh geez, we got to go update that documentation. And it's kind of a only when you discover it type of issue.

Speaker 2

41:42

So I think the answer is both. I definitely think we still have places where our documentation drifts and then somebody notices and we're like, oh shit, we got we

41:50

gotta fix that. But we also do leverage tools like danger danger GSS, like GitHub where it can look at code and it's not necessarily like a linter of basically saying like hey, this is bad, but it can make a comment of being like, hey, you're doing something maybe this is related to this this link over here, and direct developers or whoever's reviewing to go take a look at the documentation. Maybe there's no changes that require there. And we definitely need to be careful about how much

42:16

noise we generate. But you know, if in the case of like a migration, if a developer writes some on migration and then submits it, we could basically say, hey, did you add a new file to like the dB migrate file. If so, like make sure you're following the gut steps in here and make sure that it aligns and kind of point them back at the documentation, both for the writer of the pull request but the reader and then kind of helps make sure that things stay

42:38

in sync. Not a perfect process. I think we're just we're slowly getting better at making sure that documentation stays up to date.

Speaker 4

42:45

Yeah, that's always the painful part. Those are great insights.

Speaker 3

42:49

What do you think about DHHH? Guy, it's a Weirdosney I was, I was. It did denounce that just me. No, I love DHH. She did a book of quite a few years ago Could Rework, which was prophetic really in the current situation about working from home. He did a Rails comp you know, I think it was a couple of years ago where he said that at base Camp they have never had a DBA, so they've never employed a person whose job it was to administer the database.

43:26

This is something which Rails has just just magically scaled up and the database is scaled up. Do you are you in the same situation? Have you never employed a DBA for your very large Rails database?

Speaker 2

43:41

Yeah, actually we are in the same situation. I believe we I think we were going to hire a DBA this year prior to the pandemic, and then I think there were some complications. But prior to that, the company has been operating for over eleven years, and I think now no DBA, we definitely have some DevOps that are a little bit like focused on making sure that the database is running and making sure that you know, we've

44:06

got replications set up and proper statistics. But we kind of put the onus on everyone, like you don't have one person who is the guru of SQL, you have everyone, and so everyone tries to teach everyone these things and we try to do our best to share that knowledge where we can to make everyone as experts as we can.

44:25

So we've managed to go, you know, eleven years with no DBA, and I think we're only getting to wanting one now because we're trying to do like really customized processes of how do we you know, this online schema migration stuff, how do we make that completely automated, which is actually going to be a completely distinct system to the rail system, because we're going to want to apply it to any of our projects, or maybe some gotchaes

44:49

between like upgrading my sequel there's probably some things that

44:53

they might actually have really good insight into. But I think our general approach is, even in that situation, we're going to have one DBA and hundreds of developers, and we want to make sure that, you know, they may have knowledge and might be useful for talking through things and sharing things, but the work it's going to still fall in with the developers, and you know, I mean to make sure that everyone is learning as much as they can and not just blindly hoping that the DBA

45:19

is going to handle it.

Speaker 3

45:20

Yeah, I mean it's the way. The way if DH presented it, it was kind of this is this is a necessary evil mind was to have a database specialist this instead who rails enables developers to kind of handle this themselves and not just kind of blame the database man or woman when the when the thing goes wrong. Surely as a company gets bigger, you have more specialized roles and not less specialized roles.

Speaker 2

45:52

Yeah, I would agree, and I think there are more specialized roles, but I think there are skills that apply to everyone. So you know, as the company grows, you may have more specialized roles that have more specific knowledge.

46:04

But I think probably with that specific knowledge comes the responsibility that they are not gatekeepers of that knowledge, right they They may be experts and they're maybe building content, but I would say part of their job is to make sure that it's that content is consumable by everyone. And you know, if they're answering the same questions over and over and over, they're not doing their job to educate people on how to self serve and do it themselves.

46:30

And that's how we learn and grow as a community and get better is just by sharing this knowledge.

Speaker 3

46:36

It's a it's a really quite an interesting situation. I don't know what it means for the DBAs, but I think there's there's definitely more database work out there. But I think because RAILS just makes it so easy to work with databases at scale, that you kind of tend to hit hit that stage much much later on.

Speaker 2

46:57

Yeah, I agree, you don't necessarily have to have everyone customly building SQL like Active record does a pretty good job of being an RM that let's developers just do the things they need to do. And you know, there's data notifications available to easily add tooling that you don't need the DBA for. But you know, as things grow.

47:18

There's things that RAILS doesn't yet have tooling, and maybe that's something that like if you have a DBA who is well versed in RAILS, like maybe they can contribute back to the framework or at their own gems that can help everybody get better at working with databases. And you know, it doesn't necessarily invalidate their job, but their job becomes more of a knowledge producer and they try to share that knowledge and make the community better.

Speaker 4

47:45

Yeah, we're in the same boat. We like to push that knowledge down as far as possible, but there certainly are opportunities when you're deep in the materialized viewsed and windowing and postgraphs or something like that, where you're just like, I really want to phone a phone, a deba friend. And that's the conside I would suppose, Yeah, And.

Speaker 2

48:04

I think that's that's like the roles of the specialists, the people who have the specialized knowledge, they're probably more consultants. And you know, you have someone who's like, I've got a really gnarly problem. I don't know what to do. Yeah, like get them to like sit down and help with you. And that's a big asset that they can help with people and you know, if that's a one off, it's a one off. But if they do this ten times in a week, maybe there's education there, or maybe there's

48:26

two ling. And I think it goes to pretty much any role that you ever feel like you're just throwing something over the fence. If you push that responsibility also to the developers, you can also end up with a much higher quality project.

Speaker 4

48:39

Kids them how to use includes and avoid some of those massive queries and plus one problems.

Speaker 2

48:46

Or use some of the gems available and have the n plus one queries just automatically avoided for you. You bet.

Speaker 1

48:53

Yeah, I've had some include statements which spend fifty lines on some projects that inherited. Its insane the kind of data that they're trying to return, But yeah, it's crazy good advice.

Speaker 2

49:07

One also, is anything else.

Speaker 1

49:09

That we want to talk about. I know we're getting at about that time.

Speaker 2

49:13

I'm just going to mention one thing about includes, because I think this is a another gotcha of RAILS is they don't really teach you what happens with includes and includes as actually does two things in the backgrounds. It either uses a preload or an eager load, and a preload splits it off into a different query entirely, where you do something like select star from table where id's in this big list. But then there's eger load, which

49:37

tries to smosh it into one big query. This is something where Rails always suggest using includes because it'll handle that distinction for you. But that distinction actually makes a difference at scale, and when you're dealing with large tables, eager load is almost always worse significantly, and so it's almost all the time you actually want to use preload, same interface, but it's just this interesting little goatcha that you don't really realize until it starts biting you.

Speaker 4

50:05

And you got to remember everything is just a tool, and you can either smash your finger with that hammer or you can build what you want to build with it.

Speaker 2

50:15

Exactly great, Kyle.

Speaker 1

50:17

If people want to follow you and some of the stuff that you're doing online, where should they go.

Speaker 2

50:22

I don't really have a huge online presence. I do have a GitHub account, but that's mostly working on like either public gems for the company. But what I'm trying to do is be a little bit more present in the community. So I do have some talks available at rails cof and I will like my goal is to be pushing out a little bit more written content which is available at like the blogs that Cleo provides, so I can provide a link for that in the future, as well as a link to any of the talks

50:51

that I have. But unfortunately I'm not a super user on Twitter, but I can also provide my LinkedIn where I sometimes post new introgeration there as well.

Speaker 1

51:00

Awesome, Well, I'm going to move us over to some picks. Looke, do you want to start us off?

Speaker 3

51:05

Yeah, listen, listen to this. Listen to this. Can you hear that?

Speaker 2

51:12

I can't hear anything that.

Speaker 3

51:14

That is the sound of me signing up for Drifting Ruby dot com, which is a quite excellent a series of rails gusts, including the accident from jQuery to ES six episode. I am a notorious jQuery user, almost an unrepentant one, but Drifting Ruby has let me see the light and I'm a newly reformed character. So my pick is Drifting Ruby dot com.

Speaker 1

51:43

I must say that's a great pick. So all right, hey, Matt, you want to chime in with some picks.

Speaker 4

51:50

Well, my pick comes out of this. I'd say that danger JS is something that I really want to look into. We're significantly investing in see I see infrastructure and deploying those branches like you were talking about, Dave, and so that looks like a really great way to tie back to documentation and check the best practices that can form with the rest of the company. And that's my pick for today. I'll let you know what I discover awesome.

Speaker 1

52:18

I'll jump in with a couple of picks. One is from Google. It is a type in security key. Other companies have similar products, like the Ubo key. It's a USB or a NFC key that will do your authentication for you. So actually, I have a couple of these arriving in the mail today in preparations for another Drift and Ruby episode that I want to do on these things. So that should be a pretty interesting one. I don't think it's going to have too much popularity because I

52:49

never have one of these keys before later today. And the other is I have now in front of me a little rack of Raspberry p i eight gigabyte of rams that I'm building into a tiny Kubernetes cluster for well, just because that can really so. I love raspberry pies and they just released their eight gigabyte versions, which actually banks it nicer to run some heftier things on it now. Still slow, but still a lot of fun. All Right, Kyle, do you want to join in with some picks.

Speaker 2

53:26

I didn't prepare anything, so I actually don't have anything that's off the top of my mind here for things to just call out.

Speaker 1

53:32

All right, fair enough, Well, it was great talking to you, Kyle, and I always like talking about technical debt because I am notorious for introducing it.

Speaker 2

53:41

I'm always happy to like building tools to fix these things so that we can make better.

Speaker 1

53:46

All right. Well, that's a wrap for this episode. We appreciate you coming and talking with us. So it was a lot of fun.

Speaker 2

53:53

Yeah, it was wonderful. Thank you, Bye, take care.

Transcript source: Provided by creator in RSS feed: download file

Rails at Super Scale with Kyle d'Oliveira - RUBY 667

Episode description

Transcript