Hello and welcome to Postgres.FM, a weekly show about all things PostgreSQL. I am Michael, founder of pgMustard, and this is my co-host Nikolay, founder of Postgres.AI. Hey Nikolay, what are we talking about today?
Hi Michael, we are talking about some article which attracted my attention.
And I hope not just the article, because it was the article was about best practices, and I'm I'm hoping to discuss that.
Yeah I always like like 15 mistakes, 25 recipes, this kind of titles.
Clickbait.
Yeah and honestly I chose it like let's just choose this article, and it attracted my attention just by title. 7 crucial Postgres best practices. Good title, right?
Yeah, and we haven't had an episode called best practices. So it's an interesting, it's a interesting omission on our part.
Really? Yeah. Yeah, okay, best practices. We have something like mistakes to avoid maybe, right?
Oh, maybe, yeah.
From my experience being program committee of various conferences and just general conference experience, people love to hear about mistakes. They also love to hear pieces of advice, but mistakes we somehow love more because it's like things to avoid. And yeah, and best practices. What does it mean best practices? Do this, not that, right?
Yeah, I've been thinking about this. What's the difference between best practices and good practices? Like, I think it's like shorthand for if I'm new to this and I don't really know what I'm doing, what are some things I can do to not look stupid?
Or to, you know, to at least cover what people generally say is a good idea because in anything complex, especially databases, we know often it depends, like it's going to depend on exactly what you're doing as to whether you should do X or Y. So I think these lists and these suggestions are almost fraught with danger from the beginning, but I think they can be useful educationally. Like, they can be useful for people to say, in general, this is a good idea. In general, this is a bad idea.
You could be a great consultant, you know. Because you sound like one.
Yeah. Well, what do you think best practices mean?
For me, it's not about avoiding to be, to look stupid. It's about, I like the word shortcut, like to quickly find what other people already experienced and choose maybe not best, but something better than you would choose on your own, right? So.
That's a good point, actually. Not so much about looking cheaper, but maybe to avoid things going wrong. Like to avoid the most common traps. Right. The goal
is not about how we look. The goal is not to look bad when making decisions, engineering decisions around polygons, but the goal is to make database work better, more reliably, more like better performance, avoid data loss, avoid downtime, these kind of things. Actually 3, well, correctness is 1 of the goals, but naturally Postgres has very good, in most cases, if we don't touch the null topic, right? Null topic.
In this case, if we don't touch the null topic, most of the time Postgres works in a very correct way and predictable. I expect this, I got what expected. And so in general, 3 areas, big areas are good performance, reliability, and well, avoidance of data losses, big fear of database guys, right? And also no downtime or very short downtime.
Yeah, perfect.
Big 3 areas. Let's dive into this article. I know we both are not thinking it's super great material.
Yeah, sadly.
Let's be honest. Well, it's an attempt.
It looks to me like somebody has asked their favorite LLM for a list of Postgres best practices and almost copy and pasted it verbatim into an article. That's what it looks like to me without much review.
Well maybe it's written by LLM, yeah I'm not sure. The author name is Zak and I think it's good that the good attempt is made to bring this again. And actually it was on Hacker News Top, I guess. That's how it caught my attention. It's a good attempt, but I agree with you some parts. Maybe it's written purely manually. It's possible. But I would like to just reflect on the fields covered, 7 fields. From the beginning, I don't agree that we have 7 practices here.
We have 7 fields covered and inside each field we have some practices mentioned. And you also mentioned that you think there should be more fields covered if we try to come to have this like kind of digest of tips, right? But let's touch, my idea, let's touch every one of these seven, bring up to three our own tips we should, like, should be noticed, like, involved into consideration. And then in the end, I'm very curious what you think is missing in terms of the fields, the areas. Right?
So, deal. Okay. First, the first field is database design and discussion is about naming conventions and so on. What what you think? What's your number one take when we think about how to make database design good? What what makes you good?
Yes, I mean I'm surprised it starts with this. I actually saw some of the Hacker News comments and it's an odd choice of where to start but I do see a lot of people making the mistake of using spaces in their names for objects or using mixed case and things and technically if they followed this advice they wouldn't do that so I do see people making the opposite mistake so yeah I don't think it's bad advice and generally people do have conventions.
I actually follow a lot of the ones that are listed here. So yeah, not bad. How about you?
Well, first of all, I think it's not a mistake if you... My previous startup had camel case table names. It was inherited and I understand it very well that it's not super convenient. I left it as is because it was intentional. I also think one step back, database design being the first topic, and then performance will be the next topic and leaving reliability like backups and HA later.
It just shows a tendency that probably this article and this set of tips are targeting more backend engineers, developers, programmers, or those who create some systems rather than purely database folks. And it's totally fine. And also, like, if we take managed service, managed Postgres, like RDS or something, The important areas like backups HA are presumably solved already. So we have opportunity to focus on things we deal with every day, like database design, performance.
These things are closer to us. I can understand this choice of order. If we talk about database design, I think the main idea is to have some convention. It can be even camel case with spaces, but you understand you will be forced to use double quotes all the time, everywhere. But if it's choice, well, it's inconvenient. I know I was there. And I was there after a couple of other startups where I was, how to say, like, very, very, like, polishing every symbol, you know.
I think this is what matters here. Like, convention is important for yourself, for people who work with you also, right? And for LLMs who work with you as well.
But it's a good point. It doesn't really matter why is it number 1 in this list. Like, yeah. And there's also no mention in database design and schema design, there's no mention of normalization, which is a bit odd.
Well, let's shift to what we think matters. So you say convention and also we need to normalize the tables. I tend to agree. And also, if you think about convention, I love also to think not only about how we name database objects, but also how we write SQL. I'm a big fan of Mozilla SQL Style Guide, which is closer to normal programming languages and moving away from this central space colon, right? How to say like central space thing like in... We all...
You mean like an underscore?
No, not only, so SELECT space blah, blah, then you're from and you need to move from. So you maintain the central space vertical line.
Oh, alignment.
Alignment of spaces. Alignment, yes. I honestly, I hate it so much. I know it's old-fashioned approach and many books follow this rule, but it's so inconvenient, guys. Just take Mozilla SQL Style Guide, publish it inside your company wiki or documents, like any confluence or something, what you have. Adjust it for your flavor a little bit and that's it. And agree, okay, we are following this convention.
As for normalization, I guess let's mention the episode we had a couple of weeks ago, 3 weeks ago maybe, with Franck Pachot, about NoSQL versus SQL. And we talked about normalization, denormalization quite a lot there. I agree this is a very important topic to agree on and follow this. Well, also, for me, I think constraints are super important.
Like, I would say, I would put this as a big, like, it's a best practice to use database constraints, all kinds of constraints, understanding the power of them, and also understanding overhead as well.
Yeah, I think it's good default, like if in doubt, add the constraint, and you might want to remove it in future if you get to a certain scale, but I like the idea of having it by default.
For example, we discussed in the same episode with Franck, we discussed maybe we should have not NULL constraint by default. Not that, unfortunately, it's not so usually in all systems. But we can agree, let's put it by default and lift it only if we know NULLs are going to be useful. So yeah, that's it, I guess 3 tips, enough, right?
And what about schema namespacing do you ever do you use it oh
it's interesting I I'm I'm shifting back and forth and in my career use namespaces schemas or not use them depends on the project here I am like with you it depends if it's
Also it doesn't matter that much, right?
Well, it does, because it's about convenience as well. If you think about others, they can be spending a lot of time trying to find your table. Again, consistency here is the key, because if you chose to start using namespaces, schemas, suddenly after a few years of development, and the majority of your tables are in the main... And it happens all the time. Like, I see it all the time. Yeah, sure.
The majority of the tables are in public schema, but suddenly you start using some additional schemas and hide tables there basically.
It's maintainability as well, for example you can control permissions better and it also sometimes namespaces are serving some trick like you can have multiple tables with the same name sitting in different schemas and but this can go this can lead to some messy state and problems yeah so but I would say this is part of conventional agreements you how we use schemas so yeah sometimes people choose you know like choose this is good topic by the way for maybe 1 day we should discuss it.
Multi-tenant applications and architectures and how to create multi-tenancy with good properties in database. I just recently learned that if you go to Timescale cloud, you cannot run create database and you cannot have multiple logical databases in a single server. So in single Postgres, usually we are allowed to have multiple logical databases. And their recipe for multi-tenancy, they have an article about this, like, let's use schemas and just that's it.
But for me, schema is a very weak wall between data. Logical database is a big wall between data because you cannot read from another database being connected to 1 database. You cannot read from another database not using something like postgres_fdw or dblink. Because this is how Postgres works. Database name is a part of connection properties, the core connection properties. Unlike schema, schema is kind of... It's something...
Namespace.
Well, yeah. It's namespace, and you can revoke rights to use some schema, it's good, but still it's inside the same database. So if you messed up with permissions, you can end up having security issues. If you want strict segregation of data, separation of data for your tenants. Okay, This is a different topic. I think quite good, like we have at least let's move on to the performance topic, the second topic, right?
So, they talk about indexing and they talk about query optimization. I think, right, we've in the past talked about a couple of things, like micro optimizations and macro optimizations. And I think we could maybe, like, think in those terms instead. I do think it's obviously performance is a lot about indexing. And about individual query optimization. But I do think there's some best practices around macro optimization. Like in Postgres, by default, we don't have pg_stat_statements turned on.
That feels to me like a pretty much a best practice that we should probably have pg_stat_statements on or some or an extension for doing that analysis that's on the on the wider system level. Maybe some logging like I think there's some performance logging of slow queries.
Just choose just 1 tip right now. It's already too many.
Oh, A, best practice. All right. Yeah, best practice. I would say, well, I like their first 1, I guess. Use indexes.
Yeah, use indexes is obviously, but yeah, I think my take would be don't leave autovacuum untuned and don't postpone bloat control. This is how Postgres works right now, right? So we must be aware of what bloat is, what dead tuples are, and distinguish them actually. And...
They cover that in maintenance. Yeah.
Yeah. And this is what nobody do it well among managed Postgres platforms right now. Nobody. I mean, they don't give you good tools. Some of them have basic autovacuum tuning, right? And that's it. Basic autovacuum tuning. And that's it. Some of them provide pg_repack, and also have it indexed concurrently, but that's it. Obviously, I see future where more platforms will provide more, better tools to help you in this area.
But so, knowing that right now it's on your shoulders when you build some system, this should be not postponed, right? Index is obvious, like everyone knows index. Let's create indexes.
But 2 important things that aren't mentioned here are do it with concurrently and don't index everything. We've done a whole episode on over-indexing, but I feel like a whole beginner category of issues are caused by folks creating indexes everywhere
yeah I agree and well there's There are dozens of tips we can put there around indexing.
Yeah, you said only a couple, right?
Yeah, only a couple, but you already named more than 5. Pre-registered statements. I agree. While we are all sitting here and waiting until hackers decide to move this extension to core, we need to keep this tip active. And register statements must be installed in any database. Understanding some very edge case or maybe corner case overhead we also talked about in the past. Yeah. Good. Yeah, okay, good. And that's it maybe, right? Well, performance is a huge topic. We love it.
Yeah, we had many episodes about performance. But again, autovacuum tuning, learn indexes, many best practices. And the predecessor sentence must be installed. Agree on these 3 maybe, right? Good. Good. Good. So next field, again, the list is far from being complete. It's just like the tip of the iceberg. So next security, suddenly. I would say make sure you don't have Postgres open
to the world. That's the main mistake I see people making. Yeah
What else
but but obviously if you if you're using certain like managed services, that's on My default in a lot of cases so then it becomes well of having good
Right, they open it to public, but protected, maybe, right?
Yes, but...
But protected how?
Then it becomes about good password policies and things.
Right, right, right. But this started with Heroku, I think. They also have Postgres open to public and security by obscurity when hostname is a very large hash. Yeah. And username also. Everything is like large hash, but it's weak security. But the password is strong, and if it's not MD5, but Scram, it's something better already. But I agree, if you're a serious project, this should not be the case. A port should not be available to public.
I think it should be only like local IP address and available only internally in networks VP Yeah, so Good.
I'm saying Are you saying it depends?
I'm saying that I'm not against like I can understand people who have well platforms who have it open to public for because sometimes we want to start very quickly. And at the same time, I would not allow this in a serious project with data of customers, PII, and so on. I would not allow this at all, to let this port to be open, even if the password is absolutely difficult to find.
For example, we have 2 factors of authentication everywhere, but here we have just 1 password well what about if it's leaked right
yeah this is this is honestly the difficult thing about security best practices is because Best practices with security means do everything you possibly can to secure everything. And that's just often not practical. Like at what cost? At what, you know, what are you trading off? Often you do trade off things for better security, like as you were saying, like quickness of getting started or developer experience, like ease of use.
So there's like a, I do think it's a tricky topic to give best practices on because you don't ever want to say do something that's slightly less secure because it has these other benefits because it's it's not a it's not a best practice anymore.
Yeah it also makes me think we could we could have bad pieces of advice episode you know we could have it. Yeah we did like we
did go through there's a wiki.
Don't do this. My next advice, next tip for security is divide and conquer. So and unfortunately, many teams don't follow this. So use different users for different parts of workload. And for example, humans should have different usernames in database. So we can distinguish them quickly and probably limit differently and so on and parts of application also it's it's wise to to avoid using the single user name for everything yeah
I really like
yeah I
really like that about humans as well. I mean, we don't like to think about it, but if you were to need to remove somebody from a project, it's much easier to do that at the user level if they've got their own user than it is to roll out and, you know, to think about what did they have access to, which passwords did they know that were team-wide or something like that. So yeah.
I think I've heard of a similar view, in fact I think this article mentions it, but there's a general concept of principle of least privilege.
Yeah, yeah, yeah. Well, this is super important principle, of course. And it just, it should be a habit. It's hard to enforce it. It should be a habit of all the team members to follow this principle.
But it is also painful, right? Like, it is painful because it means anytime you add anything, you then need to add permissions to it, like from all of the things. I get that it's good for security.
Default privileges and Postgres, the concept of default privileges, sometimes you can define them as well.
Is that the least?
Yeah. Yeah, well, yeah. Another, maybe, let's do the last tip. I'm grabbing the ideas from article, but we're trying to unwrap them here. Password rotation.
Oh, I hate it. I've written yuck about that.
Well nobody does it actually. Well, some people do it but honestly, honestly, it's first of all it's a pain.
For me it's a relic of when people used to use the same passwords across multiple services. So I think the risk, when I say used to, obviously a lot of people still do. You think
it's a bad advice?
My advice would be make sure all your users are using Some password generator or password manager so that they have a unique password per service because then the risk of their password getting leaked by a different service that then they can use as an attack vector on your service is 0. So it's only if you leak the password that it becomes an attack vector. Which, you know, do you see where I'm coming from?
Yeah, yeah, yeah, I understand that. Also, like, yeah, I agree. I think actually Postgres could have improvements in this area. For example, you could say, if it would be possible to define password policies and so on, like, for example, to forbid short passwords, simple passwords, and Postgres would ensure it's not happening. Or also PgBouncer, I don't know, like Postgres first of all. And maybe password retention could be 1 of the policies if people choose to use it. Right, I don't know.
Like, I don't believe in the rules here established and followed at all. It's like I just see companies have this principle and nobody follows this principle and attempts are like hard. It should be Postgres who enforces this rule. To like avoid weak passwords, for example.
Yeah, right.
Yeah. Okay. Let's... Well, the article also mentions RLS, but I think it should be very connected to performance and to be used with big caution. So it's a controversial topic to just put it, oh, just let's use RLS. Well, yeah, let's check performance additionally. OK, next, backup and recovery. If you're on a managed service, well, supposedly it should work good. But it depends on the service, on the provider. Not everyone has good capabilities.
I would say also a best practice would be to have a version like to take I know we've talked about the difference between a backup and a logical backup or a dump, but if you're on a managed service I would still recommend having a copy on a period that you're comfortable with off the service. Yes,
Automatically created. So diversification of backup locations and some platform can go down or your account can be stolen. And you, you must have the backups, maybe not super frequently, but some backups additionally in secondary provider or location, like isolated. It's similar to having backups on the same Server where you work, but just a different layer. Same principle, different layer. If all backups are in the hands of RDS, well, and single account, well, it's not good.
Yeah. And unfortunately, if you want, if you are on RDS and you want to follow this principle, you need to deal with dumps. Logical backups.
And it's not unique to them, there are quite a lot of service providers that that's the only option. Yeah, yeah.
I agree. What else about backup and recovery? Testing sometimes, right?
Yeah, I guess that is the other big 1, isn't it? It's not a backup unless you know it could be restored.
Yeah, shorting the backup.
So some amount of testing that restores.
And I don't think platforms test all your backups. They test only some of them and who knows, right? So I think it's a good idea to test even if you trust them and don't hear bad stories, still have some testing.
Yeah, what's the age-old saying, like, Trust but verify, right? Trust them to do it, but verify it.
Final responsibility is yours. If you lose everything, well... Yeah, so what else? Anything else in this area? Backup and recovery? Understand RPO, RTO?
Yeah, I think discuss that. RPO, RTO, I think it's about discussions internally. Like have those discussions. I think RPO and RTO are really good tools and good things to define, but more importantly, they're good to enforce. You've even had the conversation with, you know, senior management or whoever it is, maybe co-founders. What are our trade-offs here? Like, are we willing to pay this much more for, like, a reduced amount?
Or are we willing to lose a tiny bit of data to bring our costs right down? I think those are difficult discussions to have, and not everybody has the same... Not everybody would make the same trade-offs there in different situations.
Let's move on and next, it was number 4, field number 4. I would jump to field number 7 here, high availability, because in my opinion backups and like DR, basically, that disaster recovery and HA, they should consider together as a foundation of good database system platform or something. So HA, right? HA, it's interesting. If My number 1 advice is avoid systems which are not HA systems but split-brain systems, like replication manager, avoid it. Just avoid it. And I don't know...
Well, if you're going to say a best practice, maybe we'll just say Patroni? Or what would you say?
Yeah, just Patroni. But you know, right now, CloudNativePG is gaining popularity, and it's not using Patroni. So I wrote on Twitter, anyone experienced already split brains with CloudNativePG? I'm just wondering, I don't know. So Patroni is definitely passed a huge path to achieve very good quality in terms of avoidance of split brains and maintaining very low downtime in case of primary disappears and so on. Yeah, this is number 1. Like I would just if possible choose Patroni.
This is my advice. What else?
Or managed service provider. I think that this is where the big, well all the cloud providers that offer this. I'm so glad they handle this for me.
In this case, like I can speak about this openly. Crunchy Bridge, please don't do backups from primary, physical backups. They still do. You need to fix this. If you follow us, like right now, full backup, if it's not snapshot, even if it's not snapshot. So it should be on from some replica, some physical Standby, because it's very I/O intensive. Even cloud snapshots can be I/O intensive. And of course, wall archiving should be on the primary to minimize the lag and minimize the RPO.
Improve RPO, right? But full backup, if you do it from primary, it's huge stress for the primary, and if it's heavily loaded project, it suffers. What we recently observed, it was not good. So advice to backup, if it's under your control, organize backups. So full backup or Delta backup or snapshot is not done, not on the primary. It's possible. pgBackRest backups, pg_basebackup, they support non-exclusive backups, right? So we can do it on replicas. On the...
Makes sense. And if we've got HA in place, the replica's not doing anything, right? It's not a read replica.
Well, yeah, it depends here, but I agree. It's good to do backup from replica, which won't be primary. So if we talk in terminology of Patroni, it should have no failover tag. Probably also no load balance tag, but definitely no failover tag. It's Patroni terminology. You can put the tag on the replica to say no failover, so if failover needs to happen, Patroni won't choose this replica.
Got it. Okay. I was actually thinking you did want it to be that 1, but yeah, sure.
And then you do backup from it. Well, there's another topic. If you do backups, only full backups, only from single node, what happens if it has some local corruption and other nodes are fine? So all your backups are corrupted, but it's an interesting, already deeper topic. Let's talk about HA, we talked about Patroni, what else? Lag control, right?
Lag control. Monitor lag, yeah.
Well, also not monitor, monitor, but also it can be tuned. Oh, sure, yeah. To have a very low lag, and also when you do load balancing it can be very smart logic. Many people already implemented it or else implemented it in core already. So it can be like if some write happened, this session deals with primary for some time. It can be even smarter, right, comparing other sense and so on.
So you have some, it's not strictly about HA, but it's about load balancing mostly and how we use our replicas. It's like a dressing topic. What else about HA? We don't have a lot of time. Unfortunately,
we haven't. We have a whole episode on it. Let's link to that.
Yeah, yeah, yeah, yeah. What else in this area? Any other tips? Test failovers, right? Or switchovers?
Yeah, true.
Yeah. I remember I chased RDS guys asking, you have this checkbox HA, which makes me pay double price for primary, but how can I trust you? I mean, I trust you in words, but in action, how can I test it? And they implemented it, you can test failover, which is great. So simulate primary failure and see how failover happens, how your application experiences it, for example. Important, another tip of advice here is to write applications so it has retry logic, not to lose writes.
Okay, let's move on, because this is like huge areas. Each of them are huge areas. What's left? Maintenance and monitoring. Wow, huge area. Development practices. And that's it. Only 2 areas. And I think it's not possible to pack it into a few minutes. Let's only maybe use 1 or 2 tips everywhere. For maintenance and monitoring, we already had a tip to always install pg_stat_statements. What else? Your choice.
Well, a big mistake I see people doing sometimes is turning off autovacuum. And this, the author of this doesn't seem aware of autovacuum so that that would be my main tip is keep autovacuum on and tune it I know you
mentioned my tips are 2 of them like bloat control and index maintenance and we had we had several episodes on these topics And just these must be implemented in any project which aims to grow even not to huge numbers, but some moderate numbers. Okay, and finally, development practices, and then your secret area as well. Development practices, what do you think? Version control, code organization, well?
Yeah, I mean, version control has been a best practice for years, but I still think it's harder than it should be. I know we did a whole episode on this. But some basic, however you choose to do it, some way of tracking what changes have you made to your database, some form of version control is so much better than not having anything.
Yeah, well, my advice is straightforward. Based on what we do last 5 years, use proper testing during development and before deployment. And this testing should involve a lot of data. So thin clones or database branching are great.
Or a staging server that has like just some not has some reasonable amount of data.
Well, if you're alone, yes. And also, even if you're alone, if you work on 2 tasks separately, it's still like single staging is not enough. You need to have, And what if you started doing something and made everything wrong and data is changed? How to restart? So database branching is the key.
Or best practice.
Yeah, so I recommend in development practices, grow your level of development methodologies, so database branching starts to be involved into all development and testing activities. It will be level up, huge level up. So this is what we do at Postgres AI. And okay, what's your secret topic, secret area?
Not secret, we've discussed it a bunch while we're talking, but I couldn't believe that config tuning wasn't in, Like, no mention of changing shared buffers even. Like, I don't understand how you can talk about best practices for running Postgres and not talk about changing settings.
Again, if you are a RDS user, I can imagine, like, we talked about HA, but we talked about backups, and it's very... In this article, it's very development-focused point of view, like, developers point of view. Like, things already solved, and configuration partially also solved, but we... We all remember, like, Crunchy Bridge tuned a random page cost after our complaint.
They did, but I don't think RDS have yet or most of the others. So random page cost is still the default on most major providers.
So many things are still default. I'm happy I have work doing consulting.
We help many people still. I know that could be covered, but it really just wasn't mentioned. And then the other big 1, I'm sure there are others that have it's hard to spot like the absence of things right when you go through something but I couldn't believe I didn't see a single mention until the conclusion of upgrades of staying even in security we didn't mention staying up-to-date with the latest minor versions and in performance we didn't talk about like upgrading to major versions.
That feels to me like a best practice should be to stay up to date.
Yeah, yeah I agree. It's part of our healthy Postgres vision. Definitely minor upgrades, major upgrades for sure. It's part of our health check we do.
Everything that's in your health check in my opinion should be your best bet. Yeah. If that makes sense.
And so also still I always have a feeling that it's not enough, we need to extend all the time. Some things go away, like upgrades, we will have, in 2018 they will have statistics, so this will, this tip will go away, but so many tips are still coming. Okay I think it's great like maybe you're not very well structured but I hope some some people heard some good pieces of advice today. Let us know in the comments please or on Twitter or LinkedIn somewhere.
Good. And we have document actually, maybe it's a good time to remind that we have a document with ideas and also if something is missing and we could dive into some topic.
Yeah, you mean, so we've got a document that lets people request or suggest episode ideas.
Right.
Yeah, topic suggestions.
Yeah. Good.
Nice.
Good.
Thanks, Nikolay. Catch you next week.
Thank you. Bye bye.
