Hello and welcome to Postgres.FM, a weekly show about all things PostgreSQL. I am Michael, founder of pgMustard, and I'm joined as usual by Nikolay from Postgres.AI. Hey Nikolay.
Hi Michael.
And today we are delighted to be joined by 2 excellent guests who have each contributed a lot to Postgres over many years now and who both recently published blog posts on the topic we're going to be discussing. Let me introduce you both quickly. First we have Gülçin Yıldırım Jelínek, who co-founded the Prague PostgreSQL Meetup and is a staff engineer at Xata. Welcome Gülçin. Gülçin: Hello, thank you for having me. We're delighted to.
And we're also honoured to be joined by Robert Haas, long-serving PostgreSQL major contributor and committer and VP Chief Architect Database Service at EDB. Welcome, Robert.
Hello, Thank you for having me.
It's our pleasure as well. So to kick us off, I've prepared a couple of questions to ask each of you in turn, but I'd also like to encourage you to ask each other questions as we go along. Perhaps we can start with you, Gülçin. What are your high-level thoughts on the topic of is pg_dump a backup tool and why is it something you wanted to write about recently? Gülçin: It is funny because I didn't actually want to write about pg_dump.
I just joined my current employer Xata and it was my first week. And then I noticed something in the discord channel that we have. Somebody's having an issue with pg_dump. I was like, Oh, what's happening? And I saw like some parameter that I didn't recognize, like in the error message, I was like, restrict non-system relation kind. I was like, I don't know this configuration option or anything. And then I noticed it was actually introduced recently at that time. And I was like, oh, okay, why?
And then I check it and it is kind of related to the CVE. I remember the number 2024-734. It doesn't matter, But there's a blog post about it, so you can find with this number. And in there, it explains like, what is this vulnerability and how can actually people use this vulnerability to actually compromise when you are, potentially your database, because it affects the pg_dump. So people can actually create a non-temporary object in the database.
And then just before pg_dump begins, it changes this object with a different thing, like a view or a foreign table so people can INSERT SQL there. And then when pg_dump attempts to do the backup, then it can run the injected SQL code. So why are we there? Because it affects it. And then I said, hey, this affects from Postgres 12 to 16, upgrade the Postgres versions and test if pg_dump scripts are working, review the user permissions, the standard recommendations when this kind of thing happens.
And then when we were sharing this blog post on Twitter, I think our marketing team made like, okay, it's a pg_dump, a tool to backup Postgres databases. It was the definition of that tool, basically. And then I read through it and everything, and I noticed, oh, people are saying, you know, the usual, when you say something about pg_dump, it is not a backup tool. And I was like, okay. And then basically it kept going. So I had to write another blog post to say, is it really, or is it not?
Who first said this? Gülçin: I don't know. I didn't know this because there were so many. And I know that because in Postgres community, whenever pg_dump topic opens up, somebody will say, you know, pg_dump is not a backup tool. But then actually a few days before this discussion happened, Peter Eisentraut committed a change, which will be in effect in PG18, that tries to remove the backup terminology and kind of converts to export so that people are not considering it as a backup tool in a way.
So this, I think, made people to be more vocal, saying that, look, this was how it was before, but not anymore, and you should not say it. And then I had to write another, I mean, I felt like I should write something more about it to explain why and why not we cannot consider pg_dump backup or not. And in my opinion, it is a tool that can be used to backup a database. And it is a logical way of doing a backup. You can call it a dump.
Maybe you can define, you know, Nikolay was saying is it backup tool or yes, no, or define backup. So it can be a backup. In your case, when I was like working as a DBA for a long time, I was using it to backup databases. Depends basically the context of how you use it and the nuances that you can actually utilize this tool. So yeah, that's where I stand today. I don't agree that it is not a backup tool.
It can be a backup tool, but there are maybe later on in the discussion we can discuss what are the drawbacks with it and how actually regular backup tools that are out of Postgres can help like pgBackRest or something. Can I jump straight away with a question? Gülçin: Yeah, please.
Yeah, I saw also comments that it's maybe for very large databases like many terabytes and more it's not a good tool, backup tool, but at least for small databases it's good and also partial, you can export only 1 table. Imagine we have a tiny database, just like, I don't know, like 100 rows, 1 table. And I SELECT * from this table in psql and just make a picture on my iPhone.
This is a backup, this picture we can restore from it right Gülçin: Well maybe it's a snapshot right yeah why not well dump is also snapshot right Gülçin: yeah and that I don't really see like why it can't be called backup okay It Gülçin: is a moment and you can use that moment to do something with it.
I thought it was a really good blog post Gülçin, I'll share it in the show notes as well for anybody that hasn't seen it. And speaking of good blog posts on the topic, I think Robert added a lot of good points as well. Both of your first blog posts included a lot of technical details like the technical aspects of why it technically could be considered a backup tool but also the drawbacks the many drawbacks of it and why you might recommend for a general purpose backup tool using something else.
So Robert, how about yourself? How would you summarize your high-level thoughts on the topic and why it was something you want, or maybe you didn't want to write about it either?
Well I think it just kind of got under my skin because you know Gülçin's blog post was not the first time that I've heard people sort of using this pg_dump as not a backup to a line and to me that kind of came across as shouting at people without necessarily like giving you know a reasoning right you know The documentation said for literally 20 years that pg_dump could be used to make backups, or I don't remember exactly what the wording was.
When I looked into the history in Git, I actually found that the language that it's been changed to now with exporting the database is very similar to the original language that was used to describe pg_dump when that code was first added to PostgreSQL, but there was a 2 decade period in the middle when the documentation said, hey, you can use this to take backups. And from my point of view, it doesn't even matter whether that's true or whether you think that's true.
If the documentation said for 2 decades that X piece of software could be used to do Y, then nobody should get in trouble for saying that. Like, nobody should get called out for saying that. That just doesn't make any sense to me. Like, I mean, honestly, I think, you know, we, some of us, self-included can be a little too eager to jump on people's case from time to time. And I don't think that's like good for our community.
I think we wanna be the kind of community where when people show up we give them help, we give them good advice, and we don't come down on them like a ton of bricks. And Gülçin is not the only person I've seen who seemed to me to be kind of getting beaten up a little bit. And I was just like, why are we doing this? Like, clearly, pg_dump isn't right for every purpose. And there are lots of situations where it's probably not what you want.
But I just the tone is baffling to me, because it seemed very hostile to me and I couldn't make any sense of really why we should be that hostile about anything, but especially why we should be so hostile about that in particular. Gülçin: And to that I actually have something to add, because after this discussion started to come up again, and I was looking at the groups like where people are actually using this, like it's not a backup tool rhetoric.
And I seen like few users that are trying to get help from this Postgres communities that we have online, a lot of them. And there was like, I noted 2 of them for today. 1 of them is asking, pg_dump can limit a backup by schema. I mean, it's like using this sentence and there's somebody answering directly. It's not related there, but pg_dump is not a backup. And then there's another user, can someone send me the command to take backup of partial Database? Which actually pg_dump can, right?
We can do the Schema only, we can do just the Data, whatever, or we can do a Table, any type of Object. And then answer is like, there is no such command. The standard backup tools take backups of the entire Database cluster. So basically, it doesn't consider it as a backup tool, even though there is a pg_dump command that can actually do what people are asking. So that's what I find very not helpful, right?
We could just say to people, look, this is this pg_dump command that you can actually take this table that you want to take, or selective restore, whatever you want to do with it, and help people to the direction that they're actually trying to get there. Instead, just saying, there is no such a command. It's not a backup tool anyway, because the standard backup tools takes the entire database cluster. So that I don't find helpful, is what Robert is saying. That is not helpful at all.
You might not agree that it's a good tool for using it as a backup solution and which we can talk where it could be improved or why actually people should prefer backup solutions. But this is still not helpful. There's a tool that we were all using for a long time and it can does all the things that these people were asking. So nuance of the question matters, the context of it. And that's where I am, basically.
And you know, if somebody asks about how to use pg_dump, and you want to tell them, hey, here's how to do that thing with pg_dump, but maybe you want to consider some other alternatives instead. Cool, like I got no problem with that. That can be helpful advice. But like pretending that the thing that they're asking about doesn't exist when it does, that I just don't understand that at all.
So yeah, I've definitely got some theories as to why people are behaving and speaking like that. But I
do think... 1 of such people is just here. I can speak to him if you want.
I wanted to, yeah, I wanted to, I think you've got some really good language around this Nikolay around logical backups and physical backups that really helps clarify and I think if people use that language in those sentences it would immediately help with clarity and also limitations but I'd love to hear your like high-level thoughts on topic as well and and yeah why is something you say
so this statement is a dump is not about the capital is reaction to the statement documentation had 20 years And we saw so many disastrous situations in many companies who Tried to rely on this as backup tool while growing So we did like actually it was not my statement, right? I just picked it, right? I think Franck Pachot also mentioned it. I'm not sure he was the first who reacted to the Gilson's article.
But I joined as well, and I'm sure in many, not only Discord or Slack or IRC, anywhere many people are picking up this motto because it's painful to observe how many companies relied on pg_dump as dumps as backups, right? If we call dumps And considering backups, okay, we can do that, but there are limitations. There is big power in this, not only partial. You can take specific Tables. These days we have many managed Postgres offerings and they don't share backups with us.
If you want multi-cloud backup, you must use pg_dump. You cannot get data or copy or something. You cannot get physical backups out of RDS, for example, right? But this pain observed for a couple of decades caused me like joining this Movement saying that pg_dump is not a backup tool. At the same time, there is a like I told Michael there is like it's There is like a kind of professional shift in my mind here. Because when somebody says backups, I envision only physical backups.
Although there are logical backups, of course. And again, this is not my idea to introduce this language. I checked it in Oracle and MySQL documentation. I think maybe it's a good idea to borrow this concept and mention specifically SQL. There are 2 kinds of backups, physical and logical. They all have pros and cons. For example, logical backup, If you rely on pg_dump as a backup tool, like for example, partial and escaping from RDS, it's good pros, right?
Speaking of cons, it's always like kind of snapshot. It puts pressure on your Database in terms of xmin horizon, affecting autovacuum behavior, which is unacceptable if you have 10 plus terabytes and heavy load. Also, at 1 day, some bug or corruption might happen, and you simply cannot read your data at logical level. While physical backups are not affected. They just copy files, right? And like, there are many pros and cons to compare, right?
And I like the idea to split language between logical and physical. And for me personally, when somebody says backups without specification, I still see by default physical only. Right. Gülçin: If we are considering the corruption, the logical backups, the corruption can be also in the physical level. Right. I'm okay with that, but I have backup, I can restore and deal with it, right?
Gülçin: Well, then actually you can maybe keep this corruption between your physical backups if you didn't notice, if it's gone unnoticed. And then if you had the logical backup on top of it, maybe, you know, it could be another tool to fight this physical corruption that you have. Yeah, what I'm trying to say, if I have physical backups with corruption, I will deal with it and so on. But if I have corruption which prevents pg_dump from reading data, it will just fail and I don't have anything.
Yeah, so I'd just like to make a couple of comments here. I think 1 of the things that I find really interesting is that people who work for different companies that all support and use Postgres can have very different experiences of some of this stuff. And I've seen that before with other issues and I'm seeing it here too. Because my typical experience with pg_dump is not the 1 that you were describing at all.
In fact, since I've worked at EDB, which is the whole of my professional Postgres career, I've never had that situation happen. Like not once have I run into a situation where a customer should have been using something other than pg_dump and they were just using pg_dump and then they got into trouble.
What happens to me rather frequently is that someone has used some other kind of backup and things have gone really badly wrong for some reason and pg_dump becomes the way that we can help that customer to get out from under that problem. So just as your experience with the customers that you've worked with is informing the way that you view the issue. I have a different set of experiences, a very different set of experiences from what it sounds like.
And so this thing that to you feels like, ah, this is the catastrophe. We've got to steer clear of this. In my experience, that's never the problem. It's always the thing we reach for to get out from under the problem. And I really just want to highlight that because I'm not saying your experience isn't valid and I hope you'll return the same courtesy.
Gülçin: I actually understand partially what Nikolay is trying to say here because I was before EDB, before working with Robert, I was working for Second Quadrant and we were building our own backup solution, Barman.
And now it's EDB owns it. And then I know, because I was actually doing remote DBA work And there was a lot of customers with backup issues. They had their own home cook scripts. In the wrong hands, this can go wrong, because there are some things that pg_dump and restore, you have to know about it. How do you do the dump process? How do you do restore? Do you actually test these things?
Do you copy the whole directory or do you consider it as just some logs that we can actually delete at some point and so on? So if people don't know how to maybe put these things together in a way, it is not really helpful for some people, then things can go wrong. And I seen that things actually went wrong.
That's why we were steering people, you know, if you just do regular backups and restores, use this tool that we have or any other tool that can be used for backups and you can keep the retention period, you can keep your backups for X days, you can restore them and test and you can have continuous backups that edit. So it's not like partial, you know, it can be just like a continuous thing that you don't need to worry and you can do point in time recovery and so on and so on.
So I understand this rhetoric and I was the advocacy of it, but then I also feel like it went too far saying, you know, this is not, this is not usable and that I, I oppose basically. Yeah. It's like pendulum. I agree. Yeah. The start of this pendulum is these 20 years of documentation. So you raised a very good point about restore. When I hear backup, full-fledged backup, it's not only physical to me, it's also verified. And if we have physical backup which we test, that's great.
While with dumps, I'm very curious, while Robert, you didn't see an ability of pg_dump to read some, I don't know, some database which is corrupted and we cannot get dump out of it. But second question like here. Okay.
That actually happens all the time. And one of the things that I often end up helping people do is fixing the database enough that we can use pg_dump to get the data out of it. Because if the database has incurred a lot of damage at a physical level for some reason, we're never going to be able to repair that well enough to give confidence that everything is the way that it should be.
So a dump and restore in my professional opinion is absolutely essential in that situation to get back to a clean state. Now you are 100% correct that the dump may also fail or the restore may also fail, but those are problems that we can understand and fix. We can look and say, ah, well you have a pg_class entry, but you're missing a pg_index entry, so we need to create the one or delete the other.
That's a problem where we can say, ah-ha, that's something that we as Postgres experts can look into and understand what needs to be done to bring this back to a state where pg_dump is going to run. But the blocks being messed up at a physical level or out of sync with each other because we've had some time travel of some kind or something like that, Those are problems we won't be able to get out from under that ever. Does that make sense?
Yeah, it makes total sense. And moreover, it's a very popular approach to use pg_dump to test physical backups to see that we can read all except indexes. For indexes, we use amcheck, but to test physical backups, we use pg_dump to /dev/null, for example, just to see that there is no corruption, like We can read it for sure. And the second, like you mentioned restore. I remember a couple of times I saw a dump could not be restored because of a unique key violation, right?
Because of corruption of uniqueness constraint. Because some duplicates happened and unique key didn't save us due to some bugs or something. Maybe somebody disabled something, I don't know. Or foreign keys, foreign keys as well. If you disable triggers, you can corrupt your data easily, right? You disable triggers, you load something and you enable triggers and Postgres won't check it. And during pg_dump, pg_dump you can have, but you cannot restore from it. Right?
So yeah, we see some mutual points definitely here. And the question is just about language I guess. That's it.
Well I think it's also about experience Nikolay, you mentioned some disasters, is it my right and understanding this is folks who have come to you with some issue and they've only... It's not just that they're using pg_dump as a backup tool, it's their only form of backup. And what kind of issues is that causing?
Remember the first managed service, managed Postgres service created, popular at least. It was called Heroku. I think it still exists, but not being actively developed these days. And they offer backups as dumps. You can download them. That's great actually. If a managed service, Postgres service provider allows you to download backups, that's great. But it was just backups. And nobody does this. I mean, nobody among very popular managed Postgres providers do this.
They rely on physical backups these days, right? And also on snapshots and so on. I mean, cloud snapshots, full disk snapshots. And this also shows evolution of backup concept in many people's minds, Not only us. So I think it would be great just to agree on the language and discuss. I'm okay to be alone thinking that backup is just physical backup. Backup could include both logical and physical, and we could clarify documentation and language articles and so on.
And I see it's a pendulum, right? Again, this is my point. Too long documentation was claiming this is a backup tool. This language was super harsh. And I remember I was trying to explain at least a couple of times in my life, I was trying to explain to some customers with growing Postgres databases, exceeding terabytes and approaching 10 terabytes. I'm saying, don't rely on pg_dump as a backup solution And they just showed me documentation saying, this is like, this is what they say.
Vendor is saying this, right?
Yeah. I mean, I think that there is a, maybe a difference between something that creates a backup and a backup tool. I mean, this does get down a little bit to what you think words mean, so it almost seems like a silly thing to argue about, right? But I think, you know, you asked Gotcha at the beginning, like, if I take a snapshot of all of my data on a cell phone, is that a backup?
And I think the answer is obviously yes, but equally obviously, that's a silly way to do a backup because your restore procedure is going to be very unpleasant, which is not what you want. I think sometimes when people talk about a backup or a tool that can take a backup or a backup tool, sometimes they mean like, can I get a copy of my data from which I could recover? Right, and that's 1 question. And pg_dump will give you that, right?
The other question, sometimes what people mean is, they mean, is this like, and they may have some particular commercial product in mind that offers a certain feature set and their question is am I going to get this feature set where for example my retention times will be managed and my my actual process of orchestrating the backup and orchestrating the recovery will be managed. And then the answer is no, pg_dump is not going to do that for you.
And you probably do want those things in most cases. So I don't know, like, I think there's a lot of nuance that's possible in the language here. But for me, the important thing is to make sure that we're clearly able to explain what the benefits and drawbacks of the different approaches are rather than, you know, spending too much time fighting about the specific language, which for me, it gets a little bit silly.
I agree. Yeah.
I agree as well, Robert. In your blog post, you make a really good case for the tone of the statement being difficult, and I think you actually use some language that is that like waters it down a little bit or explains a little bit more it doesn't take many more extra words to do so but I also wanted to ask do you see this problem in other statements in the Postgres community like are there other things people are saying that remind you of the tone of this kind of statement as well?
I don't have specific examples in mind off the top of my head, but definitely yes. I mean, it's a chronic problem on Hackers. You know, I think I wrote a blog post about the sort of tone of dialogue in the Postgres community towards the end of last year. And it's always a problem because when you post your patch on Postgres Hackers, you're essentially soliciting review. And people are rarely going to write you a review where they're like, you know what, this patch is amazing and I love it.
I mean, it happens. People actually do get those kinds of reviews, and it's a great day when you do. But generally, when you're reviewing a patch, you're picking something that you actually like and would like to see go forward. And then you're saying the worst things about it that you can think of to say. You're like, so here's all the problems.
Here's all of the stuff that I think needs to be better in order for this to become part of the product, which I hope it will, but these things are the things that I think need to be fixed first. And so what I see is that actually for a lot of committers, in particular, people's mental health is not in a great place.
You know, I kinda thought my mental health was not in a great place around some of this stuff, and then I talked to some other people and found that they were feeling worse about it than I was feeling by like significant margins. And it's, in my opinion, it's rarely because of bad intent. I mean, obviously people get frustrated. People say things that they shouldn't have said or they don't say it in the right way or they're pissed off.
I mean, those things happen and I don't wanna pretend like they don't. But I think very, very often it's a case of the nature of the workflow and the nature of the process and the kind of engineering that we're doing. It's difficult and it's error prone And even the absolute smartest people in the community make all kinds of mistakes, you know, over and over again, right? Like we were doing a rewrap of a scheduled minor release that happened last week.
We're doing that this week because somebody committed a fix for a bug and the fix contained another bug. And it doesn't matter who made the mistake or who didn't catch the mistake, that's not relevant. It happens all the time. And I think it's really challenging to people because we work in a very open environment where everybody sees every email we write, every patch we commit, every patch we thought about committing.
You know, it's out there constantly and you just realize that there are so many ways for you to screw up and every time you make a mistake, everybody sees it. So I think it's a struggle for everybody. As far as I can tell, every single person who works on Hackers encounters this problem of getting the tone right all the time. And I am certainly not going to sit here and pretend like I get it right more often than average.
I think a lot of people would say I am below average in that way, but I can tell you I'm very aware of the problem and I am trying to figure out how to do it better because at the end of the day, it's not enough for us to deliver great software. We need to deliver great software while also creating a community that people want to participate in.
And that applies for me, first of all, to the developer community because that's where I spent most of my time, but it also I think applies more much more broadly to the user community. And I think that is part of the reason this issue set me off a little bit, because, you know, it's the sort of thing that I'm struggling, often in vain, to do right on a daily basis. But instead of being targeted at other developers who at least kind of know that the negative feedback is coming.
Some of this felt to me like it was targeted toward users who like they don't realize that they're about to get jumped on for you know wading into a flame war about whether something is or isn't something you know and I just don't want you I don't want users that I don't want anybody to have that experience I certainly don't want users to have that experience.
I personally think that only from having you articulate that I've thought of 1 that I that annoys me a little bit and that's the correction of people pronouncing or spelling Postgres wrongly or missing the S off sometimes happens if people are new to the community and immediately they get jumped on. I think, oh, come on, they're clearly new. So yeah, I can definitely see that.
It also happens a lot with people based on their language of origin. Like the fact that we pronounce it PostgreSQL, I believe that's at least 1 of the canonical pronunciations, that is much more natural for somebody who learned to speak English in the United States Than it is for somebody who learned to speak English and for example India, right? Like it is English, But the way that English is spoken in India, it's a distinct dialect.
It has its own ways that people say things, ways that people communicate characteristic patterns of speech. And that's not the only place, certainly. I think actually there are probably other countries that where the problem is even more acute because English isn't even used as a common language communication in many parts of the world But even when it is it's not necessarily the same as your English and people aren't necessarily going to be You know starting from the same point, right?
If I read a word that is unfamiliar and my wife reads the same word, we're likely to pronounce it the same way in most cases. But if a colleague from halfway around the world reads the same word, their instinct may not be the same as mine. And that's not necessarily a question of me being right and them being wrong. That's the question of we went to different schools. We were taught different things.
Gülçin: Yeah, I think it also points out to the wider problem in many communities, like the longevity of the projects will depend on people. And if you are hostile to people or like, because we all come from different parts of the world, I didn't learn English until I was like, you know, an older kid. And that is always a problem when I give a talk or when I write an email.
It is still in the back of my mind that I try to correct myself, I use multiple tools, I try to present myself as good as I can, but there are limits. I still confuse the propositions I use in and at, all around, randomly. I could never fix this. And that doesn't mean that I can't contribute to the project, and I could and I do.
And that's what I believe, like these little statements, maybe we took it to a philosophical approach through, it's not about pg_dump, backup or not, but like as, you know, saying Postgres, but we should do better in how we handle communication because this is the way that people interact with today, report issues.
And if you don't accept the problems, well, people will not report it or they will not actually use this and report back what they use so that you don't actually get the feedback from people. And because you cut these channels that people actually try to communicate to you, instead of opening all these channels that we should actually amplify, we should have more channels for people to bring stuff that they interact with Postgres or ecosystem in general.
So that's where I was really impressed by Robert's blog about how open he was about this. And I appreciate the efforts that going on towards this, because when I started, I also felt scared, almost reading some of the emails. I was like, I wouldn't want this reaction to come to me, for example. So it shouldn't be like that. And I think it's not just an issue of dialect either. You know, like that is definitely part of it.
But 1 thing that I've noticed on Hackers is that clarity and extreme precision of expression is very, very highly valued, right? Like someone can come along with a worse idea and because they explain it extremely clearly and precisely either it gets accepted or they get feedback on how it should be changed or positive comments. Welcome to the community. Hey, great to have you, right? Somebody else writes a worse email about a better idea, and it actually gets a worse response.
And I do understand some of why that happens, right? We value people whose style of expression is similar to our own, where we feel like we can freely and easily communicate with those people, and everybody's busy, so you don't wanna spend a huge amount of time trying to understand email A if you could very quickly and easily understand email B But it's obviously super off-putting to people when you may have proposed something that was actually great And if somebody had given you 5 minutes
of their time, they could have understood exactly what you were trying to say, but they just flip through the email really fast. And then they moved on because they're busy. And that's obviously going to be demoralizing to people.
To play devil's advocate a little bit, I personally err on the side of being polite and trying to be kind and trying to be welcoming, but I also think sometimes that approach doesn't always land, people don't always take the lesson from it or learn from the statement or realise that maybe what I'm really trying to say or I'm not being clear enough, that kind of thing.
And I do think, for example, with the comment that we started with, I feel like there's a certain amount of trying to save people from themselves or trying to shock people, deliberately trying to be provocative in order to make people think, oh we shouldn't only be relying on this tool for this purpose or you know we maybe I should be rethinking my thoughts, you know that it doesn't apply to all of these cases like mispronouncing the project name but I've seen this specific comment come mostly
from consultants, some experienced consultants, some who are very kind and also involved in like diversity initiatives. I've definitely seen this from people that you wouldn't necessarily expect to be direct and unkind so the exact phrase pg_dump is not a backup tool. So I think that's coming from a place of having seen people shoot themselves in the foot and wanting to save people from that and wanting to be quite direct to avoid it.
So I don't know for sure, but I believe their intent is good, but maybe they're deliberately choosing to be provocative or direct or I'm not sure. I'm not sure. Maybe I'm putting words in their mouths basically. Gülçin: I think it's like we are not calling out people for just saying, you know, this is not a backup too, because we understand where they come from, because we are in the same industry, working for ages. We know these people, we all had the customer stories and so on.
But I think the general idea from here that when somebody shares a blog post, let's say we all wrote about it. I had wrote 2 blog posts and Robert wrote 2 more. And we just got together and talking about it. Let's say he's pointing out why pg_dump is good at dependency management, let's say. We take it for granted, which I wanted to bring up in today's call to just actually showcase that there are things we should appreciate in this tool, why he says it is an amazing tool.
Then towards this, somebody writes like, but it is not a backup tool. Then I don't get it, because it's not what is the discussion about. We are trying to discuss that there are ways you can make this tool in your tool set. It's not the only tool. There are professional solutions for backing up your database against disaster recovery, as we mentioned, the retention and the whole orchestration of the database backups and recovery.
But when we are discussing this tool specifically, which I feel that is important here because there's nuance to be discussed, and just shutting down the discussion saying, but it's not a backup tool, this is where I feel that this needs to be improved better because then you don't really contribute to this because you need to say then why it is not in this case, why don't you agree with this? Let's say, is it dependency management thing is not for you or why it could be improved?
You could say that pg_dump could be improved because let's say we could run vacuum after it, or we can do, I don't know, like do statistics better or something. I mean, to contribute where a pg_dump might have been improved, because I've seen people like in the discussions that they struggle with mapping, let's say, pg_dump options to pg_restore options because they assume the order will be the same and they don't get it and so on.
So there are things maybe we could get input from why people complain about these things and to improve. That's where I go for issues. I see these comments in the forums and like, oh, OK, this is a good idea. Maybe I can actually talk about this. But then when we are discussing this and coming with like, okay, this is not a backup tool, it kind of brings back to the 0 and doesn't really improve anything.
But your second article was basically agreeing that it's not a backup tool. Gülçin: No, in the sense that people say, as I'm saying, as a solution, if you want to orchestrate your backups, use a, I don't know, a tool that is like, you know, Barman, Baker's or something. But then another discussion we have, why it can't be? Why pg_dump?
We are discussing the, because in the second blog post of Robert, for example, he gives up like this, you know, why it could be a nice tool for these of the use cases that he lists. And they're getting the question of again, that I don't agree basically, like, okay, use a better maybe solution if you are managing production databases in multiple environments that are giant databases and you really don't need to deal with, you know, home run.
But he's still historically, it is still a tool that we use, you know, it could be used for different cases. What, what I hear is you're saying when people come to you to comment to your first blog post saying, I think it was Franck Pachot and I'm joining him, still joining. And he said, pg_dump is not a backup tool. You think it's like shuts down some discussion and so on.
But I just explained that this is a pain from a lot of experience and we are just reacting and what I hear you still try to judge him, right? Let's just... Gülçin: No, no, that's definitely not for it. I'm just saying we discussed that, but the second blog post was about, you know, there's backup tools, you should use it. But then when Robert was describing a part of why pg_dump is good, in my opinion, it was like very valuable points. And there it was not even relevant.
We were not even discussing, should you use this tool or not? And I'm not targeting anybody. I'm not targeting anybody. So be clear about it. Yeah. So the change happened only now. It's in Postgres 18. And recently I had discussion this like claiming, oh, it's not backup tool. Somebody said, oh, what is this about then? And sending me a link to pg_dump documentation.
So I think I would not judge people who are saying pg_dump is not a backup tool until we have this change in documentation and start recovering from this stress we had 20 years. This is my point. I stay on this point very strong. And common ground is let's start distinguishing physical and logical backups. We can clarify this on documentation as Oracle and MySQL did.
And there is already part of documentation speaking of backups, it describes dumps, I mean, pg_dump and then file system snapshots and then point-in-time recovery, full-fledged backups. And just if we clarify documentation and I will stop seeing customers sending me this link saying you're wrong, this is documentation saying you are wrong.
But like, I think, you know, I don't know, like, if you can't win an argument against a documentation link, I don't know, it feels like something's not right there, you know, like, I'm not trying to be harsh. And I just feel like, you know, if somebody hires you to give them good advice, and you give them advice that is actually good, and their response is...
Robert, let me interrupt you. Sorry. I'm just like, I feel judgment in you and Galaxian's words. Like, you tell me now how you want to be welcoming, and now you judge me like I cannot win. I cannot win 2 things. pg_dump is a backup tool. Sometimes I cannot. They say they trust documentation more because many more minds behind it. And also pg_stat_statements, documentation says you cannot say set it to positive value, keep it 0 globally because globally it's a bad idea.
I already like some customers I win, some customers I don't. I'm not genius, right? But I feel in both of you, I feel judgment. Why don't we stop judging people and sentiment and so on? I bring you like improvement. Let's say there are 2 types of backups, logical and physical. And then we, we develop language from there. And this joins us. Right? When you judge people saying they came to me with this statement, or you say, you cannot win your customer authentication, This splits us.
And I start fighting with you. I don't want to fight with you.
But I mean, that's also my complaint about the language that you were using. So I don't know how to have this discussion without having opinions about whether certain language is good or bad. And I don't think I mean, you can't write like we have to be able to talk about what the language does and to what extent it helps or hurts. And yeah, of course, there's some judgment there I don't know like I definitely have been in the situation of having a customer who?
Wouldn't listen and I The frustration that you feel with that situation feels very genuine to me like I I can totally imagine that happening and being a bad experience, but I don't know. I'm not even saying it's a bad thing that we changed the language in the documentation.
I was only reacting against sort of like conclusory statement pg_dump is a backup tool and now I don't want to talk about it anymore I think we should always be talking about it more I think we should be trying to as you say bring clarity to it and bring precision to it
I agree with you totally like I hear you now well and I think we will stop saying this actually, if documentation will be, it's already fixed, I think it can be fixed even better if we say it's a logical backup tool, for example. Everyone will be happy, I think, right? And we will stop saying it's not a backup tool. We will start saying it's not a physical backup tool, which is obvious, right? And this will join everything and so on, right?
I agree with your reaction, actually, which says this statement it's not a backup tool it's like too like far from balance right it's off balance I agree with this so it's not a good statement actually I admit but again it's a reaction to another not a good statement which we had in documentation which didn't say logical backup it said just just backup okay
we're pretty much out of time okay I wanted to thank you all for your thoughts on this I think is a difficult subject and I think actually it's really nice to have 3 people that all care about educating folks and teaching people how to do things well with different opinions on how to do so or you know slightly different approaches on how to do so but as Nikolay says, as Gülçin pointed out in her blog post, the language around this has been changed in the documentation.
Robert, keep fighting the good fight on the hacker, the tone on things on Hackers. Is there any last words anybody else wants to add? Let's start Gülçin, did you want to say anything else at the end? Gülçin: No, I'm happy that we are discussing it and I don't take things personally. I mean, we are here just to discuss technically why this could be useful in some cases and why not. And Yeah, that was anyway the summary of what I said in the blog post as well.
So if people like to read it and comment on it, and I'm happy to discuss more. Thanks. Wonderful. Well, we're looking forward to your future blog posts, whether you want to write them or not. Robert, any last words from you?
I just think, you know, on Nikolay's comment about making the documentation better, what I would encourage, and of course this is much longer than we can actually do in this forum, is, you know, let's get down beyond the headline, right? Like saying in the headline that it is a backup tool or that it's not a backup tool, it's an export, it's a dump, it's a lot. We got to get beyond that subject line and think about what we say down deeper.
I think one of the areas where the Postgres documentation is sometimes weak is it doesn't always do a good job listing pros and cons. Pros and cons very often don't get listed for things. So you know that that's probably an area where we could we could grow as a community.
Big time.
Brilliant. Well thank you so much everybody and thanks Nikolay, catch you next week.
Thank you. Thank you for coming. Gülçin: Thanks. Bye-bye. Robert: Ciao. Bye. Gülçin: Ciao.
