The problem with these setups is that: 'There Are Many Ways To Do It'(tm), and: 'You Really Need To Test For Your Use-Case'(tm). I need to read and understand everything to decide what's best in my case. And then you need to write a lot of scripts, do a lot of time-consuming testing, and document everything.
Scalable, Reliable PostgreSQL is not really there yet.
There are many ways to do it, yes, but most people just want one thing: the db failing over in case it goes down.
Scalable, reliable Postgres is absolutely possible. This comment is unfounded.
If you truly need reliability, of course you'll have to invest some time into setting it up. If this isn't something you're interested in, just go with a hosted postgres instance like Heroku.
As someone that's deployed a streaming replication master slave pair with a backup regime designed to achieve a Point In Time Recovery Target of no more than 1 minutes data loss in the event of complete failure requiring restoration from backups....
Everything, it all needs to be simplified. Yes it's the "difficult" problem for a database, but it's also the one thing where I feel Postgres is lacking. What's needed is the Postgres community to quit messing around and accept that clustering is no longer something they can fob off to other projects. They need to adopt either Postgres-XC or preferably Postgres-XL and push it forward with full force.
The current state of affairs is "pgpool, slony, repmgr, buccardo, and the rest, pick one, spend hours testing how well they fit your use case. Then use that." Now this is ok when a project is new. But Postgres is about 20 years old. Clustering is not new, many of these solutions are old, we finally have a very good clustering option but it's not even mentioned on half the official wiki pages about clustering last time I looked. Postgres is a professional grade tool, so it's ok to not endorse things not considered ready yet... But Postgres-XL is in use in production deployments, it's built out of Postgres itself not via proxies or other layering over or inside vanilla Postgres. It's not a hack, and endorsing it as the definitive future of clustered Postgres but not necessarily endorsing it as "as good as regular Postgres" is long overdue. At least that's my opinion as a serious production Postgres user who makes the decision to use other databases more often than not, due to lack of good clustering.
When you bring them up in matched pairs, ensuring that the data nodes are using streaming replication is a lot easier ... AND worth the trouble for the ability to do whole cluster "upgrades" without taking the cluster itself offline.
Yes. And so good that the tools I use to deal with PITR are actually just wrappers over those 2 commands.
But how you orchestrate the recovery process, test the recovery process, centralise your backup files for faster tar/upload to offsite storage. All these things are "not simple" when all you have is the Postgres docs.
Postgres is more than just a database/database engine. Its a database ecosystem.
Scalable, Reliable PostgreSQL is not really there yet.
There are many ways to do it, yes, but most people just want one thing: the db failing over in case it goes down.