It is always more easier to scale horizontally at application layer (just adding...

Zanfa · on April 25, 2023

In my experience more often than not, Postgres performance problems aren't really caused by the database, but either badly designed schemas or queries. For a lot of developers, the thinking goes that 10s of millions of rows sounds like a lot like big data, so they must start building microservices, distributed systems, use K/V stores and horizontally scale a la Google, whereas their entire dataset could actually fit in RAM of a single server and they could avoid the majority of the complexity if they just scaled vertically, which is usually much, much easier.

Merad · on April 25, 2023

I think many people underestimate the capabilities of SQL databases by a couple orders of magnitude. I once worked on a feature that integrated tightly with a third party service. Their api didn't have any support for aggregate queries, and my company was smaller without real BI or data team, so I ended up writing a tool to dump our account data into a local Postgres db in order to do a some data analysis. By the time I left the company that db was approaching 50 GB, the table holding the primary data had about 40 million rows, and a couple of the supporting tables were over 100 million rows. This was all on a 2018-era Dell dev laptop - a fairly powerful machine (6 core/12 thread, 32 GB RAM, SSD), but certainly no server. It took about 90 seconds to update the materialized views that summarized the data I looked at most frequently. More than acceptable for my use case, and there was a lot of room for improvement in that schema (it was pretty much a straight dump of the api data).

smcleod · on April 25, 2023

Came here to say exactly this. Over the last 12~ years working with PostgreSQL I've dealt with quite a few performance related issues - almost all were poorly written queries.

bakuninsbart · on April 25, 2023

Can you point to some good resources on how to write better postgres queries? Or give examples of common pitfalls?

bbojan · on April 25, 2023

Start with EXPLAIN ANALYZE then work from there. You can use tools where you paste the output of it and it shows you the data in a more easy to understand format.

vberg · on April 25, 2023

I am using https://explain.dalibo.com/ for that exact purpose and it does a great job highlighting the perfs issues.

Zanfa · on April 25, 2023

I don't have any other good recommendations other than learning how to read and use EXPLAIN [1]. This should typically be the first tool when you have a slow query.

[1] https://www.postgresql.org/docs/current/sql-explain.html

Ialdaboth · on April 25, 2023

An old classic but too many indices can be harmful too.

branko_d · on April 26, 2023

https://use-the-index-luke.com/

lordnacho · on April 25, 2023

I would guess the most common pitfall is either not having indices or having the wrong kind of index for your query.

eastern · on April 26, 2023

100%. Exaggerating the bigness of their own data is a common phenomena. Sometimes one is talking to a group of developers who are all so impressed with this bigness but every one of them has a phone in their pocket which could fit their entire dataset.

ak39 · on April 25, 2023

You can horizontally scale the database directly.

beebmam · on April 25, 2023

Depends on the scenario