Yes. Our schema is fully denormalized, which is particularly important for perfo...

ams6110 · on Feb 26, 2014

So why are you using an RDBMS?

_pmf_ · on Feb 26, 2014

> So why are you using an RDBMS?

Because even a shitty RDMBS is more robust, more secure and faster than any NoSQL wankery for this kind of use case.

NathanOsullivan · on Feb 26, 2014

How many times do we see the opposite question getting asked?

Postgres is robust, feature-filled, and scalable. You dont have to want all three to extract value from it.

_gtly · on Feb 26, 2014

Postgresql has gradually and intelligently been incorporating NoSql features e.g: hstore, hstore2, json, jsonb, etc. There are powerful features like indexing, etc.

There are even cases where performance has been found to be better than e.g: mongo: source: http://obartunov.livejournal.com/175235.html

There is a lot of momentum behind the improvements/additions. We will be seeing much more NoSql in 9.4, 9.5, and beyond.

twic · on Feb 26, 2014

> Our schema is fully denormalized, which is particularly important for performance

And yet appending a single event (you dedupe for each event, right?) takes half a second. That's an eternity!

drob · on Feb 26, 2014

Query performance is critical. Insert perf is a comparatively minor concern.

Even so, there are a few factors to consider:

- Half second dedupes are for users with 100k+ events, which is <<1% of them.

- We batch events for ~5s before adding them to the cluster, so we aren't deduping for every event -- only once per ~5s of events per user.

If this becomes an issue, we can remove the deduping from normal operation and only call it when we're backfilling / updating events. Even so, we still need this function to exist, and the 100x performance improvement is very helpful.