100k TPS over a billion rows: the unreasonable effectiveness of SQLite

rich_sasha · 2025-12-03T05:31:13 1764739873

I concur that sqlite is quite amazing. That said, I was a heavy user and have grown some skepticism as well:

- it is not that hard to lock the db. Usually killing the process that caused the deadlock solves the issue - but you need to identify it / monitor for it. And yes, it happens with WAL too

- but when it does happen, it is quite scary. Simply, anything that touches your DB suddenly stops working - can't read, can't write.

- in some cases, WAL does not checkpoint. This leads to drastic growth in the size of the WAL file, and down the line in catastrophic slowdown of queries - things that take 10ms suddenly take 10 seconds. In my particular case, no tweaking of SQLite params fixed it. I had to monitor for it, and periodically force WAL file to be rolled into the main DB.

- all of this gets harder on Windows, where eg.you cannot just 'lsof' a file.

- the performance stats change somewhat for the worse in the cloud on drives that look local but actually aren't. Of course that is not sqlite's fault, but the blazing fast performance doesn't apply to all commonly encountered environments that look like real local drives.

I'm not dissing SQLite, I use it despite these shortcomings. Equally, I'm happy to reach for something like Postgres, which, well, hasn't burned me yet.

EDIT I should add that despite all this I never managed to corrupt the DB, or break any of the SQL promises - never messed up indices, never saw broken ACID compliance etc. And that's a massive endorsement, on reflection.

andersmurphy · 2025-12-03T07:04:17 1764745457

These are some really good points.

- WAL checkpointing is very important (litestream handles this well). As you said not checkpointing can cause massive query slow down.

- SQLITE_LOCK and SQLITE_BUSY can be avoided by ensuring your application only has a single write connection ideally behind an MPSC queue. After WAL this is probably one of the biggest SQLite quality of life improvements.

- 100% avoid cloud drives in this context you ideally want attached NVME.

- Postgres is great and there's nothing wrong with using it!

zbentley · 2025-12-03T12:25:00 1764764700

> a single write connection ideally behind an MPSC queue

That’s a pretty tall order. What if I want read-after-write consistency for code that issues a write? Did you mean some kind of a fair mutex around writes instead (in which case, how is this different from what SQLite already does?)? What if writes are coming from multiple uncoordinated processes? Do I then need to bring in a daemon or IPC?

andersmurphy · 2025-12-03T12:34:59 1764765299

It's really not. You have multiple read connections and a single write connection. You batch over that single write connections. The items of a batch is just a function with a sequence of queries and or application logic. That means these functions can read their own writes (as they can read using the write connection). This gives you read-after-write consistency. Because, all these functions are in a batch transaction they can read the writes of functions that have been run before them in the same batch.

Generally I find SQLite works best with a CQRS model, where you push view updates to your clients. So reads and writes are separate. In this model you only need read-after-write for transaction logic (i.e doing a read to check for something before doing an update).

Writes can come from multiple threads, they will just be put on a queue. Personally, I use an optimistic concurrency queue for batches, and there's only one "batcher" that has the write connection.

The key difference between this and what SQLite does is quite significant.

SQLite doesn't queue those writes instead they storm the place, taking turns at asking "can I go now" and sleeping for (tens, hundreds of) milliseconds at a time.

This only gets "worse" as computers get faster: imagine how many write transactions a serial writer could complete (WAL mode and normal synchronous mode) while all your writers are sleeping after the previous one left.

If you have a single limited pool, your readers will now be stuck waiting for an available connection taken by sleeping writers etc.

It's fairer and more efficient if your application handles the writer queue.

rich_sasha · 2025-12-03T12:44:20 1764765860

This may be a good approach, but it departs a little from "it's just a file with ACID semantics, abuse it all you like". If you have multiple reader/writer processes then, if I read your post correctly, you'd need a gatekeeper process that batches the writes. And it needs, I suppose, to support more than query+data. I remember running a transaction that inserts the data, queries last inserted rowid and returns that to the user. It's not super straightforward to implement via a separate process.

But in any case, by the time you do that, you need a monitored (hand-rolled) service and at least some of the allure of a db-in-a-file goes away.

Again I'm not being mean about SQLite, it's a great piece of technology. Just sharing my war stories with others who may want to push it hard one day too.

andersmurphy · 2025-12-03T12:57:00 1764766620

SQLite, for me at least, is an embedded database it should be relatively tightly coupled to your application if you want to get the most out of it in the context of web servers. After all it's part of your application.

Again, maybe it's because I'm using Clojure on the JVM (with both real and virtual threads as well as bunch of great concurrent data-structure). But, setting up a "process" (thread) to do that batching is not hard, it's also easy for the individual functions to return results to their call sites via the Java promise API (after the batch completes).

All of this runs in a single process. With a single deployable uberjar/artefact.

zbentley · 2025-12-04T12:10:34 1764850234

> setting up a "process" (thread) to do that batching is not hard

Processes and threads are different things with specific meanings.

How would this approach work in an application server that implements concurrency with forking/multiple child processes (Python, ruby, PHP, node cluster, and many many more)?

The biggest benefit of SQLite is that, as GP says, it can be used like a file. Many places I’ve worked have used it to coordinate between all sorts of processes—often different processes written in different languages!

andersmurphy · 2025-12-04T12:23:49 1764851029

>How would this approach work in an application server that implements concurrency with forking/multiple child processes (Python, ruby, PHP, node cluster, and many many more)?

Sadly it mostly doesn't without a lot of work. You need to use languages that are not single threaded (have native multithreading support): Clojure/Java/Go/C# etc.

It's why python/ruby/php/node etc are fundamentally constrained. I'd argue those languages are the ones that pushed the horizontal scaling trend as they struggle to get the most out of a single machine.

asa400 · 2025-12-03T06:06:32 1764741992

From your experience, would you call these behaviors bugs, or are they more known issues that result from SQLites specific implementation quirks? What kinds of workloads were you throwing at it when these types of issues happened? Asking as someone who really enjoys and respects SQLite but hasn't encountered these specific behaviors before.

rich_sasha · 2025-12-03T06:28:35 1764743315

I was pushing SQLite quite hard. My DB was at peak 25GB or so. Occasional queries of O(1e6) rows while simultaneously inserting etc. Many readers and a few writers too. Id expect some degradation, sure, but Id say it wasn't very graceful.

I think, however, I was well within the parameters that SQLite maximalists would describe as within th envelope of heavy but fine usage. YMMV.

I found a very small number of people online with the exact same issues. Enough to know I'm not hallucinating, but not enough to find good support for this :/ but, TLDR, forcing WAL truncation regularly fixed it all. But I had to do it from an external process on a heartbeat, etc etc

andersmurphy · 2025-12-03T07:07:54 1764745674

You don't need to truncate the WAL, you can checkpoint PASSIVE and the WAL will be overwritten (so your queries won't slow). Generally if you're using litestream for backups it will do checkpointing for you. If you aren't depending on the after each batch (always be batching!) works well too.

I'd say the hardest part of using SQLite is its defaults are rough, and a lot of drivers don't handle batching for you.

saberience · 2025-12-03T17:24:41 1764782681

I was in love with sqlite too, until it just started getting randomly corrupted/locked and I kept having to restore it, and I never worked out why it was happening.

I appreciate its "simplicity" but ultimately I hated not knowing why it occasionally just shit the bed and ended up in an unrecoverable state. I also didn't like having to roll my own recovery system for it. Now I just use Postgres for all my hobby projects and it "just works" and I've never had it lock-up or corrupt itself...

Your mileage may vary, but sqlite definitely isn't as stable as it makes it seem.

Scaevolus · 2025-12-03T06:16:28 1764742588

WAL2 might help with the checkpointing problem: https://sqlite.org/src/doc/wal2/doc/wal2.md

rich_sasha · 2025-12-03T06:31:52 1764743512

Iirc it was somewhat experimental when I last used SQLite like this. Is is not considered mainstream? Thanks for the top tip.

weitendorf · 2025-12-02T19:00:16 1764702016

I've been working on a hybrid protobuf ORM/generic CRUD server based on sqlite

(code at https://github.com/accretional/collector - forgive the documentation. I'm working on a container-based agent project and also trialling using agents heavily to write the individual features. It's working pretty well but the agents have been very zealous at documenting things lol).

This is my first real project using sqlite and we've hit some similarly cool benchmarks:

* 5-15ms downtime to backup a live sqlite db with a realistic amount of data for a crud db

* Capable of properly queueing hundreds of read/write operations when temporarily unavailable due to a backup

* e2e latency of basically 1ms for CRUD operations, including proto SerDe

* WAL lets us do continuous, streaming, chunked backups!

Previously I'd only worked with Postgres and Spanner. I absolutely love sqlite so far - would still use Spanner for some tasks with an infinite budget but once we get Collector to implement partitions I don't think I would ever use Postgres again.

bombela · 2025-12-02T20:22:51 1764706971

> * 5-15ms downtime to backup a live sqlite db with a realistic amount of data for a crud db

Did you consider using a filesystem with atomic snapshots? For example sqlite with WAL on BTRFS. As far as I can tell, this should have a decent mechanical sympathy.

edit: I didn't really explain myself. This is for zero downtime backups. Snapshot, backup at your own pace, delete the snapshot.

hedora · 2025-12-02T22:05:13 1764713113

If it’s at 5-15ms of downtime already, you’re in the space where the “zero” downtime FS might actually cause more downtime. In addition to pauses while the snapshot is taken, you’d need to carefully measure things like performance degradation while the snapshot exists (incurring COW costs) and while it’s being GCed in the background.

Also, the last time I checked the Linux scheduling quanta was about 10ms, so it’s not clear backups are going to even be the maximum duration downtime while the system is healthy.

bombela · 2025-12-03T04:02:07 1764734527

I am not so sure you know what you are talking about. Feel free to provide some reading material for my education.

Why would the scheduler tick frequency even matters for this discussion. Even on a single cpu/core/thread system. For what is worth, the default scheduler tick rate has been 2.5ms since 2005. Earlier this year somebody proposed switching back to 1ms.

https://btrfs.readthedocs.io/en/latest/dev/dev-btrfs-design.... https://docs.kernel.org/admin-guide/pm/cpuidle.html https://docs.redhat.com/en/documentation/red_hat_enterprise_... https://sqlite.org/wal.html#ckpt https://www.phoronix.com/news/Linux-2025-Proposal-1000Hz

hedora · 2025-12-04T03:44:57 1764819897

Well, I haven’t checked it for a while. Still, try measuring FS latencies during checkpoints, or just write a tight loop program that reads cached data and prints max latencies once an hour. Use the box for other stuff while it runs.

bombela · 2025-12-05T23:20:33 1764976833

https://github.com/accretional/collector/blob/main/pkg/colle...

There is no way btrfs can be slower than this in any shape or form.

If we are comparing something simpler. Like making a copy of the SQLite database periodically. It makes sense for a COW snapshot to be faster than copying the whole database. After reading the btrfs documentation, it seems reasonable to assume that the snapshot latency will stay constant, while a full copy would slow down as the single file database grows bigger.

And so it stands of reason that freezing the database during a full copy is worse than freezing it during a btrfs snapshot. And a full copy of the snapshot can then be performed, optionally with a lower IO priority for good measure.

It should be obvious that the less data is physically read or written on the hot path, the less impact there is on latency.

For what is worth, here is a benchmark comparing IO performance on a few linux filesystem. Including some sqlite tests. https://www.phoronix.com/review/linux-615-filesystems

matthewaveryusa · 2025-12-02T18:28:41 1764700121

The only caveat being this assumes all your data can fit on a single machine, and all your processing can fit on one machine. You can get a a u-24tb1.112xlarge with 448 vcores, 24TB RAM for 255/hour and attach 64TB of EBS -- that's a lot of runway.

tuhgdetzhh · 2025-12-02T18:35:49 1764700549

Or rent a bare-metal machine from hetzner with 2-3x performance per core and 90% less costs[1].

[1] Various HN posts regarding Hetzner vs AWS in terms of costs and perf.

tetha · 2025-12-02T19:09:44 1764702584

In my experience, a decently managed database scales very hard.

3x EX44 running Patroni + PostgreSQL would give you 64GB of working memory, at least 512 GB NVMe of dataset (configurable with more for a one-time fee) at HA + 1 maintenance node. Practically speaking, that would have carried the first 5 - 10 years of production at the company I work at with ease, for 120 Euros hardware cost/month + a decent sysadmin.

I also know quite a few companies who toss 3-4x 20k - 30k at DELL every few years to get a database cluster on-prem so that database performance ceases to be a problem (unless the application has bad queries).

cortesoft · 2025-12-03T00:30:45 1764721845

There are no Hetzner servers that have 24TBs of RAM

ethanwillis · 2025-12-02T19:57:31 1764705451

This might be true in terms of direct monetary costs.

I want to like Hetzner but the bureaucratic paper process of interacting with them and continuing to interact with them is just... awful.

Not that the other clouds don't also have their own insane bureaucracies so I guess it's a wash.

I'm just saying, I want a provider that leaves me alone and lets me just throw money at them to do so.

Otherwise, I think I'd rather simply deploy my own oversized server in a colo even with the insanely overpriced hardware prices currently.

edit: And shortly after writing this comment I see: "Microsoft won't let me pay a $24 bill, blocking thousands in Azure spending" https://news.ycombinator.com/item?id=46124930

rixed · 2025-12-03T08:37:05 1764751025

Yes there is some bureaucratic paper churn to deal with them, but it's a one time cost. I did it once probably more than 10 years ago. Since then, login to the website takes me <10s (with OTP) every couple of days and then finding what I'm looking for in the web UI or the API doc is usualy just 3 or 4 clicks away (their website is a bit messy).

Compare that with AWS, where login is slow and unreliable (anyone else got an error message after every login and has to refresh to get in?), the website is a giant mess collapsing under its own weight, and slow like it's still running websphere.

Over the last 10 years, I've certainly lost way more time working through aws paperless bureaucracy than complying with Hetzner paper bureaucracy. And I'm not even using aws for that long.

earthnail · 2025-12-02T20:11:18 1764706278

Can you elaborate on what the bureaucracy is you experienced? I'm a Hetzner customer since last month and so far I thoroughly enjoy it. Have not encountered any bureaucracy yet.

ethanwillis · 2025-12-02T22:15:31 1764713731

I think I was still being a bit too harsh even after throwing into my comment that other providers aren't perfect either.

But basically after the initial paperwork I had some issues with my account getting flagged even though I wasn't using it 99.999% of the time. It's not a huge deal for me because I wasn't trying them out for anything serious. I just questioned how often that might happen if I was actually using it seriously and what kind of headaches it could cause me while re-verifying everything with them.

From people I know if everything is going good then their service is great. Server performance is good, pricing is good, etc.

vachina · 2025-12-03T01:08:35 1764724115

You’re renting an entire infrastructure, I think a bit of KYC is reasonable.

I had more trouble onboarding AWS SES, with a process that felt more like me begging. With which I said fuck it and went with self hosting ever since (on a bare metal server no less)

wesammikhail · 2025-12-02T23:33:34 1764718414

I was asked for a passport photo when I tried to open an account. They literally asked for a passport photo immediately after the signup form. Like WHAT? I couldn't believe my eyes. The most insane shit I've ever seen.

rixed · 2025-12-03T08:46:03 1764751563

Quite commonly required by law in Europe; but often times not implemented very seriously by hosting providers, but Germany seems to be an exception.

I remember a time in France for instance, about 15years ago, it was mandatory to provide your ID when bying a mere prepaid sim card. No seller would actually check, and a coworker of mine who used to work for one of the largest french telcos at the time told me that once they ran some stats over the customer database and noticed that most names where from popular comics and TV show. They laughted and moved on. These days, the seller would at least ask for some ID.

aka circling the cattle.

sampullman · 2025-12-03T07:26:05 1764746765

If I was letting some random person rent one of my servers without oversight, I'd sure want to see some ID first.

aquariusDue · 2025-12-03T01:10:21 1764724221

It's weird seeing people on HN complain about this aspect regarding Hetzner because it's the complete opposite of my experience. Two years I've rented a dedicated server for around 40 euros monthly from Hetzner as a business customer and I had no issues whatsoever. They didn't ask for a business license or personal ID or anything really, I provided a VAT ID along with a business name and address but it wasn't anything extra compared to what I also provided Migadu or Porkbun for example.

I suppose they might have more KYC procedures for personal accounts based outside the EU otherwise I have no clue.

whatevermom3 · 2025-12-03T06:48:54 1764744534

Same, Hetzner has always been very flexible with me when it comes to practically anything. It's always been humans answering my queries, with of course various quality but overall quite good especially for the price. I gave them some VAT number to get reduced prices at some point and that was it :shrug:

vidarh · 2025-12-03T08:45:27 1764751527

Used them for more than 10 years. There was a one off, straightforward process of providing some details back then, and then nothing more.

gip · 2025-12-03T01:44:39 1764726279

I'm based in the US and I tried twice to create an account for Hetzner (a personal account as well as a company / startup account). They rejected all my attempts. I don't quite understand their business model :)

johnthescott · 2025-12-03T02:36:45 1764729405

similar experience, as well. not sure what's going on with hetzner.

superze · 2025-12-03T03:50:48 1764733848

I love their pricing and the simplicity, but they don't give the impression of being highly skilled. They have zero managed services, not even managed K8. Their s3 (very mature tech at this point) is utterly garbage even one year after their launch.

Then the bureaucracy you mention which is just a reflection how they work internally as well.

Nextgrid · 2025-12-02T20:05:51 1764705951

> I want a provider that leaves me alone and lets me just throw money at them to do so.

That’s been my experience with Hetzner.

A lot of people get butthurt that a business dares to verify who they’re dealing with as to filter out the worst of the worst (budget providers always attract those), but as long as you don’t mind the reasonable requirement to verify your ID/passport they’re hands-off beyond that.

ethanwillis · 2025-12-02T22:11:44 1764713504

That's fair and I don't have any major issues with that.

I guess my concern on the bureaucracy is if you are unlucky enough to get flagged as a false positive it can be an annoying experience. And I can't really blame them too hard for having to operate that way in an environment of bad actors.

You're definitely right that the budget providers do attract the types of people trying to do bad things/exploit them in some way.

closeparen · 2025-12-03T16:44:34 1764780274

It’s too bad that there does not seem to be a comparable provider with datacenters in North America.

adityaathalye · 2025-12-02T19:09:29 1764702569

Heh, the documentation calls out the limits. Maximum (theoretical) DB size is 281TB: https://sqlite.org/limits.html

> This particular upper bound is untested since the developers do not have access to hardware capable of reaching this limit.

> However, tests do verify that SQLite behaves correctly and sanely when a database reaches the maximum file size of the underlying filesystem (which is usually much less than the maximum theoretical database size) and when a database is unable to grow due to disk space exhaustion.

DenisM · 2025-12-02T18:38:26 1764700706

Scale-up solves a lot of problems for stable workloads. But elasticity is poor, so you either live with overprovisinoed capacity (multiples, not percentages) or fail under spiky load which often time is the most valuable moment (viral traffic, Black Friday, etc).

No one has solved this problem. Scale out is typically more elastic, at least for reads.

kragen · 2025-12-02T19:16:13 1764702973

That's a good point, but when one laptop can do 102545 transactions per second, overprovisioned capacity is kind of a more reasonable thing to use than back when you needed an Amdahl mainframe to hit 100 transactions per second.

DenisM · 2025-12-02T20:54:53 1764708893

As compute becomes cheaper your argument becomes more and more true.

But it only works if workloads remain fixed. If workloads grow at similar rates you’re back to the same problem.

kragen · 2025-12-02T21:10:21 1764709821

Well, it doesn't work for the newly added workloads. But for the most part we instead have the same workloads performed less efficiently.

CuriouslyC · 2025-12-02T21:14:39 1764710079

I love hetzner for internal resources because they're not spikey. For external stuff I like to do co-processing, you can load balance to cloudflare/aws/gcp services like containers/Run/App Runner/etc.

masterj · 2025-12-02T22:18:16 1764713896

I suspect that for a large number of orgs accepting over-provisioning would be significantly cheaper than the headcount required for a more sophisticated approach while allowing faster movement due to lower overall complexity

kiitos · 2025-12-02T20:28:10 1764707290

and that your application doesn't need to be resilient to host or network faults

embedding-shape · 2025-12-02T18:45:24 1764701124

> The only caveat being this assumes all your data can fit on a single machine

Does my data fit in RAM? https://yourdatafitsinram.net/

Not sure using EC2/AWS/Amazon is a good example here, if you're squeezing for large single-node performance you most certainly go for dedicated servers, or at least avoid vCPUs like a plague.

jandrese · 2025-12-02T19:00:10 1764702010

That site is a bit questionable. I entered "64TB" as the answer and it was very happy to show me a bunch of servers that maxed out at 6 or 8TB. Even the one server that listed 64TB of RAM might be questionable since it's not leaving room for the OS or your applications. That said 64 TB is a gargantuan amount of data, so I'm not too worked up over it not fitting in RAM. Lord help you if you have a power outage and have to reload the data from disk.

ErroneousBosh · 2025-12-02T20:15:00 1764706500

Getting on for ten year's worth of forum posts on https://rangerovers.pub/ comes to about 32MB of SQL dump.

So yeah, easily.

paulddraper · 2025-12-02T20:43:26 1764708206

How does 25 TiB fit in RAM when the max machine has 24 TB?

cryptonector · 2025-12-03T19:29:55 1764790195

That's true of PG as well.

diath · 2025-12-02T18:49:26 1764701366

> Hopefully, this post helps illustrate the unreasonable effectiveness of SQLite as well as the challenges you can run in with Amdahl's law and network databases like postgres.

No, it does not. This article first says that normally you would run an application and the database on separate servers and then starts measuring the performance of a locally embedded database. If you have to keep the initial requirement for your software, then SQLite is completely out of equation. If you can change the requirement, then you can achieve similar performance by tuning the local PGSQL instance -- and then it also becomes a valuation of features and not just raw throughput. I'm not saying SQLite is not an option either, but this article seems confusing in that it compares two different problems/solutions.

sethev · 2025-12-02T20:08:54 1764706134

Right - but SQLite handily beats the case where postgres is on the same box as well. And it's completely reasonable to test technology in the configuration in which it would actually run.

As an industry, we seem to have settled on patterns that actually are quite inefficient. There's no problem that requires the solution of doing things inefficiently just because someone said databases should run on a different host.

stickfigure · 2025-12-02T21:25:05 1764710705

If you're going to run on more than one piece of hardware, something is going to be remote to your single writer database.

As an industry, we've generally decided against "one big box", for reasons that aren't necessarily performance related.

janandonly · 2025-12-03T08:34:32 1764750872

I sometimes dream of a local-first world in which all software works with local DB and only writes to the cloud as an afterthought, maybe as a backup or a way to pick up work on another machine. It just boggles my mind that more software nowadays relies on an always on internet connection for no good reason other then the design itself.

andersmurphy · 2025-12-03T09:38:34 1764754714

I think people's reaction to cloud vendors is to go local first. But, there's a middle ground VPS, rented server, even self hosting.

My problem with local first is it's fine for solo apps with the occasional sync. But doesn't work for medium to large datasets and the stuff I work in is generally real-time and collaborative. To me multiplayer is one of the strengths of the web.

BugsJustFindMe · 2025-12-02T19:54:21 1764705261

> If you have to keep the initial requirement for your software, then SQLite is completely out of equation.

No it isn't? You can run a thin sqlite wrapping process on another server just fine. Ultimately all any DB service is, PostgreSQL included, is a request handler and a storage handler. SQLite is just a storage handler, but you can easily put it behind a request handler too.

Putting access to sqlite behind a serial request queue used to be the standard way of implementing multi-threaded writes. That's only spitting distance away from also putting it behind TCP.

gunnarmorling · 2025-12-02T22:04:30 1764713070

You could do that, but you'd run into exactly the same bottleneck the author describes with a remote Postgres instance. The workload exposes high contention on hot rows. If transactions are kept open for several milliseconds due to this being a remote network call between client and DB server, throughput will be equally limited also when using SQLite.

andersmurphy · 2025-12-03T06:55:21 1764744921

Yeah this is a very good point. Any sort of network drive will have similar issues with SQLite. You're very much wanting attached NVME.

Barathkanna · 2025-12-02T21:02:40 1764709360

Exactly. People forget that “SQLite can’t do X” often really means “SQLite doesn’t ship with X built in.” If you wrap it with a lightweight request handler or a queue, you essentially recreate the same pattern every other DB uses. The fact that PostgreSQL bundles its own coordinator doesn’t make SQLite fundamentally incapable. It just means you choose whether you want that layer integrated or external.

chasil · 2025-12-02T21:56:48 1764712608

As long as WAL mode is not enabled, connections over NFS/SMB or other file sharing protocols will work.

I'm not saying that this is a good idea, and it could fail in a spectacular manner, but it can be done. DML over this is just asking for trouble.

formerly_proven · 2025-12-02T20:02:20 1764705740

Well that's just dqlite/rqlite.

benjiro · 2025-12-02T23:40:13 1764718813

> Well that's just dqlite.

Far from it, as now your not just dealing with network but also with raft consensus... So each write is not just a network trip, its also 2x acknowledging. And your reads go over the leader, what can mean if somebody accessed node 1 app but node 2 is the leader, well, ...

Its slower on reads and writes, then just replications that PostgreSQL does. And i do not mean async but even sync PostgreSQL will be faster.

The reason dqlite exists is because canonical needed something to synchronize their virtualization cluster (lxd), and they needed a db with raft consensus, that is a lib (as not a full blown server install like postgres). Performance was not the focus and its usage is totally different then most people needs here.

BugsJustFindMe · 2025-12-02T20:07:07 1764706027

Dqlite and Rqlite are primarily for buildling fault-tolerant clusters. But if you just take the network access part, then ok sure, but also so what?

otoolep · 2025-12-02T20:35:37 1764707737

rqlite[1] creator here.

Nit: dqlite is a library, it is not a network-exposed database like rqlite is. Sure, it requires connecting to other nodes over the network, but local access is via in-process. In contrast one connects with rqlite over the network - HTTP specifically.

[1] https://rqlite.io

andersmurphy · 2025-12-03T09:23:42 1764753822

Massive fan rqlite. Awesome work!

andersmurphy · 2025-12-02T19:01:16 1764702076

Paradoxically, raw throughput matters a lot more if you are going to scale on a single box. SQLite is 10x PG on a single box in this example. Considering databases tend to be the bottle neck that can take you an order of magnitude further. PG on the same server will also be slower the more complex the transaction as unix sockets are still going to be considerably slower than a function call.

The other thing to point out is in this article is that the PG network example CANNOT scale horizontally due to the power law. You can throw a super cluster at the problem and still fundamentally do around 1000 TPS.

prisenco · 2025-12-02T20:36:06 1764707766

Also important is just how fast cheap hardware has gotten which means vertical scaling is extremely effective. People could get a lot farther with sqlite in wal mode on a single box with an nvme drive than they imagine. Feels like our intuition has not caught up with the material reality of current hardware.

And now that there are solid streaming backup systems, the only real issue is redundancy not scaling.

barfoure · 2025-12-02T19:58:00 1764705480

> Paradoxically, raw throughput matters a lot more if you are going to scale on a single box.

There’s absolutely nothing paradoxical about any of this.

embedding-shape · 2025-12-02T18:56:43 1764701803

> If you have to keep the initial requirement for your software, then SQLite is completely out of equation.

It'd be a very short article if so, don't you think? Full article would be something like: "Normally you'd have a remote connection to the database, and since we're supposed to test SQLite's performance, and SQLite is embedded, it doesn't compare. Fin"

stonemetal12 · 2025-12-02T19:19:22 1764703162

The table of data at the end of the article has 7 lines, only one has data for both DBs. What was the point of setting up the comparison if there is no comparison made?

andersmurphy · 2025-12-02T19:24:47 1764703487

Because it shows that a network RDBS database cannot get you out of this predicament.

fulafel · 2025-12-02T19:37:56 1764704276

What is says first is: "SQLite is for phones and mobile apps (and the occasional airliner)! For web servers use a proper database like Postgres!"

Though I'd say it's for a broader set of applications than that (embedded apps, desktop apps, low-concurrency server apps etc).

Phones and mobile apps installations of course outnumber web app deployments, and it doesn't say what you paraphrased about servers.

jaketoronto · 2025-12-02T20:43:41 1764708221

Are you limiting your # of connections to postgres to 8? Is this unnecessarily throttling your throughput? This seems like quite the bottleneck... connection pools are good when your app is overwhelming your db.. but in this case, you really should be trying to put more load on Postgres... I'm concerned that this whole experiment is tainted by this choke point. I would love to see this tested again with a much larger connection pool. (Also, might be nice for you to share what the CPU and thread usage on the Postgres side was during this). (Side note: I'm a big fan of sqlite.. but I've run into lots of performance bottlenecks caused by arbitrarily low connection pools and the like)

jaketoronto · 2025-12-02T20:51:10 1764708670

To further explain:

You mention setting the conn pool to 8 to match your # of cores. That would be fine if you didn't have any sleeps inside of your txns... But the moment you added the sleeps inside the txns, your limit of 8 kills through throughput... because no other thread can access the DB once 8 of them grab connections and start the 20ms of total sleep. Imagine instead if you had 64 connections... you would 8x your throughput... What if you were to go even higher? At some point you might start overloading the DB... at that point, you could consider tuning the db to accept more connections... or... maybe you've truly reached the DB's peak performance limit.

I just don't think that 8 connections represents that limit... you need to do everything you can to open up your client config until you reach PG's limitations.

andersmurphy · 2025-12-02T21:27:32 1764710852

A larger pool actually makes the number worse because it adds more contention. I tested it with 64 and all the results were worse. The last example which was 348 TPS drops to 164 TPS!

Tangentially I also highly recommend this article on pool sizing.

https://github.com/brettwooldridge/HikariCP/wiki/About-Pool-...

andersmurphy · 2025-12-03T06:56:04 1764744964

Added an epilogue that covers larger connection pools and the results.

ch2026 · 2025-12-02T23:30:11 1764718211

Yeah I don’t buy this article. Something is off. I run vastly more TPS on a very barebones MySQL install over the network.

Edit: Keep downvoting. I hate this fucking site.

tomconnors · 2025-12-02T19:45:26 1764704726

Cool stuff as usual, Anders. One of the nice things about running a networked DB is that it makes redeploying the application a bit simpler. You can spin up a new EC2 instance or whatever and once it's online kill the old one. That gets 0 or close to 0 downtime. If the DB is on the same instance, replacing it requires loading up the DB onto the new instance, which seems more error prone than just restarting the app on the original instance, but in my experience that typically incurs downtime or some complicated handoff logic. Have you had to deal with anything like that running sqlite in prd?

andersmurphy · 2025-12-02T20:01:29 1764705689

Oh there are a bunch of considerations.

You're going to want persistent storage on your server, not ephemeral. You'll also want NVME. A lot of the time you're going to end up on bare metal running a single server anyway.

You're going to have down time for migrations unless you're very clever with your schema and/or replicas.

Litestream for me at least is what makes SQLite viable for a web app as prior to that there wasn't a good replication story.

With litestream it's much easier to have a backup on standby. That being said where I have used it in production some amount of downtime has been acceptable so mileage may vary.

benjiro · 2025-12-03T01:44:20 1764726260

> Litestream for me at least is what makes SQLite viable for a web app as prior to that there wasn't a good replication story.

Does Sqlite now not have a build in rsync for replicas?

Searches, yep ... https://sqlite.org/rsync.html

andersmurphy · 2025-12-03T12:24:46 1764764686

Pretty sure rsync was only added in 2024, litestream predates it by quite a bit.

What's cool is the newest version of rsync lets you replicate while in journal mode (which litestream doesn't support).

kiitos · 2025-12-02T21:08:13 1764709693

> You're going to have down time for migrations unless you're very clever with your schema and/or replicas.

probably worth stating these kinds of design considerations/assumptions up-front

i'm sure lots of applications are fine with "downtime for [database] migrations" but lots more are definitely not, especially those interested in synthetic metrics like TPS

andersmurphy · 2025-12-02T22:03:00 1764712980

I'd argue the opposite most applications are fine with an hour of downtime a month and arguably much more downtime then that. The recent AWS and Cloudflare outages have proven that.

You can achieve zero downtime with Sqlite if you really need to.

TPS is not a synthetic metric when you cap out at 100 TPS because of Amdahl's law and your users having a power distribution.

kiitos · 2025-12-03T00:31:55 1764721915

1h of downtime per month means you're delivering at best two 9s of availability. again that may be fine for lots of applications but it's trivial scale, and certainly a couple orders of magnitude below what aws and cloudflare provide

taking a step back, if your application's db requirements can be satisfied by sqlite [+replication] then that's great, but that set of requirements is much narrower, and much easier to solve, than what postgres is for

ngrilly · 2025-12-04T22:02:01 1764885721

How would you achieve zero downtime?

andersmurphy · 2025-12-05T10:25:06 1764930306

Personally, I'm a fan of event sourcing if you need zero downtime you probably need auditibility. You have one SQLite database that's an append only event log. You build projections from it into any number of SQLite databases. These databases pull from the log database and update themselves. They are completely expendable. So you never have to vacuum them or migrate them. You just build a new projection and then point to it. This is also how you can spread your sqlite over nodes (if that's your thing, with something like NATS).

There are other ways where you're just more disciplined with your schema and indexes. I.e jsonb, partial indexes and existence based programming (from data oriented design).

Edit: In the time it took me to write this cloudflare is down again https://news.ycombinator.com/item?id=46158200

ngrilly · 2025-12-05T13:25:36 1764941136

That's indeed a possible (and very resilient) solution. But that's a significant shift for many apps. I'm fan of event sourcing in theory, but I've always been afraid of it making things overly complex in practice, for relatively small apps. But I haven't tried hard enough to have a real opinion on this.

andersmurphy · 2025-12-05T14:03:56 1764943436

Oh for sure. That's why my default is if you don't need event sourcing and are ok with some down time a month if you need to add a really large index. Keep it simple.

Once you have a decent CQRS set up (which is a natural fit with SQLite) and event sourcing it can be quite easy to spin up on a new project.

I think people don't have an honest assesment of what their project requirements are in terms of uptime and scale. So they start with crazy microservice multinode architectures that are designed to scale to the moon and inevitably end up with more downtime due to complexity.

I'd still recommend people start with managed PG for most things. But, if you're comfortable using VPSs or bare metal servers SQLite can take you very far.

ngrilly · 2025-12-06T23:10:01 1765062601

I fully agree that most projects would work perfectly fine with a monolith and PostgreSQL. And yet they go with microservices, Redis, RabbitMQ, etc, making the system more complex and less available.

SQLite would be perfect if it would allow multiple concurrent writers, which would allow running database migrations without downtime. I’m looking forward to Turso for this.

cwillu · 2025-12-02T23:31:44 1764718304

Sqlite supports multi-process access, so as long as you've configured to permit that, you can do the same “start new process before retiring the old one” dance.

tomasol · 2025-12-02T21:15:03 1764710103

Author is setting PRAGMA synchronous="normal", meaning fsync is not issued as part of every write tx, but eventually. In order to make the comparison fair it should be set to "full".

andersmurphy · 2025-12-02T21:48:01 1764712081

PRAGMA synchronous="normal" is fine if you are in WAL mode. The database cannot be corrupted by power loss unlike in journal mode.

> The synchronous=NORMAL setting provides the best balance between performance and safety for most applications running in WAL mode. You lose durability across power lose with synchronous NORMAL in WAL mode, but that is not important for most applications. Transactions are still atomic, consistent, and isolated, which are the most important characteristics in most use cases.

tomasol · 2025-12-02T21:59:35 1764712775

fsync is the most expensive operation during a write. NORMAL mode means you don't care whether last ~100 ms of transactions before a process crash / VM restart are going to be persisted or not. My suggestion is either to use synchronous="full" or disable `synchronous_commit` on Postgres to avoid comparing apples to oranges.

Edit: Also, the example indicates financial transactions. Can you explain why you need serializability but not durability?

andersmurphy · 2025-12-03T06:50:29 1764744629

Very good point I've added an epilogue to the post with updated numbers

Really shows the power of dynamic batching.

cryptonector · 2025-12-03T19:34:20 1764790460

Thank you for doing this update!

cryptonector · 2025-12-03T19:31:53 1764790313

100%. TFA was NOT comparing apples to apples. Now that it's been updated the numbers do not look quite so fantastic. The version of PG used matters, too, since the latest adds async I/O support that greatly improves performance.

andersmurphy · 2025-12-03T20:38:03 1764794283

SQLit went from 102545 -> 100405

westurner · 2025-12-02T18:31:44 1764700304

That's a helpful TPS Report.

TIL `SAVEPOINT` can occur in a BEGIN ... END SQLite transaction, and that works with optimizing batch size on a particular node with a given load.

Is there a solution for SQLite WAL corruption?

From https://news.ycombinator.com/item?id=45133444 :

> "PSA: SQLite WAL checksums fail silently and may lose data" https://news.ycombinator.com/item?id=44672902

> sqlite-parquet-vtable, [...]

matharmin · 2025-12-02T18:39:17 1764700757

As mentioned in those threads, there is no SQLite WAL corruption if you have a working disk & file system. If you don't, then all bets are off - SQLite doesn't protect you against that, and most other databases won't either. And nested transactions (SAVEPOINT) won't have have any impact on this - all it does in this form is reduce the number of transactions you have.

westurner · 2025-12-02T19:13:55 1764702835

> working disk & file system

And a working ECC or non-ECC RAM bus, and [...].

How bad is recovery from WAL checksum / journal corruption [in SQLite] [with batching at 100k TPS]?

And should WAL checksums be used for distributed replication "bolted onto" SQLite?

>> (How) Should merkle hashes be added to sqlite for consistency? How would merkle hashes in sqlite differ from WAL checksums?

SQLite would probably still be faster over the network with proper Merkleization

alexwennerberg · 2025-12-02T18:47:13 1764701233

Does anyone have examples of organizations that have leveraged SQLite and written about their experience? I've read a lot of theory and benchmarks about it lately and it seems extremely impressive, but I'm wondering if anyone has written about pushing it to its limits "in production"

andersmurphy · 2025-12-02T18:49:31 1764701371

https://use.expensify.com/blog/scaling-sqlite-to-4m-qps-on-a...

Linked at the bottom of the article is the most extreme I've seen.

alexwennerberg · 2025-12-02T19:20:36 1764703236

Thank you! I read this a long time ago and could never remember it, thinking it was Shopify that wrote it.

yolo3000 · 2025-12-02T19:24:59 1764703499

Maybe they should have stuck with traditional tech and used their talent on something else. Stock is down 96%

janci · 2025-12-02T20:51:26 1764708686

How does SQLite handle HA setups? The minimum I want is reliable automatic failover in reasonable time for user-facing service. Ideally an active-active setup.

vhodges · 2025-12-02T21:43:36 1764711816

sqlite is just a library (in C)

A few projects:

  * https://github.com/rqlite/rqlite Distributed, fault tolerant cluster
  * https://litestream.io/ Replication to S3 (or compatible) - more disaster recovery than fail over
  * https://fly.io/docs/litefs/ Same Author as litestream).  Distributed replication.  Requires writes to be redirected to the primary.

I am debating Postgres vs sqlite (probably with litestream) for a project right now.

And other than HW redundancy, I can get pretty far by scaling vertically on a single box. And for my app, I could probably (and my users!) live with some occasional downtime (as long as the data is replicated/backed up).

If I get 20-50K users, it'll be a successful venture so I don't need much these days and it will be cheaper and easier to run as well.

otoolep · 2025-12-02T21:59:15 1764712755

rqlite creator here, happy to answer any questions.

andersmurphy · 2025-12-03T09:32:58 1764754378

Yeah if you're comfortable scaling vertically and potentially a little downtime. Sqlite massively simplifies your ops, backups litestream is fantastic.

It's also as you mentioned dirt cheap (VPS or a hetzner box).

nchmy · 2025-12-02T22:10:16 1764713416

there's various options now, but im most interested in Marmot - which is multi-master and just came out of a 2 year hibernation with a big overhaul that introduced a native gossip mechanism to replace NATS/Raft

https://github.com/maxpert/marmot/

busymom0 · 2025-12-02T19:20:53 1764703253

Previously, I had always used Postgres for database and Rust or NodeJS for my backend. For my new website (https://limereader.com/), I used Swift for my backend, SQLite for Database, Vapor for web server in the Swift app and am self-hosting the site on an old Mac mini.

A sqlite related issue I ran into had to do with accessing the SQLite database from multiple threads. Found out a solution easily: for multi-threading use, SQLite needed to be initialized with a `SQLITE_OPEN_FULLMUTEX` flag. Since then, the website has been running seamlessly for about 3 weeks now.

maxmcd · 2025-12-02T19:25:58 1764703558

This will block threads while waiting for other threads to write. That might work great for your threading model but I usually end up putting the writer in one thread and then other threads send writes to the writer thread.

busymom0 · 2025-12-02T19:32:32 1764703952

I do open 2 connections:

First one for writing with flags:

    SQLITE_OPEN_CREATE | SQLITE_OPEN_READWRITE | SQLITE_OPEN_FULLMUTEX

Second one for reading with flags:

    SQLITE_OPEN_READONLY | SQLITE_OPEN_FULLMUTEX

As you can note, I have SQLITE_OPEN_FULLMUTEX on both of them. Should I only have it for the writing one?

maxmcd · 2025-12-04T12:31:40 1764851500

Oh nice, yes I think your threads should be able to perform reads concurrently when the write lock is not held. Would make sure you are in WAL mode as well, since I think that will improve your concurrency.

wmanley · 2025-12-02T19:38:12 1764704292

Use a connection per-thread instead. By sharing a connection across threads you’ll be limiting concurrency - and transactions won’t work as you’d expect. SQLite connections are not heavy.

Also: use WAL mode and enable mmap.

andersmurphy · 2025-12-02T19:37:09 1764704229

You don't need fullmutex if you manage your connections correctly at the application level. I.e ensure each connection is only used from a single thread at a time. I also highly recommend having an MPSC queue for your batch/writes and make them go through a single connection so you don't have to deal with SQLITE_BUSY or SQLITE_LOCKED.

shadowgovt · 2025-12-02T21:26:19 1764710779

Good. I feel like people keep discovering this principle that "When the engine does far, far fewer things, it's faster," but I certainly won't bash on people learning this, however they get there.

ducktective · 2025-12-02T20:31:44 1764707504

Does anyone have rough numbers (max daily users etc) on viability of SQLite vs PostgreSQL for a typical user-facing webapp or e-commerce application?

I know due to some recent update, SQLite can support concurrent reads but still only a single writer. For which cases this would be a problem?

Some recommend it's better to start with postgres anyway if you have any remote thoughts of scaling in mind....

saberience · 2025-12-03T17:34:45 1764783285

Honestly, just use PostGres. It's easy enough and will scale with your business, also it won't randomly lock or corrupt your database (I've had sqlite do this to me several times).

dmezzetti · 2025-12-02T19:18:11 1764703091

I've used SQLite as the content storage engine for years with TxtAI. It works great. Also plenty of good add-ons for it such as sqlite-vec for storing vectors. It can take you pretty far and maybe it's all you need in many circumstances.

bob1029 · 2025-12-02T20:49:27 1764708567

This is great until you encounter a customer with a hard RPO requirement of 0. SQLite has a few replication options, but I would never trust this in high stakes domains over PGSQL/MSSQL/DB2/Oracle/etc.

BenjiWiebe · 2025-12-02T21:19:14 1764710354

I'm curious, is an RPO of 0 truly expected or needed? I can easily believe that some places would "require" it.

What kind of data is so critical that the data from a quarter second before catastrophic destruction must be saved?

I guess weapons testing, at least... But that wouldn't be streaming data of that importance for a very large % of time.

wbsun · 2025-12-03T00:31:36 1764721896

My experience: customers with $$$ will always believe they are very important, so important that losing a single bit is the end of world.

So you may not want to convince customers waving huge $$$ checks that their data are not that important. But instead, providing options to keep them once they realize that: their pockets are not that deep, and they are also totally ok losing some data.

NorwegianDude · 2025-12-03T03:45:13 1764733513

This is very misleading. The secure defaults for sqlite is changed, so commits are not actually written to the disk. Running sqlite like this will cause data loss on os crash or power loss.

andersmurphy · 2025-12-03T06:52:28 1764744748

I've added numbers for synchronous FULL to the article for those who are interested.

SJC_Hacker · 2025-12-03T00:01:31 1764720091

SQLite is cool. You can create an in memory DB ands it’s super fast. Better stick with read only though

With a few lines of code you can even copy an arbitrary DB from disk to memory.

pmdulaney · 2025-12-02T18:50:12 1764701412

Prefer "surprising" to "unreasonable".

jhoechtl · 2025-12-02T18:52:50 1764701570

But that wouldn't be cool

cryptonector · 2025-12-03T19:27:59 1764790079

My problem with SQLite3 is the poverty of types and the dynamic typing. I really want the wealth of types that PG brings. But I really like the SQLite3 implementation better than the PG implementation (granted, I'm talking about the core of the RDBMS, not anything to do with networking since SQLite3 lacks networking).

jcadam · 2025-12-02T21:02:06 1764709326

I've been getting some good results from sqlite-vec on my current project. I need to look at geospatial extensions next.

cluckindan · 2025-12-03T12:44:45 1764765885

The first example uses a pattern where application code is run in the middle of a db transaction. That is almost certainly an anti-pattern, unless the computation is very lightweight - and even in that case it could probably be done in SQL.

andersmurphy · 2025-12-03T13:03:22 1764767002

You can definitely mitigate the power/amdahl's law problems with pure SQL and/or triggers/stored procedures. I mention that in the article. However, sacrificing interactive transactions is rough and not always possible.

It's worth keeping in mind that the problem here is not heavyweight computation, but the network latency to the database. Your computations can be fast, but if your latency is still at best 1ms for each time you go to the database during that transaction. Every ms that high contention row is locked is kept your throughput degrades dramatically.

andersmurphy · 2025-12-03T07:00:27 1764745227

(author here) Thank you for the fantastic discussion so far. I've added an Epilogue to the article that covers:

- Pragma synchronous "FULL"

- Giving PG a bigger connection pool

gausswho · 2025-12-02T20:32:42 1764707562

SQLite db's do not reliably survive Syncthing across multiple devices. Until that's true I am always seeking a DB design that does.

Jyaif · 2025-12-03T17:17:13 1764782233

> If you [..] do not fully understand the nuances and limitations of SQLite DO NOT USE IT.

So, what are the limitations compared to Postgres?

saberience · 2025-12-03T17:33:11 1764783191

It doesn't scale out, only up, is a fairly big limitation.

So if you have a single DB server running sqlite and your server goes down, well, your shit is down and there is no failover. I.e. no built in replication or clustering.

It doesn't support multiple simulataneous writes (like PostGres and SqlServer etc).

No stored procedures or functions.

There is no real client/server architecture. i.e. if you have applications on multiple servers which need access to the DB then you're in a bad place. The database has to be embedded along with the application.

andersmurphy · 2025-12-03T18:58:56 1764788336

>It doesn't scale out, only up, is a fairly big limitation.

This is the main limitation. That being said you can scale out with projections if event sourcing is your thing.

>It doesn't support multiple simulataneous writes (like PostGres and SqlServer etc).

A process with a single writer tends to be faster because it reduces contention. You only need MVCC in postgres because of the network.

What's even better is you can query across multiple databases seamlessly with ATTACH (https://sqlite.org/lang_attach.html). So it's very easy to split databases (eg: session database, database per company etc). Each database can have its own writer and eliminating contention between data that doesn't need to have atomic transaction across databases.

>No stored procedures or functions.

It's an embedded database the whole thing is effectively a stored procedure. You can even extend SQLite with your own custom functions in your application programming language while it's running (https://sqlite.org/appfunc.html).

In terms of access by multiple applications etc, if it's read access you can create read replicas/projections with litestream etc.

CyberDildonics · 2025-12-02T23:45:56 1764719156

Those numbers are just called a database.

koakuma-chan · 2025-12-02T20:11:22 1764706282

Alternative title: SQLite is All You Need

badmonster · 2025-12-02T19:05:47 1764702347

The real insight here is recognizing when network latency is your bottleneck. For many workloads, even a mediocre local database beats a great remote one. The question isn't "which database is best" but "does my architecture need to cross network boundaries at all?"

andersmurphy · 2025-12-02T19:27:48 1764703668

(author here) yes 100% this. This was never mean't to be a SQLite vs Postgres article per say, more about the fundamental limitations of the network databases in some contexts. Admittedly, at times I felt I struggle to convey this in the article.

slashdave · 2025-12-02T19:45:27 1764704727

Sure. Now keep everything in memory and use redis or memcache. Easy to get performance if you change the rules.

koakuma-chan · 2025-12-02T20:10:06 1764706206

You can use SQLite for persistence and a hash map as cache. Or just go for Mongo since it's web scale.

Yodan2025 · 2025-12-02T22:14:59 1764713699

yep, then add an AWS worker in-between

SJC_Hacker · 2025-12-03T00:38:30 1764722310

SQLite can also do in memory

slashdave · 2025-12-03T01:36:59 1764725819

Yeah, very good point. It all comes down to requirements. If you require persistence, then we can start talking about redundancy and backup, and then suddenly this performance metric becomes far less relevant.

andersmurphy · 2025-12-03T06:53:10 1764744790

Backups are to the second with litestream.

runako · 2025-12-03T05:10:28 1764738628

So much this. My inner perf engineer shudders every time I see one of these "modern" architectures that involve databases sited hundreds of miles from the application servers.

andersmurphy · 2025-12-03T06:54:10 1764744850

This article is very much a reaction to that. The problem is the problem as Mike Acton would say.

andrewvc · 2025-12-02T21:26:06 1764710766

The HN SQLite worship posts have gotten out of hand. What’s next a post on how appending to files is faster than Kafka?

It’s great that some people have workloads that this is a fit for. What’s more common is the use case managed databases like RDS etc solves for. You have some quantity of data you want to always be there, be available over a network for whatever app(s) need it and want backups, upgrades, access control etc solved for you.

I love SQLite and reach for it for hobby projects, but as a product for general business apps it is quite niche. It has the qualities that make for any popular product on HN, a great getting started experience and a complex maintenance and operational experience.

abrookewood · 2025-12-02T23:58:54 1764719934

Honestly, I think it's a reaction to all of the over-optimisation that everyone gets caught up in - immediately starting on AWS, Kubernetes and micro-services. Most of the projects people work on will never reach the performance limits of SQLite and a single server.

I'm not saying that there aren't valid reasons to use AWS & clustered solutions etc, but we shouldn't always take that as our starting position.

Tractor8626 · 2025-12-02T21:00:01 1764709201

Sqlite is very cool. But what was the point of slowing postgress down?

> But, wait our transactions are not serialisable, which they need to be if we want consistent transaction processing

You either don't know what serializable does or trying to mislead the reader. There is zero reason to use searializable here.

> Let's say you have 5ms latency between your app server and your database.

5ms latency is unrealistic. Unless you use wifi or you database is in another datacenter.

> I'm talking about transactions per second, specifically interactive transactions that are common when building web applications

No they are not common at all. You probably invented them just to make pg look bad.

gunnarmorling · 2025-12-02T22:22:18 1764714138

> No they are not common at all. You probably invented them just to make pg look bad.

Interactive transactions are a well-defined concept in database technology, exactly with the semantics described by the author: transactions with multiple queries, with application logic in between, for instance taking a result from one query, processing it, and running a second query with the outcome of that processing as input.

That said, the example in the blog post feels a bit contrived to me, the question being whether that transaction could be structured in a why so that both updates run right after each other towards the end of the transaction, thus significantly reducing the duration of the row lock being held (which is not to say that the general problem described in the article doesn't exist, it does for sure).

andersmurphy · 2025-12-02T22:45:06 1764715506

I wanted the example to be simple and relatable. So I went with something close to the transaction example on the Postgres site (which to be honest also feels contrived).

In practice I'd never implement a ledger like this these days, I'd much prefer an append only model with some triggers. But, like you said interactive transactions are very much a thing.

Out of curiosity do you have a better/less contrived example in mind?

andersmurphy · 2025-12-02T21:15:00 1764710100

> You either don't know what serializable does or trying to mislead the reader. There is zero reason to use searializable here.

If you're processing financial transactions you want your isolation level to be serialisable. As the order in which the transactions are processed matters.

> 5ms latency is unrealistic. Unless you use wifi or you database is in another datacenter.

Even with 1ms latency. Amdahl's law will still make you cap out at a theoretical 1000 TPS if you have 100% row lock contention.

> No they are not common at all. You probably invented them just to make pg look bad.

I'm confused. I invented transactions? Are you saying you don't use transactions with rollback when you use PG?

Tractor8626 · 2025-12-02T22:52:26 1764715946

> If you're processing financial transactions you want your isolation level to be serialisable. As the order in which the transactions are processed matters.

So you don't know what serializable level is.

> Even with 1ms latency. Amdahl's law will still make you cap out at a theoretical 1000 TPS if you have 100% row lock contention.

So why use 5ms and 10ms for examples?

> I'm confused. I invented transactions?

"Interactive" transactions.