Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
SQLite – The “server-process-edition” branch (sqlite.org)
236 points by yread on Aug 15, 2018 | hide | past | favorite | 75 comments


It's scales so perfectly we are deploying it across 3 datacenters, each with four 384 core machines with 3 terabytes of RAM. I can't speak highly enough about sqlite and the team behind it.


I don't understand why you'd use SQLite for a deployment this heavy over PostgreSQL or MySQL. Yes, SQLite is fantastic, but why would you choose to do it this way?


Because SQLite is often the right tool for the job, more so than Postgres and MySQL. It scales insanely well and requires minimal configuration and near zero administration. I try to use SQLite for as long as is feasible in my projects purely because it "just works".

Don't let the "Lite" fool you. Depending on your needs, you can scale to 10s of thousand of users using just SQLite. Or not! It is all about knowing your system and properly evaluating your options. I choose to use SQLite because it often fits the use case of small to medium projects the best.


Honest question: where do Postgres and MySQL fall down for such deployments? What is the administration required? Backups? Users and permissions? Indices?

FWIW I use sqlite for a personal wiki, i.e. a website with exactly one user :)


I think the idea is that if your service does more, it necessarily needs more work to maintain. From a simple point of view, there are just more moving parts to go wrong.


The lite doesn’t refer to light, it refers to a “mineral”.

https://changelog.com/podcast/201


I was confused by this, so I read the transcript. In case anyone else reads this, the name is supposed to be read as:

SQL (S Q L or sequel, your call) - ite

So it does sound like the name of a mineral: bauxite, boehmite, hematite, etc. Heck, kryptonite :D


> Jerod Santo:

> Hm, like a mineral. Were you playing on the word "light", or were you just playing on mineral...?

> Richard Hipp:

> I was, I was.

It's pronounced as a mineral. It's still a pun on "light". And an obvious one, given what SQLite is compared to other RDBMSes.


If someone put a good fast raft replication engine in process in front of sqlite you'd have a pretty killer db for a lot of things.



Today is your lucky day: https://github.com/rqlite/rqlite


I think it is a tradeoff. If you want more data resilience on disk, SQLite will require all sorts of hacks to (and more difficult, when) to dump to disk.

On the other hand, if you need every ounce of performance and do not care so much about data resiliency on disk, then the other sql applications will be too much hassle to configure in a way you do not miss any of the many safeguards to save data they ship with by default, which will kill your performance when you least expect.


It's faster? Simpler?

Conversely, why would you choose one of those alternatives?

(I don't actually think there aren't any reasons to do so, mind you, but I can well imagine that if you really KISS, well - even at impressive sizes sqlite makes sense).


SQLite's typing is rather weak, is it not? That might be a tad scary for someone used who relies on that part of PostgreSQL.


It sure is. After recently porting a large application from SQLite to Postgres, much of my time was spent casting types and ensuring code was updated to input the correct types into Postgres. All in all, the changes were for the better, as many bugs were discovered and resolved in the process.


Well, the historic one is, "because you need multiple concurrent data writers."

But, that's what this branch is supposed to be for. So now the reason is: "because you need multiple concurrent data writers from multiple processes." And it might also be "you need multiple concurrent data writers and care about your data enough to not risk loss in the event of a system crash or power failure" since it uses PRAGMA synchronous=OFF.


You are using this particular branch in production, deployed as you mention, or sqlite in general? Some more info about this use case would be very interesting if you can share!


What sort of volume is this handling? Is the dataset in memory, or if not, what size and type of I/O is backing this? It seems like a ton of CPU, whereas in my experience typically I/O is the primary bottleneck for database loads.


Note the second comment in the following thread:

https://news.ycombinator.com/item?id=12739771

quinthar posted in quite a lot of detail there as well, so it may provide some context.


The link says that it requires turning synchroous off, which means that you won't be waiting for real I/O on transactions, since no fsync calls are emitted. Add that to a huge amount of RAM for cache, and it's very reasonable to be CPU bound... So long as you don't mind data loss or corruption on power loss or kernel panic...


Doesn't a BBU Raid controllers largely address the power loss concern? And, I always thought BBU Raid controllers were just absolute common sense for any serious database (until everyone went cloud and suddenly basics like dual network card, dual PSU and raid controllers didn't fit Amazon's desire to sell complexity).


Well, data will be lost if anything below the OS buffer cache crashes (source http://sqlite.1065341.n5.nabble.com/How-dangerous-is-PRAGMA-...).

The BBU protects from power loss of the HDs, but not power loss or general failure of the mainboard or any other important component.

PAlso BBUs can run out of battery so a flash backed BBU is generally recommended.


Without fsync, the dirty page can sit in RAM and not even be sent to the disk, so a battery backup for the disk wouldn't solve data loss.


Right..but you leave fsync on in this case, no? This might be incompatible with the locking required by this server-process edition though (I guess, no clue). But more generally speaking, fsync=on with a proper raid controller gives massive performance boost (like orders of magnitude) while being relatively resilient to power loss.


"in my experience typically I/O is the primary bottleneck for database loads"

Not when you have potentially neatly 3TB of memory cache.

Of course it all depends on the dataset.


So is there one SQLite process running on each of the 384 cores?



It may be a bit out-of-date, but BerkeleyDB provides a SQLite front-end that supports page-level locking -- and thus has better TPS when multiple threads are writing. It's basically SQLite with the btree ripped out and replaced with BerkeleyDB. More information here [0].

[0] http://www.oracle.com/technetwork/database/database-technolo...


Mods: there is a much more up-to-date version of the readme at this link: https://sqlite.org/src/artifact/b98409c486d6f028


If this requires turning off pragma synchronous, doesn't that disqualify it for anything but read only tasks? Power loss etc would corrupt the database...


You could have a master slave architecture where you disable it for the slaves only. It really depends on the use cases.


wait, sqlite can do master-slave in some way?


No, vanilla SQLite cannot.

But think about it as a library, you can build master-slave on top of it.

Which is basically what I did with https://redisql.com/ in the pro version exploiting Redis AOF


Its not a requirement, but it will run quicker with it off.


Install a UPS.


Doesn't protect against a kernel panic, or other kind of crash.


Expensify is powered entirely using sqlite (inside bedrockdb.com) and it is freaking amazing.


This was discussed a couple years ago at https://news.ycombinator.com/item?id=12739771


Good thing you don't need 64bit integers.

Seriously, as impressive as SQLite is for what it is, people shouldn't be given the impression that it doesn't come with some massive and often surprising caveats when viewed as a general purpose database.


Looks like I misremembered this (re 64bit integers) and it was one of the other slightly odd limitations SQLite has around rowids and foreign keys that caused me to abandon the last system I was designing with it...

As for what most concerned me most when using SQLite was the ease of which it would happily allow me to make broken queries and not raise a fuss, e.g. referencing non-GROUP BYed terms in an aggregation.


Hi ris,

Can you provide example(s) of the broken query with non-GROUP BY columns in an aggregation? (Or a reference link if this is well known - I've searched briefly but can't find anything.)


Not the parent poster but it's queries like these:

Select user.id, user.name, sum(tx.amount) from user inner join txn on (user.id = txn.uid) group by user.id

User.name is not in the group by list so it is picked arbitrarily from the rows in the group. It's harmless here but not in all cases.


TBH I like the ability to do this. Sometimes you truly don't care which value you get, you just want one. Sometimes you have application-level constraints which mean the one it picks is guaranteed to be correct, but you can avoid the cost of enforcing that (at insertion time and at lookup time). It's a handy tool when it's appropriate.

But hell yes, tools should warn you about this loudly, as it's not generally correct.


Did you file a bug report?


SQLite has 64 bit integers. In fact row ids are 64 bit by default since version 3.


It has some quirks, but, I'd be interested to know what you thing are its "massive and surprising" issues.

Also, it has 64-bit integers - https://www.sqlite.org/datatype3.html


Think you have a few misplaced double negatives.


What I wasn't able to understand is if this follow the main SQLite develop efforts or of it is just a frozen branch.

Is there any reason why it is not incorporated into the main branch and activated by compile time flags? Just complexity?


The branch

https://sqlite.org/src/timeline?r=server-process-edition

looks active.

I would hesitate to integrate such a feature into mainline prematurely. If you really need it you can use the branch.


we have been using this page locking technology for quite a while now and it works incredibly well. Having both read and write concurrency is super powerful and scales fantastically.


This is good news! Already storing over 7PB of data in sqlite, and have already needed to make many work arounds with readers/writers.


This sounds really interesting. Can you expand on this installation?


Thanks, and Sure!

First off, it is not a single instance, but many sqlite databases. Many wich has 20+ TiBs of data.

For the last 10+ years I have worked at a backup company and we had gone threw some iterations of storage backends, including a inhoues system that fell on its face. So far the only thing that has been able to keep up with our demands has been Sqlite, although there have been some hurdles.

Because of how we want isolation between users having any sort of shared database system really is not a option. And in the early days our users liked the idea of simply copying a users dataset and sending it some place else.

In any case we tested, I should say a guy named Brain on our team tested many different systems and ended up picking Sqlite many years ago. It has stuck and served us very very well. It was not until we wanted to access the data in different ways that we ran into concurrency issues and made a few weird decisions on how to access the data to avoid some of these problems.

These databases store the volume data for each volume we back up. And because it is a database we can track the blocks in such a way that allows us to present a consistent view of each backup in time. And it has been good enough to be able to export volumes as iscsi luns (custom iscsi server I wrote to interface when the backend), and in turn allow us to virtualize peoples systems we backed up on the fly without moving data from one format to another.

Now that I think about it ~7PB is only half of it, that is just the data we have managed by our hardware. We have many other people storing just as much if not more on their systems.

Looking on my original comment it may seem like I implied that we have a single 7PB database, but no, it is 1000s of smaller ones between 8 and 30TiB in most cases.

Have we had data loss? yes, but it was mostly to improperly using sqlite. Except one occasion which we were sure it was sqlite itself, which I am sure Mr. Hipp will refute :p, but I forget the details and a fix was applied quickly.

Sqlite is nice, and I would recommend it to anybody for just about any use, even uses you might have not considered for a embedded database.


Are you putting binary blobs in the sqlite databases directly?


Yes, the block data is directly inserted into the database.

This has worked fine, but I don't think it is optimal. I think we might get some performance increases if we just used seqlite for keeping block offsets into a flat file. But then there are a bunch of other problems to solve.

Data is compressed, do we would have to handle free hole allocations for variable sized data. Or simply waste free space. Not that sqlite does does any better with those things, just things we would have to solve in general.


I don't understand why SQLite would be used instead of writing a data structure. To me going through SQL (& ORMs) is a hindrance.

Why use a database if you only have one process?


Transactions. It's relatively easy to atomically edit random data in a file on disk, if you're willing to overwrite the entire file with your new version. But as soon as you need to edit files that are too big for that approach, the problem becomes exceptionally complicated.

Indexing any columns you want in one line of code is also very nice.


Comparable questions are why use a library when you can just write the code yourself, or why use a high level language when you can write it in assembly. SQLite gives you a lot of power, having already solved a lot of the problems and with a lot more testing, that you then don't have to. The price you pay for that is using their API, which, in this case, is SQL. Using an ORM is a separate choice. You may choose differently than others on these decisions, but using SQLite is certainly a reasonable decision that many people find compelling.

It's also useful to point out that SQLite can have many reader processes. The single-process limit is only on writes.


SQLite is a data structure. The process of finding things on disk and atomically updating them is not as simple as it sounds and is about 90% of the work of a database.


To add to the arguments already made: You don't have to figure out serialization. In particular, writing out structs in C and C++ can break even between compiler versions if padding changes.


That's fine when you don't need complex queries or your data is small. As complexity increases you will eventually end up with a buggy slow relational database.


And even then it's better to use a standard solution that other people will know too, if nothing else, to make maintenance easier.


The collision rate (percentage of attempted transactions that failed due to a page-level locking conflict) in all tests was between 1 and 2%. Failed transactions are not included in the TPS counts below.

I read this as: retry logic required. That's a bit onerous IMO. Wrapping a single writer in a mutex gives near rw=1 performance in the cases I tested.


Can someone explain a bit more?

(a) Does this mean that there will be an sqlite version that can be deployed as a process accessible over network in servers?

(b) Does this mean that sqlite will now have the capability to do concurrent reads/writes, but that it is upto users to implement a process that can take advantage of this and create something like (a)?


> (a) Does this mean that there will be an sqlite version that can be deployed as a process accessible over network in servers?

No.

> (b) Does this mean that sqlite will now have the capability to do concurrent reads/writes, but that it is upto users to implement a process that can take advantage of this and create something like (a)?

Yes... this allows a multi-threaded application for example to have multiple readers/writers without blocking each other.


... as long as those threads don't touch the same page as each other. Page locks <> row locks.


Depends on the width of your rows and the relation of access patterns to physical row allocation.


For some reason, I had the idea that this was always the case with SQLite, at least for the last 10 years or so.


In WAL mode, readers don't block writers and vice versa, but there can still only be one writer active at any given time.


This provides (b), not (a).


amazing!! cant wait for this to be in the main branch


How this compare with https://github.com/bloomberg/comdb2 that adds stored procedures using LUA scripting language and decimal types ?


In what ways should they be compared?

This appears to be a means to run Sqlite in a server environment with multiple processes hitting it.

What you linked is just some Lua based extension that has little todo with Sqlite performance on certain types of infrastructure.


Comdb2 is a database system that uses SQLite as a query planner. It also allows Lua stored procedures. It has a different backend from SQLite with different capabilities.


"All clients are in the same process", so you only get multiple processes if you roll your own IPC (as I understand it).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: