Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't really work with SQLite too much (I mainly work with mySQL and in-memory-structures, like hashes), but here are a few observations that kept me from using it:

1. It uses a lot of IO unnecessarily. (Actually, necessarily, due to their motto, but 99% odf developers aren't aware that this is not necessary) SQLite-based applications often make my hard-drive sqeak like it's on a last spin - SQLite does A LOT of hdd activity (Chrome, Firefox, Adwords editor are notorious for that). It's SQLites motto - 100% consistency first! journal everything!.. but... that's my HDD and I don't want it to be dead because SQLite thinks it's great idea to write every byte it receives to insure me against 0.01% chance of data loss.

2. Head-to-head runs of mySQL vs SQLite in real-world applications - well, SQLite loses badly.. and I mean badly. I use a lot of hash-style databases (i.e. id-value storage, 2 columns only) in some of my tests SQLite lost with 10 times worse performance against mySQL (again, if they didn't try to record everything, like mySQL does with MyISAM - they would've got the chance).

That being said - in-memory SQLite tables are very performant! Very! But... almost nobody uses that. As a result - Chrome/Firefox downloads phising update and my HDD lights like a christmas tree (download Filemon from SysInternals and watch how much Chrome or other SQLite-based engine uses your HDD for simple data that MUST be in-memory... or at least used by in-memory cache, like in MySQL... so your users' HDDs will eventually die because of this). I know it's none of developers troubles to think about the fact that their software unnecessarily uses much IO and leads to early hard drive replacement, but still, that's a big reason why I avoid using SQLite.

I mean, there are a lot of in-memory structures that can (and should) be used, but once the developer sees how convinient SQLite is, he jumps right on it. I mean.. 99% of times SQLite should be used like this - load data from disk to :memory: table (i.e. replicate it to memory), work with :memory: and eventually dump it to drive (like once in a couple of minutes) and dump on "Save" command. But instead every popular program uses disk-based tables. Why? Because it's a lazy way. Who wants to add in-memory pre-caching when you can just open a table from disk in less commands?

It's sad to see the current state of programming. Coming from an area where I had to write programs that would fit into 64KILObytes of memory and 3.5MHZ (that's 1000 times slower than modern single-core processor) - (ZX Spectrum)... and we've had whole huge games in that, working in real-time, with color graphics and almost unlimited galaxies (Elite)... it's just sad to see how the available resources are used these days. People had to use a lot of tricks to make things work with acceptable speeds... And now? I mean it's 3GHZ and gigabytes of data, yet I have to wait 10 seconds for Word (or Firefox) to start up?

My main points are: 1) SQLite is convinient; 2) Disk-based SQLite (which is used by 99% or more pieces of software) is nowhere near the possible performance of in-memory; 3) disk-based SQLite is putting a heck of a load on HDD (fire up FileMon during your Chrome session and be amazed); 4) SQLite is nowhere near one-fits-almost-all solution, like mp3 is; 5) even so it will gain more popularity because it's convenient for developers.

I have MySQL table with 72billion denormalized items (72million rows) and it's in realtime application... and it performs! I can't even imagine what would SQLite be like with that kind of data.

Still, please note that I'm not saying SQLite is bad, I'm saying the default route almost all devleopers take with it has some serious issues and yet nobody seems to care.



Because people find it useful, it seems its easy to forget it was called SQLite for a good reason.

The original purpose was to have lite embeddable sql for apps where it simply did not make sense to have to set up an additional piece of infrastructure (eg mysql). E.g. as a store for app config, as an "app file format", or even for lite qtys of relational data where setting up a mysql would be too much of a hassle.

Thinking SQLite should compete with eg MySQL on large data sets, complex queries, high concurrency etc (ie all the cases where one would normally expect to set up a separate database server), to me just seems way outside of SQLite's intended purpose.

As to HDD crunching; I venture if SQLite is used for the domains it is intended for (eg as app file format), this should not be an issue if developers think about how they're using it.

E.g. if one is in the habit of doing a tmp-write, fsync, atomic rename for every atomic change an app would usually make to its files, then one would expect significant HDD crunching anyway if one is doing this with extreme granularity.

Ie with SQLite used as app backing store, the question should be what the extent of a transaction is, before one calls commit()

For these intended domains, I'm not convinced that in-memory pre-caching for the sake of batching/deferring writes is the way to do it at all, but rather to think in terms of the commit() corresponding to the point at which the app otherwise would want to atomically update its file.


Many good points! I agree on most part.

However, from what I've seen - even while you do a transaction, SQLite seems to write a "journal" file (which is still I/O and still slow).

I'm also not saying mySQL should be used for app file format (that would be pretty dumb), but rather challenging the original article's premise that SQLite (as mp3) is one-fits-all solution. There are many solutions for many situations and SQLite is far from being universal solution.


I don't think anyone is saying that it is one size fits all. To extend the MP3 analogy, SQLite data is very portable, and serverless. MySQL is a server (in my experience), so you need to have that process always running, more analogous to a streaming audio server. I think that makes SQLite a very different kind of beast, more like a file format than a database server.


BTW, not many people realize this, but SQLite has "pragma" commands that allow you to dial back its HDD synchronization and transactional consistency. I have used "pragma synchronous = 0;" many times to greatly improve the performance of my apps.


I second the IO issue - it's alarming to have the HDD always churning when I have Chrome open. Assuming SQLite is the culprit, that's a good reason to think twice about embedding it into an application.


BTW, if you turn off phishing protection - that lowers HDD usage significantly.

(But, reasonably enough, by "turning off phishing protection" you have be more attentive to what you're doing online)


What version of Chrome are you running?


SQLite hits the disk on each transaction. It's up to the app developer to group their operations efficiently. Most don't. Did you make good use of transactions in your tests?

In this respect I feel like SQLite is also the OpenGL of data... not just the MP3. It's up to you to understand the correct methods of using it that result in acceptable performance.


If you have a database that's just hash style, why not stick it in a DBM-style database and get a major performance increase?


The most popular is BerkeleyDB and it's license isn't very friendly for commercial apps. Rest of what I tested are really slow.

Also, most of DBM-style databases suffer from same flaws as disk-based SQLite - they don't use memory cache. I prefer to have slight chance of losing my data, but having 10-100 times faster performance on most queries (RAM vs HDD is often 1:100), rather than 100% consistency and slowness.

Last, but not least, hash-style mySQL is easily distributed and cached.


Have you had a chance to test Tokyo Cabinet?


No, actually this is the first time I hear about it. Thanks for a pointer - I'll look into that.

Although the LGPL license seems like a downer - if I understand it correctly it means that either I should distribute my software as open source or compile that particular library into some kind of external .dll, right?


It's set up as an external system anyway. You'd have to do extra work to hack it in. Otherwise, that is true.


The lib seems impressive. Anything which Python's bindings can do 65K inserts in 110ms deserves a lot of attention!!

Now I just have to attempt to make it's Python bindings compile under Windows :( It's hacking time!

Thanks for the tip!


Download VirtualBox and have the best of both worlds!


sqlite has a page cache but its default settings are a bit weak.

I suspect that what you want to do is open the sqlite database in exclusive mode (so that it grabs and holds a lock over the db, and should be able to assume that cached data has not changed) and increase the size of the page cache from its default of 2000k. See http://sqlite.org/pragma.html (the cache_size and locking_mode pragmas).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: