"[I]t came down to storing lots of binary attachments that could be easily shared with other applications in the CMS. That was the main reason for the move."
Yeah this is a non-story. A better headline would be "Why a small application dropped Oracle for CouchDB". Compared with CERN experiments that generate petabytes per year, this is like publicizing Google's technology choice for its employee-list database. Not that ATLAS data goes into an Oracle database, but...
Where are the zealous HN title mods when you need them, eh?
And furthermore, as some other commenters have noted, this is only for a tiny administrative database, not the massive data that's being generated by the projects. If they stored that in couchDB, then this would be news.
Important note that I'm not seeing in other comments here: they started with both Oracle and CouchDB. This was basically a cleanup in favor of CouchDB, since running two different kinds of DBs is best avoided if possible for simplicity reasons.
Yeah, I'm not going to defend Oracle (or their pricing model), but in this case it sounds like the person who did the migration was just not an Oracle expert. Data pump is great for database dumps (and very fast), and SecureFiles is basically tailor-made for storing large amounts of binary data. There's a feature for schema versioning that would have allowed them to seamlessly run multiple versions of the same schema exposed to different sets of clients.
Oracle is vastly more powerful than pretty much any of the open source technologies, but it's eventually going to go away because of scenarios like the one described in the article. Tech folks are starting their careers on OSS technologies, learning on them, getting comfortable, and then when they stat working on something like Oracle, they aren't familiar with all of the capabilities under the covers, and migrate off of it to the more familiar platform.
As other comments have noted, direct costs were probably not the driving factor here. One place I worked benefited from great academic pricing on Oracle products, and I was happy to have Oracle as our go-to vendor. But people with the knowledge to leverage the capabilities you mentioned are expensive (which is fine), and don't always go beyond 'being a good DBA' to develop an intuitive feel for the domain (e.g. biology or physics)
> Oracle is vastly more powerful than pretty much any of the open source technologies
Does it have multi-master replication like CouchDB? What about append-only storage of data (so can do live backup snapshots)? Or a REST-ful interface (so can directly use it from web clients, via a simple proxy)? A web based data browser and viewer like Futon?
Becuase those are very important features I like in Couch.
Now I am not saying it knowing Oracle doesn't have those feature, as I don't know Oracle much (maye it does). But it seems to me that your claim how Oracle is a strict super-set of all the other database technologies out there is a bit hyperbolic.
First off, I'm not saying it's better. It's heavy, complicated, incredibly expensive, the really cool features are even more expensive, but feature-wise, it's an amazing example of what you can do inside of a relational database engine if you have years and years and billions of dollars to spend on R&D.
To answer your questions:
> Does it have multi-master replication like CouchDB?
Yes! It has several kinds of replication. Active/passive, active/passive with the standby available for reading, multi-master, cascading replication (master->standby->secondary standby). You can also do combinations of the above, like master<->master->standby<-master->standby->standby (if you want to get really crazy).
But it's a lot more than that. Synchronous? Sure. Asynchronous? Sure. Semi-synchronous? Yes, I can say that I want to allow the standby database to get up to X minutes out of sync with the primary before I switch over to synchronous and force clients to block until I'm caught up.
Hey, what about file-based replication for items that are not even technically managed by Oracle or inside the database? No problem.
What about failover? Well, Oracle can not only have its clients detect that a database has failed and handle the failover automatically, but you can actually have it fail over and automatically spin up another standby so you don't have a SPOF.
I could go on for pages just on replication scenarios that Oracle supports.
> What about append-only storage of data (so can do live backup snapshots)?
Yes, absolutely, but to be honest, you don't need append-only storage to do live backup snapshots in Oracle. You can do point-in-time consistent backups while the database is serving transactions (it works under the covers similarly to append-only, but the nuances are a little different). Not only can you run backups this way, but you can actually request that your session "see" a view of the database as it was an hour or a day ago. You can also simply say, "return the database to the state it was as of X transaction or at Y time", and the entire database can revert back to that state.
> Or a REST-ful interface (so can directly use it from web clients, via a simple proxy)? A web based data browser and viewer like Futon?
Yes, Oracle APEX, which is basically an Oracle front-end application server, but lighter than a "real" J2EE stack, can do both of these things. If you want, there's a fairly simple markup language, similar to ERB, that you can use to write simple CRUD rails-like applications on top of APEX, but you don't have to if you don't want to.
Again, I'm not saying that Oracle is better, because all of these features come at a complexity cost that is massive (and a technical debt that requires highly skilled specialty technologists). If I were building an application today, I would not base it on Oracle, because for what 90% of the world needs, there are open source databases or data stores that fill the need just fine. But you gotta hand it to Oracle for the sheer amount of technology they've crammed into an RDBMS.
Thanks for responding. I had no idea about these features. Well heck you can say it is better. That would be alright.
I have been living in the open source world and just never had to deal with commercial DBs at this point.
So I agree with your point. A lot of these features are there but up coming developers might not know about it and will always pick open source choices.
I guess PostgreSQL is the direct competitor at the moment and I have only heard good things about, and Oracle is hated by large numbers of developers (for reasons not necessarily related to technical features).
Assuming I understand this correctly -- and I could be wrong -- but I don't think that the pump thing would work for this use case. I think it's used for other applications. In this scenario, it needs to be to a DB next to these small agents. So spinning up an Oracle instance for each isn't really viable. Also, the agents aren't necessarily in labs that have the nice Oracle deal that CERN central has.
I think the title is a little inflammatory, no?