I see. From what I know about Cassandra, this is a much more expensive write tha...

nemothekid · on Aug 22, 2013

1.) You do not have to use dynamic columns for this. Unfortunately I've found in my own experience, as Cassandra has matured over the last year, alot of terminology has fallen in and out of fashion and its hard to recognize what is actually current. Dynamic columns in CQL3 has nothing to do with the behavior OP is talking about and in dynamic columns are sort-of a deprecated feature in Cassandra 1.2. In CQL3, OP's use-pattern in actually hidden if you didn't know any better.

In short, there is no deserialization/reserializaion. OP's writes are append only. I have a similar use pattern to OP, and I haven't seen any performance issues with 100,000s of columns (on SSDs)

2.) The "huge IO efficiency" is similar to what you would see in any columnar data store. Wikipedia has a good walkthrough of it (http://en.wikipedia.org/wiki/Column-oriented_DBMS). The short story is now there is fewer meta data between his values.

--

In any case, it works out because Cassandra is far more well suited for this type use pattern than Mongo is. We migrated from MongoDB (on SSDs) to Cassandra for similar reasons. The perf-killer on Mongo in this scenario is the write lock.

abalone · on Aug 22, 2013

Thanks. I definitely need to read up on Cassandra wide rows now.

relistan · on Aug 22, 2013

nemothekid is correct. We are not using CQL, we are using Thrift and are running with real columns.