Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I read: "We evaluated a lot of things: a custom mysql impl, voldemort, hbase, mongodb, memcachdb, hypertable, and others"

I wonder if among the others there is Redis too, if so (and somebody at twitter is listening) I could love to know what do you think are the biggest weakness in Redis and if you think in the short term it's more important a fault-tolerant cluster implementation (the redis-cluster project) or virtual memory to support datasets bigger than RAM into a single instance. Thanks in advance for any reply.



Redis requires you have enough RAM to hold your entire dataset in memory...which seems impossible for something as large as twitter


As I read Antirez's comment, he was asking about precisely this. (Specifically, how high a priority removing this limitation should have, relative to other important stuff he's presumably working on.)


Even if you overestimate ridiculously, it seems perfectly reasonable to keep the entire dataset in ram: 1 TRILLION tweets, at 500 bytes each is a data set of less than half a petabyte. A quick search shows I can get 4 GB of ram for $66 USD. Assume no redundancy, no bulk discount, and all other hardware is free, that is a cost of $8M or so (about half of their last round of funding, iirc).

Consider now that you don't need to keep non-recent tweets in ram, bulk buyers can get it significantly cheaper than individuals, and the dataset is far smaller than that, then throwing hardware seems far less impossible. I'd imagine that they could keep the last month in ram trivially.


Does Redis really require the entire dataset to be in RAM, or does it just require enough virtual memory to hold the dataset? In other words, couldn't you just let it swap?


From their FAQ:

Do you plan to implement Virtual Memory in Redis? Why don't just let the Operating System handle it for you?

Yes, in order to support datasets bigger than RAM there is the plan to implement transparent Virtual Memory in Redis, that is, the ability to transfer large values associated to keys rarely used on Disk, and reload them transparently in memory when this values are requested in some way.

So you may ask why don't let the operating system VM do the work for us. There are two main reasons: in Redis even a large value stored at a given key, for instance a 1 million elements list, is not allocated in a contiguous piece of memory. It's actually very fragmented since Redis uses quite aggressive object sharing and allocated Redis Objects structures reuse.

So you can imagine the memory layout composed of 4096 bytes pages that actually contain different parts of different large values. Not only, but a lot of values that are large enough for us to swap out to disk, like a 1024k value, is just one quarter the size of a memory page, and likely in the same page there are other values that are not rarely used. So this value wil never be swapped out by the operating system. This is the first reason for implementing application-level virtual memory in Redis.

There is another one, as important as the first. A complex object in memory like a list or a set is something 10 times bigger than the same object serialized on disk. Probably you already noticed how Redis snapshots on disk are damn smaller compared to the memory usage of Redis for the same objects. This happens because when data is in memory is full of pointers, reference counters and other metadata. Add to this malloc fragmentation and need to return word-aligned chunks of memory and you have a clear picture of what happens. So this means to have 10 times the I/O between memory and disk than otherwise needed.


Maybe but I did some math and Twitter dataset should not be so big... The estimation I posted some time ago in twitter were completely wrong. I hope to be able to post some more data later today.

I think that a few big boxes are enough to take all the twitter dataset in memory, and if you ask me, and seen the continuous scalability concerns Twitter experimented during its history maybe this is something they should consider seriously (Redis apart).


Cassandra runs on a cluster so "the dataset" maybe more than 10 TB, so maybe more than the disk space on 1 machine.


If Google has multiple copies of the entire web in RAM (as has been reported in many places), Twitter should definitely be able to hold 140 character tweets in RAM.


i would probably say the large memory footprint of redis , especially at 64bit vs 32bit.


We'd love to be able to store an infinite amount of data in redis with large datasets -- that would let us use it at Posterous in ways that are beyond how we use it now -- which is mainly for analytics and job queueing (via resque)


thanks for the comment rantfoil. The more I talk to people about this issue the more I think virtual memory is the first thing to do. I'll start working on VM this Xmas, really hope to have it pretty solid in three months.


I'm not at Twitter, but what's the tuning parameter to specify that every x seconds data goes to disk?


hello, it's documented in the default redis.conf comments, the parameter is called "save": http://github.com/antirez/redis/raw/master/redis.conf




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: