Nice. But 10 million tweets? That's a few days worth, what's the point?

symptic · on Dec 22, 2008

The point is to be a sort of Google algorithm for Twitter. This is plenty of data to at least get a very solid idea of who the top Tweeters are based on their connections, influence, and popularity.

Also, keep in mind he scraped the TOP Twitter users (those with X+ followers). A lot of Twitters tweets likely come from those under that threshold, saving time, storage space, and effort.

InfochimpsFlip · on Dec 23, 2008

There's another batch coming of tweets off the data mining feed. But yeah: the focus here was on the graph structure more than the text. We're also hoping someone pipes up with "oh gee I have 750m tweets archived do you think anyone else wants to look at them?"

petercooper · on Dec 23, 2008

I downloaded it and they date back to 2006. I guess users who only posted a few times have all their old tweets indexed, whereas those with many tweets only have the latest ones in there (i.e. X tweets each collected max).