Gravatars: why publishing your email's hash is not a good idea

natrius · on Dec 16, 2009

Spam hasn't been a problem for me for quite some time. I stopped obfuscating my email address years ago.

Batsu · on Dec 16, 2009

I think that those of us who actually use email for communication and check it several times a day are simply numb to the pain.

At this point, I think I could spot a piece of spam tip-toeing through the brush at 300 yards.

gojomo · on Dec 16, 2009

What works for you? How often and how closely do your check your spam folder(s) for false-positives?

dflock · on Dec 16, 2009

I use Gmail to aggregate all my mailboxes. Gmail's spam folder keeps things for 30 days before deleting them. I've currently got 2965 spam emails in the spam folder - which is a little on the low side of my average. So, I get about 3000 spam emails in 30 days, or about 100 per day. Gmail's spam filtering is almost 100% effective, for me. I see maybe one, occasionally two, spam emails in my inbox per month, and have had maybe 3 false positives over the last couple of years. I very occasionally glance at the contents of the first page of the spam folder, just to be on the safe side, but like I said, I almost never find anything mis-filtered. For me this is good enough - I'm effectively shielded from all spam and don't really notice it as an issue.

rimantas · on Dec 16, 2009

Exactly. I "outsourced" my spam management to GMail and I don't think I did ever try to obfuscate my email, they are all over the net :) My position is that one has to deal with spam by means invisible to end users—and that rules out CAPTCHA and email obfuscating.

CrLf · on Dec 16, 2009

Never.

I used to quickly look through my spam folder in GMail and then flush it whenever it went above a certain amount of messages. Now I don't have that compulsion anymore, since there is an option to not display the unread count.

I just let the spam folder alone. If there is any false positive, and if it is important enough, the sender will most likely send another message and the probability of multiple false positives is lower.

novum · on Dec 16, 2009

I read over every single spam (or at least the subject and first few lines of the body, via gmail) before deleting.

Yes, it's a lot. And yes, as Batsu said, I'm numb to the pain.

nomoresecrets · on Dec 16, 2009

When I check my GMail spam folder, I get spams at about the rate of one per minute these days.

Filtering those manually became unworkable for me years ago.

I maybe see a spam in my inbox once a day. It goes up and down, as spammers find workarounds and then Google fix them.

The email address I use is 10+ years old though.

gojomo · on Dec 16, 2009

I would consider that level of effort (and numbness) a "problem"; I'm especially curious how "no problem" natrius deals.

wlievens · on Dec 16, 2009

How many do you get per day? I get about fifty.

jrockway · on Dec 16, 2009

So? Sending email is free; you can just send the email without knowing if the address is correct or not. Change the "| md5sum" to "| xargs mail" and you don't need the Gravatar hash anymore.

mseebach · on Dec 16, 2009

It's not for spam, it's for privacy. Entering your email in a field that says "will not be published", and they generates an avatar, actually party publishes your email address, which is bad for privacy.

Say, you suspect that Alice on stack overflow bad-mouthing software vendor "HAL", claiming impartiality, is really Eve, working for competing software vendor "Moon", you might be able to confirm that by using your knowledge of moon.com e-mail addresses, and checking the hash of eve@moon.com, causing SO to breach the privacy Alice/Eve expected when signing up.

youngian · on Dec 16, 2009

Either Alice on stack overflow is tying her profile there to her real-world identity, or she shouldn't be signed up for Gravatar. The whole point of Gravatar is to persist a single identity across multiple sites, so I cannot imagine why you would tie a secret or fake identity to a real identity (your email) that you did not want associated with it.

Technically, yes, this article is correct. The Gravatar FAQ even discusses this issue, IIRC. But in practical usage, I can't imagine how this would prove important.

rogeriopvl · on Dec 16, 2009

If I now that someone is called John Doe the result space is almost the same with or without gravatar's hash. It's nothing a brute force attack can't handle.

sysk · on Dec 16, 2009

You beat me to it.

_l4lu · on Dec 16, 2009

An idea for a quick fix: use

  my.address+really_long_random_string@famousprovider.com

as your Gravatar e-mail. Will work if famousprovider==gmail at least.

ChadB · on Dec 16, 2009

The big problem with that is that you'd have to use the same

  really_long_random_string

to register for every gravatar-enabled site you want to register for.

The only benefit I can see of this is that you could easily blacklist emails destined for that address.

pyre · on Dec 16, 2009

Doesn't have to be a really long and random string. Just has to be something that can't be easily deduced from your name and/or username. This might not defeat rainbow tables, but I think there are probably easier ways to harvest email addresses than to generate rainbow tables and troll for gravatar URLs.

blasdel · on Dec 16, 2009

Does that actually work with Gravatar and the blogapp plugins that call it?

A ton of people validate addresses using Regular Expressions but don't have the fortitude to copy-paste the monster regex needed to do it correctly.

westi · on Dec 16, 2009

This will work fine with gravatar.

You just have to associate the + address with your account - you can even give it a different image so you could have a different identity on every site without needing multiple email addresses.

ars · on Dec 16, 2009

It would also have to be your email on every service you register with, since they use your email automatically to get a gravatar.

pyre · on Dec 16, 2009

Why is that an issue? It defeats this attack if it is 'username+12345acbdefg@gmail.com', right? Unless we really think that people are going to get into rainbow tables just to extract some email addresses.

Aschwin · on Dec 16, 2009

I don't get it. Why doesn't Gravatar use a salt to generate the checksum? The salt would only be known to Gravatar and is not bruteforcable by anyone who doesn't know the salt.

- Unomi -

petsos · on Dec 16, 2009

Because then the sites would not be able to generate the hash themselves.

mattwdelong · on Dec 16, 2009

A simple solution would be to assign each website its own unique ID, the based on that UID assign a private/secure salt only known to the website + gravatar. Then when the websites generate hashes, they can then append the salt and generate the hash. Simple solution and seemingly easy to implement..though, it`s probably not. (I suggest this in the event that it really is a big issue. However, I am of the opinion that it`s not)

lincolnq · on Dec 16, 2009

That would require a bit more infrastructure on Gravatar's part (a UI for registering a site), and would significantly slow its adoption because site owners would have to go register their site before they could generate gravatar URLs. I used Gravatar for my site because it was so easy to implement, and I probably wouldn't have bothered if it was an annoying process like this.

mbreese · on Dec 16, 2009

Ding Ding Ding! We have a winner!

When I first read the article, I thought the same thing. If each site had their own salt key (probably based upon a id), then this problem goes away.

It could get even easier if the domain name captured via the http referrer was used as the salt, then Gravatar.com wouldn't even need a UI to let sites sign up. This might make it a little more difficult for a site operator though, so ideally this would be a configurable option.

nostrademons · on Dec 16, 2009

This is basically a dictionary attack on your e-mail address. There's an easy solution: don't use an easily guessable e-mail address.

If you do use an easily guessable e-mail address, Google has been pretty darn good at finding e-mail addresses for years. I could Google various permutations of a username and famous e-mail providers in far less time than it takes to write a Haskell program. (It'd be even easier if Google indexed the @ token...hmm, I wonder if I should be codesearching for e-mail addresses instead.)

And if all you want is a list of e-mail addresses, you could just run a crawler and a few regexps, and you'll find way more than 8500 in a few hours.

tjogin · on Dec 16, 2009

Nothing will stop people from simply emailing that address without checking/guessing if it exists first. The whole subject of this post is an exercise in futility.

IgorPartola · on Dec 16, 2009

The goal here is not to find out if an e-mail address is valid. It's to find out if the guy that leaked inside information on an Apple fansite is an Apple employee. It's an issue of privacy, not spam.

ryanelkins · on Dec 16, 2009

Maybe if the apple employee doesn't want to get caught he shouldn't use his apple.com email address that has his real photo as his gravatar.

"Who leaked this info?" "I don't know but it's someone who looks just like Bill! They even use the same gravatar image he uses everywhere else! Devious bastards! Let's see if we can reverse engineer this email hash to find out who this rogue might be!"

IgorPartola · on Dec 16, 2009

Well of course, my example is an exaggeration, but the idea is that this exposes the poster without them knowing that they are exposed. At the same time, so do a million other things like your IP address in Google's logs on the way to the comment you made on that blog. Let's face it: there's no privacy on the Internet. There's just varying degrees of completely exposed to sort of in the shadows.

ryanelkins · on Dec 16, 2009

Well my point was that this isn't really a problem caused by hashed email addresses as much as it is a problem with using email addresses in general. Even if Gravatar used something to completely hide your email address - if you use the same email address in different places your gravatar will be the same and you've compromised yourself, regardless of how they stored them.

tjogin · on Dec 16, 2009

Oh, right.

blasdel · on Dec 16, 2009

No matter how inept your accidental data leakage, you can usually rely on Dave Winer to have done a worse job earlier: http://www.metafilter.com/8140/

warp · on Dec 16, 2009

I also did some experiments and wrote about this back in april, for anyone interested:

http://320x200.org/post.py/2009/gravatar.txt

walesmd · on Dec 16, 2009

So, the argument is because the hash is displayed it can be reversed by subsequently hashing every known combination of characters with every known mail provider?

I'll give a shiny nickel to the first person that can reverse this hash: 9b60573dba6b13029b245bbdf7d01323

Moral: Not everyone uses some variation on their name nor do they use the most common email providers.

bartl · on Dec 16, 2009

Is this guy overly paranoid, or am I overly lax, that I don't care?

My email address is all over the internet. So what.

jwr · on Dec 16, 2009

I don't see what the problem is. My E-mail address is a public piece of information.

warp · on Dec 16, 2009

It also allows you to match an e-mail address to a particular account. So, let's say a dating site uses gravatars, now anyone who can see those can match my dating profile to my stackoverflow account (and any other site using gravatars).

I know this ofcourse, and will use a different e-mail address if I need to keep these identities seperate, but many people do not realize any gravatar enabled user profile can be linked to any other gravatar enabled user profile with only a tiny bit of effort in harvesting.

ryanelkins · on Dec 16, 2009

Uh, can't they also link those accounts by noticing that they are the exact same picture? I mean, I thought that was partially the point.

warp · on Dec 17, 2009

The problem is they can match these profiles even if you haven't signed up for gravatar at all. Your email hash is always sent to gravatar.com.

IgorPartola · on Dec 16, 2009

Just host your own avatar and OpenID service. That's what I do. Of course a bunch of my info is listed in the whois for the domain so this is moot anyways.

Gmo · on Dec 16, 2009

Off-topic, but what solution do you use for your OpenID service ?

IgorPartola · on Dec 16, 2009

http://siege.org/projects/phpMyID. Simple setup, works nicely and like I said, it supports pavatars.

Gmo · on Dec 17, 2009

Thanks

andrewcooke · on Dec 16, 2009

it's like reading a post from a mirror world. in my world i make it easy to find my email address by googling my name - it's a link on the top result. why? because then people can send email. as a result they do things like offer me jobs and pay me. why would i want to stop people knowing who i am or what my email address is?

(fwiw i currently use gmail for spam filtering, but have also had success with spamassassin)