Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I should preface this by saying that I love Project Euler--I spent a ton of time there while learning to program. I also am impressed by anyone who volunteers to create something for the community, and invests effort in maintaining it.

However, not storing emails, and thereby giving up account recovery with the explanation that it's about security is a shit sandwich.

My email is <myfirstname>.<mylastname>@gmail.com, a pattern I share with millions of people. This is public information. I could spray paint my email address on local bridges without in any way making my email less secure (cops might complain, though).

I understand that some people have reasons to have private email addresses that they don't want released (they'll give them to family, but not the general public). They should never sign up for anything with those email addresses, because the moment you sign up for things, you will almost certainly be entered in a database somewhere, and eventually be spammed or subjected to whatever other bad consequences you're concerned about.

Account recovery is a basic feature of a website (except those that contain data too sensitive to have account recovery), and they're giving it up for phantom security.



It's one thing to decide not to store emails (sure, why not?) but account recovery shouldn't even require one to store email addresses.

Check the the email provided by user via the recovery form against a hash of the email saved during registration, if it matches send the reset link. This way when data is breached, figuring out what the original email should be hard (if not impossibly hard, depending on how they hash it).

Am I missing something here?


You should not simply use a hash, but at least a salted hash, or even harder stuff like bcrypt. In other words, treat emails like passwords.

Apart from that, I don't see any issues with that approach. Not sure why project euler doesn't use that approach.


A salted hash would completely eliminate any ability to look up accounts by email address, since you would have to hasn't the email against the salt for every account in the database until you landed on the correct one.


You could have one of two approaches. 1. Don't use a per user salt and just use a global salt. You can counteract this decrease (a bit) in security by increasing the key stretching part of your hashing algorithm. or 2. Require the user to submit their email address AND username and store the salt in the user's record.

I agree with someone up there that email address != password. It's refreshing to see someone that gives a crap about my privacy though.


I had a visceral reaction to "global salt" (I've always heard this called: "pepper") because it's so insecure for passwords, but I guess it's not as bad for email addresses. In the case of passwords, we find that a global salt is fairly ineffective because too many people use common (stupid) passwords like "password." If 10% of the hashes are the same, you can probably figure out what that hash means pretty quickly. We don't have this overlap problem with emails so it's less scary.

Still, simply storing the create time or a randomly generated salt right on the user table is more secure than using a global salt.


A global salt is not a thing. Not is a pepper. The actual cryptographic construct is an HMAC and secret HMAC key.


Right but in this case you would essentially have to use every salt from every user to hash the submitted email and say "aha user 123,456 has a salt and hashed email which matches the submitted email of foo@example.com" which is why I suggested method 2. I am user "Arnor" and my email is "foo@example.com". Ok, your salt is #@$%@#$%DGDFfdgdawer.

If your hashing algorithm is appropriately "expensive" the scan all user salts would not work.


A global salt is called 'pepper'.


This thread is dishearteningly full of cargo-culted cryptography. You would think HN would do better. A global salt is not a thing. It's a misapplication of a cryptographic component. The appropriate tool when you think you need something like this is an HMAC and a secret HMAC key.


Once again we are talking about having an email retrieval function while not storing the email address in plaintext. A "global salt" or "pepper" (as we are apparently calling it now) just prevents enormous pre-generated rainbow tables but admittedly in today's gpu dominated cracking environment it probably doesn't get you much.

Personally I think not having a password retrieval function while simultaneously forcing all of your users to reset their password is a pretty user unfriendly tactic for the protection of an ostensibly public piece of information.


I wasn't saying it was a good idea, just that the majority of people using one refer to it as a pepper.


.. can't you just use a few bits of hashing (via a second algorithm) the original email address for the salt?


Now you just have a different, slightly more complicated hashing function.


Ah, indeed.


Hello? Lets say you have 5 billion accounts, in salted sha1. According to openssl speed sha1, you'd take less than 38 minutes on my ancient desktop to look up an email, on average half that. If the shoe fits, send the email, if not don't?

Sure having another field to match on (eg: username) for locating the correct salt would be good -- but it's certainly not infeasible to do a brute force search (probably want to queue up password request requests, though). Now, if you went with bcrypt or scrypt -- things would, by design, break down a bit. I still think you'd be able to send a reset mail within 24 hours for most reasonable configurations and number of users...


Taking unnecessarily long to handle a lookup request might leave server very vulnerable to DDoS attacks leveraging this "account recovery" option, I think.

Even worse, an invalid email would take the longest possible time, every time.

And since this is only an email address we are talking about, a global salt + more stretching (like runamok mentioned above) could be secure enough while still providing faster lookups.


Of course, you could protect from the DDoS by maintaining a secondary application server which connects to a slave database. Then the requests for account recovery wouldn't impact the rest of the system. :)


That's why I suggested a queue, so you'd only ever need to have a maximum of <total number of accounts> pending. I missed the part about using this for login as well as recovery though (but also, note that numbers are for 5 billion accounts, an scales linearly with accounts -- so divide by 2000 for half a million accounts).


38 minutes to log into your account seems excessive. Remember, this isn't just for password resets — it's to look up an account in the database by email address.


Ah, fair point. So we agree they need a (possibly not unique) user name too. [edit: note this is for 5 billion accounts, on a 10 year old cpu. So for 500.000 accounts, it would be ~1 seconds (average 0.5). Still probably too long for log-in (or particularly log-in failure feedback)).]


If you have a username and emails aren't used for contacting users, why bother storing the email address to begin with?


To verify that a username matches a given email address, so that you can provide password resets?


A noob question I suppose, but couldn't the salt be generated deterministically from the email and still serve it's purpose?


Then it's just a more complicated hash function.


Ah of course. Thanks.


Right. Thanks for pointing this out. So at least an additional "lookup helper" such as a username would be needed.


Presumably they know their username, so a look up of that nature isn't needed unless they forgot that as well.


The salt for the email could be a CRC, or a Fisher-Yates, or any of a dozen other novel one-way transformations.


You do not appear to understand the purpose of a salt. A salt should never be derived from the data it is to be used with. The entire purpose of a salt is to cause two identical inputs to a hash function to produce distinct outputs.


Right, which is why I said

> figuring out what the original email should be hard (if not impossibly hard, depending on how they hash it)

I mean, passwords are way more sensitive than emails, especially given that many people re-use them. So, how you hash passwords is more critical than how you hash emails (which is rarely done, I guess).

On the other hand, there is no reason to not have the same level of protection for emails, if you are already following best practices for passwords anyway (PBKDF2, bcrypt, scrypt etc.).


The irony here is that it's a site largely about computational complexity.


A little off topic, but is there any reason to still be talking about salted hashes when we have bcrypt and scrypt these days? Seems like an anachronism.


An attacker can pre-compute hashes of common passwords for common settings of bcrypt/scrypt. With a salt, they have to start from scratch every time.


bcrypt and scrypt are always salted (it's part of their algorithms - there is not such thing as unsalted bcrypt/scrypt)


No. No, they cannot.


Of course you're right, and I can't believe I didn't realize that. I think my point still stands that it's a bit silly to worry about storing emails, but you're right that you can even avoid that risk by encrypting them if desired.

(And "hash" is a bit misleading: http://codahale.com/how-to-safely-store-a-password/).


The combination of email address + password (even hashed in some way) isn't quite as public anymore.

Not having any personally identifying information doesn't protect your Project Euler account, it protects your other assets.


I think this is the key point. Security for relatively obscure and unimportant sites isn't really about those sites, it's about other sites. People reuse passwords a lot. They shouldn't, of course, but you can shout that from the rooftops all week and it won't change the fact that a lot of people do. If you suffer a significant breach, then a decent percentage of users will have their bank accounts put at risk from it. You can simply put the blame on the dumb users who reused passwords, but it's reasonable to want to do more.


This is a good point, and a reason to do the right thing with regard to emails--which is to store a safe version of them (bcrypt).

Because while an email and a password is not public information, a username and a password isn't public information either. If you don't trust yourself to store the former, you shouldn't trust yourself to store the later much either.


Using bcrypt on email addresses is pants-on-head retarded. Please stop cargo-culting cryptography.

How do you propose to look up accounts by email address if they use a salted hash? You would have to bcrypt the email against every row in the database until you found the correct one. If you use a username to do the lookup instead, why store the email address at all? You can't use it for anything.


You're right and wrong. Right because it's a crazy idea.

Wrong, because it's the logical conclusion of the belief that emails must be treated with as much care as passwords. If you really think that, then you need to encrypt them, and therefore you have to give up the ability to look up user accounts by email address. All you could do is verify that a user-submitted email is associated with a user-submitted account. That's where you end up when you have that sort of paranoia about email addresses.

But that conclusion is, like you said, absurd, and I never should've implied otherwise. I wasn't thinking when I wrote it.


This, so much this. I'm super confused as to why we are having a discussion about hashing emails?


Particularly, the same kind of threat to other accounts belonging to the same person exists with username/password combinations as with email/password combinations; after all, people reuse username/password combos as much as email/password combos.

So, the same general class of people who you endanger by not storing email/password securely are endangered if you stop storying email and just have username/password.

And a lot of that class will have emails that can be quickly guessed by appending one of "outlook.com", "gmail.com" or some other popular free-webmail provider to the username, because if they reuse usernames and passwords, its quite likely they do it on their mail site and that they have a webmail provider. So while what Euler has done clearly has a significant convenience impact, it has negligible security impact.


Hm... Couldn't you just sign in by using only your email without any password or any other extra stuff? I mean, it's not like there's any sensitive information there. Could lead to some trolling, but I think trolls and Project Euler don't have much overlap. In some cases I think it's valid to ask "why security?".


Euler has forums. Most likely you'd be looking at huge spam/abuse problems. For the core functionality, though, it might be okay.


> "Account recovery will no longer be possible."

A site that purports to teach is incapable of learning of how to strike a balance between securing confidential information and making it possible to recover an account. This is a solved problem. If my bank can have a password recovery system, a site about numbers can have one too.

> " With respect to this issue it is quite possible that some members will have genuinely forgotten passwords."

Who hasn't "genuinely" lost a password?


To be fair, your bank likely has a bit more money to throw at this problem.

I would think in this case the entire point is not so much to help them secure stuff, but an attempt to remove them as a target for hacking in the first place.


> but an attempt to remove them as a target for hacking in the first place.

This is very short sighted. As long as you have a popular site you're a target for defacement. And the convenience expense is enormous. As others have mentioned oauth or a twitter or facebook login alternative would have been a sane choice, what they've decided wasn't sane, it's embarrassing for them and frustrating for users who trusted the site.

Inconveniencing users to this degree is probably causing the hackers to laugh, this is in effect a huge win for them they can go brag about now in addition to accessing sensitive information.


I think you are at least slightly overstating how inconvenient this is. I mean, yes, I could wish it was easier. No, this isn't going to stop me from getting back on the site.


And how many answers did you lose? Because I lost a bunch. I'm not overstating anything, I'm honestly frustrated and dispirited because of a high degree of incompetence and bad judgment.


So, you

1) had solved a bunch of Project Euler problems, but fewer than 200 (account recovery is still available for those folks), 2) lost/forgot your signon information, and 3) lost/deleted all the code you used to find the answers?

You, sir, are in a very small boat. A frustrating boat, to be sure, but I suspect that virtually none of their users share your fate.


I'm in pretty much the same boat. The actual problems I don't really mind (I have to code to some, and it wouldn't hurt to revisit the rest), but I'd very much like to have my username back.

OK, so it's an extremely minor issue, but given that the reason for it is so silly, it's still kinda irritating.


You mean you didn't save all your results? Don't a lot of the problems build on previous results? Why would one lose anything? FYI I was interested but did not start down the projecteuler rabbit hole myself, so perhaps I'm missing something.


See a reply to a sibling post of yours. You're at least losing your handle/username for good.

(I did, cannot recall that password so far and it's not in lastpass for some reason - maybe too long ago/before I got into that habit)


I only have about 50 anyway. So, not that big of a deal just to redo.


> To be fair, your bank likely has a bit more money to throw at this problem.

I suppose they could charge 1 USD for (lifetime) membership and store the last four digits of your credit card in lieu of a username, so that they could easily look up the salt that gives the salt with witch they've hashed your email... ;-)

(Would require that you could supply the last digits of your possibly expired credit card, when you lost the password ten years hence ...)


The last four digits of your credit card should not be considered secure information. It's printed on all of your receipts. You carry it on your person in plain text. Many of your online accounts will display it in your account settings without an additional login. It's probably in both your mail and your email. Once someone has it, they can use it for years for recovery on any service that accepts it, and I know some will allow full account recovery using it alone.


True enough. On the other hand, all this is to avoid storing a username, which also isn't secure information.


A bank also has a lot more law enforcement to through at this problem.


> [...] because the moment you sign up for things, you will almost certainly be entered in a database somewhere, [...]

Oh, you've lost the game long before that. Grandma's email chain? Welcome to the database as soon as anyone on that list gets their email compromised. Apologies to all the grandmothers out there who know how to use the BCC field.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: