Do you have our loyalty card?

September 17, 2012

Do you collect miles? Are you a member of our saver's club? What's your frequent flyer number? Is it just me or is it getting really annoying that every time you shop you get asked for it.

If I'd want to join all the programs I'd probably have like 20 or more plastic cards in my wallet. And of course, to make use of the collected points, miles or whatever 'currency' they came up with, you need a PIN. Or a password. I have no idea how I should remember them all – let alone pick a unique one per card and service. Or how anyone else should be capable of doing that.

I'd say that makes password based authentication the most often implemented feature on the web. One would assume that since developers all over the world do implement it over and over again, it's hardly more than the proverbial finger exercise. Given the amount of (Open Source) frameworks for PHP, there should be many ready to use – and secure – solutions for all types of verification. Yet we see compromised service and leaked user data on a weekly basis.

LinkedIn lost approximately 6.5 million password hashes the other day, Last.fm managed to do the same with 2.5 million hashes – just to name the more prominent websites. As embarrassing as it might be to have to admit that an attacker managed to break in and even was able to take the user data along it's only a matter of time. There is no such thing as 100% security and as soon as any server is up and running it's bound to be attacked.

Knowing it's more likely than not that an experienced attacker will gain access to the userdata at some point, it gets even more important to think about how to securely store this information. And here it is where things get really embarrassing: Many sites still store passwords in clear text in their database! A security nightmare that gets explained with the need for the administrative staff or supporter team to be able to impersonate as a user. Or to be able to mail the password to the user in case he or she requests it again. Both given usecases are bogus. Let's have a look at the first one, the need to impersonate a user. It might be of arguable value to be able to do this, but being in control of the whole system why would I need to use an actual login with the user password to do this? Simply injecting a matching session should do the trick here just fine.

That leaves mailing out the current password: Usually send in plain text email without any encryption, for everybody to read. That's about as secure as sending a postcard. Or printing the "secure code" on the back of your credit card, along with your name and about everything else you'd need to make use of the card – yet handing it out for payment to someone you don't know. Actually amazing the amount of abuse didn't outgrow the regular use yet. But back to topic. The only sane way to implement a lost password recovery process is to not sent out either the original nor a newly generated password but merely a token to have the user pick a new password by him or herself, using said token as verification before accepting the fresh password.

Thus having no need to access a cleartext version of the password, we can refrain from keeping it and store a hash instead. Using a hash is a good option since we do not need to know nor recover the original password. A hash, basically a fancy checksum of the given password, is a quite vital ingredient of encryption, yet it has nothing to do with it. Checksums and thus hashes were originally only used to verify that data didn't change after being transferred, may it be due to deliberate modifications or due to transport errors. Given that usecase almost all hash algorithms are optimized for speed. And so calculating hashes is very fast – an average laptop CPU can calculate about 500.000 hashes per second for a payload of say 8 char long strings.

What is good from a performance point of view turns out to be a problem if used in this context: Simply calculating the hashes for all possible combinations of characters and numbers of passwords of usual lengths does not take very long. After that, "reversing" the hash to the value is merely a lookup. The so called rainbow tables can be found all over the net, rendering a simply hash useless in terms of security.

The fix to this is quite simple though: Add salt. By adding a unique value per user as well as a site wide token, an attacker would have to recalculate the formerly mentioned rainbow table for every hash/password anew. And has to have access to the side-token. All that doesn't make it bulletproof of course, but it gives you and your users time to change the passwords. So if the attacker finally managed to recalculate all hashes they are outdated.

But why do I even need that many accounts, logins and passwords? Seeing that people do not use individual passwords, websites leaking the hashes and of course users forgetting their credentials, it makes me wonder why we still rely on site specific logins. Concepts like OpenID or BrowserId do exist, lifting the burden of creating logins and coming up with secure yet memorable passwords for the users. And ends the need for sites like LinkedIn or last.fm to store a password and handle authentication.

Almost like one generic membership bonus card, that I can use for all the programs I choose to join, deliberately selecting which one should be active at any given time. Life could be so simple.