My generation grew up with Hotmail. Back in the late 1990’s when I was in junior high, an @hotmail.com email address was way cooler than an AOL account (since that was what most teenagers’ parents used). Hotmail was great – fast, easy, and generous with space (for that time anyway), it was my first venture into being independent in the technology world. Opening my Hotmail account was a much bigger deal for me than getting my first cell phone.
Fast forward to today. I log into Hotmail for the first time in 14 months, and am greeted with this absurdity:
Apparently, Hotmail has introduced a new SmartScreen junk filter that doesn’t filter phishing emails pretending to be from Windows Live asking me directly for my password.
This is sad. It’s sad not just because Hotmail has become, more or less, the domain of choice for spammers and phishers. Not just because most real users’ Hotmail accounts have been hacked, perhaps more due to their poor choices of passwords (around 90% of Hotmail users haven’t changed their passwords in the last 5 years) than any particular security failure on Microsoft’s part. It’s sad because this particular email would have been so easy to filter out, but Windows Live couldn’t even handle that.
Think about it: how difficult is it to filter junk mail nowadays? GMail obviously does a fantastic job at it, and so does Microsoft Exchange (ironic, right?). Even Yahoo and AOL have fairly good algorithms to detect spam. You can build a reliable junk email filter if you can think of any characteristics of spam:
- high frequency of misspelled words
- sentences that aren’t grammatically correct (as in, spam messages auto-generated by Markov models)
- email accounts that share no words with the sender’s name (such as ‘Bank of America Customer Service’ sending an email from firstname.lastname@example.org: seems sort of phishy.)
- direct requests for personal information, like passwords, credit card numbers, social security numbers
- pictures and other attachments known to appear in spam, like the classic ‘Take Our IQ Test!’
- words, phrases, and sentences known to frequently appear in spam (such as, well, ‘need your password’) – you could build this in less than a day with easy off-the-shelf Naive Bayes algorithms or Support Vector Machines.
- emails from domains and addresses known from other analyses (or, possibly, from your own users) to be spam generators
There are probably many, many more of these methods, and every spam filter uses a collection of them. What’s frustrating to me is that, based just on the methods of above, I can think of several ways the Windows Live spam I received could easily have been detected:
- Keep track of Windows Live administrator email accounts. If a sender says he’s ‘Windows Live’ but it’s not one of the actual admin accounts, the message is probably a phishing attempt.
- If for whatever reason, a password is requested in the email, it’s probably a phishing attempt.
- I have a hard time believing I’m the first person ever who has received this message and flagged it as ‘junk email’. If you know that a certain email with some exact text has been flagged by 10,000 users as junk, then it’s probably also junk to the other several million users who haven’t opened it yet.
I could keep going. But it’s out of my hands. Hotmail has fallen. I’ve already shut down my own @hotmail.com account. My friend Camille told me today that she has a filter that sends all email from @hotmail.com domains directly to her junk folder. I may do the same in not too long.