Top Alexa Websites and Email Authentication, Part 1

The more popular a website is, the more likely the average consumer is to trust a fraudulent message that impersonates that website’s Internet domain. One might expect that the websites at the top of the list employ email authentication to protect against that possibility, but what about sites further down the list? This piece will be the first in a series looking at this situation.

The Inspiration and The Question

Recently a colleague sent me a link to a Cyveillance blog post from last October about using Alexa website rankings to determine if a given URL was part of a phishing attack. Examining a sample of the phishing URLs Cyveillance had identified over the 12 months prior to the study, they determined that the better the Alexa ranking of the URL’s base domain, the less likely it was to be a phishing URL. Not exactly a ground-breaking revelation but as they wrote, “we suspected that before, but now the numbers confirm it.”

However we may look at this in a different light – these are the websites and domains people are most likely to recognize and trust, because they visit them every day. So it seems reasonable to assume that somebody receiving a fraudulent email message using the domain from one of the better ranked websites in Alexa’s list would be more likely to trust the message and click on a link it contained. And all too often, all it takes with phishing messages is one click for the user’s computer to be compromised, and their personal information exposed.

In light of that greater risk, I wondered how well these popular websites are protecting themselves from impersonation via email. The greatest threat to end-users would come from the most recognizable sites, basically those with the best ranking – those at the very bottom of Alexa’s multi-million site rankings wouldn’t be as well-known, and therefore wouldn’t pose the same level of threat. But comparing the set of top ranked sites to samples further down the list might be provide some interesting information.

Assumptions and Methodology

The expectation was that the sites at the very top of the list would be fairly well protected, but that results for those a little further down the list might be more variable. So this first examination looks at the email authentication protections of the Alexa top 1,000 sites, then compares these figures to the set of websites ranked from 9,001 to 10,000 (referred to as the “10th 1,000 sites” below).

It is straightforward to determine whether or not a given domain has published a DMARC, Sender-ID, or SPF record, as these will always appear at a well-known address in the Domain Name System (DNS). Determining whether a domain is using DKIM is more difficult, as there is no well-known address involved. If you don’t have a way to ask the domain operator whether they use DKIM, all you can do is look for messages from that domain and see if any of them have DKIM signatures. So while data on the use of DMARC, Sender-ID and SPF will be reliable, any evidence collected about DKIM usage can only be regarded as having a wider margin of error at best, and merely anecdotal at worst. With all that stated, let’s continue.

Findings

There are stark differences in the adoption of email authentication between the two groups of samples examined so far.

Of the top 1,000 sites as ranked by Alexa on March 15th, 75% had published SPF records and 22% had DMARC records – and a little over 19% had both. As a proponent of email authentication, I found these results a bit disappointing given that these organizations should have the resources to invest in protecting their users better. Unfortunately I didn’t have access to the kind of data necessary to check DKIM usage across the sample domains, so the fact that 25% appeared to not use email authentication is probably overstated – by how much, I couldn’t determine.

Turning to the 10th 1,000 sites (again, those ranked between 9,001 and 10,000 by Alexa) only 54% had published SPF records and 4% had DMARC records, while only 3.5% had both. The number of sites using only SPF was very similar between the two samples, 50% for the top 1,000 and 48% for the 10th 1,000. And in the latter sample, a full 45% of the sites appeared to not use email authentication at all (with the same caveat for DKIM).

The 45% figure is again probably overstated, but it caused me to wonder what familiar websites might be included in this category. There were quite a few very recognizable names – and by focusing on just the dozen most familiar of these sites, it was feasible to carry out some manual checks to eliminate those that were observed to use DKIM. That left several sites that are household names which, to the best of my ability to check, appeared not to be using email authentication at all – from a major consumer electronics manufacturer, to a “new media” news website, to a picture sharing site used by millions, if not hundreds of millions of people.

This is a diverse enough set of domains and brands that the odds are good that many readers will have visited one of them, received or shared a link to them, or encountered one of their products offline. And unfortunately none of them appear to be protecting consumers from phishing attacks that might be fraudulently using their domains.

Next Steps

This exploration will continue, by both comparing different sets of records within the Alexa rankings, and seeing how these results might change over the coming months. Allowances may need to be made for the fact that these rankings change to a small degree daily, and more significantly on a monthly basis. But on the whole it should be an interesting dataset to explore from an email authentication perspective.