Alexa – mumbo jumbo or reliable data?

Johannes Beus
Alexa's data seem to be experiencing a renaissance for some time now. After years of considerable quiet about these highly controversial usercounts, I find them in blogpostings more frequently and there seem to be some people who have dared to optimize this parameter – hight time to take a look at what this is all about. What exactly does Alexa do? Alexa was founded in 1996 and has been part of Amazon since 1999. As the provider of the Alexa-browser-toolbar they are collecting informations about the websites that the users of this tool are visiting. Alexa is using this to calculate a ranking, the Alexa-traffic-rank. The site with the most visitors will be at position one, the less visitors a site has the higher its position will be.

How accurate is this data?
In the past the question of how reliable these informations are has come up again and again, however. Peter Norvig, for example, took the Alexa- as well as real usercounts for a few sites that he had those data for and compared them. He came to the conclusion that these numbers should be taken with a grain of salt. The idea to compare real and Alexanumbers is a good one and since a few of the larger counterservices are thankfully operating public toplists I have done exactly that for about 1000, ragtag German sites. The following diagram shows the average Alexa-traffic-rank for intervals of real, counted by the counterservices, visitors.



As you can see, the traffic-rank seems to originate more from a chance-function than a reliable base and does not exhibit any connection to the actual numbers – though for really well-off pages with more than 40.000 visitors a day, the trend of the traffic-rank seems to go in the right direction.

Garbage in, garbage out?
Alexa's problem consists of the systematical error which is already made during the accumulation of data. Everyone who chooses to do so can install the Alexa-toolbar and thereby contribute to the Alexadata – all of this is done without checking if the supplying toolbars are even a realistic reflection of the internetusers. The past has shows that the toolbar is installed especially by SEOs and webdevelopers who thereby strongly distort the statistics. Abakus is at position 1.250 at the moment, this blog at around 50.000. Neither positions correctly represent reality. Does this mean we can completely forget about the Alexadata? That would be the easiest solution, sadly no other free service is offering better data. Now if you know about the data collection problems and you factor them into the evaluation accordingly, you can often draw predictions from the Alexanumbers, nevertheless. The graph on the right is a comparison for two pricecomparision-portals throughout last year. Both should have about the same mix of visitors and should therefore be at least roughly comparable. The development of visitors for single sites can also often be read through Alexa.
Johannes Beus - on Thu (08/09/2007) at 10:19 AM

Add Comment

more
This posting is older than 30 days and therefore closed for new comments.