Less Is More – How Too Many Indexed Pages Can Damage Your Domain

An ancient Indian Minister, Sissa ibn Dahir (Sessa), invented the board game of Chess in order to direct the attention of his ruler, Shihram, to the problems in his country. The ruler expressed his enthusiasm for the game and Sessa was allowed to decided how he wanted to be compensated.

Sessa wanted just rice, but in the following distribution: 1 grain of rice on the first square on the board, 2 grains on the second square, 4 grains on the third, 8 on the fourth, and so on until the square number 64. The ruler laughed it off as a small prize for a brilliant invention.

What the ruler did not consider was that in the end this would add up to more rice than exists world-wide (even today): 18,446,744,073,709,551,615.

This exercise can be used to demonstrate how quickly exponential sequences grow. Something very similar happens if you use filter parameters within the URLs on your domains.

If you have 1 product in 10 different colours, in 10 different sizes and with 10 different prices, you can suddenly have 1,000 new URLs, for the same product, with no additional value. If you allow Google to index URLs with low quality content, they will negatively affect your rankings for the entire domain.

Let’s look at an example.

Showing Google More Irrelevant Pages Decreases a Domain’s Importance

Comparison between Screwfix.com, Homebase.co.uk and Diy.com (B&Q)

As you can see, the website for home improvement, Homebase.co.uk, has a visibility score of 102 points and it is higher than its competitors, Screwfix.com (97 points) and Diy.com (76 points). The interesting part is that Google only has 189,000 ULRs from Homebase.co.uk within their index.

Screwfix.com has 2.5 times more ULRs (478,000) indexed and Diy.com has the incredible amount of 36 times as many URLs within Google’s index (6,770,000).

If we compare the number of keywords that those domains are generating with they own content, we can see that Diy.com needs 6,7 million URLs to rank for 340,986 keywords, and only 79,584 of those keywords manage to reach the top 10 of the search results.

Homebase.co.uk has 36 times fewer ULRs than Diy.com but ranks for more keywords (342,071) and has 103,735 of those keywords within the top 10 of the search results.

Homebase.co.uk not only has fewer URLs and more keywords than Diy.com, those also rank much better, as well.

How URLs on Diy.com get indexed

I’m actually using these domains just because a tool like a “drill“ is, after all, just a “drill“. That makes the following examples easier to understand. On fashion websites, where filter problems are also very common, you might see a normal t-shirt as either a golf shirt, a polo shirt, a v-neck, a blouse, and much more. Let’s keep with drills.

So, I found 26 “Combi Drills“ on Diy.com. You can filter the this 26 products by Availability, Price, Rating, Brand, Voltage, Batteries, Watts, or even by Weight (if you really want to, for example).

Filtering these 26 “Combi Drills“ by “18V“, I get 22 products on Diy.com, but on Google, they have 232 indexed URLs that use this filter:

Filter “Drills filtered by "Voltage" on Diy.com
Drills filtered by “Voltage” on Diy.com

If 232 index URLs for 22 products are not enough for you, just take a look at the number of indexed URLs for “Weight”. You will find 1,220 URLs:

The following search result nails the main problem for Diy.com on the head. This ULR contains the filters price, voltage, weight and cordless:

Example for a search result using more than one filter
Example for a search result using more than one filter

One of the biggest challenges for big websites is getting the most current and relevant content indexed by Google. Google has an individual limit when it comes to crawling and indexing pages on a domain: How many URLs should be crawled a day and how many of them deserve a place on Google’s search results? This is the reason why it is so important to use your resources as intelligently and productively as possible.

Please also keep in mind that the more fitting a URL is for the search request, the more Google will trust this source. Many different URLs for one and the same product, without any new value, make life really hard for Google, as it can be hard for them to decide on which URL should show up in the rankings and which one of them is the most relevant. This will cause Google to lose trust in the website, which will then cause the rankings to go down.

On the other hand, having an increase in indexed pages can be viewed in a positive light, if this increase is accompanied by an increase in the number of keywords and good rankings.

Conclusion

18,446,744,073,709,551,615 grains of rice are a lot. Quite a lot.

Enough to feed 100 tons of rice to every single human on Planet Earth. That’s 1 kg of rice per day per human for 275 long years. And economically speaking, more than a millennium worth of global rice production (Source: http://www.dedoimedo.com/life/rice.html).

Related posts