Johannes Beus
The amount of pages indexed in Google are especially important for projects that work mainly in the long-tail-area. More pages, meaning content, being searchable means a better chance of being found by a multitude of different searchqueries. While in the old days Google indexed every page that the crawler could find, they now have a rather efficient algorithm in place which rates the maximum number of indexable pages on an array of diverse signals. I have taken a look at how many sites I can come up with on the top of my head which have masses of pages in the index and made a chart of them:
| # | Domain | Pages |
|---|
| 1 | yahoo.com | 339.000.000 |
| 2 | yahoo.co.jp | 171.000.000 |
| 3 | myspace.com | 136.000.000 |
| 4 | blogspot.com | 120.000.000 |
| 5 | ebay.com | 111.000.000 |
| 6 | youtube.com | 105.000.000 |
| 7 | msn.com | 86.200.000 |
| 8 | wikipatents.com (fluctuating strongly?) | 67.000.000 |
| 9 | amazon.com | 53.300.000 |
| 10 | amazon.de | 52.600.000 |
| 11 | ebay.de | 51.700.000 |
| 12 | flickr.com | 50.200.000 |
| 13 | alibaba.com | 49.200.000 |
| 14 | wordpress.com | 46.600.000 |
| 15 | live.com | 45.700.000 |
| 16 | aol.com | 45.100.000 |
| 17 | livejournal.com | 44.600.000 |
| 18 | rootsweb.com | 41.900.000 |
| 19 | meetup.com | 41.600.000 |
| 20 | amazon.ca | 41.400.000 |
| 43 | google.com | 25.600.000 |
| 46 | chefkoch.de | 23.300.000 |
| 81 | meinestadt.de | 13.000.000 |
| 99 | yatego.com | 10.900.000 |
| 114 | cylex.de | 9.740.000 |
Besides the first 20 there are 5 more entries which are pretty interesting from a German point of view. The data comes from a “site:domain.tld”-query that I finished just now. Those of you who know more domains that fit this list are welcome to post them in the comments.