Internationalization of websites from the SEO-point-of-view

johannes beus
Even though the Internet is global, you will have to cater to the needs of regional target audiences. Most important in doing this is probably the translation of the content to the particular language. H However, while implementing this there are some pitfalls which I want to allude to in the following article:

one ccTLD for every language
Every country-website gets the ccTLD(Country-Code-Top-Level-Domain) belonging to the language, this means Germany gets example.de, France example.fr and so on. For English content a .com-domain should be preferred. I would not experiment with “exotic-domains” or such endings as .info or .biz. Should this approach be impossible, because the (fictitious) Poker Investments Limited from Paraguay is still holding on to their own ccTLD but with no realistic claim to further domains, then you can use subdomains like de.example.org or fr.example.org. I would advise against using directories for the localized versions of the website.

unmistakable Header
In theory this is quite simple: the webserver sends out the Content-Language-HTTP-Header and the HTML-document that follows will be in the language designated therein. In practice this will turn into the problem that, most webmasters have never even heard of this header, their providers set up false or no presets at all and in addition there are meta-headers about the language which are possibly set in complete contradiction to the HTTP-header. This means that searchengines had to start figuring out the language of a site themselves through a number of characteristics and you should try your best not to put any stones in their way. Therefore, you should use a HTTP-header-viewer to check if and which language-header was send, if it matches the actual language of the document and if the meta-header is correct. Searchengines will also try to detect the language through the text. The occurrence of certain words like “und” (and) or “neben”(besides) can act as indicators – this makes it extremely important to only use one language per URL.

links from same-language sites
Another clue Google uses to figure out what language a site is in are – as so often – incoming links. Usually this is not a problem, but if the ratio between English and German sites that link to your own site are unfavorable, it is possible that your site will fall into the English index – which can be detected through the [translate this page] link shows up besides the result for German searchqueries.

local IP-addresses
Searchengines like Google will not only try to figure out the language of a site but also its origin. This is expressed in the ability to not only search for “sites in German” but also in “sites from Germany”. An important indicator, which Google uses to pinpoint the origin of a project is the IP-address on which the project is hosted. For this domain if would look like this, for example:

hades:~# host www.sistrix.com
www.sistrix.com A 62.93.205.128
hades:~# whois 62.93.205.128
% This is the RIPE Whois query server #1.
[…]
country: DE


This means that projects with enough budget should make sure that the IP-address is associated with the particular country. There are many ways to make this happen and talking to the provider is usually helpful.

no automated translations
Even though it has no direct influence on the searchengineoptimization, I can only recommend to use capable native speakers to translate the content. Text that is translated through a program might be good for some short-lived entertainment but nobody will take it serious or be able to understand it. There is also the problem that great content in the source language will be useless to no extend, if it is not understood in the target language and – here comes the SEO-part – is therefore not linked to.
johannes beus - on Thu (06/14/2007) at 14:09 PM

Add Comment

more
This posting is older than 30 days and therefore closed for new comments.