Multi-regional and multilingual sites & SEO – Google Guidelines
A multilingual website is any website that offers content in more than one language. Examples of multilingual websites might include a Canadian business with an English and a French version of its site, or a blog on Latin American soccer available in both Spanish and Portuguese.
A multi-regional website is one that explicitly targets users in different countries. Some sites are both multi-regional and multilingual (for example, a site might have different versions for the USA and for Canada, and both French and English versions of the Canadian content).
Expanding a website to cover multiple countries and/or languages can be challenging. For example, if you have a French website, while machine translations can be quite accurate, you might require professional translators to help traduire votre site web (translate your website) from French to German, English or other such languages. Also, because you have multiple versions of your site, any issues will be multiplied, so make sure you test your original site as thoroughly as possible and make sure you have the appropriate infrastructure to handle these sites. Following are some guidelines and best practices to follow when creating multilingual and/or multi-regional sites.
Managing multilingual versions of your site
Here are some tips for making sure that your localized content appears in search results for the appropriate language.
Make sure the page language is obvious
Google uses only the visible content of your page to determine its language. We don’t use any code-level language information such as
lang attributes. You can help Google determine the language correctly by using a single language for content and navigation on each page, and by avoiding side-by-side translations. Translating only the boilerplate text of your pages while keeping the bulk of your content in a single language (as often happens on pages featuring user-generated content) can create a bad user experience if the same content appears multiple times in search results with various boilerplate languages.
Use robots.txt to block search engines from crawling automatically translated pages on your site. Automated translations don’t always make sense and could be viewed as spam. More importantly, a poor or artificial-sounding translation can harm your site’s perception.
Make sure each language version is easily discoverable
Avoid automatic redirection based on the user’s perceived language. These redirections could prevent users (and search engines) from viewing all the versions of your site.
Carefully consider your choice of URL
Google uses the content of the page to determine its language, but the URL itself provides human users with useful clues about the page’s content. For example, the following .ca URLs use
fr as a subdomain or subdirectory to clearly indicate French content: http://example.ca/fr/vélo-de-montagne.html http://fr.example.ca/vélo-de-montagne.html
Signaling the language in the URL may also help you to discover issues with multilingual content on your site.
It’s fine to translate words in the URL, or to use an Internationalized Domain Name (IDN). Make sure to use UTF-8 encoding in the URL (in fact, we recommend using UTF-8 wherever possible) and remember to escape the URLs properly when linking to them. (Lots of URL encoders are available online.)
Targeting site content to a specific country
If Google is aware of the country targeted by a site, we can use this information to improve the quality of our search results in different countries. Google generally uses the following elements to determine a website’s targeted country:
- ccTLDs (country-code top-level domain names). These are tied to a specific country (for example .de for Germany, .cn for China), and therefore are a strong signal to both users and search engines that your site is explicitly intended for a certain country. (Some countries have restrictions on who can use ccTLDs, so be sure to do your research first.)
- Geotargeting settings. You can use the geotargeting tool in Webmaster Tools to indicate to Google that your site is targeted at a specific country. Do this only if your site has a gTLD (generic top-level domain name). However, don’t use this tool if your site targets more than a single country. For example, it would make sense to set a target of Canada for a site about restaurants in Montreal; but it would not make sense to set the same target for a site that targets French speakers in France, Canada, and Mali.Note: Because regional TLDs such as .eu or .asia are not specific to a single country, Google treats them as gTLDs.
- Server location (through the IP address of the server). The server location is often physically near your users and can be a signal about your site’s intended audience. Some websites use distributed content delivery networks (CDNs) or are hosted in a country with better webserver infrastructure, so it is not a definitive signal.
- Other signals. Other sources of clues as to the intended audience of your site can include local addresses and phone numbers on the pages, the use of local language and currency, links from other local sites, and/or the use of Google Places (where available).
Google does not use locational meta tags (like
distribution) or HTML attributes for geotargeting.
Consider using a URL structure that makes it easy to geotarget parts of your site to different regions. The following table outlines your options:
|ccTLDs||example.ie|| || |
|Subdomains with gTLDS||de.example.com|| || |
|Subdirectories with gTLDs||example.com/de/|| || |
|URL parameters||site.com?loc=de|| || |
Geotargeting isn’t an exact science, so it’s important to consider users who land on the “wrong” version of your site. One way to do this could be to show links on all pages for users to select their region and/or language of choice.
Duplicate content and international sites
Websites that provide content for different regions and in different languages sometimes create content that is the same or similar but available on different URLs. This is generally not a problem as long as the content is for different users in different countries. While we strongly recommend that you provide unique content for each different group of users, we understand that this may not always be possible. There is generally no need to “hide” the duplicates by disallowing crawling in a robots.txt file or by using a “noindex” robots meta tag. However, if you’re providing the same content to the same users on different URLs (for instance, if both
example.com/de/ show German language content for users in Germany), you should pick a preferred version and redirect (or use therel=canonical link element) appropriately.
Courtesy : Google Webmaster Central