Is the use of machine translation evil for SEO?
In terms of global website content translation or localization, the best practice is to have content localized professionally by a native speaker.
However, just like everything else, there’s a best practice, and there’s the reality of conducting business.
So, what is the reality of running a global website?
How does the best practice apply – or not apply – especially when it comes to user-generated content (UGC)?
The Challenge of Content Localization for Global Websites
One of the real-life situations that businesses deal with is the challenge of increasing user engagement without negatively impacting SEO performance.
Site owners agonize over following the best practices for their fixed content on the website, but due to the speed and/or the costs of professional translation, oftentimes, it prohibits them to apply this best practice to UGC translation.
Because of this challenge, I often see global websites showing UGC left in English or the source language on their local sites because they are trying to follow the SEO best practice.
I understand that website owners are concerned about the SEO implications of machine translation.
However, when content is not translated into the local language, it won’t help site visitors or website owners.
Let’s go through this challenge step by step to see if we can find some middle ground.
Selecting Content for Machine Translation
Before we deep dive into the topic, I’d like to clarify that this article is specific to user-generated content, and not the entire website.
Fixed content should always be translated and localized professionally by humans without exception.
Page headers and commonly used text, such as column labels, should also be localized and checked by humans.
If you don’t want UGC to rank well in the search results or even be indexed by search engines, that is the safest area to implement machine translation.
The user comments, feedback, reviews, etc. which are not the main content of the page can easily be handled with machine translation.
Even if the translation is not perfect, it would provide helpful information to site visitors when they can read it in their languages.
When the UGC is on the pages you wish to be indexed by the search engines and perform well in the organic search results, you need to determine the best translation solution.
This is not machine translation, but another option that some websites use to localize their content. It usually has a database of words, which participants access to add the words in other languages.
It’s a low-cost solution when you have volunteers to do the translation work. Wikipedia probably is the largest global website using this solution.
Because it depends on crowd participation, it comes with some concerns.
- It is difficult to maintain the quality of the translation.
- Some languages may take much longer to generate a large enough database to translate content. This becomes a bigger issue when the source language is not one of the more widely spoken/read languages.
Some machine translation tools let you create a glossary database by words and phrases translated by crowd sourced translation.
Below is an example of a clearly wrong word showing up in Google’s Translation Tool.
When a Japanese word for “mischievous” was entered, it gave an incorrect translation in English. (The translation has been corrected since then.)
In order to control the quality of the translation and minimize problems, I suggest that you control who can contribute to the translation project by giving tool access only to trusted editors.
The Advancement of Machine Translation with AI
As machine translation technology has advanced with AI, some websites – including large global websites such as Facebook – are implementing the Neural Machine Translation (NMT).
On the Facebook site, you can see this real-time, text-to-text translation working on posts and comments. On their “code.fb.com” site, they state:
“We have just started being able to use more context for translations. Neural networks open up many future development paths related to adding further context, such as a photo accompanying the text of a post, to create better translations.
We are also starting to explore multilingual models that can translate many different language directions. This will help solve the challenge of fine-tuning each system relating to a specific language pair, and may also bring quality gains from some directions through the sharing of training data.”
Other companies, including Google and Microsoft, also offer NMT solutions for websites and other translation needs.
In addition to text translation, Microsoft developed the Automatic Speech Recognition (ASR) for audio speech translation currently used for Skype.
Improve the Quality of the Translation
Even with certain advancements, the fact is that machine translation is not perfect just yet.
That said, machine translation quality has improved significantly, especially for Western languages.
The following are some things you can do to ensure the quality of the translation:
- Create a list of commonly used words (e.g., categories, tags, product names, other keywords). Get them translated professionally or even in-house. Upload the list to the translation engine.
- Spot check the translation from time to time to ensure the quality of translated content.
- Add online dictionary using their API.
- B2B Industry specific machine translation can handle industry-specific jargon and words better.
Optimize the Machine Translation Engine
- Integrate translation management system (TMS) environments for machine translation engine implementation.
- Customize the machine translation engine for the content type.
- Create training data for AI and machine learning.
Still concerned about using the machine translation in terms of SEO?
Here’s a comment on machine-translated content by Google’s John Muller:
“I think the kind of the improvements that are happening with regards to automatically translated content… It could also be used by sites that are legitimately providing translations on a website and they just start with like the auto translated version and then they improve those translations over time.
So that’s something where I wouldn’t necessarily say that using translated content like that (spamming content) would be completely problematic but it’s more a matter of the intent and kind of the bigger picture what they’re doing.”
Many websites already use machine translation for their global sites. Their content is indexed and could perform well by providing quality content for their local audiences.
Indeed, it comes back to the “intent” Mueller spoke about.
Translating UGC to provide informative content to your local audience falls under “a good intent.”
Machine translation could be a great solution for some global websites, specifically for handling large volumes of user-generated content.
Making reviews and comments available in different languages can significantly increase visitor satisfaction, engagement, and (most importantly) sales.
Don’t let broad standards keep you from serving your consumers. Review the following and make the best decision for your business.
- Determine the content on your site that is appropriate for machine translation.
- Select the translation solution that works best for your website content.
- Optimize the machine translation engine by adding industry-specific terms, keywords, etc.
- Create training data for AI.
- Monitor the quality of the translation.
All screenshots taken by author, November 2018