Evolution of User Generated Content and Localization

Today, global business is just starting to touch on localizing user generated content (UGC). It’s such a new field for us, so how do we even think about it? How do we localize it to add value to our overall global business activities, engage with audiences and make sure that we do it successfully?

Content is more diverse. Traditionally, we have had to deal with relatively simple content like marketing brochures, articles, manuals, support and learning documentation and presentations. Now, new content types have emerged in the form of network generated content or as it is typically known as UGC. On one hand, we have bite-sized pieces of information with little context, like social media posts or Internet of Things (IoT) data. On the other hand, we have complex multimedia content like photographs, interactive graphics and videos that contain metadata.

There is more content. Every year, the volume of content published grows exponentially. In the past two years, we have generated more digital content than in the entire history of human kind.

What does this mean for localization?

As content volumes grow, many businesses are faced with limited resources and subpar technology that is unable to cope with emerging content types and volumes. How do globalization and localization professionals survive when faced with such scaling challenges? We must prioritize content. A good approach is to consider the longevity of content (the relevancy period) against how useful the content will be to the user (utility).

What if everything is of high priority? How do we ensure that the quality is still good?

  1. Set the right quality expectations with the user. Different content types require different quality levels. UGC is typically written quickly, in short bursts and therefore, can contain abbreviations, grammar errors and typos. Translating this type of content to 100% linguistic accuracy is not always necessary. Simply maintaining the key message or “gist” of the message is often enough. This is where use of translation automation comes into play. Machine translation (MT) can help to translate huge volumes of UGC, sometimes with post-editing, for publishing quickly, meeting the needs of the business and the user. Other techniques for multimedia, like text-to-speech, can be used when high production value is not needed for content types like video.
  2. Invest in MT upfront to make it as reliable as possible. The sooner you integrate MT into the overall translation process, the more intelligent the system will become, making output more accurate and in line with the overall company brand and tone of voice. You can scale your human translations with machine-learning based technologies.
  3. Scale your quality monitoring. One way to do this is through sampling; however, there is another, more scalable way. Instead of monitoring translation quality, monitor the impact on users. For example, if you have just translated thousands of product reviews to attract more potential customers, instead of reviewing all of these translations, start monitoring your conversion rates. If you have just translated a massive documentation site on your software product, check if users became more engaged. Use data and analytics more intelligently on user activity rather than simply focusing on translation quality.

New content types provide new business opportunities. When you handle large amounts of content for your users, you have a unique opportunity of adding value on that content by making it multilingual or even language-neutral. Translating UGC converts content into a greater asset to reach and engage with more users. As the technology and processes in the globalization and localization industry advance, many global brands have more channels to communicate with users in multiple countries.

Hanna Kanabiajeuskaja is Product Manager (Localization) at Box.

Hanna took part in the special guest panelist for the discussion at Welocalize LocLeaders Forum 2016 in Montreal, “Quality Validation for Network Generated Content.”