What is the best database design for multi-language data?

Cemal Can Akgül
KoçSistem
Published in
4 min readSep 9, 2021

--

In this article, a couple of approaches will be discussed to design the data model for localization to be able to manage your content in different languages.

Approaches

1. Column Approach — 1

This solution is the simplest. The main purpose it created an additional column for each text that needs to be translated.

Column Approach — 1
Column Approach — 1

Advantages

  • Simplicity — Easy to implement
  • Easy querying — no JOINs required
  • No duplicates — don’t have duplicate content (there is only one row for each record and only the language columns are duplicated)

Disadvantages

  • Hard to maintain — works in an easy way for 2–3 languages, but it becomes really hard when you have a lot of columns or a lot of languages
  • Hard to add a new language — adding a new language requires schema changes (and special access rights for DB users) for each table with multilanguage content
  • Store space — if not all translations are required (e.g. at some places default language should always be used) it may cause redundant data or empty DB fields
  • Need to build the watch — what column you are working with depending on the language

2. Column Approach — 2

This solution is the simplest. The main purpose it created an additional column for each text that needs to be translated.

Column Approach — 2

Advantages

  • Simplicity — Easy to implement
  • Easy querying — no JOINs required
  • No duplicates — don’t have duplicate content (there is only one row for each record and only the language columns are duplicated)
  • Proper normalization — seems like a clean, relational approach

Disadvantages

  • Hard to maintain — works in an easy way for 2–3 languages, but it becomes really hard when you have a lot of columns or a lot of languages
  • Hard to add a new language — adding new language requires schema changes (and special access rights for DB user) for each table with multilanguage content
  • Hard to queries — difficult to write queries will be reflected in development time and can also cost maintenance.

3. Multirow Approach

This solution is similar to the one above, but instead of duplicating the content in columns, it does it in rows.

Multirow Approach

Advantages

  • Simplicity — Easy to implement
  • Easy querying — no JOINs required

Disadvantages

  • Hard to maintain — every column that is not translated must be changed in all rows for each language. e.g changing the price for a single product requires repeating of this operation for all languages
  • Hard to add a new language — requires repeating insertion operation for each language (cloning the record for default language)
  • Duplicate content — you will have a lot of duplicate content for all the columns that are not translated

4. Single Translation Table Approach

This solution seems to be the cleanest one from the database structure perspective. You store all texts that need to be translated into a single translation table. It is more suited for dynamic websites and which have a large number of languages or which intend to add a new language in the future and want to do it with ease.

Single Translation Table Approach

Advantages

  • Proper normalization — seems like a clean, relational approach
  • Ease in adding a new language — doesn’t require schema changes
  • All translations in one place — readable/maintainable database

Disadvantages

  • Complex querying — multiple joins required to retrieve correct product description
  • Hard to maintain — overcomplicated querying on all operations: insertion, removing, and updating
  • All translations in one place — one missing table leads to global problems

5. Additional Translation Table Approach

This is a variation of the above approach and it seems to be easier to maintain and work with. For each table that stores information that may need to be translated an additional table is created. The original table stores only language insensitive data and the new one all translated info.

Additional Translation Table Approach

Advantages

  • Proper normalization — seems like a clean, relational approach
  • Ease in adding a new language — doesn’t require schema changes
  • Columns keep their names — doesn’t require “_lang” suffixes or something else
  • Easy to query — relatively simple querying (only one JOIN is required)

Disadvantages

  • May double the number of tables — You have to create translation tables for all your tables that have columns that need to be translated

Summary

I can’t say this all types of approaches are good or bad. The choice must include your time, project, fast development needs or etc. But If I must choose one, I’ll choose an additional translation table approach. Because I believe this approach is more flexible than others.

--

--

Cemal Can Akgül
KoçSistem

Senior Software Architect @KocSistem ▪ Full-stack ▪ Digital Technology Transformation — Architecture Chapter Leader