Localization/subpage

From Wikiversity
Jump to navigation Jump to search

Language[edit | edit source]

For the purpose of language planning and defining language sets for new or existing products, a language or language variant should be defined as the outcome of localization work for a product following a certain set of conventions:

  • Engineering conventions: Being enabled in the localization infrastructure (i.e. translation management system, engineering system, build system, etc.) and marked with a unique language identifier (for example, en-US)
  • Translations conventions: Following a common orthography in a certain writing system and, quite commonly in the translation industry, the rules of a style guide and a terminology set
  • UI conventions: Following a certain set of conventions around user interface adjustment, like mirroring or font sizes

The written language of an application/software product might differ from spoken-language components (for example, French (Canada) in spoken, French (France) in written).

Common confusions pertaining to language in terms of localization[edit | edit source]

There are a few sources of confusion when speaking of languages in terms of localization, most prominently:

  • Confusion about writing systems --- some languages may use two (or more) official writing systems and/or orthographies. For instance, Serbian is written in both the Cyrillic and Latin alphabets. In the context of localization then, Serbian Cyrillic and Serbian Latin count as two distinct languages.
  • Confusion about “localization infrastructure languages” and written languages --- in some cases there is a discrepancy between the “engineering” code of a language and the actual language. For example, a translation tool might enforce the use of a language with the code es-ES, but in fact the released products are supposed to be in an “international” version of Spanish.
  • Confusion about written and spoken languages. In terms of reach, the numbers cited for language use are most often those for the use of a spoken language (for example, from Ethnologue), while the definition of language sets mostly focuses on written language. For some languages the distinction between written and spoken is highly relevant – prominent examples include Arabic and Chinese.

When a product is said to be released in a certain language, what does that mean[edit | edit source]

Language means

  1. User Interface language/Language documentation or a website is written in
  2. and/or written language for text input by the user
  3. and/or spoken language in language output (for example, in videos)
  4. and/or recognized spoken language for language input via voice recognition

Usage means A user can use a language when

  1. They can understand the UI and find it appealing enough to use it in the language. For example, a highly academically translated version of Hindi might not be understood nor be appealing and therefore not get used, and users might use English instead.
  2. They can enter text where text input is required in a spelling system they are familiar with, most often through education, sometimes through the use of printed material.
  3. They can understand spoken language wherever spoken language is used. In most cases this will be a standard variant of the language (if such a standard exists) or one that the user is familiar with through television or radio.
  4. They can use spoken language where spoken language input is required and be understood. Dialects or accents might impede that understanding.