Localization and multilingual content
Localization | |
---|---|
Support for localized value display | |
Collection | |
Keywords | |
Table of Contents | |
Localization1 and the capability of supporting multilingual environments2 using Semantic MediaWiki has been improved in the Semantic MediaWiki 2.4.0Released on 9 July 2016 and compatible with MW 1.19.0 - 1.27.x. and more or less finalized in the Semantic MediaWiki 2.5.0Released on 14 March 2017 and compatible with MW 1.23.0 - 1.29.x.3 release.
Preface[edit]
Supporting multilingual content in Semantic MediaWiki is based on a view that a content page is assumed to be related to one specific content language with similar content in a different language linking to each other (by means of a semantic relation not through a simple literal declaration).
Whether the expressed conceptional framework is to be an acceptable practice or not will be left to users to decide but Semantic MediaWiki is capable of supporting multilingual content scenarios to the degree outlined.
Technical exposition[edit]
Efforts were made in getting global assumptions about a user and content language removed from internal data providers to enable formatting of values in terms of an explicit language (which can be either user or content oriented).
The introduction of datatype "Monolingual text"Holds a text value that associates the annotation with a specific language code allows Semantic MediaWiki to store text segments in tandem to a specific language and is used by:
- Property descriptions as property to create annotations to provide a localized descriptions of a property (this is so users can have a clear concept of what a property is to represent and helps its correct application). For example, dwc:kingdom shows its description in the selected user language, if available.
Multilingual content[edit]
In the past, the only way to achieve a multilingual wiki was by splitting it on a per language basis and loosely interconnect content via interwikilinks. While this may work for large installations, it seems unreasonable for maintainers of smaller sites to apply the same concept.
A general obstacle to provide localisable content is and was that the content language (or site language) is global and determines the rules of how content is to be interpret editorially (e.g. separators .
vs. ,
on numbers, fonts, ltr vs. rtl etc.).
Extension "Semantic Interlanguage Links"Allows to create and manage interlanguage links with semantic annotations as an extension used in connection with Semantic MediaWiki can help mitigate the limitation by allowing pages to "semantically" link to each other not only by means of so called interwikilinks but also to declare a specific page content language. By having the global content language no longer taking precedence over the content of a particular page (the page content language can be entirely different from that of the global content language) it has more editorial leverage to apply different rules.
The concept of an individual page content language is important because each page (and hereby its content) can declare a dependency in terms of a selected language (and its rules). For example, a page that says it is in French can create annotations using those rules with users keeping the writing style of the denoted content language.
The sandbox demonstrates this concept more clearly with the site language set to be French (which would require all numeric annotations to carry ,
as decimal separator), pages (e.g. Berlin) denoting its own page content language are no longer restricted to a "French" content interpretation.
Interlanguagelinks and content language annotations[edit]
When using extension "Semantic Interlanguage Links"Allows to create and manage interlanguage links with semantic annotations (e.g. {{interlanguagelink:en|Berlin}}
) to interlink pages (linked to each other that refer to the same Berlin
as special property "Interlanguage reference""Interlanguage reference" is a predefined property that contains an arbitrary reference and is provided by the extension "Semantic Interlanguage Links"Allows to create and manage interlanguage links with semantic annotations. Entities with the same reference are interlinked and expected to represent similar content in different languages. This property is pre-deployed (also known as special property) and comes with additional administrative privileges but can be used just like any other user-defined property.) and explicitly denoting en
as language, content will be given expository freedom over the editorial preference.
As of Semantic MediaWiki 2.4.0Released on 9 July 2016 and compatible with MW 1.19.0 - 1.27.x., Semantic MediaWiki understands that the page content language takes antecedence over the global content language and in reference to the earlier example, .
is now being identified as numeric decimal separator with annotations such as [[Has area::891.85 km²]]
to be interpret in the denoted language instead of the global one (which is French).
Localization[edit]
Not only is it important to support multilingual content from an editorial perspective, another significant factor in providing a better reading experience is its localization. Here as well were some improvements made so that when a user chooses a specific user language the formatting of query results (where allowed and possible) are to be available in a localized version.
#LOCL
display formatter was added as output format option to signal to a value that its display properties are to be in a localized context.- Special page "Browse"Shows all properties and their values annotated to a page and special page "SearchByProperty"Allows to search the wiki by properties or property value combinations have been made aware of a user context and if available will show localized values.
Examples[edit]
- Localization of numeric and quantity values 4
- Localization of boolean values 5
- Localization of date values 6
- Hint property usage with descriptions in different languages using special property "Has property description"Adds localizable context help to properties7
- Berlin 8 is to showcase the influence of the page content language and specifically the parsing of numbers
- Importing vocabulary and 导入词汇 demonstrate how similar content in different languages can be linked together using extension "Semantic Interlanguage Links"Allows to create and manage interlanguage links with semantic annotations (contains additional examples).
See also[edit]
- Info page on Multilingual Semantic MediaWiki
- Help page on content, page and user language
- Help page on extension "Semantic Interlanguage Links"Allows to create and manage interlanguage links with semantic annotations that links content of different languages together and automatically set the page content language as preferred content language
- Help page on datatype "Monolingual text"Holds a text value that associates the annotation with a specific language code
- Help page on special property "Has property description"Adds localizable context help to properties
- Help page on special property "Has preferred property label"Adds localizable labels to a property
- Help page on special property "Language code"Handles BCP47 conform language codes specifying the language of the annotated text
- Semantic MediaWiki: GitHub issue 3391/491514739 comment yes – Multilingual ask queries involving string modifiers, i.e. "intro=", "outro=", "mainlabel=" and "searchlabel="
References
- ^ ...the process of translating a product into different languages or adapting a product for a specific country or region
- ^ Semantic MediaWiki: GitHub issue gh:smw:594
- ^ Semantic MediaWiki: GitHub issue comment gh:smw:594:264713467
- ^ Semantic MediaWiki: GitHub issue #1591 example - Localization of numeric and quantity values
- ^ Semantic MediaWiki: GitHub issue #1580 example - Localization of boolean values
- ^ Semantic MediaWiki: GitHub issue #1533 example - Localization of date values
- ^ Semantic MediaWiki: GitHub issue #1533 example - Localization of property descriptions in a user context
- ^ Berlin to demonstrate the difference between page content and global content language.