Multilingual

Language Tags

Understand BCP 47 language tags and use the LANG() function to filter, compare, and work with labels in specific languages.

Common Language Codes

Wikidata uses BCP 47 language tags. The most common are two-letter ISO 639-1 codes.

Code Language Code Language
en English zh Chinese
fr French ja Japanese
de German ko Korean
es Spanish ar Arabic
pt Portuguese ru Russian
it Italian hi Hindi

The LANG() Function

Use LANG(?variable) to get the language tag of a literal value. Combined with FILTER, this lets you select labels in specific languages.

Get Labels in a Specific Language
Run ↗
SELECT ?lang ?frenchLabel
WHERE {
  ?lang wdt:P31 wd:Q9143 .
  ?lang rdfs:label ?frenchLabel .

  # Filter to French labels only
  FILTER(LANG(?frenchLabel) = "fr")
}
LIMIT 20

Multiple Language Labels

Compare Labels Across Languages
Run ↗
SELECT ?lang ?en ?fr ?de ?es
WHERE {
  VALUES ?lang {
    wd:Q2005   # JavaScript
    wd:Q15777  # Python
    wd:Q15206  # Ruby
  }

  OPTIONAL { ?lang rdfs:label ?en . FILTER(LANG(?en) = "en") }
  OPTIONAL { ?lang rdfs:label ?fr . FILTER(LANG(?fr) = "fr") }
  OPTIONAL { ?lang rdfs:label ?de . FILTER(LANG(?de) = "de") }
  OPTIONAL { ?lang rdfs:label ?es . FILTER(LANG(?es) = "es") }
}

Regional Language Variants

Some languages have regional variants indicated by subtags.

Code Variant
zh-hans Simplified Chinese
zh-hant Traditional Chinese
pt-br Brazilian Portuguese
en-gb British English
Chinese Variants
Run ↗
SELECT ?city ?simplified ?traditional
WHERE {
  VALUES ?city { wd:Q956 wd:Q8686 wd:Q1490 }  # Beijing, Shanghai, Tokyo

  OPTIONAL { ?city rdfs:label ?simplified .
             FILTER(LANG(?simplified) = "zh-hans") }
  OPTIONAL { ?city rdfs:label ?traditional .
             FILTER(LANG(?traditional) = "zh-hant") }
}

LANGMATCHES Function

Use LANGMATCHES() to match language families including subtags.

Match Any Chinese Variant
Run ↗
SELECT ?item ?label (LANG(?label) AS ?langTag)
WHERE {
  ?item wdt:P31 wd:Q515 ;  # city
        wdt:P17 wd:Q148 ;    # China
        rdfs:label ?label .

  # Match any Chinese variant (zh, zh-hans, zh-hant, etc.)
  FILTER(LANGMATCHES(LANG(?label), "zh"))
}
LIMIT 30