From Open Collaboration to Customized Control

Transitioning from Wikidata to Wikibase

John Samuel


FOSDEM 2025
Track: Collaboration and Content Management
February 1, 2025
Website: https://johnsamuel.info

Creative Commons License

Goals

  • Key Themes: Open collaboration, self-hosted data management, flexibility, and integration
  • Why Wikibase?
    • Wikidata: Open, multilingual, and collaborative
    • Wikibase: Tailored control for specific data needs
  • Objective: Explore how Wikibase balances independence with interconnectivity in the open data landscape

John SAMUEL

  • Associate Professor, Software Design and Big Data, CPE Lyon
  • Active Contributor on Wikidata and Wikimedia Commons.
  • Research Interests and Themes: Knowledge Representation, Semantic Web, Web Services, Data Integration, Data Warehouse, Distributed Systems, Geographic Information System
  • Courses: C Programming, Algorithms in C, Data Mining and Machine Learning, Artificial Intelligence and Deep Learning, Operating Systems and Concurrent Programming, Web Languages
  • Thesis: Integration of data from web services

Wikidata

Wikidata

Wikidata was launched in 2012 as a collaborative project aimed at creating a structured and linked knowledge base. Wikidata is a free, open, linked, structured, collaborative, and multilingual knowledge base.

Wikidata

Wikidata

Evolution of Wikipedia sites: From multilingual Wikipedia sites with multiple subdomains to a multilingual Wikidata site with a single domain.

Wikidata

Wikipedia and Wikidata

Wikipedia Wikidata
https://fr.wikipedia.org/wiki/Chat
https://en.wikipedia.org/wiki/Cat
https://es.wikipedia.org/wiki/Gato
https://pt.wikipedia.org/wiki/Gato
https://www.wikidata.org/wiki/Q146
https://fr.wikipedia.org/wiki/Example
https://en.wikipedia.org/wiki/Example
https://es.wikipedia.org/wiki/Ejemplo
https://pt.wikipedia.org/wiki/Exemplo
https://www.wikidata.org/wiki/Q14944328
https://fr.wikipedia.org/wiki/Lyon
https://en.wikipedia.org/wiki/Lyon
https://es.wikipedia.org/wiki/Lyon
https://pt.wikipedia.org/wiki/Lyon
https://www.wikidata.org/wiki/Q456

Wikidata

Wikidata

Wikidata

Wikidata

Wikipedia: Multilingual Articles

The Infobox is a key component of Wikipedia articles. It provides a structured and concise summary of essential information about a topic.

Wikidata

Wikidata

Wikidata

Wikidata

Wikipedia: Multilingual Articles

Wikidata: Potential issues

Wikidata

Wikipedia: Multilingual articles

Potential issues

Wikipedia: Multilingual articles

Wikidata

Wikidata: Labels, Descriptions, and Aliases

Wikidata elements include labels, descriptions, and aliases. Labels are the primary, multilingual names assigned to each entity. Descriptions provide brief information about the nature or meaning of the element, while aliases are synonyms or variants used to facilitate search.

Example: FOSDEM (Q475430)

Wikidata

Wikidata Item: Labels and Properties

Properties define the characteristics or relationships of items. Examples: Date of birth, place of birth, gender, etc. They enable precise structuring of information related to items.

Example: FOSDEM (Q475430)

Wikidata

Properties

FOSDEM - Properties of Wikidata

Wikidata

FOSDEM - Wikidata Properties

Property Value
instance of annual conference
convention series
recurring event
inception 3 February 2001
location Université Libre de Bruxelles
continent Europe
country Belgium

Wikidata

FOSDEM - Wikidata Properties

Property Value
instance of (P31) annual conference
convention series
recurring event
inception (P571) 3 February 2001
location (P276) Université Libre de Bruxelles
continent (P30) Europe
country (P17) Belgium

Wikidata

FOSDEM (Q475430) - Properties of Wikidata

Property Value
instance of (P31) annual conference (Q56220509)
convention series (Q15900647)
recurring event (Q15275719)
inception (P571) 3 February 2001
location (P276) Université Libre de Bruxelles (Q574606)
continent (P30) Europe (Q46)
country (P17) Belgium (Q31)

Wikidata

Properties, Qualifiers and References

Wikidata

Properties, Qualifiers and References

Wikidata

Wikidata: External Identifiers

External Identifiers: FOSDEM (Q475430)

Wikidata

Linked Open Data

Linked Open Data (LOD) is an approach that allows for connecting heterogeneous datasets in an open and interconnected manner, facilitating the discovery and use of information.

Key Principles

  • Unique Identifiers (URIs): Each resource is uniquely identified using URIs.
  • RDF Model (Resource Description Framework): Structuring data in the form of triples (subject-predicate-object) to represent relationships.
  • SPARQL Protocol: Query language for querying RDF data in a standardized way.

Wikidata

Linked Open Data

Objectives

  • Data Interconnection: Facilitate the linking of different data sources, providing a global and coherent view.
  • Accessibility and Openness: Encourage the public availability of data with open licenses promoting their use.

Wikidata

Linked Open Data

Linked Open Data: Representation of relations (2009)
Linked Open data: LOD, 2010
  1. https://commons.wikimedia.org/wiki/File:Lod-datasets_2009-07-14.svg
  2. https://commons.wikimedia.org/wiki/File:Lod-datasets_2010-09-22_colored.png

Wikidata

Linked Open Data

Linked Open Data: Representation of relations (Octobre 2024)

Wikidata

SPARQL queries

Web Interface

Wikidata

SPARQL Queries

Identifiers of annual conferences.

SELECT ?annualconference WHERE {
  ?annualconference wdt:P31 wd:Q56220509.
}

Wikibase

What happens if we can choose a selection of features from Wikidata? And the answer is Wikibase.

Wikibase

Examples: Creative Commons (CC)

Wikibase

Examples: Creative Commons (CC)

Wikibase

FOSDEM - Relevant Properties

Property Value
instance of (P31) annual conference (Q56220509)
convention series (Q15900647)
recurring event (Q15275719)
inception (P571) 3 February 2001
location (P276) Université Libre de Bruxelles (Q574606)
continent (P30) Europe (Q46)
country (P17) Belgium (Q31)
video Link to FOSDEM website

Wikibase

Personal Articles

Web Interface

Wikibase

Course Pages

Web Interface

Wikibase

Property

Web Interface

Wikibase

Property

Web Interface

Wikibase

Property

Web Interface

Wikibase

SPARQL Queries on Wikibase

Web Interface

Wikibase

SPARQL queries

Complete list of My Properties.

PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?property ?label WHERE {
  ?property rdf:type wikibase:Property;
    rdfs:label ?label.
  FILTER((LANG(?label)) = "en")
}

Wikibase

Requêtes SPARQL

List of courses sorted by academic year

PREFIX wd: <https://jsamwrites.wikibase.cloud/entity/>
PREFIX wdt: <https://jsamwrites.wikibase.cloud/prop/direct/>
SELECT DISTINCT ?item ?title ?url ?year WHERE {
  ?item wdt:P3 ?url;
    wdt:P27 ?title;
    wdt:P10 ?time;
    wdt:P29 wd:Q1043.
  BIND(YEAR(?time) AS ?year)
}
ORDER BY DESC (?year)

Wikibase

Federated Query towards Wikidata

Web Interface

Wikibase

Wikibase Cloud

Wikibase

Wikibase Cloud - Discovery

Goals

  • Key Takeaways:
    • Wikibase offers structured, self-hosted data management
    • Combines Wikidata’s strengths with customization and control
    • Seamless integration with global entities and linked open data
    • Wikibase for personal or institutional data needs
  • Next Steps:
    • Explore integration possibilities with other sources
    • Leverage open data while maintaining independence

References

Online Resources

Thank You!

From Open Collaboration to Customized Control: Transitioning from Wikidata to Wikibase

John Samuel


FOSDEM 2025
Track: Collaboration and Content Management
February 1, 2025
Website: https://johnsamuel.info

Creative Commons License