Identifying Linked Open Vocabularies for Personal Website: John Samuel

This article is part of a series on Knowledge Representation and Semantic Web.

Integrating Linked Data into websites enhances their semantic expressiveness, making content more interoperable and machine-readable. While Schema.org is among the most widely adopted vocabularies for this purpose, it is by no means the only one available. The Semantic Web encourages the reuse of existing vocabularies, tailored to specific domains such as academic publishing, social interactions, or institutional archiving.

A key principle of Linked Data is vocabulary reuse. Rather than defining new schemas in isolation, it is encouraged to adopt established vocabularies to ensure consistency, interoperability, and long-term sustainability. This approach also facilitates data integration across domains and improves the discoverability of web content by semantic agents.

A central resource supporting this effort is the Linked Open Vocabularies (LOV) registry^[1]. LOV indexes a wide range of well-established vocabularies, providing information about their usage, interdependencies, and adoption status. It is particularly useful when selecting vocabularies that best align with the purpose and context of your content.

Among the many vocabularies available, Schema.org is one of the most commonly used for enriching web content. It is widely supported by major search engines and offers a broad hierarchy of types and properties suitable for general-purpose content.

For blog-related content, Schema.org provides specific types that are particularly relevant:

schema:BlogPosting – describes individual blog posts, including headline, author, content body, and date metadata.
schema:Article – a general-purpose type for written works such as essays, reports, or editorials.
schema:NewsArticle – tailored to news content with additional metadata like dateline and print section.
schema:CreativeWork – a superclass encompassing all forms of creative content, useful when no specific subtype applies.
schema:Person – used to describe the author or contributor, including name, URL, and identifiers.
schema:Organization – describes the publisher, affiliation, or associated institution.

Other elements such as headings and sections may be annotated using types like schema:Heading or schema:Section, although these are more commonly abstracted in properties such as articleBody or mainEntityOfPage.

The following example illustrates a JSON-LD snippet using the schema:BlogPosting type to annotate a blog article. This includes key metadata such as title, author, publisher, publication date, and keywords.

{
    "@context": "https://schema.org",
    "@type": "BlogPosting",
    "headline": "Integrating Linked Data into a Blog",
    "alternativeHeadline": "Enhancing Semantic Structure with Schema.org and LOV",
    "author": {
        "@type": "Person",
        "name": "John Samuel",
        "url": "https://johnsamuel.info",
        "sameAs": [
            "https://orcid.org/...",
            "https://www.wikidata.org/wiki/..."
        ]
    },
    "publisher": {
        "@type": "Organization",
        "name": "John Samuel Publications",
        "logo": {
            "@type": "ImageObject",
            "url": "https://johnsamuel.info/images/logo/favicon.png",
            "width": 60,
            "height": 60
        }
    },
    "datePublished": "2020-05-03T19:04:28Z",
    "dateModified": "2024-06-01T12:30:00Z",
    "description": "An overview of how ... semantically enrich a blog.",
    "articleBody": "In recent years, I have begun integrating ...",
    "mainEntityOfPage": "https://johnsamuel.info/blog/linked-data-integration",
    "keywords": [
        "Linked Data",
        "Semantic Web",
        "Schema.org",
        "BlogPosting",
        "Linked Open Vocabularies"
    ],
    "image": "https://johnsamuel.info/images/blog/linkeddata-integration.jpg"
}

Alternative vocabularies such as Dublin Core Terms, FOAF, and SIOC offer well-established and interoperable models for describing web content. These vocabularies are widely used in the Linked Data ecosystem and are registered in the Linked Open Vocabularies (LOV) catalog.

Below is a brief overview of these vocabularies:

Prefix	Vocabulary	Purpose
`dct`	Dublin Core Terms	General-purpose metadata: `title`, `creator`, `date`, `subject`, etc.
`foaf`	FOAF (Friend of a Friend)	Describes people, their profiles, online accounts, and social relationships.
`sioc`	SIOC (Semantically-Interlinked Online Communities)	Models online discussion forums, blogs, posts, and user interactions.
`xsd`	XML Schema Datatypes	Provides datatype specifications such as `xsd:dateTime`.

These vocabularies are especially valuable when content is intended to be indexed in academic repositories, or knowledge graphs. They offer semantic precision, open governance, and widespread compatibility with RDF tools.

The following JSON-LD example demonstrates how to annotate a blog post using these open vocabularies. It models the post as a sioc:Post, provides general metadata via Dublin Core, and uses FOAF to describe the author.

{
    "@context": {
        "dct": "http://purl.org/dc/terms/",
        "foaf": "http://xmlns.com/foaf/0.1/",
        "sioc": "http://rdfs.org/sioc/ns#",
        "xsd": "http://www.w3.org/2001/XMLSchema#"
    },
    "@type": "sioc:Post",
    "dct:title": "Integrating Linked Data Using Open Vocabularies",
    "dct:description": "A conceptual overview of ...",
    "dct:created": {
        "@value": "2020-05-03T19:04:28Z",
        "@type": "xsd:dateTime"
    },
    "dct:modified": {
        "@value": "2024-06-01T12:30:00Z",
        "@type": "xsd:dateTime"
    },
    "dct:creator": {
        "@type": "foaf:Person",
        "foaf:name": "John Samuel",
        "foaf:homepage": {
            "@id": "https://johnsamuel.info"
        },
        "foaf:img": {
            "@id": "https://johnsamuel.info/images/portrait.jpg"
        }
    },
    "sioc:has_container": {
        "@type": "sioc:Blog",
        "dct:title": "John Samuel's Blog",
        "foaf:homepage": {
            "@id": "https://johnsamuel.info/blog"
        }
    },
    "sioc:content": "This blog post explores ...",
    "dct:subject": [
        "Linked Data",
        "Dublin Core",
        "FOAF",
        "SIOC",
        "Semantic Web"
    ]
}

Using these vocabularies allows for a more modular and standards-compliant representation of metadata, and supports integration into datasets beyond commercial search engines. Their use is common in institutional repositories, digital preservation frameworks, and community-driven publishing platforms.

By incorporating semantic annotations into blog content, web publishers can significantly enhance both the discoverability and machine interpretability of their material. Search engines, aggregators, and digital libraries increasingly rely on structured data to index and understand content beyond traditional keyword analysis.

More broadly, integrating Linked Data into personal or organizational websites aligns with the foundational vision of the Semantic Web: a globally connected web of data that can be queried, reused, and reasoned about across domains. Whether for scholarly communication, decentralized publishing, or digital preservation, reusing existing vocabularies enables your content to become part of a larger, interoperable ecosystem.

While Schema.org remains a pragmatic and widely supported choice—especially for general-purpose and SEO-driven use cases—it is important to recognize the strengths of domain-specific vocabularies such as Dublin Core, FOAF, and SIOC. These provide richer models for representing authorship, context, and relationships within and across documents.

For developers and content creators seeking to enrich their websites with Linked Data, the Linked Open Vocabularies (LOV) portal serves as a curated gateway to well-maintained and reusable vocabularies. Using LOV, one can explore how vocabularies relate to each other, track their adoption, and make informed decisions about which ones to apply in a given context.

References

Linked Open Vocabularies (LOV)
LOV: Managing Vocabulary Dependencies
Vandenbussche, Pierre-Yves, et al. “Linked Open Vocabularies (LOV): A Gateway to Reusable Semantic Vocabularies on the Web.” Semantic Web, vol. 8, no. 3, Jan. 2017, pp. 437–52. https://doi.org/10.3233/SW-160213