This article is part of a series on Knowledge Representation and Semantic Web.

Integrating Linked Data into websites enhances their semantic expressiveness, making content more interoperable and machine-readable. While Schema.org is among the most widely adopted vocabularies for this purpose, it is by no means the only one available. The Semantic Web encourages the reuse of existing vocabularies, tailored to specific domains such as academic publishing, social interactions, or institutional archiving.

A key principle of Linked Data is vocabulary reuse. Rather than defining new schemas in isolation, it is encouraged to adopt established vocabularies to ensure consistency, interoperability, and long-term sustainability. This approach also facilitates data integration across domains and improves the discoverability of web content by semantic agents.

A central resource supporting this effort is the Linked Open Vocabularies (LOV) registry[1]. LOV indexes a wide range of well-established vocabularies, providing information about their usage, interdependencies, and adoption status. It is particularly useful when selecting vocabularies that best align with the purpose and context of your content.

Among the many vocabularies available, Schema.org is one of the most commonly used for enriching web content. It is widely supported by major search engines and offers a broad hierarchy of types and properties suitable for general-purpose content.

For blog-related content, Schema.org provides specific types that are particularly relevant:

Other elements such as headings and sections may be annotated using types like schema:Heading or schema:Section, although these are more commonly abstracted in properties such as articleBody or mainEntityOfPage.

The following example illustrates a JSON-LD snippet using the schema:BlogPosting type to annotate a blog article. This includes key metadata such as title, author, publisher, publication date, and keywords.

{
    "@context": "https://schema.org",
    "@type": "BlogPosting",
    "headline": "Integrating Linked Data into a Blog",
    "alternativeHeadline": "Enhancing Semantic Structure with Schema.org and LOV",
    "author": {
        "@type": "Person",
        "name": "John Samuel",
        "url": "https://johnsamuel.info",
        "sameAs": [
            "https://orcid.org/...",
            "https://www.wikidata.org/wiki/..."
        ]
    },
    "publisher": {
        "@type": "Organization",
        "name": "John Samuel Publications",
        "logo": {
            "@type": "ImageObject",
            "url": "https://johnsamuel.info/images/logo/favicon.png",
            "width": 60,
            "height": 60
        }
    },
    "datePublished": "2020-05-03T19:04:28Z",
    "dateModified": "2024-06-01T12:30:00Z",
    "description": "An overview of how ... semantically enrich a blog.",
    "articleBody": "In recent years, I have begun integrating ...",
    "mainEntityOfPage": "https://johnsamuel.info/blog/linked-data-integration",
    "keywords": [
        "Linked Data",
        "Semantic Web",
        "Schema.org",
        "BlogPosting",
        "Linked Open Vocabularies"
    ],
    "image": "https://johnsamuel.info/images/blog/linkeddata-integration.jpg"
}
                

Alternative vocabularies such as Dublin Core Terms, FOAF, and SIOC offer well-established and interoperable models for describing web content. These vocabularies are widely used in the Linked Data ecosystem and are registered in the Linked Open Vocabularies (LOV) catalog.

Below is a brief overview of these vocabularies:

Prefix Vocabulary Purpose
dct Dublin Core Terms General-purpose metadata: title, creator, date, subject, etc.
foaf FOAF (Friend of a Friend) Describes people, their profiles, online accounts, and social relationships.
sioc SIOC (Semantically-Interlinked Online Communities) Models online discussion forums, blogs, posts, and user interactions.
xsd XML Schema Datatypes Provides datatype specifications such as xsd:dateTime.

These vocabularies are especially valuable when content is intended to be indexed in academic repositories, or knowledge graphs. They offer semantic precision, open governance, and widespread compatibility with RDF tools.

The following JSON-LD example demonstrates how to annotate a blog post using these open vocabularies. It models the post as a sioc:Post, provides general metadata via Dublin Core, and uses FOAF to describe the author.

{
    "@context": {
        "dct": "http://purl.org/dc/terms/",
        "foaf": "http://xmlns.com/foaf/0.1/",
        "sioc": "http://rdfs.org/sioc/ns#",
        "xsd": "http://www.w3.org/2001/XMLSchema#"
    },
    "@type": "sioc:Post",
    "dct:title": "Integrating Linked Data Using Open Vocabularies",
    "dct:description": "A conceptual overview of ...",
    "dct:created": {
        "@value": "2020-05-03T19:04:28Z",
        "@type": "xsd:dateTime"
    },
    "dct:modified": {
        "@value": "2024-06-01T12:30:00Z",
        "@type": "xsd:dateTime"
    },
    "dct:creator": {
        "@type": "foaf:Person",
        "foaf:name": "John Samuel",
        "foaf:homepage": {
            "@id": "https://johnsamuel.info"
        },
        "foaf:img": {
            "@id": "https://johnsamuel.info/images/portrait.jpg"
        }
    },
    "sioc:has_container": {
        "@type": "sioc:Blog",
        "dct:title": "John Samuel's Blog",
        "foaf:homepage": {
            "@id": "https://johnsamuel.info/blog"
        }
    },
    "sioc:content": "This blog post explores ...",
    "dct:subject": [
        "Linked Data",
        "Dublin Core",
        "FOAF",
        "SIOC",
        "Semantic Web"
    ]
}
                

Using these vocabularies allows for a more modular and standards-compliant representation of metadata, and supports integration into datasets beyond commercial search engines. Their use is common in institutional repositories, digital preservation frameworks, and community-driven publishing platforms.

By incorporating semantic annotations into blog content, web publishers can significantly enhance both the discoverability and machine interpretability of their material. Search engines, aggregators, and digital libraries increasingly rely on structured data to index and understand content beyond traditional keyword analysis.

More broadly, integrating Linked Data into personal or organizational websites aligns with the foundational vision of the Semantic Web: a globally connected web of data that can be queried, reused, and reasoned about across domains. Whether for scholarly communication, decentralized publishing, or digital preservation, reusing existing vocabularies enables your content to become part of a larger, interoperable ecosystem.

While Schema.org remains a pragmatic and widely supported choice—especially for general-purpose and SEO-driven use cases—it is important to recognize the strengths of domain-specific vocabularies such as Dublin Core, FOAF, and SIOC. These provide richer models for representing authorship, context, and relationships within and across documents.

For developers and content creators seeking to enrich their websites with Linked Data, the Linked Open Vocabularies (LOV) portal serves as a curated gateway to well-maintained and reusable vocabularies. Using LOV, one can explore how vocabularies relate to each other, track their adoption, and make informed decisions about which ones to apply in a given context.

References

  1. Linked Open Vocabularies (LOV)
  2. LOV: Managing Vocabulary Dependencies
  3. Vandenbussche, Pierre-Yves, et al. “Linked Open Vocabularies (LOV): A Gateway to Reusable Semantic Vocabularies on the Web.” Semantic Web, vol. 8, no. 3, Jan. 2017, pp. 437–52. https://doi.org/10.3233/SW-160213