This article is part of a series on Knowledge Representation and Semantic Web.

For a long time, articles, book chapters, and blog posts have been the primary sources used for citations. However, with rapid technological advancements, the use of photographs and videos has become increasingly common, even in the scientific community—especially for demonstrating how a proposed system or experiment works. These media are often shared alongside related research articles. As such, it is becoming essential to develop standardized ways of citing these new media formats. In this article, I explore how multimedia is being incorporated into scientific literature, current practices for citing such sources, and how citation metadata can be further enhanced to support these formats.

Note: The abstract of this article was initially written as a proposal for WikiCite 2017.

Photographs and videos are playing an increasingly important role in scientific research, particularly in disciplines like computer science and engineering. They are often used to visually demonstrate how a system functions, providing insights that text alone cannot convey.

However, citing these media sources is not as straightforward as citing traditional, text-based resources such as journal articles, book chapters, or blog posts. This article examines the current practices for citing photographs and videos in scientific literature, the challenges involved, and the need for improved metadata standards.

Current State of Citation for Photographs and Videos

Currently, photographs and videos are cited in scientific works, but practices vary widely. Some researchers simply mention the media in the text, while others provide a more complete reference, including details such as the creator, date of creation, title, and source.

For example, a typical citation for a photograph might look like:

Smith, J. (2020). Title of Photograph. Retrieved from URL

And a citation for a video might be:

Doe, J. (2021). Title of Video. Retrieved from URL

However, these formats often lack consistency and standardization, making it difficult to accurately retrieve or verify the media. This inconsistency is particularly problematic in scientific contexts where traceability and reproducibility are critical.

Challenges in Citing Photographs and Videos

One major challenge is the absence of standardized citation formats for multimedia. While traditional text-based resources follow well-established styles (e.g., APA, MLA, Chicago), multimedia sources do not benefit from such uniformity.

Furthermore, the dynamic and ephemeral nature of online media adds complexity. URLs may change or become obsolete, making it difficult to retrieve the original content. This undermines the reproducibility and long-term verifiability of scientific work.

Enhancing Citation Metadata for Photographs and Videos

To overcome these challenges, enhanced citation metadata is needed for photographs and videos. Such metadata could include standardized fields like:

Consider a use case where a researcher wants to refer to a specific part of a photograph— for example, a particular region of a plant. In such cases, using a Figure field along with spatial annotation helps identify the exact portion being referenced.

Unfortunately, HTML lacks a native mechanism for such spatial targeting. The International Image Interoperability Framework (IIIF), however, offers a solution through the selector field, which can be used to identify specific regions within an image. This approach enables precise and reproducible referencing of visual content.

Similarly, for videos, a researcher may wish to reference a specific segment—for example, a few seconds showing a reaction or interaction. HTML5 provides tags such as audio, video, and source, but does not support citation of specific timecodes or sequences.

The IIIF standard again provides a promising solution through the selector field, which allows one to reference specific time segments within audiovisual content. This facilitates the creation of detailed citations for video sequences, just as for image regions.

Standardizing citation practices for such granular referencing would not only improve the quality and precision of citations, but also support better discoverability, reuse, and reproducibility in scientific research.

Conclusion

As the use of photographs and videos continues to expand in scientific communication, it is vital to develop standardized and robust citation methods for these media types. Enhancing metadata and adopting interoperable standards like IIIF can address current limitations and ensure that multimedia references are precise, persistent, and reproducible.

References

  1. Software Citation Principles
  2. Citations as First-Class Data Entities: Introduction
  3. Citations as First-Class Data Entities: Open Citation Identifiers
  4. Introducing InTRePIDs – In-Text Reference Pointer Identifiers