Crossref Metadata Schema

Crossref Metadata Schema
Name	Crossref Metadata Schema
Established	2000s
Owner	Crossref
Type	Metadata standard
Domain	Scholarly publishing, digital object identifiers

Contents

Overview
History and Development
Schema Structure and Elements
Metadata Registration and Submission Processes
Persistent Identifiers and Related Standards
Validation, Quality Control, and Best Practices
Use Cases and Applications

Crossref Metadata Schema The Crossref Metadata Schema is a formalized metadata specification used in scholarly publishing to describe and index publications, contributors, and relationships for persistent identifiers. It enables interoperability among publishers, libraries, repositories, and indexing services by defining structured fields for titles, authors, affiliations, funding, licenses, and related works. The schema interacts with international standards and infrastructures that support citation linking, discovery, and preservation.

Overview

The schema provides a machine-readable model that maps publishing entities to persistent identifiers and controlled vocabularies, facilitating linking across platforms such as DOI, PubMed Central, Scopus, Web of Science, and Crossref members. Implementations often interface with registries and infrastructures like ORCID, DataCite, PORTICO, CLOCKSS, and LOCKSS to coordinate persistent access, preservation, and author attribution. Publishers and aggregators adopt the schema to ensure compatibility with services including Google Scholar, Dimensions, JSTOR, and library systems from vendors such as ProQuest and EBSCO.

History and Development

Development began in parallel with the rise of digital identifiers during efforts led by organizations tied to scholarly communication, including collaborations with International DOI Foundation, Publishers International Linking Association, and major publishing groups like Elsevier, Springer Nature, Wiley, and Taylor & Francis. Iterations of the schema have been informed by standards from bodies such as NISO, ISO, and projects like CrossMark, FundRef, and COUNTER. Amendments responded to technological shifts exemplified by initiatives from ORCID, the introduction of linked-data practices from W3C, and preservation collaborations with LOCKSS/CLOCKSS partners.

Schema Structure and Elements

Core elements encompass bibliographic metadata fields for works (titles, subtitles), contributor records (authors, editors), affiliations (institutions, research centers), dates (publication, deposited), and identifiers (DOI, ISBN). Relationship elements express citation links, corrections, retractions, and versions connecting to records in PubMed, arXiv, Zenodo, Figshare, and Dryad. Funding metadata maps to registries like Crossref Funder Registry and aligns with grant identifiers used by agencies such as NIH, ERC, and Wellcome Trust. Rights and license elements reference standards from organizations including Creative Commons and legal repositories like Legal Information Institute.

Metadata Registration and Submission Processes

Publishers register metadata during DOI assignment workflows using APIs and batch deposits often coordinated through platforms maintained by Crossref members and vendors like Atypon and HighWire. Submission processes require XML or JSON payloads conforming to schema versions, using validation endpoints and submission portals analogous to services from DataCite and ORCID API integrations. Workflow partners include production houses and service providers such as CUP and OUP that automate metadata feeds into discovery systems including WorldCat and institutional repositories managed by DSpace or EPrints.

The schema is tightly coupled with persistent identifier systems: DOIs administered by the International DOI Foundation and registration agencies, researcher identifiers from ORCID, and dataset identifiers coordinated with DataCite. It interoperates with identifier schemes and semantic standards promoted by W3C and metadata vocabularies such as Schema.org and Dublin Core subsets used by consortia like SPARC. Citation and linking practices align with initiatives such as CrossMark, FundRef, and persistent access programs like Portico.

Validation, Quality Control, and Best Practices

Quality assurance uses automated validation tools, human editorial checks, and community guidelines published by stakeholder organizations including NISO, COPE, STM Association, and major publishers like Nature Publishing Group. Best practices emphasize ORCID integration for author disambiguation, accurate affiliation data using institution registries like ROR, comprehensive funding metadata, and timely updates for corrections and retractions tracked by services such as Retraction Watch. Preservation and redundancy strategies mirror recommendations from CLOCKSS and PORTICO for persistent access.

Use Cases and Applications

Applications span discovery in indexing services like Google Scholar and Scopus, bibliometric analysis by research organizations such as Clarivate, compliance reporting for funders including NIH and Horizon Europe, and integration with institutional repositories at universities like Harvard University and University of Oxford. The schema supports scholarly infrastructure projects including citation graph initiatives from OpenCitations and data-sharing platforms such as Figshare and Zenodo, and underpins services for manuscript processing used by platforms like Editorial Manager and ScholarOne.

Category:Metadata standards