CrossRef Metadata Search

CrossRef Metadata Search
Name	CrossRef Metadata Search
Developer	CrossRef
Released	2000s
Programming language	Various
Platform	Web, REST API
License	Mixed

Contents

Overview
Functionality and Features
Data Sources and Coverage
API and Querying Methods
Use Cases and Applications
Limitations and Data Quality
History and Development

CrossRef Metadata Search is a bibliographic discovery service developed by a consortium of scholarly publishers to index and retrieve DOI-identified metadata for academic content. It provides searchable records that link to publications overseen by organizations such as Elsevier, Springer Nature, Wiley-Blackwell, Taylor & Francis, and Oxford University Press. Researchers, librarians, and technologists from institutions like Harvard University, University of Oxford, MIT, Stanford University, and University of Cambridge use it alongside services such as PubMed, Scopus, Web of Science, Google Scholar and arXiv.

Overview

CrossRef Metadata Search aggregates metadata for digital objects assigned Digital Object Identifiers through CrossRef membership, enabling discovery across publishers including IEEE, American Chemical Society, Royal Society of Chemistry, PLOS, SAGE Publications, Cambridge University Press, and BMJ Group. The interface and APIs interoperate with standards and infrastructures like ORCID, DataCite, DOAJ, LOCKSS, and Project MUSE. Stakeholders such as the National Institutes of Health, European Research Council, Wellcome Trust, Apache Software Foundation, and Creative Commons often rely on or reference metadata returned by the service.

Functionality and Features

Features include metadata fields for titles, authors, publication dates, journal and book titles, ISSNs, ISBNs, and funder identifiers, compatible with identifiers such as ORCID iDs and CrossMark status flags. It supports reference linking used by platforms like JSTOR, SSRN, EBSCO, and ResearchGate. Search capabilities handle keyword, DOI, and author queries and return machine-readable formats used in tools like Zotero, EndNote, Mendeley, and RefWorks. Additional utilities integrate with repositories and infrastructures including Figshare, Zenodo, Dryad, HathiTrust, and Internet Archive.

Data Sources and Coverage

Metadata originates from publisher deposits by entities such as Elsevier, Springer Nature, Wiley-Blackwell, Taylor & Francis, Oxford University Press, IEEE, American Physical Society, Society for Neuroscience, Nature Publishing Group, Cell Press, John Wiley & Sons, and Karger. Coverage spans journals, conference proceedings, books, datasets, and preprints indexed alongside PubMed Central, bioRxiv, medRxiv, and institutional repositories like Harvard DASH and MIT DSpace. National libraries and organizations such as the Library of Congress, British Library, National Library of Medicine, and Europeana may use CrossRef metadata for cataloguing and aggregation.

API and Querying Methods

Programmatic access is provided through RESTful endpoints and OAI-PMH-like feeds that developers integrate into services run by companies and projects like GitHub, AWS, Google Cloud Platform, ORCID, and Kubernetes-backed deployments. Query formats support JSON and XML payloads consumed by software projects including Python libraries, R packages, Node.js applications, and tools used at institutions like Los Alamos National Laboratory, CERN, and NASA. Authentication and rate-limiting policies involve CrossRef membership and API keys similar to practices at PubMed, Scopus, and Dimensions.

Use Cases and Applications

Common applications include citation validation for publishers such as Elsevier and Springer, discovery services in library systems like Ex Libris and Koha, bibliometric analysis by organizations like Clarivate and Altmetric, and integration into research workflows at universities including Yale University and Princeton University. Developers build tools for automated reference linking in manuscript submission systems used by Editorial Manager and ScholarOne and for metadata enrichment in scholarly aggregators like ScienceDirect, Wiley Online Library, and Nature.com. Grant funders such as Wellcome Trust, National Science Foundation, and European Commission use metadata to track outputs.

Limitations and Data Quality

Data quality depends on publisher deposits; inconsistencies have been observed with metadata from some members including issues with author name disambiguation affecting affiliation records at institutions like Columbia University and University of California, Berkeley. Coverage gaps exist for small presses and non-English publishers similar to challenges faced by DOAJ and Scopus. Duplicate records, missing DOIs, incomplete funding information, and variable reference parsing accuracy can affect bibliometric analyses done by groups like CWTS and OpenAIRE. Interoperability issues arise when mapping to identifiers such as ISNI, ROR, and legacy catalog systems at national libraries.

History and Development

The service evolved alongside CrossRef initiatives launched by publisher consortia including Association of American Publishers and institutional partners such as Cornell University and British Library. Milestones parallel developments in identifier systems like DOI, the adoption of ORCID in the 2010s, and open metadata movements endorsed by funders such as Wellcome Trust and Horizon 2020. Collaborations and technical contributions have involved organizations such as DataCite, Jisc, OpenAIRE, Digital Science, and academic groups at University College London and Imperial College London.

Category:Bibliographic databases