PID (persistent identifier)

PID (persistent identifier)
Name	PID
Caption	Persistent identifier ecosystem
Introduced	1990s
Example	DOI, Handle, ARK, PURL, ORCID

Contents

Definition and Purpose
Types and Schemes
Governance and Standards
Implementation and Resolution Mechanisms
Use Cases and Applications
Advantages and Limitations
Interoperability and Persistence Policies

PID (persistent identifier) A persistent identifier is a long-term, unique reference used to identify digital objects, organizations, researchers, datasets, and cultural artifacts across platforms and time. It ensures stable access and citation for scholarly outputs, archival materials, and administrative records by decoupling identifiers from physical locations and transient metadata services.

Definition and Purpose

Persistent identifiers serve as durable names that remain actionable despite changes in infrastructure, hosting, or organizational ownership. Prominent initiatives and organizations such as International DOI Foundation, CrossRef, DataCite, ORCID, and Internet Archive advocate for PIDs to support reproducibility, citation, and discovery in scholarly communication and cultural heritage. Standards bodies including ISO and W3C inform technical expectations while repositories like Zenodo, Dryad Digital Repository, Figshare (company), and UK Data Service implement PID strategies to preserve access to research outputs. Major funders and institutions such as Wellcome Trust, National Institutes of Health, European Commission, Horizon 2020, and National Science Foundation often require PID use in grant reporting and data management plans.

Types and Schemes

Common PID schemes include the Digital Object Identifier system administered by the International DOI Foundation, the Handle System managed by the Corporation for National Research Initiatives, the Archival Resource Key (ARK) promoted by California Digital Library, the Persistent Uniform Resource Locator (PURL) maintained by projects such as Internet Archive and OCLC, and the International Standard Name Identifier (ISNI) supported by British Library and ISO. Person-level schemes include ORCID and ResearcherID from Clarivate. Institutional and organizational identifiers include ROR (identifier) and GRID (database). Subject and dataset identifiers may use DataCite DOIs, while software can be identified via Software Heritage identifiers and GitHub repositories linked to DOIs through Zenodo. National libraries like the Library of Congress and Bibliothèque nationale de France assign identifiers for bibliographic control, while cultural heritage initiatives like Europeana and Digital Public Library of America integrate multiple PID schemes.

Governance and Standards

PID governance spans non-profits, consortia, and standards organizations. The International DOI Foundation and DataCite define policy and metadata norms for DOI assignment; the Handle System relies on CNRI and regional registrars; ORCID operates under a non-profit membership model with institutional subscribers; and RAR-style archival authorities embed governance within libraries like the National Library of Australia. Standards and specifications from ISO, W3C, IETF, and NISO guide syntax, metadata, and resolution behavior. National and regional policy actors such as the European Research Council, Research Councils UK, Australian Research Council, and Japan Science and Technology Agency shape adoption through mandates and interoperability requirements.

Implementation and Resolution Mechanisms

Resolution involves mapping a PID to a current resource location via resolver services. DOI resolution uses infrastructure operated by CrossRef and DataCite pointing to publisher platforms like Elsevier, Springer Nature, Wiley, and IEEE Xplore. Handle resolution employs regional resolver networks and institutions including CNRI and Internet2. ARK resolvers are implemented by organizations such as California Digital Library, HathiTrust, and National Library of Medicine. Resolver service providers and directory services include PURL, N2T (Name-to-Thing), Handle.Net, and commercial services from Amazon Web Services. Integration with discovery systems occurs via library platforms like Ex Libris, OCLC, and indexing services like Web of Science, Scopus, and Google Scholar.

Use Cases and Applications

PIDs underpin citation practices in scholarly publishing involving publishers Elsevier, Springer Nature, Taylor & Francis, and Oxford University Press, and fuel data citation in repositories like PANGAEA and Figshare (company). They enable researcher identity management for academics at institutions such as Harvard University, Massachusetts Institute of Technology, University of Oxford, and Stanford University through ORCID integrations. Cultural heritage institutions such as the Metropolitan Museum of Art and British Museum use PIDs to link artifacts across catalogs, while archives like National Archives (United Kingdom) and National Archives and Records Administration employ identifiers for records. Governmental agencies including NASA, European Space Agency, and USGS assign persistent identifiers to datasets and mission products. Publishers, funders, aggregators, and indexing services leverage PIDs for metrics by collaborators like Altmetric and Clarivate.

Advantages and Limitations

Advantages include stable citation, provenance tracking, machine-actionable metadata, and facilitation of reproducible research across platforms like Jupyter Notebook and tools from GitHub. PIDs reduce link rot and support aggregation by services such as ORCID and DataCite Commons. Limitations involve governance fragmentation, cost barriers for small organizations, dependency on resolver infrastructure operated by entities like CrossRef and CNRI, and varying metadata quality that complicates interoperability with aggregators like CORE and OpenAIRE. Legal and jurisdictional risks arise when registries are subject to national laws affecting institutions such as European Commission funded projects or national libraries.

Interoperability and Persistence Policies

Achieving interoperability requires shared metadata standards, trust frameworks, and durable business models. Crosswalks between schemes are coordinated by organizations such as DataCite, CrossRef, ORCID, and ROR (identifier), while metadata standards from Dublin Core, Schema.org, and MARC support mapping across library and web ecosystems. Persistence policies, adopted by institutions like Harvard Library, British Library, and National Library of Australia, define responsibilities for identifier minting, redirection, metadata stewardship, and succession planning to mitigate organizational change and ensure continuity of access. Long-term preservation partnerships often include consortia such as LOCKSS and CLOCKSS to complement PID strategies.

Category:Identifiers