Identifiers.org — LLMpedia

Identifiers.org
Name	Identifiers.org
Formation	2011
Type	Service
Headquarters	EMBL-EBI
Language	English

Contents

Overview
Registry and Identifiers
Resolver Services and Infrastructure
Data Model and Standards Compliance
Governance and Community Usage
Applications and Integrations

Identifiers.org Identifiers.org is a service providing persistent, resolvable Web identifiers for life sciences resources, databases, and cross-references. It offers a curated registry of namespace prefixes and a resolver system that maps compact identifiers to provider-specific URLs, enabling stable citation and programmatic access across communities such as proteomics, genomics, metabolomics, and systems biology. The project interfaces with multiple institutions and standards bodies to promote interoperability among repositories, knowledgebases, and scholarly infrastructures.

Overview

The project originated within the context of efforts by European Molecular Biology Laboratory, European Bioinformatics Institute, Centre for Genomic Regulation, University of Cambridge, and other research centres to tackle identifier fragmentation encountered by initiatives like UniProt, Ensembl, PubMed, Gene Ontology, and Protein Data Bank. It complements infrastructures developed by Digital Object Identifier, Handle System, and the CrossRef ecosystem while aligning with recommendations from bodies such as Global Biodata Coalition and Research Data Alliance. Identifiers.org underpins citation practices in journals such as Nature, Science, and PLoS Biology and is used by projects including Reactome, BioModels, and ChEMBL.

Registry and Identifiers

The registry catalogs namespace prefixes corresponding to resources maintained by institutions such as NCBI, European Nucleotide Archive, GenBank, ZINC database, and KEGG. Each entry records recommended URI patterns, authoritative providers like Swiss-Prot or RefSeq, and metadata about licensing and update cadence used by aggregators such as ArrayExpress and Expression Atlas. The compact identifier format integrates with references produced by services like ORCID disambiguation, Wikidata cross-linking, and citation indices managed by Scopus and Web of Science. By curating mappings for well-known resources such as Reactome pathways, SABIO-RK kinetic entries, and BioSamples accessions, the registry reduces broken-link risk for repositories maintained by organizations like EMBL, Wellcome Trust Sanger Institute, and Broad Institute.

Resolver Services and Infrastructure

The resolver infrastructure provides HTTP redirection for compact identifiers to provider-specific landing pages operated by entities including European Bioinformatics Institute, National Center for Biotechnology Information, Protein Data Bank in Europe, and Swiss Institute of Bioinformatics. It employs high-availability components and mirrors to ensure reliability comparable to services such as GitHub Pages, Zenodo, and Figshare. Load balancing, caching, and monitoring integrate with platforms like Prometheus, Grafana, and Kubernetes clusters used by major data centers including EMBL-EBI and Cold Spring Harbor Laboratory. The resolver service also works alongside scholarly metadata aggregators like DataCite and identifier registries such as Identifiers for the Digital Object initiatives to improve discoverability across portals like Europe PMC and bioRxiv.

Data Model and Standards Compliance

The data model follows best practices articulated by standards organizations such as World Wide Web Consortium, International Organization for Standardization, and OpenAIRE. Schema elements reference vocabularies adopted by Dublin Core, Schema.org, and JSON-LD to enable interoperability with knowledge graphs maintained in Wikidata and linked-data platforms such as BioPortal. Identifiers.org aligns with persistent identifier principles advocated by FORCE11 and uses canonicalization approaches comparable to Linked Data Platform strategies used by Gene Ontology Consortium and UniProt Consortium. The registry captures provenance metadata compatible with PROV-O and supports content negotiation patterns employed by repositories like Europe PMC and PubChem.

Governance and Community Usage

Governance and curation involve collaborations among stakeholders including staff from European Molecular Biology Laboratory, Swiss Institute of Bioinformatics, Wellcome Sanger Institute, and funding agencies such as European Research Council and National Institutes of Health. Community processes accept submissions and updates from database maintainers like IntAct, STRING, and BioGRID and follow advisory input from consortia such as ELIXIR and Global Alliance for Genomics and Health. Use cases span academic groups at institutions like Stanford University, University of Oxford, and Massachusetts Institute of Technology as well as industry partners including Novartis and GlaxoSmithKline who integrate identifiers into pipelines for drug discovery and translational research.

Applications and Integrations

Practical integrations include citation resolution in manuscript submission systems used by Elsevier, Springer Nature, and Wiley, cross-references in pathway resources such as KEGG PATHWAY and MetaCyc, and programmatic links consumed by workflow managers like Nextflow and Galaxy. Tooling support exists in libraries and platforms from Bioconductor, PyPI packages, and GitHub repositories that automate conversion between local accession numbers and resolvable URIs for projects such as ENCODE, 1000 Genomes Project, and Human Cell Atlas. The registry’s mappings enable enrichment workflows in data integration efforts by ELIXIR Nodes, National Center for Biotechnology Information pipelines, and commercial knowledgebases managed by Clarivate.

Category:Biological databases