RCSB PDB — LLMpedia

RCSB PDB
Name	RCSB PDB
Formation	1971
Founder	Walter Hamilton; Edgar Meyer; Helen Berman
Type	Database; nonprofit
Headquarters	Rutgers University; University of California, San Diego
Region served	Global

Contents

Overview
History and Development
Data Content and Structure
Access, Tools, and Services
Governance and Funding
Impact and Applications

RCSB PDB is the United States-based resource for three-dimensional structural data of biological macromolecules, providing access, curation, and tools for researchers in biochemistry, molecular biology, structural biology, and related fields. It serves scientists, educators, clinicians, and industry by distributing coordinates, experimental data, and annotations for proteins, nucleic acids, and complexes, supporting studies that connect structure to function across multiple disciplines.

Overview

The resource aggregates macromolecular structures submitted to the worldwide structural biology community and makes them available through an integrated platform that links structural entries to experimental methods, chemical components, literature, and biological annotations. Users can navigate between entries that reference techniques such as X-ray crystallography, Nuclear magnetic resonance spectroscopy, Cryo-electron microscopy, and related experimental approaches; view connections to projects and institutions like Human Genome Project, Protein Data Bank Japan, European Bioinformatics Institute, and EMDataBank; and explore cross-references to databases including UniProt, Chemical Abstracts Service, PubMed, Gene Ontology, and Ensembl.

History and Development

The archive traces its origins to the early community efforts in the 1970s when structural data sharing was championed by figures and institutions that included Walter Hamilton, Edgar Meyer, and Helen Berman, and initiatives associated with research centers such as Brookhaven National Laboratory, Cold Spring Harbor Laboratory, and National Institutes of Health. Over decades, governance and technical stewardship evolved through collaborations with organizations like Research Collaboratory for Structural Bioinformatics, Protein Data Bank (Europe), Protein Data Bank Japan, and consortia involving Rutgers University, University of California, San Diego, Rutgers Biomedical Health Sciences, and national facilities such as Lawrence Berkeley National Laboratory. Landmark developments paralleled advances reported in venues such as Nature, Science (journal), and Cell (journal), and engaged leaders honored by awards like the Nobel Prize and institutions including the Howard Hughes Medical Institute.

Data Content and Structure

The collection contains atomic coordinate files, electron density maps, cryo-EM density maps, restraint data from NMR spectroscopy, and derived models, organized by accession identifiers that link to literature in PubMed Central, experimental details tied to facilities such as Advanced Photon Source, European Synchrotron Radiation Facility, and metadata cross-referenced to resources like ChEMBL, DrugBank, PDBsum, mmCIF, and the Worldwide Protein Data Bank. Entries cover proteins, nucleic acids, viruses, ribosomes, and large assemblies with annotations pointing to organisms cataloged by NCBI Taxonomy, pathways described in KEGG, enzyme classifications in Enzyme Commission numbers, and disease associations curated against OMIM, ClinVar, and clinical resources such as Food and Drug Administration filings. Chemical ligands are indexed with identifiers from International Union of Pure and Applied Chemistry records and linked to small-molecule repositories like ZINC (database) and PubChem.

Access, Tools, and Services

The platform provides web portals, programmatic APIs, and visualization tools that interoperate with software and services including PyMOL, UCSF Chimera, Jmol, Mol*, BLAST, and structural analysis packages developed by groups at Stanford University, Massachusetts Institute of Technology, and European Molecular Biology Laboratory. Educational materials and workshops have been created with partners such as Cold Spring Harbor Laboratory, Gordon Research Conferences, Howard Hughes Medical Institute, and European Bioinformatics Institute outreach programs. Data deposition pipelines coordinate with submitters from institutions like Harvard University, Yale University, University of Cambridge, Max Planck Society, and measurement facilities including Diamond Light Source, Stanford Synchrotron Radiation Lightsource, and National Synchrotron Light Source II.

Governance and Funding

Organizational oversight has been provided through collaborations among academic institutions, federal agencies, and philanthropic organizations, with funding streams and partnerships involving entities such as the National Science Foundation, National Institutes of Health, U.S. Department of Energy, private foundations like the Gordon and Betty Moore Foundation, and university research offices at Rutgers University and University of California, San Diego. Advisory boards and scientific committees include representatives from leading research centers and professional societies including the American Society for Biochemistry and Molecular Biology, International Union of Crystallography, American Chemical Society, and consortia such as the wwPDB members, ensuring policy alignment with community standards and FAIR data principles promulgated by groups like the Research Data Alliance.

Impact and Applications

The archive underpins discoveries and translational applications across molecular medicine, biotechnology, and structural virology, supporting work at pharmaceutical companies such as Pfizer, Moderna, GlaxoSmithKline, and research on pathogens studied at institutions like Centers for Disease Control and Prevention, World Health Organization, Scripps Research, and Rockefeller University. Structural entries have enabled vaccine design linked to efforts by teams at Imperial College London, University of Oxford, and National Institute for Biological Standards and Control; informed enzyme engineering in collaborations with Genentech and Novartis; and supported computational methods developed by groups at Google DeepMind, IBM Research, Carnegie Mellon University, and ETH Zurich. The resource is cited broadly in journals including Nature Communications, Proceedings of the National Academy of Sciences, Journal of Biological Chemistry, and Structure (journal), and contributes to education, public health responses, and innovation ecosystems spanning academia, industry, and government.

Category:Biological databases