Protein Data Bank

Protein Data Bank
Name	Protein Data Bank

Contents

Introduction
History
Structure and Content
File Format and Annotations
Access and Utilization
Impact and Applications

Protein Data Bank is a comprehensive repository of 3D structures of proteins, nucleic acids, and complexes of these biomolecules, maintained by the Research Collaboratory for Structural Bioinformatics (RCSB) at Rutgers University, in collaboration with the National Institute of General Medical Sciences (NIGMS) and the National Institutes of Health (NIH). The database is widely used by researchers in the fields of structural biology, biochemistry, and molecular biology, including scientists at Harvard University, Stanford University, and the University of Cambridge. The Protein Data Bank is closely related to other databases, such as the Protein Information Resource (PIR) and the UniProt database, which provide additional information on protein sequences and functional annotations.

Introduction

The Protein Data Bank is an essential resource for understanding the three-dimensional structure of biological macromolecules, including enzymes, receptors, and hormones, which are crucial for various biological processes, such as metabolism, signaling pathways, and gene regulation. Researchers at institutions like the Massachusetts Institute of Technology (MIT), the University of California, Berkeley, and the European Molecular Biology Laboratory (EMBL) rely on the Protein Data Bank to study the structure-function relationships of these molecules. The database contains structures determined by X-ray crystallography, NMR spectroscopy, and other biophysical techniques, and is closely linked to other resources, such as the Worldwide Protein Data Bank (wwPDB) and the Protein Data Bank Japan (PDBj).

History

The Protein Data Bank was established in 1971 by Walter Hamilton at the Brookhaven National Laboratory, with the goal of creating a centralized repository for three-dimensional structures of biological macromolecules. The database was initially managed by the Brookhaven National Laboratory and later transferred to the Research Collaboratory for Structural Bioinformatics (RCSB) in 1998. The Protein Data Bank has undergone significant changes and expansions over the years, including the introduction of new file formats and annotation standards, and has been supported by organizations like the National Science Foundation (NSF), the Department of Energy (DOE), and the Wellcome Trust. The database has also been influenced by the work of prominent researchers, such as James Watson, Francis Crick, and Rosalind Franklin, who made significant contributions to our understanding of the structure of DNA.

Structure and Content

The Protein Data Bank contains a vast collection of three-dimensional structures of biological macromolecules, including proteins, nucleic acids, and complexes of these molecules. The database is organized into a hierarchical system, with structures classified into different categories, such as enzymes, receptors, and hormones, and is closely related to other databases, such as the Enzyme Commission (EC) and the Gene Ontology (GO) database. Each structure is represented by a unique PDB ID, which provides access to detailed information on the molecular structure, including atomic coordinates, bond lengths, and bond angles. Researchers at institutions like the University of Oxford, the University of California, Los Angeles (UCLA), and the German Cancer Research Center (DKFZ) use the Protein Data Bank to study the structure-function relationships of these molecules.

File Format and Annotations

The Protein Data Bank uses a standardized file format to represent the three-dimensional structures of biological macromolecules. The PDB file format contains detailed information on the atomic coordinates, bond lengths, and bond angles of the molecule, as well as additional annotations on the molecular structure, such as secondary structure and ligand binding sites. The database also provides tools and resources for data annotation and validation, including the PDB Validation Report and the PDB Annotation Toolkit, which are used by researchers at institutions like the University of Chicago, the University of Michigan, and the European Bioinformatics Institute (EMBL-EBI).

Access and Utilization

The Protein Data Bank is freely accessible to the public, and provides a range of tools and resources for searching, visualizing, and analyzing the three-dimensional structures of biological macromolecules. Researchers can access the database through the RCSB PDB website, which provides a user-friendly interface for searching and retrieving structures, as well as tools for molecular visualization and structure analysis. The database is widely used by researchers in academia and industry, including scientists at Pfizer, GlaxoSmithKline, and the National Institutes of Health (NIH), and has been supported by organizations like the Bill and Melinda Gates Foundation and the Howard Hughes Medical Institute (HHMI).

Impact and Applications

The Protein Data Bank has had a significant impact on our understanding of the structure-function relationships of biological macromolecules, and has enabled major advances in fields such as structural biology, biochemistry, and molecular biology. The database has been used to study the mechanisms of disease, including cancer, HIV/AIDS, and neurodegenerative disorders, and has facilitated the development of new therapeutic strategies and drugs, including enzymatic inhibitors and receptor antagonists. Researchers at institutions like the University of California, San Francisco (UCSF), the University of Pennsylvania, and the Scripps Research Institute have used the Protein Data Bank to study the structure-function relationships of molecules involved in these diseases. The database has also been used in biotechnology and pharmaceutical applications, including the development of new biomaterials and biosensors, and has been supported by organizations like the National Institute of Biomedical Imaging and Bioengineering (NIBIB) and the Food and Drug Administration (FDA).