SCOP — LLMpedia

SCOP
Name	SCOP
Type	Database / Classification Resource
Founded	1990s
Founder	Alexey G. Murzin et al.
Headquarters	Europe
Fields	Bioinformatics, Structural Biology

Contents

Overview
History and Development
Structure and Classification
Methods and Algorithms
Applications and Use Cases
Limitations and Criticisms

SCOP

SCOP is a curated classification system and database for protein structural domains that organizes protein folds, superfamilies, and families based on structural and evolutionary relationships. It serves as a reference resource used by researchers in Cambridge, Stanford University, European Bioinformatics Institute, Max Planck Society, and Rutherford Appleton Laboratory and interlinks with resources such as Protein Data Bank, UniProt, Pfam, CATH (protein structure classification), and InterPro. The resource underpins comparative studies across organisms including Escherichia coli, Saccharomyces cerevisiae, Homo sapiens, Mus musculus, and viral lineages like Influenza A virus and Human immunodeficiency virus 1.

Overview

SCOP provides a hierarchical taxonomy that groups protein domains into classes, folds, superfamilies, and families by combining manual curation with structural comparison. Curators assessed structures from the Protein Data Bank and annotated relationships relevant to projects at institutions such as European Molecular Biology Laboratory, Wellcome Trust Sanger Institute, University of Oxford, University of Cambridge, and University College London. The classification has informed computational tools developed in laboratories at Massachusetts Institute of Technology, Harvard University, California Institute of Technology, ETH Zurich, and University of Tokyo.

History and Development

SCOP originated in the 1990s through work by researchers including Alexey G. Murzin and collaborators associated with Imperial College London and collaborators at European Bioinformatics Institute. Early versions mapped structures deposited to the Brookhaven National Laboratory archive into a coherent hierarchy, paralleling contemporaneous efforts such as CATH (protein structure classification) and database projects at Swiss Institute of Bioinformatics and National Center for Biotechnology Information. Over successive releases, SCOP incorporated structures from high-profile studies reported by groups at Max Planck Institute for Biophysical Chemistry, Cold Spring Harbor Laboratory, National Institutes of Health, and others investigating proteins like hemoglobin, lysozyme, and kinases characterized in publications from Nature, Science, and Cell.

Structure and Classification

The SCOP hierarchy arranges domains into major classes informed by recurrent architectures observed in proteins studied by teams at University of California, San Francisco and Yale University. Classes include all-alpha, all-beta, alpha/beta, and alpha+beta organizations, reflecting observations from canonical proteins such as those described by researchers at Johns Hopkins University and McGill University. Within classes, SCOP defines folds and groups related proteins into superfamilies and families; these groupings align with evolutionary inferences discussed in work from European Molecular Biology Laboratory and Cold Spring Harbor Laboratory. SCOP entries cross-reference sequences cataloged in UniProt and structural entries in Protein Data Bank, enabling integration with motif databases maintained by Pfam, domain models from SMART (database), and pathway resources at KEGG.

Methods and Algorithms

SCOP classification combined expert manual curation with computational structure comparison algorithms developed in academic groups such as University of California, San Diego, University of Washington, and Princeton University. Tools and methods used alongside SCOP include structure alignment programs and scoring routines influenced by methods from BLAST, DALI, MUSCLE, and profile-based techniques employed in projects at European Bioinformatics Institute and Center for Genomic Regulation. SCOP manual assignment relied on visual inspection, secondary-structure analysis, and literature synthesis from laboratories like Max Planck Institute for Biochemistry and methodology papers in journals such as Journal of Molecular Biology.

Applications and Use Cases

Researchers used SCOP for homology modeling, fold recognition, and evolutionary studies in investigations led by groups at Stanford University School of Medicine, University of California, Berkeley, University of Toronto, and Seoul National University. It underpinned benchmark datasets for algorithm development in structural bioinformatics projects at Google DeepMind and academic competitions run by Critical Assessment of protein Structure Prediction participants. SCOP-informed annotations aided functional inference in proteome projects for Arabidopsis thaliana, Drosophila melanogaster, Caenorhabditis elegans, and pathogenic studies on Mycobacterium tuberculosis and Plasmodium falciparum.

Limitations and Criticisms

Critics noted SCOP’s reliance on manual curation could limit scalability relative to automated approaches developed at European Bioinformatics Institute and groups behind CATH (protein structure classification) and Pfam, especially as structural data from consortia such as Structural Genomics Consortium and high-throughput initiatives expanded. Concerns were raised about subjectivity in fold delineation echoed in discussions at conferences hosted by International Society for Computational Biology and in articles in Bioinformatics. Additionally, integration challenges with sequence-centric resources maintained by National Center for Biotechnology Information and automated pipelines at UniProt highlighted the need for synchronized updates and consistent cross-references.

Category:Biological databases