BioGRID — LLMpedia

BioGRID
Name	BioGRID
Type	Biological interaction database
Founder	Michael Tyers
Established	2003
Focus	Protein–protein interactions, genetic interactions, chemical interactions, post-translational modifications
Languages	English

Contents

Overview
Data Curation and Content
Database Structure and Access
Tools and Integration
Use Cases and Impact
History and Development

BioGRID BioGRID is a curated repository of biological interactions that aggregates experimental data on protein–protein interactions, genetic interactions, chemical associations, and post-translational modifications. It serves researchers in molecular biology, genomics, proteomics, and systems biology by integrating datasets from model organisms and human studies. BioGRID supports data dissemination to community resources and interoperability with computational platforms.

Overview

BioGRID compiles interaction data from high-throughput studies and focused publications to create a consolidated resource for researchers working with organisms such as Saccharomyces cerevisiae, Homo sapiens, Drosophila melanogaster, Caenorhabditis elegans, Mus musculus, Arabidopsis thaliana, Schizosaccharomyces pombe, Escherichia coli, Zea mays, Danio rerio, Xenopus laevis, Gallus gallus, Bos taurus, Rattus norvegicus, Plasmodium falciparum, Mycobacterium tuberculosis, Candida albicans, HIV-1 and other taxa. The resource provides standardized interaction types and experimental evidence terms that align with community ontologies used by repositories like UniProt, Gene Ontology, National Center for Biotechnology Information, European Molecular Biology Laboratory, European Bioinformatics Institute, and Ensembl. BioGRID’s scope complements efforts by projects such as Human Protein Atlas, STRING, IntAct, MINT, Reactome, KEGG, PDB, and GWAS Catalog.

Data Curation and Content

Curation in BioGRID is performed by expert curators who annotate interactions described in journals indexed by services such as PubMed, Web of Science, Scopus, CrossRef, MEDLINE, Nature, Science (journal), Cell (journal), EMBO Journal, PNAS, PLoS Biology, Genome Research, Nature Genetics, Nature Communications, The Lancet, Journal of Biological Chemistry, Molecular Cell, Genes & Development, Developmental Cell, Nature Methods, Nucleic Acids Research, and Bioinformatics (journal). Curators map gene and protein identifiers to authorities including RefSeq, Entrez Gene, UniProtKB, HGNC, MGI, FlyBase, WormBase, SGD, PomBase, TAIR, and KEGG Genes to ensure consistency. Interaction annotation captures experimental methods like affinity purification, yeast two-hybrid, co-immunoprecipitation, mass spectrometry, genetic suppression, synthetic lethality, chromatin immunoprecipitation, biochemical assays, and fluorescence microscopy, referencing standards promoted by organizations such as HUPO and IMEx.

Database Structure and Access

The BioGRID schema organizes interactions into entities for proteins, genes, chemicals, post-translational modifications, and metadata linking to publication records. Data exports are provided in formats compatible with platforms such as Cytoscape, Gephi, R Project, Bioconductor, Python (programming language), Perl, Java, SQL, and GraphML. Web services and REST APIs enable programmatic queries interoperable with resources like GitHub, Zenodo, Figshare, Google Cloud Platform, Amazon Web Services, and National Institutes of Health. Authentication and licensing follow community norms used by Creative Commons and institutional repositories of Wellcome Trust and National Science Foundation funded initiatives.

Tools and Integration

BioGRID integrates with visualization and analysis tools including Cytoscape, NetworkX, Gephi, Pathway Commons, ReactomePA, STRING-db, and workflow systems such as Galaxy (platform), Nextflow, Snakemake, Docker, Singularity, and Conda. It supports enrichment analyses using resources like DAVID, Enrichr, Metascape, g:Profiler, and connects to structural databases such as Protein Data Bank, AlphaFold, and ModBase for mapping interactions to 3D context. Collaborative projects and portals that consume BioGRID data include UniProt Consortium, NCBI Gene, Ensembl Genomes, FlyBase Consortium, WormBase Consortium, SGD Consortium, PomBase Consortium, and disease databases like ClinVar, OMIM, DisGeNET, and PharmGKB.

Use Cases and Impact

Researchers use BioGRID for network reconstruction, identification of protein complexes, discovery of genetic interaction landscapes, drug target prioritization, and modeling of signaling pathways implicated in diseases such as Alzheimer's disease, Parkinson's disease, Breast cancer, Colorectal cancer, Acute myeloid leukemia, HIV/AIDS, Tuberculosis, Malaria, and COVID-19 pandemic. BioGRID data underpins studies employing algorithms developed in computational biology labs associated with institutions like Broad Institute, Wellcome Sanger Institute, Cold Spring Harbor Laboratory, European Bioinformatics Institute, MIT, Harvard University, Stanford University, University of California, San Francisco, and University of Cambridge. The database has been cited in thousands of publications and is used in educational resources at universities and workshops hosted by organizations including EMBO, Gordon Research Conferences, Cold Spring Harbor Laboratory, and International Society for Computational Biology.

History and Development

BioGRID originated from initiatives in the early 2000s spearheaded by researchers affiliated with institutions such as University of Toronto, Samuel Lunenfeld Research Institute, and collaborations with consortia like IMEx and PSI-MI. Over time, development has incorporated community standards from bodies such as HUPO, GA4GH, FAIR Data Principles, and toolchains used by ELIXIR. Funding and partnerships have involved agencies and organizations like Canadian Institutes of Health Research, National Institutes of Health, Wellcome Trust, European Commission, and private foundations supporting open data. Continuous development cycles have added support for chemical interactions, post-translational modification curation, and enhanced API services to meet the needs of diverse research communities.

Category:Biological databases