BioCyc — LLMpedia

BioCyc
Name	BioCyc
Developer	SRI International
Released	1998
Programming language	Java, Python
Operating system	Cross-platform
License	Free for academic use; commercial licenses available

Contents

Overview
Content and Databases
Data Curation and Sources
Software and Tools
Applications and Use Cases
Access and Licensing

BioCyc is a curated collection of pathway/genome databases and a suite of bioinformatics tools developed to integrate genomic, metabolic, and regulatory information for organisms. Its resources support comparative analysis, pathway visualization, and metabolic modeling used by researchers at institutions such as Stanford University, Harvard University, Massachusetts Institute of Technology, University of California, Berkeley, and SRI International. The project interoperates with databases and initiatives including UniProt, NCBI, EMBL-EBI, Ensembl, and KEGG to provide linked biological knowledge.

Overview

BioCyc began as a project to catalog metabolic pathways and enzymes for model organisms like Escherichia coli and expanded into a collection covering bacteria, archaea, and eukaryotes including Saccharomyces cerevisiae and Arabidopsis thaliana. It links genome annotations from repositories such as GenBank, RefSeq, TAIR, and WormBase with pathway data and enzyme nomenclature standards maintained by authorities like International Union of Biochemistry and Molecular Biology and IUBMB. The platform supports users across research centers such as Cold Spring Harbor Laboratory, Johns Hopkins University, Max Planck Society, and European Molecular Biology Laboratory.

Content and Databases

BioCyc comprises multiple component databases: organism-specific Pathway/Genome Databases (PGDBs) analogous to projects like EcoCyc and MetaCyc, containing curated information on metabolic pathways, enzymatic reactions, genes, and regulatory interactions. The collection interoperates with ontologies and standards like Gene Ontology, Systems Biology Markup Language, SBML Level 3, and cross-references to resources such as PubMed, Protein Data Bank, Reactome, and MetaNetX. It includes curated pathway collections for microbes studied at Lawrence Berkeley National Laboratory, Los Alamos National Laboratory, and research consortia associated with Wellcome Trust and Gates Foundation.

Data Curation and Sources

Curation in BioCyc draws on primary literature indexed in PubMed Central and curated annotation efforts from groups at University of Cambridge, ETH Zurich, University of Oxford, and University of Chicago. Genome annotations are integrated from sequencing centers like Broad Institute and national resources such as European Nucleotide Archive and DNA Data Bank of Japan. Functional assignments reference enzyme commission numbers from IUBMB and metabolic reconstructions guided by standards from Systems Biology Markup Language consortia as practiced at Institut Pasteur and CNRS laboratories. Community curation workflows mirror collaborative models used by Wikidata and curated portals such as UniProt and STRING.

Software and Tools

BioCyc provides software for visualization and analysis including pathway maps, genome browsers, and metabolic modeling tools comparable to utilities from Cytoscape, COBRA Toolbox, Jupyter, Bioconductor, and Galaxy. It offers programmatic access via APIs and batch tools used by teams at Google Research, Microsoft Research, Amazon Web Services, and academic groups at Princeton University, Yale University, University of Toronto, and UCSF. Visualization components draw on techniques popularized by platforms like RStudio and libraries used in projects at Imperial College London and Karolinska Institutet.

Applications and Use Cases

Researchers apply BioCyc to comparative genomics studies at Broad Institute projects, metabolic engineering at companies and labs such as Genentech, Novartis, Pfizer, and Roche, and synthetic biology initiatives affiliated with MIT Media Lab and Harvard Wyss Institute. Agrigenomics groups at USDA and Syngenta use its plant databases for trait mapping in CIMMYT collaborations, while microbiome researchers at European Molecular Biology Laboratory and Max Planck Institute for Developmental Biology integrate PGDBs with metagenomic pipelines from MG-RAST and Human Microbiome Project. Clinical and pharmaceutical studies reference BioCyc-curated pathways when investigating targets studied by NIH, FDA, and translational centers such as Dana-Farber Cancer Institute.

Access and Licensing

BioCyc offers tiered access: a free tier for academic users and subscription or commercial licensing for enterprise use, similar to licensing models at Elsevier, Springer Nature, and Clarivate. Institutional subscribers from universities like Columbia University, University of Washington, and University of Michigan obtain advanced features and bulk download options; collaborations with infrastructure providers such as Amazon Web Services and Google Cloud support large-scale analyses. Data exchange formats and licensing terms follow community norms exemplified by Creative Commons and data-sharing policies from funding bodies like National Science Foundation and Wellcome Trust.

Category:Bioinformatics