Generated by GPT-5-mini| Galaxy (bioinformatics platform) | |
|---|---|
![]() | |
| Name | Galaxy |
| Developer | Pennsylvania State University; Johns Hopkins University; University of Pennsylvania; Emory University |
| Released | 2005 |
| Programming language | Python (programming language) |
| Platform | Web application |
| License | Academic Free License |
Galaxy (bioinformatics platform) Galaxy is an open, web-based platform for data-intensive biomedical research that enables reproducible analyses, interactive data visualization, and sharing of computational workflows. It supports a broad array of genomics and proteomics tools and integrates with public repositories and infrastructure to serve researchers across institutions and projects. Galaxy fosters collaboration among bioinformatics groups, academic centers, and research consortia while prioritizing transparency and reproducibility in computational research.
Galaxy provides a browser-accessible workbench connecting tools, data, and computational resources so researchers can perform analyses without command-line expertise. It interoperates with resources such as National Institutes of Health, European Bioinformatics Institute, Wellcome Sanger Institute, Broad Institute, and National Center for Biotechnology Information, enabling import/export with repositories like Sequence Read Archive and Ensembl. The platform supports workflows compatible with standards and initiatives involving Global Alliance for Genomics and Health, ELIXIR, National Human Genome Research Institute, Cancer Genome Atlas, and Human Cell Atlas collaborators. Galaxy’s web UI, API, and tool integration model link to services and infrastructures including Amazon Web Services, Google Cloud Platform, Microsoft Azure, and high-performance computing centers at institutions such as Argonne National Laboratory and Lawrence Berkeley National Laboratory.
The platform originated in the mid-2000s from collaborative efforts among researchers at Pennsylvania State University and University of Pennsylvania with early adopters at Johns Hopkins University and Emory University. Influential milestones align with public projects and funding from agencies like National Science Foundation, National Institutes of Health, and philanthropic entities associated with Gordon and Betty Moore Foundation and Wellcome Trust. Contributions from groups at University of Cambridge, European Molecular Biology Laboratory, Cold Spring Harbor Laboratory, Max Planck Society, and Karolinska Institutet expanded tool sets and infrastructure. Galaxy’s evolution paralleled large-scale efforts such as 1000 Genomes Project, ENCODE Project, International Cancer Genome Consortium, and Human Microbiome Project, driving features for reproducibility, provenance, and sharing across consortia including BD2K and GA4GH working groups.
The architecture separates the web application layer, the job execution layer, and data management, built primarily with Python (programming language) and integrating container technologies like Docker and Singularity (software). Core components include the web-based user interface, a tool shed for sharing tools analogous to GitHub, an API layer supporting automation and integration with workflow engines such as Common Workflow Language and Nextflow, and a metadata/provenance store enabling reproducibility practices endorsed by FAIR data principles advocates. Authentication and identity integration use protocols and services like OAuth 2.0, CILogon, and institutional identity providers in federations such as InCommon. Job runners interface to resource managers including SLURM, PBS Professional, and cloud orchestration tools used at institutions like Oak Ridge National Laboratory and National Energy Research Scientific Computing Center.
Galaxy includes an extensible tool ecosystem for sequence alignment, variant calling, differential expression, and metagenomics, relying on established tools and databases such as BWA (software), Bowtie, BLAST, HISAT2, STAR (software), GATK, SAMtools, BCFtools, DESeq2, edgeR, and QIIME. Workflows are constructed visually and can be published, versioned, and cited, facilitating reproducible pipelines for projects like The Cancer Genome Atlas analyses or Metagenomics surveys. Visualization components integrate with genome browsers such as Integrative Genomics Viewer and UCSC Genome Browser, and support interactive plotting libraries used by projects at European Bioinformatics Institute and Broad Institute. Data provenance, history tracking, and sharing features align with policies from funders such as National Institutes of Health and journals like Nature and Science.
Galaxy can be deployed as a public service, institutional instance, or local appliance, with automated deployments leveraging tools and platforms used by major research infrastructures, including Terraform, Kubernetes, Ansible, and cloud marketplaces linked to Amazon Web Services and Google Cloud Platform. Scalability strategies address workloads from single labs to national services provided by organizations like ELIXIR members and regional clouds such as NeCTAR. Performance tuning and elastic scaling practices incorporate technologies and operational models developed at centers including European Molecular Biology Laboratory-European Bioinformatics Institute, Sanger Institute, and Johns Hopkins University Bioinformatics Core.
The Galaxy ecosystem is sustained by an international community and steering structures that involve academic nodes at Penn State Hershey Medical Center, Emory University, Johns Hopkins University, University of Pennsylvania, and partners across Europe, Asia, and Australia. Governance and coordination align with community workshops, annual meetings, and training programs run in conjunction with organizations such as Carpentries, FAIRsharing, ELIXIR, Galaxy Community Conference, and funders like NIH Office of Data Science Strategy. Training materials, MOOCs, and tutorials have been developed in collaboration with educators from Cold Spring Harbor Laboratory, European Bioinformatics Institute, The Jackson Laboratory, and Fred Hutchinson Cancer Research Center to support reproducible research and capacity building across consortia including H3Africa and Global Alliance for Genomics and Health.
Category:Bioinformatics software