Generated by GPT-5-mini| Galaxy (platform) | |
|---|---|
![]() | |
| Name | Galaxy (platform) |
| Developer | Pennsylvania State University; Galaxy Project |
| Released | 2005 |
| Programming language | Python (programming language), JavaScript |
| Operating system | Linux, macOS, Microsoft Windows |
| Genre | Scientific workflow management system |
| License | Academic Free License |
Galaxy (platform) is an open, web-based scientific workflow platform designed for accessible, reproducible, and transparent analysis of large-scale biomedical data. It provides a graphical user interface and programmatic interfaces for integrating tools, managing datasets, and sharing workflows across research communities, institutions, and infrastructure projects. Galaxy emphasizes provenance, collaboration, and interoperability with resources such as National Institutes of Health, European Bioinformatics Institute, and cloud providers like Amazon Web Services.
Galaxy originated in 2005 from work at Pennsylvania State University and early collaborations with groups at Center for Genome Research and Biocomputing and University of Pennsylvania. Influenced by projects such as Bioconductor, Taverna (software), and Cytoscape, Galaxy evolved through contributions from initiatives like the National Institutes of Health Big Data to Knowledge program and the European Molecular Biology Laboratory. Key milestones include the introduction of the web-based workflow editor, the establishment of the Galaxy Project organization, partnerships with European Bioinformatics Institute, and funding from agencies including National Science Foundation and Wellcome Trust.
Galaxy's architecture combines a web application front end, an application server, a job management layer, and tool integration components. The web interface is implemented in Python (programming language) and JavaScript, with RESTful APIs patterned after specifications from OpenAPI Initiative and interoperability efforts like Global Alliance for Genomics and Health. The job execution model integrates with batch systems such as Slurm Workload Manager, Sun Grid Engine, and cloud orchestration systems from Amazon Web Services and Google Cloud Platform. Core components include the tool shed for sharing wrappers, the workflow engine, the dataset provenance database, and the visualization framework used by projects like ENCODE Project and 1000 Genomes Project.
Galaxy provides features for tool integration, workflow composition, data provenance, and sharing. The tool integration system uses XML-based tool descriptors inspired by standards from Bioinformatics Open Source Conference and supports containerization technologies like Docker and Singularity (software). Workflow composition offers drag-and-drop editors and supports formats interoperable with Common Workflow Language and Workflow Definition Language. Data provenance and reproducibility leverage metadata models aligned with practices from Digital Object Identifier and archives such as Sequence Read Archive. Authentication and authorization can be federated with providers like ORCID and ELIXIR.
Galaxy can be deployed on single servers, high-performance computing clusters, and cloud platforms. Reference deployment patterns include community instances such as those hosted by European Bioinformatics Institute and institutional deployments at universities like Johns Hopkins University. Scalability strategies use job runners for Slurm Workload Manager, container orchestration with Kubernetes, and object storage backends compatible with Amazon S3 and OpenStack Swift. Performance tuning frequently references best practices from National Institute of Standards and Technology and case studies conducted by groups affiliated with Wellcome Sanger Institute.
Development is coordinated by the Galaxy Project with contributions from diverse institutions including Pennsylvania State University, European Bioinformatics Institute, Johns Hopkins University, and community hubs such as UseGalaxy.org. The open-source codebase is managed through platforms influenced by workflows from GitHub and governance practices akin to Apache Software Foundation. Community activities include annual events like the Galaxy Community Conference and collaborations with consortia such as ELIXIR and the Global Alliance for Genomics and Health. Training and outreach leverage materials from Carpentries and workshops at conferences like American Society of Human Genetics.
Galaxy is widely used in genomics, transcriptomics, metagenomics, and proteomics research. Exemplary applications span projects like ENCODE Project, 1000 Genomes Project, The Cancer Genome Atlas, and pathogen surveillance efforts coordinated with Centers for Disease Control and Prevention. Clinical and translational pipelines employ Galaxy in workflows for variant calling, RNA-seq analysis, and microbial genomics integrated with standards from Clinical Laboratory Improvement Amendments and reporting frameworks used by National Health Service (England). Educational deployments support teaching at institutions such as University of Cambridge and University of California, San Diego.
Category:Bioinformatics Category:Scientific workflows Category:Open-source software