Generated by GPT-5-mini| Globus (software) | |
|---|---|
| Name | Globus |
| Developer | University of Chicago; Argonne National Laboratory; Globus Alliance |
| Released | 2001 |
| Programming language | Java (programming language); Python (programming language) |
| Operating system | Linux, Windows, macOS |
| Platform | Grid computing; Cloud computing |
| Genre | File transfer; Data management; Identity federation |
Globus (software) Globus is a research-oriented data management and file transfer platform originally developed to support large-scale scientific collaboration across institutions such as Argonne National Laboratory, Lawrence Berkeley National Laboratory, and the University of Chicago. It integrates identity federation, high-performance transfer, and data sharing capabilities for projects in domains like high-energy physics, astronomy, bioinformatics, and earth science. The project emerged from the Grid computing movement and has been used in collaborations involving facilities such as Oak Ridge National Laboratory, CERN, and National Energy Research Scientific Computing Center.
Globus provides services for reliable, high-performance data movement, remote data access, and controlled data sharing across institutional boundaries. It builds on protocols and software developed in the GridFTP ecosystem and interacts with middleware from initiatives such as Open Grid Forum and European Grid Infrastructure. The platform emphasizes interoperability with identity systems like InCommon, eduGAIN, and OAuth 2.0 deployments, and integrates with storage systems including Network File System, Amazon S3, and Ceph clusters.
The software traces its origins to early 2000s efforts to enable distributed computing for projects funded by agencies like the U.S. Department of Energy and the National Science Foundation. Initial components grew out of the activities of the Globus Alliance and collaborations among laboratories such as Argonne National Laboratory and Los Alamos National Laboratory. Over time, the codebase and service offerings evolved alongside milestones in grid computing and cloud computing, contributing to deployments at facilities like Fermilab and multinational efforts at CERN. The governance and operational model shifted as partnerships with academic institutions and commercial entities matured.
Globus is architected as a set of loosely coupled services and client software, enabling integration with diverse computing environments such as those managed by XSEDE, PRACE, and regional research infrastructures. Core components include a managed transfer service that leverages GridFTP servers, an identity and access management layer compatible with SAML and OAuth 2.0, and a web-based sharing service that implements access control and provenance metadata. The system exposes APIs consumable by applications written in languages like Python (programming language), Java (programming language), and frameworks used by projects at Lawrence Livermore National Laboratory and Sandia National Laboratories. Deployment patterns range from head-node agents on HPC clusters to connectors for object stores such as Amazon S3 and block storage on platforms like OpenStack.
Globus has been used to move petabyte-scale datasets supporting experiments in high-energy physics collaborations at CERN and analysis workflows at Oak Ridge National Laboratory. It has enabled data sharing and reproducible research in genomics studies run by institutions like Broad Institute and facilitated earth observation data distribution for projects at the National Oceanic and Atmospheric Administration. Research consortia such as those convened by the National Institutes of Health and the European Commission have adopted the platform to streamline cross-institutional workflows, while observatories collaborating with Space Telescope Science Institute have used it for archival transfers. Commercial partners and technology providers integrate Globus-style transfer capabilities into services offered by Amazon Web Services, Google Cloud Platform, and vendors supplying research computing stacks.
Security in Globus centers on federated identity, encrypted transfer channels, and fine-grained access control. The platform interoperates with identity providers in InCommon and supports standards such as SAML and OAuth 2.0 to authenticate users from universities and laboratories including MIT, Stanford University, and University of California, Berkeley. Data movement uses TLS-protected channels and mechanisms derived from GridFTP to ensure integrity and confidentiality during transfer between endpoints like cluster file systems at Argonne National Laboratory and cloud object stores. Institutional deployments map onto compliance regimes relevant to agencies such as the U.S. Department of Energy and health-focused initiatives under the National Institutes of Health for controlled-access datasets.
Development historically involved partners from national laboratories, universities, and research consortia including the Globus Alliance and academic groups at Carnegie Mellon University and University of Southern California. The community encompasses researchers, system administrators at sites like Fermilab and NERSC, and software engineers collaborating through issue trackers and project repositories influenced by practices from Apache Software Foundation and other open-source communities. Workshops and training events co-located with conferences such as Supercomputing (conference) and meetings organized by XSEDE and Open Grid Forum foster adoption and feedback.
Historically, component licensing has included open-source licenses for client libraries and protocol implementations, while managed services have been offered under subscription or institutional agreements. Deployment options range from using the hosted managed service model adopted by several universities and laboratories to on-premises installations integrated with local authentication and storage, similar to patterns used by National Center for Supercomputing Applications and cloud deployments on Amazon Web Services or OpenStack infrastructures. Institutions negotiate service agreements that reflect operational support levels required by projects funded through agencies like the National Science Foundation and the U.S. Department of Energy.
Category:Data transfer software Category:Grid computing