Simple API for Grid Applications

Simple API for Grid Applications
Name	Simple API for Grid Applications
Developer	Open Grid Forum
Released	2002
Programming language	C, C++
Operating system	Unix-like, Microsoft Windows
License	BSD-style

Contents

Overview
Architecture and Components
Programming Model and Interfaces
Use Cases and Implementations
Performance and Scalability
Security and Authentication
Adoption and Standards Integration

Simple API for Grid Applications is an application programming interface designed to provide a uniform, portable interface for distributed resource management, data movement, and job submission across heterogeneous grid and cluster environments. It was developed to bridge middleware and application layers, enabling interoperability among projects and institutions that operate disparate infrastructures, such as research laboratories, supercomputing centers, and international collaborations. The API influenced subsequent middleware efforts and was referenced by standards bodies and consortia in the early 2000s.

Overview

The API emerged from collaboration among academic groups and standards organizations aiming to harmonize interactions with resource managers such as Portable Batch System, Sun Grid Engine, Slurm Workload Manager, Torque (software), and platforms from vendors like IBM and Intel. Stakeholders included members of the Open Grid Forum, European Grid Infrastructure, National Science Foundation, Lawrence Livermore National Laboratory, and CERN. The specification focused on a minimal, extensible surface for job lifecycle control, file staging, and event notification, reflecting lessons from projects such as Globus Toolkit, Condor (software), UNICOS, and early cluster initiatives at Los Alamos National Laboratory.

Architecture and Components

The architecture defines a small set of abstractions that map to resources managed by systems like PBS Professional, Microsoft Windows HPC Server, and proprietary scheduler solutions used at institutions such as Argonne National Laboratory and Oak Ridge National Laboratory. Core components include: - Job objects corresponding to tasks submitted to Supercomputer queues at centers like NERSC and Jülich Research Centre. - File staging primitives that interact with transfer services from projects like GridFTP, FTS (File Transfer Service), and storage systems maintained by EMBL-EBI or Large Hadron Collider collaborations. - Event and notification hooks compatible with messaging frameworks used in Message Passing Interface deployments and orchestration stacks developed at Los Alamos National Laboratory.

Interfaces were intentionally lightweight to allow binding implementations for languages and toolkits common in research computing, enabling integration with software stacks from Apple, Microsoft, and open-source ecosystems driven by communities around GitHub and SourceForge.

Programming Model and Interfaces

The programming model exposes synchronous and asynchronous operations for submission, control, and monitoring of jobs, borrowing patterns familiar to developers of middleware such as Globus Toolkit and schedulers like Slurm Workload Manager. APIs are specified in C with conventions that ease wrapping for Python (programming language), Java (programming language), and environments used at institutions like MIT and Stanford University. Typical interfaces include: - Job description templates compatible with job scripts used at Lawrence Berkeley National Laboratory and Fermilab. - Callbacks and poll-based monitors that interoperate with event libraries popularized by projects at Carnegie Mellon University. - File transfer primitives that can be adapted to secure channel protocols endorsed by Internet Engineering Task Force working groups.

Bindings and adapters were developed to integrate with portal frameworks deployed in collaborations such as Open Science Grid and services operated by European Commission funded infrastructures.

Use Cases and Implementations

Implementations addressed scenarios in high-energy physics, bioinformatics, climate modeling, and computational chemistry at sites including CERN, European Molecular Biology Laboratory, NASA Ames Research Center, and NOAA. Use cases included batch submission pipelines used in experiments at Large Hadron Collider, ensemble workflows run on systems at Argonne National Laboratory, and federated resource sharing across consortia like Open Grid Forum members. Several research projects produced adapters to bridge the API to middleware like Globus Toolkit, Condor (software), and site-specific schedulers at national labs.

Performance and Scalability

Design decisions emphasized low overhead and scalable control paths to support large job volumes characteristic of workflows executed at Oak Ridge National Laboratory and national supercomputing centers such as National Energy Research Scientific Computing Center. Performance tuning focused on reducing RPC latency compatible with high-throughput environments observed in experiments at CERN and observations from grid deployments coordinated by European Grid Infrastructure. Scalability testing often used benchmarks and workloads derived from collaborations involving NASA, NOAA, and climate consortia to validate broker and scheduler integrations.

Security and Authentication

Security considerations aligned with practices advocated by the Internet Engineering Task Force and federated identity initiatives such as Shibboleth and InCommon. Implementations commonly leveraged transport-layer security, X.509 credentials used in Grid computing deployments, and site access policies enforced at facilities like Lawrence Livermore National Laboratory and Oak Ridge National Laboratory. The API accommodated integration with credential delegation mechanisms and token-based systems promoted by projects at European Commission research programs.

Adoption and Standards Integration

Adoption occurred through academic consortia, national laboratory projects, and standardization dialogues at the Open Grid Forum and related bodies. The API influenced later specifications and bindings in middleware stacks and informed discussions at conferences attended by practitioners from CERN, IBM, Microsoft, and major research universities. While superseded in part by integrated cloud-native orchestration platforms championed by organizations like Cloud Native Computing Foundation and commercial cloud providers such as Amazon Web Services, the API remains a notable stepping stone in the evolution of interoperable grid middleware.

Category:Grid computing software