Graham (supercomputer)

Graham (supercomputer)
Name	Graham
Location	University of Waterloo
Operator	Compute Ontario and University of Waterloo
Manufacturer	Dell Technologies and NVIDIA
Architecture	x86-64 (Intel Xeon) and GPUs
Memory	1.5 PB (aggregate)
Storage	20 PB (parallel)
Speed	1.65 PFLOPS (peak)
Operating system	Linux (custom)
Purpose	Research, high-performance computing
Deployment	2018

Contents

Overview
Architecture and Hardware
Performance and Benchmarks
Software and Programming Environment
Deployment and Operational History
Notable Research and Applications

Graham (supercomputer) is a high-performance computing system deployed at the University of Waterloo to serve research communities across Ontario and Canada. Funded and operated through partnerships including Compute Ontario, University of Waterloo research initiatives, and provincial infrastructure programs, Graham supports workloads from computational chemistry to machine learning. The system links to regional and national infrastructures such as Compute Canada, CERN, Perimeter Institute for Theoretical Physics, MaRS Discovery District collaborators and numerous university research groups.

Overview

Graham was commissioned to provide mid-scale petascale capacity for academic users affiliated with University of Waterloo, McMaster University, Western University, Queen's University and other Ontario institutions, integrating with national platforms like Compute Canada and provincial strategies exemplified by Ontario Ministry of Research and Innovation. The project involved partnerships with industry vendors including Dell Technologies, NVIDIA, and storage specialists associated with IBM and Intel, with governance connected to organizations such as CANARIE and research consortia similar to Tri-Council initiatives. Graham aimed to accelerate research in domains represented by groups at Perimeter Institute for Theoretical Physics, Institute for Quantum Computing, Waterloo Institute for Nanotechnology and clinical research hubs linked to Sunnybrook Health Sciences Centre and Princess Margaret Cancer Centre.

Architecture and Hardware

Graham's compute architecture combined x86-64 nodes based on Intel Xeon processors with accelerator nodes using NVIDIA Tesla GPUs, organized into a high-performance fabric often leveraging interconnect technologies akin to InfiniBand and network topologies used by systems at Los Alamos National Laboratory and Lawrence Livermore National Laboratory. Storage subsystems used parallel file systems similar to Lustre and object storage paradigms promoted by enterprises like Dell EMC and NetApp, providing tens of petabytes of capacity to support workflows from bioinformatics groups at The Hospital for Sick Children to climate modeling teams collaborating with Environment and Climate Change Canada. Cooling and power infrastructure followed best practices employed by data centers at Google and Amazon Web Services research clusters, while system management relied on tools comparable to configurations used by XSEDE and PRACE installations.

Performance and Benchmarks

Graham delivered peak single-precision and double-precision performance measured in the petaflop range, with sustained throughput evaluated using benchmarks resembling High Performance Linpack (HPL), STREAM memory bandwidth tests, and application kernels drawn from projects at National Aeronautics and Space Administration and European Centre for Medium-Range Weather Forecasts. Comparative benchmarking positioned Graham alongside regional commodity clusters and complementary national systems such as nodes within Compute Canada and international peers hosted at diRAC and EPCC (Edinburgh Parallel Computing Centre). Performance characterization informed allocation policies modeled after those used by NSF supercomputing centers and influenced procurement cycles comparable to acquisitions at Argonne National Laboratory.

Software and Programming Environment

Graham ran a Linux-based environment supporting resource managers and schedulers analogous to Slurm Workload Manager and modules ecosystems like Environment Modules or Lmod. Users accessed scientific libraries and frameworks including MPI stacks consistent with Open MPI and Intel MPI, math libraries similar to FFTW and Intel MKL, and machine learning frameworks such as TensorFlow, PyTorch, and GPU toolchains from NVIDIA including CUDA. Scientific applications deployed spanned computational chemistry packages related to work at Royal Society of Chemistry partners, astrophysics codes used by groups collaborating with Canadian Institute for Theoretical Astrophysics, and genomics pipelines in use at Genome Canada funded projects.

Deployment and Operational History

Graham was installed and commissioned in 2018 with operational governance involving University of Waterloo computing services, supported by regional consortia like Compute Ontario and aligned with national resource allocation policies similar to Compute Canada's frameworks. Operational updates, maintenance windows, and capability expansions were coordinated with vendor support from Dell Technologies and NVIDIA and with input from academic stakeholders including faculty from Faculty of Mathematics, University of Waterloo and engineering departments linked to Perimeter Institute. Over its operational lifetime, Graham underwent upgrades to firmware and software stacks paralleling procedures at other academic HPC centers such as University of British Columbia and McGill University.

Notable Research and Applications

Graham enabled research across disciplines: machine learning projects connected to Vector Institute collaborations, computational chemistry and materials science investigations involving researchers affiliated with Canadian Light Source and Waterloo Institute for Nanotechnology, climate and hydrodynamics simulations cooperating with Environment and Climate Change Canada, and genomics and bioinformatics studies tied to Ontario Institute for Cancer Research and Genome Canada. The platform supported collaborations with international science bodies including CERN experiments, theoretical physics work at Perimeter Institute for Theoretical Physics, and multidisciplinary programs bridging industry partners such as BlackBerry spin-outs and startups supported by Communitech and MaRS Discovery District.

Category:Supercomputers