Parallel computing

Parallel computing
Name	Parallel computing
Type	Computational paradigm
Introduced	Mid 20th century
Proponents	John von Neumann, Gustave Olver, Edgar Codd
Relevant	Supercomputer, Distributed computing, Multicore processor

Contents

Overview
History
Models and Architectures
Parallel Algorithms and Programming Paradigms
Performance Metrics and Scalability
Hardware and System Implementations
Applications and Use Cases
Challenges and Future Directions

Parallel computing is a computational paradigm that executes multiple calculations or processes simultaneously by distributing work across multiple processing elements. It accelerates problem solving for compute-intensive tasks on systems such as Cray Research supercomputers, IBM mainframes, and modern Intel and AMD multicore processors. Parallel computing underpins major scientific projects at institutions like Lawrence Livermore National Laboratory, Los Alamos National Laboratory, CERN, and NASA centers.

Overview

Parallel computing leverages concurrency to reduce elapsed time for tasks by mapping subtasks to multiple processing units, as practiced in Seymour Cray's designs, Donald Knuth's analyses, and industrial efforts by Hewlett-Packard, Siemens, Fujitsu, Microsoft Research, and Google. It spans tightly coupled shared-memory systems exemplified by Apollo Computer workstations and loosely coupled distributed clusters such as those built with Beowulf approaches and Amazon Web Services clusters. Programming frameworks include models drawn from work at Bell Labs, Xerox PARC, MIT, Stanford University, and UC Berkeley.

History

Early concepts trace to mechanical devices and theoretical foundations developed by figures like Alan Turing, John von Neumann, and Alonzo Church, later implemented by projects at Los Alamos National Laboratory and commercialized by Seymour Cray at Cray Research. Milestones include the development of pipeline and vector architectures on systems such as the CDC 7600, innovations in multiprocessing at IBM during the Cold War, and the emergence of distributed computing in projects at DARPA and ARPANET. The rise of commodity clusters in the 1990s was influenced by initiatives at NASA Ames Research Center and the National Science Foundation. Modern multicore and manycore eras were driven by companies like Intel, AMD, and NVIDIA alongside academic programs at ETH Zurich, University of Illinois Urbana-Champaign, and Carnegie Mellon University.

Models and Architectures

Architectural models include shared-memory symmetric multiprocessing (SMP) found in Sun Microsystems servers, distributed-memory message-passing clusters as in Oak Ridge National Laboratory installations, and hybrid NUMA configurations used by Cray Inc. and Hewlett Packard Enterprise. Programming and machine models draw from the PRAM abstraction, Bulk Synchronous Parallel (BSP) inspired by Leslie Valiant's work, and actor models popularized in systems from Erlang development at Ericsson and research at INRIA. Accelerator-based heterogeneous architectures incorporate GPUs developed by NVIDIA and co-processors from Intel's Xeon Phi projects, while FPGA-based solutions are explored by teams at Xilinx and Altera.

Parallel Algorithms and Programming Paradigms

Design of parallel algorithms references foundational contributions from John Backus and algorithmic theory refined at Bell Labs, MIT, and Princeton University. Paradigms include data parallelism used in Cray Research vector codes, task parallelism applied in HP Labs research, pipeline parallelism in IBM mainframes, and map-reduce formulations popularized by Google and formalized in academic work at UC Berkeley. Programming systems include message-passing MPI created through collaboration among Argonne National Laboratory, University of Tennessee, and industry partners; shared-memory APIs like OpenMP developed by consortia involving Intel; and higher-level languages and libraries from Microsoft Research, Sun Microsystems, and Apple Inc..

Performance Metrics and Scalability

Key metrics include speedup, efficiency, throughput, latency, and scalability curves studied in publications from ACM and IEEE. Amdahl's Law, developed in contexts influenced by Gene Amdahl at IBM, and Gustafson's Law from John L. Gustafson provide analytic bounds on parallel performance and guide capacity planning at facilities such as Oak Ridge National Laboratory and Argonne National Laboratory. Benchmarks such as LINPACK used in the TOP500 list, NAS Parallel Benchmarks designed by NASA and NAS collaborators, and SPEC benchmarks from Standard Performance Evaluation Corporation are central for system comparison.

Hardware and System Implementations

Hardware implementation ranges from early vector processors at Control Data Corporation to contemporary exascale systems at Oak Ridge National Laboratory and Lawrence Livermore National Laboratory supported by vendors like Cray Inc., HPE, IBM, Fujitsu, and NVIDIA. Interconnect technologies include designs from Infiniband Trade Association, proprietary networks developed by Cray Research, and Ethernet-based fabrics used in Google and Facebook data centers. Memory hierarchies and cache coherence protocols were advanced in research by Stanford University and MIT and deployed by Intel Corporation and ARM Holdings.

Applications and Use Cases

Parallel computing enables large-scale simulations in climate science at NOAA and ECMWF, cosmology at CERN and Caltech, materials modeling at Argonne National Laboratory, computational chemistry used at Lawrence Berkeley National Laboratory, and genomics workflows in projects at Broad Institute. Industry applications include high-frequency trading systems at Goldman Sachs subsidiaries, real-time rendering in Pixar pipelines, machine learning training at OpenAI and DeepMind, and big data analytics in platforms from Hadoop ecosystems driven by companies like Cloudera.

Challenges and Future Directions

Challenges include programmability, energy efficiency emphasized in roadmaps by International Energy Agency studies, fault tolerance pursued at Sandia National Laboratories, and security concerns investigated by NSA and academic teams at UC Berkeley. Future directions point to quantum-inspired hybrid systems explored at IBM Research and Google Quantum AI, neuromorphic accelerators studied at Intel Labs and IBM Watson Research Center, and exascale and post-exascale architectures coordinated through international collaborations such as projects funded by European Commission and U.S. Department of Energy.

Category:Computer science