Distributed computing

Distributed computing
Name	Distributed computing
Field	Computer science
Introduced	1970s
Subdiscipline	Distributed systems, Parallel computing

Contents

Introduction
History and Evolution
Architectures and Models
Algorithms and Protocols
Fault Tolerance and Reliability
Performance and Scalability
Applications and Use Cases
Security and Privacy

Distributed computing is a branch of computer science concerned with systems in which components located on networked computers communicate and coordinate their actions by passing messages. It studies the design, analysis, and implementation of algorithms and systems that run across multiple autonomous nodes such as clusters, grids, and cloud platforms. Practitioners draw on foundations from Alan Turing, John von Neumann, Leslie Lamport, Edsger W. Dijkstra, and institutions like Massachusetts Institute of Technology, Stanford University, and University of California, Berkeley.

Introduction

Distributed computing addresses computation that is partitioned across distinct machines such as servers in a data center, workstations in a Beowulf cluster, or devices in an Internet of Things deployment. Key concerns include coordination among processes developed at places such as Bell Labs and IBM Research, consistency guarantees exhibited by systems like Apache Cassandra and Google Spanner, and middleware designed by projects originating at Sun Microsystems and Microsoft Research. Influential conferences and venues include ACM Symposium on Operating Systems Principles, USENIX Annual Technical Conference, and International Conference on Distributed Computing Systems.

History and Evolution

Early ideas emerged from theoretical work by figures associated with Princeton University and University of Cambridge and from practical systems like the ARPANET and Multics. The 1970s and 1980s saw advances at Xerox PARC and DEC leading to remote procedure call concepts used in NFS and CORBA. The 1990s introduced peer-to-peer experiments inspired by Napster and industrial deployments such as Amazon Web Services and Google File System. The 2000s and 2010s moved toward cloud-native designs exemplified by Kubernetes, Hadoop, and MapReduce, with research driven by labs at Carnegie Mellon University and ETH Zurich.

Architectures and Models

Architectural models include client–server topologies popularized by Netscape and Microsoft, peer-to-peer overlays seen in BitTorrent and Gnutella, and publish–subscribe frameworks used in MQTT and Apache Kafka. Computational models span shared-memory emulations such as those studied at Los Alamos National Laboratory and message-passing models formalized by researchers at MIT and University of Edinburgh. Deployment targets range from edge computing nodes in projects by Cisco Systems to high-performance computing centers like Oak Ridge National Laboratory.

Algorithms and Protocols

Core algorithms and protocols include consensus algorithms such as Paxos and Raft, leader election studied in work at Bell Labs, clock synchronization approaches like Network Time Protocol, and routing algorithms originating from Stanford networking research. Distributed hash tables were popularized by teams behind Chord and Kademlia, while replication and quorum techniques trace to studies at IBM Research and Xerox PARC. Transaction protocols include two-phase commit used in Oracle Corporation systems and three-phase commit variants examined at University of California, Irvine.

Fault Tolerance and Reliability

Fault tolerance mechanisms rely on redundancy strategies employed by Facebook and Google to survive node failures and partition events studied in Leslie Lamport’s work on asynchronous systems. Byzantine fault tolerance protocols such as those motivated by Soviet Union cryptography research and adopted in blockchain projects like Bitcoin and Ethereum address malicious behaviors. Checkpointing and rollback recovery techniques were advanced by teams at Lawrence Livermore National Laboratory and applied in distributed simulations at CERN.

Performance and Scalability

Performance engineering in distributed systems examines latency, throughput, and load balancing issues tackled by projects at Netflix and Twitter. Scalability patterns include sharding used by MongoDB and Amazon DynamoDB, autoscaling pioneered in Amazon EC2 offerings, and caching strategies informed by :Category:Computer science research at Princeton University. Benchmarks and measurement infrastructures derive from initiatives at SPEC and academic efforts at University of Illinois at Urbana–Champaign.

Applications and Use Cases

Distributed computing underpins large-scale services such as search engines by Google and social networks by Meta Platforms, Inc.; scientific applications include distributed simulations at NASA and data analysis at CERN. Other domains include financial trading platforms used by firms on New York Stock Exchange, content delivery networks run by Akamai Technologies, and decentralized ledgers developed by consortia including Hyperledger Project.

Security and Privacy

Security concerns are addressed through authentication protocols like Kerberos and TLS implementations influenced by standards bodies such as the Internet Engineering Task Force. Privacy-preserving techniques include secure multi-party computation researched at IBM Research and differential privacy promoted by teams at Google and Apple Inc.. Threat models include denial-of-service attacks studied by CERT and adversarial actions analyzed in research from MIT CSAIL.

Category:Computer science