Latency (computer science)

Latency (computer science)
Name	Latency (computer science)
Field	Computer science

Contents

Definition and Types
Measurement and Metrics
Causes and Sources
Effects on System Performance
Reducing and Managing Latency
Latency in Networking and Distributed Systems
Historical Context and Trends

Latency (computer science) is the time delay between a request and the corresponding response within a computing system, measured across hardware, software, and network boundaries. It affects user experience, system throughput, and real-time control across fields such as operating systems, databases, and telecommunications. Prominent implementations and studies of latency appear in contexts involving Intel, ARM, Google, Microsoft, Apple, and Amazon infrastructures.

Definition and Types

Latency denotes temporal delay; commonly distinguished from throughput and bandwidth in system design. Common types include hardware-level CPU latency involving Intel and AMD microarchitectures, memory latency tied to Samsung Electronics DRAM and SK Hynix modules, storage latency for Seagate Technology and Western Digital disks and Samsung solid-state drives, and network latency across providers such as Cisco Systems and Juniper Networks. Application-level latency appears in database systems like Oracle, PostgreSQL, and MySQL and in middleware from Apache projects. Real-time latency constraints arise in embedded systems designed by Texas Instruments and NVIDIA for fields such as autonomous vehicles studied by Waymo and Tesla.

Measurement and Metrics

Measurements use metrics such as round-trip time (RTT), one-way delay, jitter, and tail latency percentiles (e.g., 95th, 99th). Tools and protocols developed by organizations like IETF and software such as Wireshark, iperf, and ping provide RTT and packet analysis; instrumentation frameworks from Google and Facebook produce telemetry for latency heatmaps and histograms. Benchmark suites from SPEC and TPC capture latencies for CPU, memory, and storage subsystems, while observability platforms such as Prometheus and Grafana plot latency distributions and service-level indicators used by Netflix and LinkedIn.

Causes and Sources

Latency arises from propagation delay in physical media traced back to standards bodies like IEEE and ITU defining signaling, serialization delays in interfaces from PCI-SIG and USB-IF, queuing delays in routers and switches produced by Cisco Systems or Arista Networks, and processing delays from kernels like Linux kernel or Microsoft Windows. Software-layer causes include concurrency models used by Erlang or Go, garbage collection in virtual machines such as HotSpot for Java, and lock contention in frameworks like OpenMP and POSIX Threads. External sources include regulatory events such as decisions by FCC that affect network deployment and disasters that disrupt supply chains involving Foxconn.

Effects on System Performance

High latency degrades interactivity in human-facing services operated by Twitter and Instagram, reduces scientific throughput in high-performance computing centers run by Oak Ridge National Laboratory and CERN, and increases tail latency in cloud services run by Amazon Web Services and Microsoft Azure. In financial trading, latency impacts firms like Goldman Sachs and Citadel LLC where microsecond advantages change market outcomes. In control systems for aerospace built by Boeing and Airbus, latency can affect stability; similarly low-latency requirements drive design choices in projects such as SpaceX and Lockheed Martin.

Reducing and Managing Latency

Techniques include caching strategies used by Cloudflare and Akamai Technologies, content delivery networks developed by Akamai Technologies and Fastly, protocol optimizations advanced by IETF (e.g., QUIC), offloading to accelerators from NVIDIA and Intel (e.g., FPGA vendors such as Xilinx), and edge computing promoted by EdgeX Foundry and initiatives from Google and Microsoft. Architectural patterns like microservices employed by Netflix and Uber reduce critical-path latency through circuit breakers and backpressure mechanisms described in works by Martin Fowler. System-level optimizations include NUMA-aware scheduling in kernels like Linux kernel and kernel bypass techniques used in projects such as Data Plane Development Kit.

Latency in Networking and Distributed Systems

In wide-area networks managed by carriers such as AT&T and Verizon Communications, latency includes propagation and routing delays studied in research from MIT and Stanford University. Distributed databases like Apache Cassandra, Google Spanner, and MongoDB must balance consistency and latency, invoking results from Leslie Lamport and the CAP theorem. Consensus algorithms such as Paxos and Raft quantify latency impact on commit times; blockchain platforms like Bitcoin and Ethereum exhibit latency trade-offs in block propagation and finality. Content routing approaches, including Border Gateway Protocol implementations maintained by IETF working groups, directly influence inter-domain latency.

Historical Context and Trends

Latency considerations date to early computing at institutions such as Bell Labs and IBM where memory hierarchies and instruction pipelines were first analyzed. Advances in semiconductor scaling by Moore's law contributors like Robert Noyce and Gordon Moore changed transistor performance, while networking milestones from ARPA and the development of the ARPANET shaped packet delay characteristics. Recent trends emphasize edge computing by companies like Cloudflare and Amazon and protocol evolution such as QUIC and HTTP/3 to combat increased application demands seen in projects at Google and Facebook.

Category:Computer performance