Graphics processing unit

Graphics processing unit
Name	Graphics processing unit
Caption	A modern graphics processing unit (GPU)
Invented	1990s
Designer	NVIDIA; AMD; Intel
Type	Semiconductor; Processor
Used in	Personal computers; Workstations; Data centers; Game consoles

Contents

History
Architecture and components
Performance and benchmarking
Programming and APIs
Applications
Market and industry
Future trends and research

Graphics processing unit

A graphics processing unit (GPU) is a specialized semiconductor processor optimized for parallel computation and the manipulation of images, videos, and large numerical datasets. Initially developed to accelerate rasterization and real-time rendering for video games and graphical user interfaces, GPUs have evolved into general-purpose accelerators widely used in scientific computing, machine learning, and media production. Leading companies and institutions such as NVIDIA, AMD, Intel, Sony, Microsoft, and Apple have driven GPU adoption across consumer, professional, and cloud markets.

History

GPU development traces to graphics cards for personal computers in the 1980s and 1990s, where firms like ATI Technologies and Matrox produced fixed-function accelerators for 2D and 3D graphics. The transition to programmable shading in the early 2000s, led by products from NVIDIA (e.g., early Shader Model implementations) and innovations standardized by Khronos Group through OpenGL and later Vulkan, transformed GPUs into flexible pipelines suitable for shaders and compute tasks. Landmark events include the introduction of unified shader architectures in the mid-2000s, the popularization of general-purpose GPU computing (GPGPU) with frameworks from NVIDIA and academic work at institutions such as Stanford University and University of Illinois Urbana-Champaign, and the rise of GPU-accelerated data centers at companies like Google, Amazon Web Services, and Microsoft Azure.

Architecture and components

Modern GPU architecture centers on massively parallel execution units organized into shader cores, compute units, or streaming multiprocessors designed by vendors like NVIDIA, AMD, and Intel. Key components include high-bandwidth memory subsystems (e.g., GDDR6, HBM) made by manufacturers such as Micron Technology and Samsung Electronics; command processors and rasterizers influenced by standards from Khronos Group; texture mapping units and render output pipelines that trace lineage to early designs from 3dfx Interactive and S3 Graphics; and interconnects like PCI Express and proprietary links such as NVLink. Control logic for scheduling, cache hierarchies, and memory management units coordinate thousands of threads to perform SIMD/SIMT execution, while thermal and power delivery systems implement designs informed by firms like ASUS, Gigabyte Technology, and MSI.

Performance and benchmarking

GPU performance is measured across workloads using metrics from industry and academia, including floating-point throughput (FLOPS), memory bandwidth, texture fill rate, and ray-tracing performance exemplified by hardware from NVIDIA's RTX series and dedicated accelerators. Benchmark suites and organizations such as SPEC and publications like AnandTech and Tom's Hardware publish comparative results using real-world applications including games from id Software and engines like Unreal Engine and Unity (game engine). Data-center and HPC benchmarking use suites like LINPACK and MLPerf, with notable deployments at research labs such as Lawrence Livermore National Laboratory and supercomputers like Summit (supercomputer) that rely on GPU acceleration.

Programming and APIs

GPU programming evolved from fixed-function pipelines to shader languages and compute APIs. Early shading used languages and systems developed around OpenGL and proprietary SDKs; later models include DirectX (from Microsoft) shader stages, CUDA (from NVIDIA), and open standards such as OpenCL and Vulkan managed by Khronos Group. High-level frameworks for machine learning—such as TensorFlow (from Google), PyTorch (from Meta Platforms, Inc. and contributors), and libraries like cuDNN—leverage vendor drivers and runtimes for kernel execution, memory management, and multi-GPU scaling. Compiler and runtime research from institutions like MIT and companies like LLVM Project influence optimization passes, just-in-time compilation, and heterogeneous scheduling.

Applications

GPUs power a broad spectrum of applications: real-time rendering in games from studios like Electronic Arts and Activision Blizzard; professional content creation in tools by Adobe Systems and Autodesk; scientific simulation and visualization in projects at NASA and CERN; machine learning workloads driving research at OpenAI and industrial AI at Baidu and Alibaba Group; cryptocurrency mining that has affected consumer markets and supply chains involving companies like Bitmain; and media encoding/decoding in products by FFmpeg contributors. Specialized uses include ray tracing in film production by houses such as Industrial Light & Magic and neural rendering in research groups at Stanford University.

Market and industry

The GPU market is dominated by firms such as NVIDIA, AMD, and integrated offerings from Intel, with manufacturing largely outsourced to foundries like TSMC and Samsung Electronics. OEMs and board partners such as ASUS, EVGA, and MSI produce consumer and workstation cards, while cloud providers including Amazon Web Services, Google Cloud, and Microsoft Azure offer GPU instances. Market dynamics have been shaped by product cycles, supply constraints during global events affecting firms like Foxconn, competition from console integrations by Sony and Microsoft, antitrust scrutiny in multiple jurisdictions, and strategic alliances with research institutions and automotive firms such as Tesla for autonomous driving stacks.

Future trends and research

Research and industry roadmaps point to continued scaling of on-chip parallelism, integration of dedicated accelerators for ray tracing, tensor operations, and AI inference, and heterogeneous system architectures combining CPUs and GPUs from vendors like ARM licensees and integrators. Emerging directions involve photonic interconnects investigated at Caltech and national labs, compute-in-memory concepts explored at IBM Research, and software models for distributed training and inference studied at Carnegie Mellon University and ETH Zurich. Standards efforts by Khronos Group and work on cross-vendor runtimes aim to improve portability and power efficiency as GPUs evolve into central components of scientific infrastructure, consumer devices, and cloud platforms.

Category:Computer hardware