GPU — LLMpedia

GPU
Name	Graphics Processing Unit
Caption	A modern add-in board featuring a dedicated GPU.
Inventor	Intel, NVIDIA, AMD
First production	Late 1990s
Related components	Central processing unit, Video random access memory, PCI Express

Contents

History
Architecture
Types and applications
Performance and metrics
Market and manufacturers

GPU. A graphics processing unit is a specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. Initially developed to handle the computationally intensive tasks of rendering 3D computer graphics for video games and computer-aided design, its highly parallel structure has made it effective for a broader range of complex algorithms. This evolution has transformed the GPU from a fixed-function graphics accelerator into a general-purpose parallel processor pivotal to modern computing.

History

The conceptual origins of dedicated graphics hardware can be traced to arcade system boards like those used in Namco's Galaxian and the early home computer Texas Instruments TMS9918. The term "GPU" was popularized by NVIDIA in 1999 with the launch of its GeForce 256, marketed as the world's first such unit. Early competitors included 3dfx Interactive's Voodoo Graphics and products from ATI Technologies, which later became AMD. The shift from fixed-function pipelines, handling tasks like transform and lighting, began with the introduction of programmable shader units by Microsoft's DirectX 8.0 and OpenGL standards. This programmability laid the groundwork for researchers, including those at Stanford University, to repurpose these processors for non-graphics tasks, a field that became known as GPGPU.

Architecture

Fundamentally, a GPU architecture is built around a massive array of smaller, efficient cores designed for parallel processing, contrasting with the fewer, more complex cores of a central processing unit. These cores are organized into streaming multiprocessors or similar blocks that execute threads in a SIMD or SIMT fashion. Key architectural components include the shader units for vertex, geometry, and pixel processing, texture mapping units, and render output units. Memory architecture is critical, featuring high-bandwidth interfaces like GDDR6 or HBM2 and dedicated caches. Modern designs from NVIDIA (Ampere), AMD (RDNA), and Intel (Xe) also incorporate dedicated hardware for ray tracing acceleration and tensor operations for artificial intelligence workloads.

Types and applications

GPUs are primarily categorized as integrated, where the processor is part of the central processing unit die, as seen in Intel Iris Xe or AMD Ryzen with Radeon Graphics, and discrete, a separate card connected via PCI Express. Discrete GPUs, produced by NVIDIA and AMD, dominate performance segments. Beyond rendering for Microsoft Windows and macOS interfaces and games like Cyberpunk 2077, their parallel prowess is harnessed in scientific domains. They accelerate molecular dynamics simulations, climate modeling, and computational fluid dynamics. In artificial intelligence, they are essential for training deep learning models at companies like OpenAI and Google Brain. The cryptocurrency mining boom for Bitcoin and Ethereum further demonstrated their computational utility, while professional markets use NVIDIA Quadro and AMD Radeon Pro cards for computer-aided design and visual effects in studios like Pixar and Industrial Light & Magic.

Performance and metrics

GPU performance is measured by several key metrics, often highlighted in reviews by publications like AnandTech and Tom's Hardware. FLOPS, particularly for single-precision (FP32) and half-precision (FP16) calculations, indicate raw computational throughput. For graphics, frames per second in benchmarks such as 3DMark or games like Shadow of the Tomb Raider is a primary measure. Memory bandwidth, determined by the interface (e.g., GDDR6X) and bus width, is crucial for feeding the cores. Thermal design power indicates power consumption and cooling requirements. Specialized performance includes ray tracing performance measured in ray-triangle intersections per second and AI inference performance in operations like TOPS for tensor cores. Real-world performance is also governed by driver software from NVIDIA Game Ready Drivers or AMD Software: Adrenalin Edition.

Market and manufacturers

The discrete GPU market is dominated by NVIDIA and AMD, who design chips fabricated by Taiwan Semiconductor Manufacturing Company and Samsung Electronics. Intel has re-entered the discrete market with its Intel Arc series. These companies supply chips to add-in board partners like ASUS, Gigabyte Technology, and MSI who manufacture final retail products. The integrated GPU market is led by Intel within its Core processors, followed by AMD with its APUs. The professional and datacenter market is fiercely contested, with NVIDIA's Tesla and AMD's Instinct lines powering supercomputers like Fugaku and Summit. The market is heavily influenced by trends in cryptocurrency mining, artificial intelligence research at Meta and Microsoft, and the console gaming sectors supplied by custom chips for the PlayStation 5 and Xbox Series X.

Category:Computer hardware Category:Graphics hardware Category:Parallel computing