Volta (microarchitecture)

Volta (microarchitecture)
Name	Volta
Designer	Nvidia
Model	GPU
Succeeded by	Turing
Preceded by	Pascal
Released	2017
Fab	TSMC
Process	12 nm FinFET
Codename	GV10x
Compute	7.0
Type	Parallel computing

Contents

Overview
Architecture
Products
Performance
Software and ecosystem
Reception and impact

Volta (microarchitecture). Volta is a GPU microarchitecture developed by Nvidia, succeeding the Pascal architecture and preceding Turing. First introduced in 2017, it was designed primarily for high-performance computing and artificial intelligence workloads, marking a significant shift from purely graphics-focused designs. Its most notable innovation was the integration of dedicated Tensor Cores, specialized hardware for accelerating deep learning operations.

Overview

Announced at the GPU Technology Conference in 2017, the Volta architecture represented Nvidia's strategic pivot toward the burgeoning fields of AI and scientific computation. The flagship product, the Tesla V100, was launched in partnership with major cloud providers like Amazon Web Services and Google Cloud Platform. This generation was fabricated by TSMC using its 12 nm FinFET process, which allowed for increased transistor density and power efficiency compared to its predecessor.

Architecture

The core architectural breakthrough of Volta was the introduction of Tensor Cores, which are specialized execution units designed to perform mixed-precision matrix multiply-accumulate calculations fundamental to deep learning frameworks like TensorFlow and PyTorch. It also featured a new streaming multiprocessor design, known as Volta Streaming Multiprocessor, which improved independent thread scheduling through a technique called SIMT (Single Instruction, Multiple Threads). The architecture supported NVLink 2.0 for high-speed interconnect between multiple GPUs and HBM2 (High Bandwidth Memory 2) for significantly increased memory bandwidth over the GDDR5X used in Pascal.

Products

The primary consumer of the Volta architecture was the data center and research sector, with no consumer-grade GeForce cards based on it ever released. The flagship product was the Tesla V100, available in configurations with either 16 GB or 32 GB of HBM2 memory. This chip was also integrated into Nvidia's own DGX-1 and DGX-2 supercomputers. For professional visualization, the architecture powered the Quadro GV100 graphics card, which found use in studios like Industrial Light & Magic for complex rendering tasks.

Performance

In benchmarks for AI training, the Tesla V100 with its Tensor Cores delivered a dramatic performance leap, often cited as a 12x increase over the previous Pascal-based Tesla P100 for mixed-precision workloads. In high-performance computing applications, such as those run on the Oak Ridge National Laboratory's Summit supercomputer, Volta-based systems excelled at simulations for projects like the Large Hadron Collider. Its performance in traditional graphics rendering, while competent, was not its primary focus, with gaming performance being surpassed by the subsequent Turing architecture's GeForce RTX 20 series.

Software and ecosystem

Volta's capabilities were unlocked through Nvidia's comprehensive software stack, primarily the CUDA parallel computing platform and associated libraries like cuDNN and TensorRT. Support for the architecture was rapidly integrated into major deep learning frameworks, including Microsoft's Cognitive Toolkit and Facebook AI Research's PyTorch. The NVLink interconnect technology was crucial for scaling performance in systems like the DGX Station, enabling faster data exchange between multiple Tesla V100 accelerators.

Reception and impact

The launch of Volta was met with acclaim from the scientific and AI research communities, with institutions like Stanford University and Massachusetts Institute of Technology utilizing the technology for advanced research. It cemented Nvidia's dominance in the AI accelerator market, directly competing with offerings from Intel and Google's TPU. The architecture's commercial success in data centers influenced the design of its successor, Turing, which brought Tensor Cores and dedicated RT Cores for ray tracing to the consumer market with the GeForce RTX 20 series.

Category:Nvidia microarchitectures Category:Graphics processing units Category:2017 in computing