CXL (interface) — LLMpedia

CXL (interface)
Name	CXL
Other names	Compute Express Link
Inventor	Intel Corporation
Superseded	PCI Express

Contents

Overview
Technical specifications
Protocol versions and features
Use cases and applications
Implementations and industry support
Comparison with other interconnects

CXL (interface). Compute Express Link (CXL) is an open industry-standard interconnect designed for high-bandwidth, low-latency communication between a central processing unit and devices like accelerators, memory expansion cards, and smart NICs. It builds upon the physical and electrical layers of the PCI Express bus but introduces new protocols for cache coherency and memory semantics. Developed by a consortium including Intel Corporation, Advanced Micro Devices, and Arm Holdings, it aims to address performance bottlenecks in heterogeneous computing and data-centric workloads.

Overview

The creation of CXL was driven by the growing demand for efficient data movement in systems utilizing diverse processing elements beyond traditional CPUs, such as GPUs, FPGAs, and other accelerators. It is managed by the Compute Express Link Consortium, which includes key industry players like Google, Microsoft, and Meta Platforms. The protocol maintains compatibility with the widely adopted PCI Express infrastructure, allowing it to leverage existing ecosystem tools and form factors. Its primary innovation is enabling a unified, cache-coherent memory space between the host processor and connected devices, which is critical for workloads in artificial intelligence, high-performance computing, and cloud computing.

Technical specifications

CXL operates over the PCI Express 5.0 or later physical layer, utilizing the same connectors and form factors, such as the M.2 and U.2 standards. The specification defines three distinct protocol types: CXL.io, which is essentially a PCI Express protocol for initialization, link-up, and I/O operations; CXL.cache, which allows a device to cache host memory; and CXL.mem, which permits the host processor to access device-attached memory. This layered approach enables low-latency, cache-coherent access to memory pools, facilitating efficient sharing of data structures between the central processing unit and devices like Intel Xeon processors or Nvidia accelerators without unnecessary data replication.

Protocol versions and features

The CXL specification has evolved through several versions, with CXL 1.1 establishing the foundational protocols and CXL 2.0 introducing support for memory pooling and switching, allowing multiple hosts to share a pool of DRAM or persistent memory. CXL 3.0, announced in 2022, significantly enhanced capabilities by supporting fabric-attached devices, coherency across multiple domains, and doubling the theoretical bandwidth to 64 GT/s, aligning with PCI Express 6.0. Each iteration is developed and ratified by the Compute Express Link Consortium, with contributions from members like Samsung Electronics, SK Hynix, and Hewlett Packard Enterprise.

Use cases and applications

Major applications for CXL are found in data centers and high-performance computing environments, where it enables efficient memory expansion and pooling for workloads like machine learning training on Nvidia GPUs or in-memory database processing. It is pivotal for artificial intelligence inference, allowing FPGAs from Xilinx or Intel to share memory coherently with AMD EPYC or Intel Xeon hosts. In cloud computing platforms like Amazon Web Services or Microsoft Azure, CXL facilitates resource disaggregation, letting memory and accelerator resources be dynamically allocated to virtual machines, improving utilization and reducing total cost of ownership.

Implementations and industry support

Initial CXL implementations have been integrated into server processors, with Intel incorporating it into its Sapphire Rapids Xeon Scalable processors and AMD supporting it in its EPYC 9004 series under the brand AMD Instinct. Major memory and device manufacturers, including Micron Technology, Samsung Electronics, and SK Hynix, have demonstrated CXL-based memory expansion modules. System vendors like Dell Technologies, Hewlett Packard Enterprise, and Lenovo have announced server platforms supporting the technology, while Google and Meta Platforms are actively testing it for their large-scale data center deployments.

Comparison with other interconnects

CXL is often compared to other high-speed interconnects like NVLink, developed by Nvidia for tight coupling between its GPUs and CPUs, and InfiniBand, used primarily for network fabric in high-performance computing clusters. While NVLink offers higher bandwidth for Nvidia-specific ecosystems, CXL is an open standard focused on cache coherency and memory semantics across heterogeneous devices. Compared to PCI Express, which is a foundational I/O protocol, CXL adds critical coherency and memory pooling features without replacing it, instead operating as a complementary, higher-level protocol stack for advanced use cases in modern data centers.