RAID — LLMpedia

Contents

Overview
Standard levels
Nested (hybrid) RAID
Implementation
Non-standard levels
Applications

RAID is a data storage technology that combines multiple physical disk drive components into one or more logical units for the purposes of data redundancy, performance improvement, or both. It is a foundational concept in enterprise storage, server environments, and high-performance computing. The term was first defined by researchers at the University of California, Berkeley in a 1988 paper that laid the groundwork for standardized implementations.

Overview

The core principle is to distribute or replicate data across an array of independent disks, presenting them to the operating system as a single storage device. This methodology was developed to address the performance limitations and reliability concerns of older storage technologies like the Single Large Expensive Disk. Key objectives include increasing mean time between failures and protecting against data loss due to drive failure. The original taxonomy, now often called "standard levels," was popularized by industry vendors such as IBM, Hewlett-Packard, and Adaptec.

Standard levels

The most common schemes, standardized by the Storage Networking Industry Association, are numbered from 0 to 6. RAID 0 uses striping to spread data blocks across drives, offering high performance but no redundancy. RAID 1 employs mirroring, writing identical data to a pair of drives. RAID 5 distributes parity information along with striped data, requiring a minimum of three disks. RAID 6 extends this concept with dual parity, allowing survival of two simultaneous drive failures. Other fundamental levels include RAID 2, RAID 3, and RAID 4, which are now largely obsolete in practice.

Nested (hybrid) RAID

These configurations combine two standard levels to achieve specific benefits, denoted by multiple numbers like 10 or 50. RAID 10 (or 1+0) creates a striped set of mirrored pairs, offering both high performance and robust fault tolerance, and is widely supported by controllers from companies like LSI Logic and Promise Technology. Conversely, RAID 01 (0+1) is a mirrored pair of stripes. Similarly, RAID 50 and RAID 60 combine the distributed parity of RAID 5 or RAID 6 with the striping of RAID 0, often used in large-scale storage systems from vendors like NetApp and Dell EMC.

Implementation

Deployment can be handled via software or dedicated hardware. Software implementations use the host system's central processing unit and are common in operating systems like Linux (via mdadm), Windows Server, and FreeBSD. Hardware implementations employ a dedicated RAID controller card, which houses a specialized processor, such as an I/O processor, and often includes cache memory protected by a battery backup unit. The choice affects performance, bootability, and compatibility with complex configurations like those used in Oracle Database or VMware vSphere environments.

Non-standard levels

Beyond the standard taxonomy, various proprietary and non-standard schemes exist. Linux MD RAID supports RAID 1E, an enhanced mirrored layout. Some hardware controllers from Intel implement RAID 1E and RAID 5E. IBM developed RAID 7, a cached level with a real-time operating system. Other examples include Storage Computer Corporation's RAID S and RAID Z, which is part of the ZFS file system developed for OpenSolaris and used in products from iXsystems.

Applications

The technology is ubiquitous in scenarios demanding high availability and performance. It is a critical component in database server infrastructure, web server farms, and video editing workstations. In enterprise settings, it forms the basis for storage area network and network-attached storage solutions from companies like Hitachi Data Systems and QNAP Systems. Its use is also mandated or recommended in various compliance frameworks and is integral to disaster recovery strategies for organizations worldwide.

Category:Computer storage