RADOS bench — LLMpedia

RADOS bench
Name	RADOS bench
Developer	Ceph
Released	2010s
Platform	Linux
Genre	Storage benchmarking
License	LGPL

Contents

Overview
Architecture and Operation
Benchmarking Modes and Workloads
Configuration and Parameters
Performance Metrics and Interpretation
Use Cases and Practical Applications
Limitations and Best Practices

RADOS bench

RADOS bench is a microbenchmark utility associated with Ceph designed to measure raw object storage performance against a Ceph OSD cluster. It provides simple, repeatable tests for throughput and latency of the RADOS layer, enabling evaluation of OpenStack integration, cluster scaling, and hardware effects on distributed storage performance. Administrators and researchers use it alongside tools like fio, IOzone, and Bonnie++ when assessing designs for cloud computing and high performance computing infrastructures.

Overview

RADOS bench exercises the object layer of the Ceph ecosystem by writing and reading objects directly via the librados API to bypass higher-level abstractions such as RADOS Gateway and CephFS. It operates against a specified pool in a running Ceph cluster composed of monitors, OSDs, and optional metadata servers. The utility helps isolate performance characteristics attributable to hardware like NVMe, SATA, SSD, and HDD devices, or to software components such as the BlueStore backend and the CRUSH placement algorithm implemented by Sage Weil's team. Operators commonly compare RADOS bench outputs when tuning kernels for Red Hat Enterprise Linux, Ubuntu, or CentOS deployments.

Architecture and Operation

RADOS bench uses the librados client library to create, write, read, and delete objects in a target pool. It leverages the cluster map provided by Ceph Monitor daemons to locate appropriate placement groups and target Object Storage Devices. Underlying communication uses protocol layers also employed by RADOS Gateway and librbd, ensuring results reflect real network and OSD behavior influenced by components like BlueStore, Filestore, RADOS Gateway caches, and Ceph Manager modules. The tool can be invoked from any machine with access to the cluster and proper keyrings, interacting with authentication provided by cephx.

Benchmarking Modes and Workloads

RADOS bench supports multiple operational modes including sequential writes, sequential reads, random writes, and random reads, with configurable object sizes. Typical workloads emulate patterns seen in OpenStack Swift object stores, Kubernetes persistent volumes, and GlusterFS comparisons. Tests may vary object size from small 4 KiB objects—reflective of metadata-heavy services used by Hadoop Distributed File System users—to multi-megabyte objects common in video streaming and media archives used by organizations like Netflix or YouTube. Users contrast single-threaded runs to multi-threaded or multi-client scenarios resembling traffic from Apache HTTP Server, NGINX, or large-scale Ceph RBD consumers.

Configuration and Parameters

Key parameters include object size, number of objects, concurrency level (threads), duration, and pool selection. The tool accepts flags for the write/read pattern, object name prefix, and verification options. Tuning may involve altering CRUSH map rules, pool replication factor, or erasure coding profiles using erasure-code plugins developed by contributors from Red Hat and other vendors. Network-related settings such as MTU, jumbo frames, and TCP offload features influence outcomes, as do kernel parameters and I/O schedulers employed in distributions like Debian or SUSE Linux Enterprise Server.

Performance Metrics and Interpretation

RADOS bench reports throughput in MiB/s and IOPS alongside average and sometimes percentile latencies. Interpreting results requires context: replication factor or erasure coding increases write amplification affecting throughput; network saturation can bottleneck tests even when local storage remains underutilized. Comparative analysis often references metrics gathered by Prometheus exporters or Grafana dashboards that correlate RADOS bench results with cluster-level metrics from Ceph Manager modules and OSD perf counters. Performance curves can reveal behaviors such as write stabilization times, backfilling effects after OSD failure, and client-side queueing symptomatic of slow journal devices in legacy setups.

Use Cases and Practical Applications

Administrators use RADOS bench for capacity planning, hardware selection, and validating cluster tuning prior to deployment for projects involving OpenStack Nova, Ceph RBD-backed virtual machines, and Kubernetes persistent storage for stateful workloads. Researchers and vendors run it when demonstrating performance claims during evaluations for conferences like KubeCon or vendor testing for Red Hat Summit. It also serves as a diagnostic for regression testing by developers working on Ceph features such as BlueStore improvements or PG autoscale enhancements.

Limitations and Best Practices

RADOS bench is a synthetic microbenchmark and does not replicate workload semantics found in production multi-tenant environments like those run by LinkedIn or Dropbox. It bypasses gateways and filesystem layers, so results should not be used as sole predictors for application-level performance in systems using RADOS Gateway or CephFS. Best practices include aligning test parameters with expected production object sizes, running tests across multiple client nodes to capture network effects, and combining RADOS bench with end-to-end tools such as fio and real-application benchmarks. When interpreting results, account for cluster activities like recovery after OSD down events, and document software stack versions including Ceph Pacific or Ceph Quincy releases.

Category:Ceph