LLMpediaThe first transparent, open encyclopedia generated by LLMs

RADOS Gateway

Generated by GPT-5-mini
Note: This article was automatically generated by a large language model (LLM) from purely parametric knowledge (no retrieval). It may contain inaccuracies or hallucinations. This encyclopedia is part of a research project currently under review.
Article Genealogy
Parent: Ceph Hop 4
Expansion Funnel Raw 73 → Dedup 0 → NER 0 → Enqueued 0
1. Extracted73
2. After dedup0 (None)
3. After NER0 ()
4. Enqueued0 ()
RADOS Gateway
NameRADOS Gateway
DeveloperRed Hat
Released2012
Programming languageC++, Python
Operating systemLinux
LicenseLGPL

RADOS Gateway

RADOS Gateway is an object storage front end for the Ceph distributed storage system that exposes S3-compatible and Swift-compatible RESTful interfaces. It provides multi-tenant HTTP access to data stored in a Ceph RADOS cluster, integrates with identity systems and load balancers, and supports features required by cloud, archival, and big data platforms. Major adopters and integration partners include OpenStack, Kubernetes, Red Hat, and community projects.

Overview

RADOS Gateway implements object storage access for Ceph RADOS, enabling applications that expect Amazon S3, OpenStack Swift, Google Cloud Storage semantics to operate against on-premises and hybrid cloud installations. The project originated in the Ceph community and later became a component maintained by Red Hat, aligning with distributions that target enterprise storage markets such as Red Hat Enterprise Linux and integrations with OpenStack services like Keystone (OpenStack) and Glance (OpenStack Image Service). As a networked gateway, it interoperates with load balancers and proxies such as HAProxy, Nginx, and Apache HTTP Server, and can be deployed alongside orchestration frameworks like Kubernetes, OpenShift, and Docker Swarm. The gateway is commonly used in sectors including research institutions like CERN, cloud providers such as OVH, and enterprises leveraging SUSE or Canonical-based infrastructures.

Architecture

The gateway serves as a stateless HTTP front end that maps RESTful requests to Ceph RADOS objects stored in OSD daemons coordinated by the Ceph Monitor and placement algorithms such as CRUSH. It typically runs as multiple daemon instances behind reverse proxies or load balancers, relying on cluster metadata provided by MONs and the Ceph Manager for dynamic configuration. For durability and replication it leverages PGs (placement groups) and integrates with OSD features like replication, erasure coding, and recovery. In multi-site setups it can participate in asynchronous replication using RADOS Gateway zonegroups and zones, coordinated through a metadata bucket service. The component interacts with identity providers through protocols and projects such as LDAP, Active Directory, SAML, and OAuth 2.0 when integrated via Keystone (OpenStack) or custom middleware.

Features and Protocol Support

RADOS Gateway offers a broad feature set: object lifecycle management, versioning, multipart upload, bucket policies, and server-side encryption integrations with KMS backends. It supports S3 semantics including presigned URLs, ACLs, and multipart uploads compatible with AWS Signature Version 4, and provides Swift API compatibility for applications built against OpenStack Swift. Extended features include eventual consistency behavior tuning, bucket notification hooks for event-driven systems like Apache Kafka, and object lock semantics for WORM use cases compliant with regulatory workflows. For authentication and authorization it supports S3-style credentials, Keystone tokens, and integration with secrets management platforms such as HashiCorp Vault and key management services present in Keycloak deployments.

Deployment and Scaling

Deployments range from single-zone lab setups to global multi-site fabrics used by service providers. Scaling horizontally, administrators add gateway instances and scale Ceph OSDs, using orchestration tools like Ansible, SaltStack, and Terraform to manage infrastructure as code. Containerized deployments leverage Kubernetes Operators and Helm charts for lifecycle management, and can be integrated with storage classes and CSI drivers used by Kubernetes and OpenShift. For geographic distribution, multi-site replication uses zonegroups and tools modeled after CDN and replication patterns found in systems like GlusterFS and enterprise backup solutions from vendors such as NetApp and Dell EMC.

Security and Access Control

Security involves transport encryption, credential management, and policy enforcement. Administrators enable TLS with certificates issued by authorities such as Let's Encrypt or internal Microsoft Active Directory Certificate Services, terminate TLS at edge proxies like HAProxy or Traefik, and use firewall appliances from vendors like Palo Alto Networks or Fortinet for perimeter control. Access control supports S3 ACLs and bucket policies, role-based access via Keystone (OpenStack) and federated identity using SAML or OpenID Connect. Server-side encryption can be integrated with KMIP-compliant key managers and hardware security modules from Thales Group or IBM for enterprise key custody. Audit and compliance workflows align with standards and audits performed under frameworks such as ISO/IEC 27001 and regulations enforced by agencies including European Commission and National Institute of Standards and Technology.

Performance and Monitoring

Performance tuning spans backend OSD configuration, placement group count, and network topology optimizations using high-throughput fabrics from vendors like Mellanox Technologies. Observability uses telemetry exported to systems such as Prometheus, visualized in Grafana, and logged through stacks involving Elasticsearch, Logstash, and Kibana as part of ELK deployments. Benchmarking and load testing employ tools and suites like cosbench, s3bench, and application-level tests against distributed compute systems such as Apache Hadoop and Apache Spark. Integration with the Ceph Manager's dashboard and Ceph's native telemetry exposes per-pool and per-OSD metrics, while A/B testing and capacity planning leverage metrics collection frameworks used by cloud providers including Google, Microsoft Azure, and Amazon Web Services.

Integration and Use Cases

Common integrations include object storage back ends for OpenStack Glance, archival stores for DICOM data in healthcare, media asset management for broadcasters like BBC, and big data pipelines coupling Apache Hadoop and Apache Spark with object storage. Backup and archive appliances, backup software vendors such as Veeam and Commvault, and cloud-native platforms use the gateway for S3-compatible targets. Scientific collaborations at institutions like NASA, European Space Agency, and research labs such as Los Alamos National Laboratory use Ceph with the gateway for petascale data. Content delivery workflows interoperate with CDN providers including Cloudflare and Akamai via origin pulls, while CI/CD pipelines in projects using GitLab or Jenkins store artifacts in S3-compatible buckets served by the gateway.

Category:Ceph Category:Object storage