LLMpediaThe first transparent, open encyclopedia generated by LLMs

NDB Cluster

Generated by GPT-5-mini
Note: This article was automatically generated by a large language model (LLM) from purely parametric knowledge (no retrieval). It may contain inaccuracies or hallucinations. This encyclopedia is part of a research project currently under review.
Article Genealogy
Parent: Maatkit Hop 4
Expansion Funnel Raw 56 → Dedup 0 → NER 0 → Enqueued 0
1. Extracted56
2. After dedup0 (None)
3. After NER0 ()
4. Enqueued0 ()
NDB Cluster
NameNDB Cluster
DeveloperOracle Corporation
Initial release2001
Latest release8.0 (example)
Written inC++
Operating systemLinux, Solaris, Windows
GenreDistributed database
LicenseProprietary / GPL (components)

NDB Cluster NDB Cluster is a distributed, shared-nothing data management system originating from a high-availability storage engine designed for telecom and web-scale environments. It provides synchronous replication, automatic failover, and online scaling, and has been integrated into prominent relational products and open-source ecosystems. Key adopters span telecommunication carriers, cloud providers, and web platforms seeking low-latency, fault-tolerant transactional storage.

Overview

NDB Cluster was created to meet requirements of carrier-grade systems and real-time services, influenced by projects such as Ericsson node clustering and designs around Telefonica service platforms. Its architecture reflects ideas from distributed systems research exemplified by Leslie Lamport's consensus work and Google's distributed databases. Commercialization involved MySQL AB and later Oracle Corporation following acquisitions. It competes and interoperates in landscapes populated by PostgreSQL, Cassandra, MongoDB, and Redis.

Architecture

NDB Cluster implements a multi-process topology with distinct roles inspired by distributed architectures used by Sun Microsystems and HP fault-tolerant systems. Primary components include management nodes, data nodes, and SQL/API nodes; designs echo principles in Berkeley DB clustering and Amazon service partitioning. Communication uses a packet protocol over TCP/IP or RDMA, similar to approaches in Intel scalable fabrics. High-availability features reference concepts used in Veritas clustering and Hewlett-Packard NonStop systems.

Data Storage and Replication

Data storage in NDB Cluster uses distributed in-memory primary copies augmented by on-disk persistence, reminiscent of hybrid strategies in SAP HANA and Oracle Database In-Memory options. Replication is synchronous within node groups and supports configurable replicas per partition, paralleling replication models in Microsoft SQL Server Always On and IBM Db2 pureScale. Partitioning strategies follow key-based sharding patterns used by Facebook's infrastructure and Twitter storage layers. Checkpointing and redo logging for durability are comparable to mechanisms in Ingres and Informix.

Transaction Management and Recovery

Transaction processing adopts two-phase commit variants and optimistic/concurrency control influenced by work from Jim Gray and research at MIT and Stanford. Recovery leverages distributed checkpoints, binary logs, and local replay similar to approaches in Oracle RAC and MySQL binlog replication. Node failure handling and data rebalancing draw on algorithms seen in Google Bigtable and consensus ideas from Raft and Paxos literature. Administrative recovery utilities mirror tools provided by Sun Microsystems and Red Hat enterprise offerings.

Administration and Monitoring

Management is centralized through management nodes that coordinate configuration, startup, and cluster membership, a pattern used by Zookeeper and Consul systems. Monitoring integrates with metrics and tracing ecosystems such as Prometheus and Grafana and can be incorporated into observability platforms used by Netflix and LinkedIn. Tools for backup, restore, and rolling upgrades reflect operational practices from Oracle Corporation enterprise products and cloud orchestration paradigms employed by Kubernetes and OpenStack.

Performance and Scalability

NDB Cluster targets low-latency transaction throughput with in-memory operation and partitioned parallelism, strategies similar to VoltDB and Tarantool. Scalability is achieved by adding data nodes and SQL/API nodes, paralleling scale-out models in Cassandra and Elasticsearch. Benchmarks often reference OLTP workloads like those popularized by TPC standards and practical deployments at companies such as Telefonica and Booking.com that require horizontal elasticity. Network topology, CPU affinity, and NUMA tuning influence performance as with high-performance systems developed at Intel and AMD.

Use Cases and Deployments

Typical use cases include telecommunications subscriber stores, session state for web platforms, real-time bidding in advertising stacks, and gaming leaderboards—similar application spaces served by Cassandra, Redis, and Aerospike. Production deployments have appeared in carriers and large web properties influenced by design needs seen at Akamai, Yahoo!, and eBay. Integration scenarios span hybrid transactional/analytical processing when coupled with analytics engines like Apache Spark or data warehouses such as Snowflake.

Category:Distributed databases Category:Oracle software