Generated by GPT-5-mini| Database management systems | |
|---|---|
![]() BernardoSulzbach · CC BY-SA 4.0 · source | |
| Name | Database management systems |
Database management systems Database management systems provide software platforms for creating, querying, updating, and administering structured data stores. They underpin applications developed by organizations such as International Business Machines, Oracle Corporation, Microsoft, SAP SE, and Amazon (company), support research at institutions like Massachusetts Institute of Technology and Stanford University, and enable services in industries served by Goldman Sachs, Walmart, Pfizer, Boeing, and Netflix. Early milestones involved work at IBM and academic projects such as System R, Ingres, and CODASYL that influenced standards from American National Standards Institute and organizations like the Institute of Electrical and Electronics Engineers.
The evolution of relational systems followed theoretical advances by Edgar F. Codd and practical systems from IBM and University of California, Berkeley that led to commercial products by Oracle Corporation and Informix. Parallel developments in transactional processing and distributed databases were driven by projects at Bell Labs, Berkeley DB, and research groups at Carnegie Mellon University and University of California, Santa Barbara. The rise of internet-scale services at Google LLC, Facebook, Twitter, and Amazon (company) accelerated interest in NoSQL alternatives such as Cassandra (database), MongoDB, and HBase while standards bodies including World Wide Web Consortium and ISO/IEC JTC 1 responded with interoperable languages and protocols.
Core components include a query processor, storage engine, transaction manager, and catalogue or metadata manager used in systems from Oracle Database to PostgreSQL and Microsoft SQL Server. High-availability architectures reference replication techniques implemented by Percona, MariaDB, and MySQL clusters, and orchestration often leverages tools from Kubernetes and Docker, Inc.. In distributed deployments, consensus algorithms such as Paxos and Raft (algorithm) appear alongside coordination services like Apache Zookeeper and etcd to ensure cluster membership and leader election. Backup and recovery strategies integrate with storage arrays by vendors such as NetApp and Dell EMC.
Relational models trace to Edgar F. Codd and use Structured Query Language standardized by ANSI and ISO. Alternative models include document stores exemplified by MongoDB, key–value stores like Redis (software), wide-column stores such as Apache Cassandra, and graph databases including Neo4j and Amazon Neptune. Query languages and interfaces include SQL, SPARQL, Gremlin, and APIs from projects like Apache Thrift and gRPC. Semantic data and ontology work relate to standards developed by World Wide Web Consortium for Resource Description Framework and Web Ontology Language used in graph-centric systems.
Atomicity, consistency, isolation, and durability (ACID) concepts were formalized in transactional research at IBM and implemented in systems like Ingres and System R. Concurrency control techniques include two-phase locking used in Microsoft SQL Server and Oracle Database, optimistic concurrency control employed by Google Spanner and SAP HANA, and multiversion concurrency control present in PostgreSQL and MVCC implementations. Distributed transactions and global consistency are addressed by protocols such as Two-phase commit protocol and consensus mechanisms like Paxos applied by Chubby (service) and ZooKeeper deployments.
Physical storage strategies use page-based and log-structured designs seen in RocksDB and LevelDB; compression and tiering integrate hardware from Intel Corporation and Western Digital. Index structures include B-trees used in SQLite and MySQL InnoDB, LSM-trees implemented by Cassandra and HBase, and specialized indexes like R-trees for spatial data applied in systems by Esri and PostGIS. Query optimizers draw on cost models developed in academia at Princeton University and University of Washington, and execution engines incorporate vectorized processing used in MonetDB and ClickHouse.
Access control models such as role-based access control appear in Oracle Corporation and Microsoft products and integrate with identity systems like Active Directory and Okta. Encryption at rest and in transit is provided using standards from National Institute of Standards and Technology and protocols defined by Internet Engineering Task Force. Auditing, data masking, and provenance tie to regulatory frameworks including General Data Protection Regulation enforcement in the European Union and Health Insurance Portability and Accountability Act compliance in the United States. Data integrity mechanisms employ checksums and constraints implemented in PostgreSQL and enterprise platforms from SAP SE.
Systems vary from embedded engines such as SQLite used in Android (operating system) and iOS apps to enterprise appliances from Oracle Corporation and cloud-native services including Amazon Aurora, Google Cloud Spanner, and Azure SQL Database. NoSQL families include document, key–value, column-family, and graph databases from vendors like MongoDB, Inc., Redis Labs, DataStax, and Neo4j, Inc.. Hybrid transactional/analytical processing (HTAP) and NewSQL products emerged from companies such as VoltDB and NuoDB, while data lakehouse architectures combine engines like Apache Spark and Delta Lake used by organizations including Databricks.
Category:Computer software