WiredTiger — LLMpedia

WiredTiger
Name	WiredTiger
Developer	MongoDB, Inc.; original authors WiredTiger, Inc.
Written in	C, C++
Initial release	2012
Latest release	(see project)
Repository	GitHub
License	AGPL (original), server-side public license for MongoDB integrations
Website	WiredTiger project pages

Contents

History
Architecture and Design
Storage Engines and Data Structures
Concurrency and Transactions
Performance and Benchmarks
Integration and Usage in Databases
Licensing and Development Community

WiredTiger

WiredTiger is a high-performance NoSQL-era storage engine originally developed by WiredTiger, Inc., later acquired and integrated by MongoDB, Inc.. It was introduced to provide a modern alternative to legacy designs used in systems like Berkeley DB and to serve as the default engine for major deployments exemplified by MongoDB and evaluated in projects associated with MySQL and PostgreSQL research. The project intersects engineering efforts from teams associated with Linux Foundation-era open-source ecosystems, corporate contributors such as Amazon Web Services, and academic influences including work cited by researchers at MIT, Stanford University, and University of California, Berkeley.

History

WiredTiger emerged from a startup founded by veterans who had previously contributed to storage work at Sleepycat Software and who drew influence from designs in Berkeley DB, Google Bigtable, and LevelDB. Early releases in 2012 targeted scalable workloads common to Facebook, Twitter, and LinkedIn use cases; subsequent adoption accelerated after acquisition by MongoDB, Inc. in 2014. The engine’s evolution reflects integration efforts with enterprise adopters including Red Hat, Canonical, and Microsoft Azure teams, and it has been discussed at conferences like USENIX FAST, ACM SIGMOD, and VLDB Conference. Key contributors and maintainers have included engineers formerly associated with Oracle Corporation, Intel Corporation, and HP Labs.

Architecture and Design

WiredTiger’s architecture emphasizes a modular design influenced by concepts from Google File System and Hadoop Distributed File System client libraries while remaining a local storage engine suitable for systems like MongoDB and embedded databases used by Redis experiments. Core components include a cache layer with an eviction policy derived from research at Carnegie Mellon University and logging layers informed by ARIES-style recovery work originating from IBM Research. The design supports pluggable compression codecs similar in concept to implementations in ZFS and bzip2 ecosystems, and leverages threading models comparable to those used by Apache Cassandra and RocksDB to maximize parallel I/O throughput on platforms such as Linux, FreeBSD, and Windows Server.

Storage Engines and Data Structures

WiredTiger implements multiple storage strategies, notably a B-Tree-like variant related to structures used by IBM DB2 and a log-structured merge approach conceptually similar to LevelDB and RocksDB. The engine supports append-only logging and checkpointing mechanisms resonant with techniques used in PostgreSQL and Oracle Database Redo Logs. Data structures include page-based layouts with internal compression options akin to Snappy and zlib integrations used in systems like Apache Parquet and Hadoop. Indexing and cursor semantics are comparable to those described in Ingres and Sybase literature, while on-disk formats reflect trade-offs discussed in papers from SIGMOD and VLDB.

Concurrency and Transactions

WiredTiger provides multi-version concurrency control (MVCC) influenced by seminal work at Berkeley DB and algorithmic patterns discussed in ACM journals; this enables non-blocking reads and snapshot isolation comparable to transactional semantics in PostgreSQL and Oracle Database. Transaction support includes atomic commit/rollback behaviors aligned with Two-phase commit discussions in IEEE transaction research, and lock management borrows ideas found in Linux kernel synchronization primitives and Intel CPU memory-ordering models. The concurrency design has been evaluated alongside systems like MySQL InnoDB, RocksDB, and SQLite in community benchmarking at events such as NoSQL Matters and MongoDB World.

Performance and Benchmarks

Benchmarks comparing WiredTiger to engines such as InnoDB, RocksDB, LevelDB, and Berkeley DB have been presented at USENIX, ACM EuroSys, and industry forums from Amazon and Microsoft Research. Performance characteristics emphasize throughput for mixed read/write workloads as seen in YCSB benchmarks and latency improvements measured in real deployments at eBay, Cisco, and Adobe Systems. Compression options and cache tuning mirror practices from memcached and Redis operators, and profiling tools from perf, DTrace, and Valgrind have been used in community-driven analyses produced by contributors from Intel Labs and Google.

Integration and Usage in Databases

The most prominent integration of WiredTiger is as the default storage engine for MongoDB releases beginning with major versions in the mid-2010s, replacing older engines used by the project. It has also been experimented with in connectors and forks within MySQL-adjacent projects and in academic prototypes at MIT CSAIL and UC Berkeley RISELab. Cloud providers including Amazon Web Services (via Amazon DocumentDB comparisons), Google Cloud Platform, and Microsoft Azure have documented performance trade-offs for document-store and key-value workloads when using WiredTiger-backed deployments. Contributions and ports have been discussed by teams at Red Hat and in compatibility layers for Kubernetes operators managing stateful sets.

Licensing and Development Community

WiredTiger’s licensing history includes an original AGPL-like approach and later interactions with the licensing policies of MongoDB, Inc. culminating in discussions referenced alongside licensure changes made by Elastic and other open-source vendors. The development community has included corporate contributors from MongoDB, Inc., Amazon, Intel, and independent maintainers who coordinate via GitHub and present at FOSDEM, All Things Open, and Open Source Summit events. Governance and contributor models draw comparisons to projects under the Apache Software Foundation and Linux Foundation umbrellas, and ecosystem tooling often intersects with continuous integration platforms such as Travis CI and Jenkins.

Category:Storage engines