Boehm–Demers–Weiser garbage collector

Boehm–Demers–Weiser garbage collector
Name	Boehm–Demers–Weiser garbage collector
Developer	Hans Boehm, Alan Demers, Michael Weiser
Released	1991
Programming language	C, C++
Operating system	Cross-platform
License	LGPL

Contents

Overview
Design and Implementation
Features and Usage
Performance and Evaluation
Compatibility and Language Bindings
History and Development
Criticisms and Limitations

Boehm–Demers–Weiser garbage collector is a conservative garbage collector for C, C++, and other languages that use manual memory management, developed to provide automatic memory reclamation in existing codebases. The project was created by Hans Boehm, Alan Demers, and Michael Weiser and has been used alongside systems and libraries in environments influenced by work at institutions like Digital Equipment Corporation, IBM, MIT, Carnegie Mellon University, and University of Massachusetts Amherst. It influenced and was informed by research in memory management at organizations such as Bell Labs, University of Washington, and Stanford University.

Overview

The collector implements a conservative, stop-the-world approach that identifies unreachable objects via pointer scanning rather than explicit roots, drawing on techniques from prior research at Xerox PARC, Bell Labs, and SUN Microsystems; it was designed to integrate with large existing UNIX-based codebases and toolchains from vendors including Sun Microsystems, Silicon Graphics, and Intel Corporation. It provides automatic reclamation while aiming to minimize changes to programs written for compilers like those from GCC and Clang/LLVM, and has been deployed in projects associated with GNU Project, FreeBSD, and NetBSD. Contributors and adopters have included researchers and engineers affiliated with Microsoft Research, Google, Apple Inc., and various academic laboratories.

Design and Implementation

The design uses conservative root finding by scanning registers and stacks produced by compilers such as GCC, Clang/LLVM, and legacy toolchains from Borland, mapping object reachability similarly to techniques described in papers from ACM conferences and USENIX symposia. Its implementation in C relies on platform-specific support for virtual memory primitives on POSIX systems and integration hooks for runtime environments like those from Sun Microsystems and IBM. The collector uses mark-and-sweep semantics, coalescing free blocks with algorithms related to earlier work at MIT, and supports incremental and parallel marking strategies influenced by research at Carnegie Mellon University and Stanford University. It handles thread suspension and root collection in the presence of threading models from Pthreads libraries and runtime conventions used by vendors including Intel Corporation and ARM Holdings.

Features and Usage

Features include automatic reclamation for legacy C++ and C applications, optional conservative stack scanning for code compiled with GCC and Clang/LLVM, and configurable heuristics suitable for server software from organizations like Apache Software Foundation, Mozilla Foundation, and NGINX, Inc.. The collector exposes APIs for integration into projects developed at institutions such as Mozilla, Red Hat, and Canonical, and supports tuning parameters for allocation thresholds informed by benchmarking work at SPEC and Phoronix Test Suite contributors. It has been used in runtime systems and virtual machines associated with Erlang/OTP research groups, embedded systems from ARM Holdings partners, and large-scale services run by companies like Google and Facebook.

Performance and Evaluation

Performance analyses compare the collector against precise collectors and tracing collectors developed in academic settings at MIT, Harvard University, and Princeton University, and industrial implementations from Oracle Corporation and IBM. Benchmarks conducted on hardware from Intel Corporation and AMD show trade-offs in pause times and throughput relative to moving collectors used in projects at Sun Microsystems and OpenJDK, with workload-dependent overheads identified in evaluations published in venues like ACM SIGPLAN and USENIX. Tuning for multicore servers produced by Dell Technologies and Hewlett Packard Enterprise can reduce impact on latency-sensitive services operated by Amazon (company) and Microsoft Azure.

Compatibility and Language Bindings

Bindings and ports have been created for languages and runtimes developed at institutions such as Eclipse Foundation projects, Python Software Foundation-hosted interpreters, and research languages from University of Cambridge groups, enabling use with Python, Ruby, and experimental systems researched at ETH Zurich. The collector interoperates with compilers and toolchains from GNU Project and Clang/LLVM, and has been packaged for distributions maintained by Debian, Fedora Project, and openSUSE. Community ports have interfaced with virtual machines and language ecosystems from Eclipse, Mono Project, and academic runtime work at University of Illinois at Urbana–Champaign.

History and Development

Development began in the late 1980s and early 1990s with academic contributions from researchers affiliated with University of Massachusetts Amherst and Carnegie Mellon University, culminating in releases influenced by industrial partners like Digital Equipment Corporation and Sun Microsystems. The collector evolved through collaboration among academics and engineers at IBM Research, Microsoft Research, and open source communities associated with GNU Project and FreeBSD. Over time, enhancements addressed multiprocessing, incremental collection, and configurability driven by needs identified in deployments at Google and Mozilla Foundation.

Criticisms and Limitations

Criticisms include conservative root identification producing false retention of memory noted in studies from ACM conferences and by engineers at Oracle Corporation and IBM, potential fragmentation issues observed on systems from Intel Corporation and ARM Holdings, and challenges integrating with precise GC expectations in runtimes developed at Sun Microsystems and OpenJDK. The approach can complicate real-time guarantees demanded by systems used in projects at NASA and embedded platforms from ARM Holdings partners, and its stop-the-world phases are less suitable for low-latency services maintained by companies such as Amazon (company) and Microsoft Azure.

Category:Garbage collection