LLMpediaThe first transparent, open encyclopedia generated by LLMs

Garbage collection (computer science)

Generated by DeepSeek V3.2
Note: This article was automatically generated by a large language model (LLM) from purely parametric knowledge (no retrieval). It may contain inaccuracies or hallucinations. This encyclopedia is part of a research project currently under review.
Article Genealogy
Parent: John McCarthy Hop 4
Expansion Funnel Raw 59 → Dedup 0 → NER 0 → Enqueued 0
1. Extracted59
2. After dedup0 (None)
3. After NER0 ()
4. Enqueued0 ()
Garbage collection (computer science)
NameGarbage collection
ParadigmMemory management
DesignerJohn McCarthy
First appearedLisp, 1959
Influenced byAutomatic memory management
InfluencedJava, C#, Python, JavaScript

Garbage collection (computer science). In computer science, garbage collection is a form of Automatic memory management that automatically reclaims memory occupied by objects that are no longer in use by a program. This process relieves the programmer from the manual burden of Memory management and helps prevent issues like Memory leaks and Dangling pointers. The concept was first implemented in the Lisp programming language by John McCarthy in 1959 and has since become a cornerstone of many modern languages including Java, C#, and Python.

Overview

Garbage collection operates as a subsystem within a program's Runtime system, periodically identifying and deallocating memory that is no longer reachable from the program's Root set. This automation contrasts with manual memory management techniques used in languages like C and C++, which require explicit calls to functions like `malloc` and `free`. The primary benefits include increased Software reliability and developer productivity, though it may introduce unpredictable pauses due to Stop-the-world collection cycles. Major implementations are found in the Java virtual machine, the .NET Framework's Common Language Runtime, and the V8 engine for JavaScript.

Basic concepts

The fundamental principle is distinguishing live objects from garbage based on Reachability from a set of root references, such as Global variables and stack frames. An object graph is formed from these roots through pointer references. Key terms include the mutator, which is the application program altering the object graph, and the collector, which performs the reclamation. Critical invariants must be maintained, such as preserving all Live data while allowing the reclamation of Cyclic references, which can be problematic for some collection algorithms.

Tracing garbage collection

Tracing collectors, the most common variety, work by traversing the object graph from the roots to mark all reachable objects. Unmarked objects are then considered garbage. The classic algorithm is Mark-and-sweep, developed for Lisp, which marks live objects and then sweeps through memory to reclaim unmarked space. The Copying garbage collection algorithm, used in environments like the Smalltalk system, divides memory into two spaces and copies live objects from one to the other, compacting them. The Cheney's algorithm is a well-known breadth-first copying method. The Tri-color marking abstraction is a common model for understanding these tracing processes.

Reference counting

Reference counting is an alternative approach where each object stores a count of the number of references to it. This count is managed by the Compiler or Runtime system, being incremented when a reference is created and decremented when one is destroyed. Pioneered in early systems like the CP/M operating system, it is used in parts of Apple's Cocoa (API) and the Python interpreter for certain types. Its main drawback is the inability to reclaim Cyclic references without auxiliary mechanisms, such as a periodic tracing collector as used in Python.

Generational garbage collection

This technique, based on the Weak generational hypothesis, observes that most objects die young. Memory is divided into generations, such as a young Eden space and an old Tenured generation. The Java virtual machine and .NET Framework's Common Language Runtime employ this strategy. A Minor collection frequently scans the young generation, promoting surviving objects to an older generation. Less frequent Major collection scans the entire heap. This approach reduces pause times and improves throughput, as seen in the HotSpot JVM and the Boehm–Demers–Weiser garbage collector.

Implementation considerations

Implementing a garbage collector requires careful design to minimize latency and Throughput impacts. Techniques like Incremental garbage collection and Concurrent garbage collection, as used in the Java virtual machine's Garbage-first collector, allow the mutator to run concurrently with the collector. Real-time computing systems, such as those using the Real-time Java specification, require Real-time garbage collection with bounded pause times. The choice of algorithm affects memory fragmentation, with compacting collectors like the one in the .NET Framework helping to mitigate it. Performance tuning often involves parameters like Heap size and is a focus of ongoing research at institutions like Microsoft Research and in projects like the OpenJDK.

Category:Memory management Category:Programming language implementation