Eric Bruno is a contributing editor for Dr. Dobb's. He can contacted at [email protected]
At JavaOne 2009, Sun released Java SE 6 Update 14, which included a version of the much-anticipated Garbage First (G1) garbage collector. G1 is a low-pause, low-latency, sometimes soft real-time, collector that allows you to set max pause time goals and collection intervals through suggestions on the Java VM command line. Although it cannot guarantee it, G1 will attempt to meet your goals, and hence introduce as little latency as possible into your application. This in turn may also make the VM run more predictably as it attempts to meet the pause time goals you provide.
What Is Garbage Collection?
Many dynamic languages, such as C, C++, Pascal, and so on, require you to manage memory explicitly. This includes memory allocation, de-allocation, and all of the accounting that occurs in between. In this time frame, you must be sure to not lose track of the memory (thereby failing to ever free it), or the result will be a memory leak. Just as dangerous is the attempt to use an object (or access memory) after it has been de-allocated, through what is called a dangling pointer. Either one of these situations can result in undefined behavior, the accidental overwriting of other data, a security hole, or an abrupt crash.
Automatic memory management (garbage collection) removes the likelihood that these issues will occur since it's no longer left up to you to account for memory allocations. In C++, the concept of smart pointers is one solution, and in other languages, such as Lisp, SmallTalk, and Java, a full-featured garbage collector tracks the lifetimes of all objects in a running program. The history of garbage collection can be traced back to John McCarthy, who invented the concept as part of the Lisp programming language [McCarthy58].
In short, a garbage collector works to reclaim areas of memory within an application that will never be accessed again. At the most fundamental level, garbage collection involves two deceivingly simple steps:
Determine which objects can no longer be referenced by an application. This is done either via object reference counting, or object graphs (tracing). Reclaim the memory occupied by dead objects (the garbage).
Until recently, Java SE came with two main collectors: the parallel collector, and the concurrent-mark-sweep (CMS) collector -- see the sidebar Parallelism and Concurrency. As of the latest Java SE 6 update release, the G1 collector is another option. The plan is for G1 to eventually replace CMS as a low-pause, soft real-time collector. Let's take a look at how it works.
Parallelism and Concurrency
When speaking about garbage collection algorithms, parallelism describes the collector's ability to perform its work across multiple threads of execution. Concurrency describes its ability to do work while application threads are still running. Hence, a collector can be parallel but not concurrent, concurrent but not parallel, or both parallel and concurrent.
The Java parallel collector (the default) is parallel but not concurrent as it pauses application threads to do its work. The CMS collector is parallel and partially concurrent as it pauses application threads at many points (but not all) to do its work. The G1 collector is fully parallel and mostly concurrent, meaning that it does pause applications threads momentarily, but only during certain phases of collection. For more information on garbage collection, and common algorithms used, read my latest book entitled Real-Time Java Programming with Java RTS, available from Pearson Publishing.