Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 10 Next »

*** Work in progress ***

This page gives a brief outline of the major control flows in the execution of a garbage collector in MMTk.  For simplicity, we focus on the MarkSweep collector, although much of the discussion will be relevant to other collectors. 

This page assumes you have a basic knowledge of garbage collection, for those that don't, please see one of the standard texts such as The Garbage Collection Handbook. 

Structure of a Plan

An MMTk Plan is required to provide 5 classes.  They are required to have consistent names which start with the same name and have a suffix that indicates which class it inherits from. in the case of the MarkSweep plan, the name is "MS".

  • MS - this is a singleton class that is a subclass of org.mmtk.plan.Plan.   This class encapsulates data structures that are shared among multiple threads.
  • MSMutator - subclass of org.mmtk.plan.MutatorContext.  This class encapsulates data structures that are local to a single mutator thread.  In the case of Jikes RVM, a Thread is actually a subclass of this class for efficiency reasons.
  • MSCollector - subclass of org.mmtk.plan.CollectorContext.  This provides thread-local data structures specific to a garbage collector thread.
  • MSConstraints - subclass of org.mmtk.plan.PlanConstraints.  This provides configuration information that the host virtual machine might need.  It is separated out from the Plan class in order to prevent circular class loading dependencies.
  • MSTraceLocal - subclass of org.mmtk.plan.TraceLocal.  This provides thread-local data structures specific to a particular way of traversing the heap.  In a simple collector like MarkSweep, there is only one of these classes, but in more complex collectors there may be several.  For example, in a generational collector, there will be one TraceLocal class for a nursery collection, and another for a full-heap collection.
The basic architecture of MMTk is that virtual address space is divided into chunks (of 4MB in a 32-bit memory model) that are managed according to a specific policy.  A policy is implemented by an instance of the Space class, and it is in the policy class that the mechanics of a particular mechanism (like mark-sweep) is implemented.  The task of a Plan is to create the policy (Space) objects that manage the heap, and to integrate them into the MMTk framework.  
MMTk exposes some of this memory management policy to the host VM, by allowing the VM to specify an allocator (represented by a small integer) when allocating space.  The interface exposed to the VM allows it to choose whether an object will move during collection or not, whether the object is large enough to require special handling etc.  The MMTk plan is free (within the semantic guarantees exposed to the VM) to direct each of these allocators to a particular policy.

Policies

A policy describes how a range of virtual address space is managed.  The base class of all policies is org.mmtk.policy.Space, and a particular instance of a policy is known generically as a space.  The static initializer of a Plan and its subclasses define the spaces that make up an MMTk plan.  

 

MS.java

In this code fragment, we see the MS plan defined.  Note that we generally also define a static final space descriptor.  This is an optimization that allows some rapid operations on spaces.

Space is a global object, shared among multiple mutator threads.  Each policy will also have one or more thread-local classes which provide unsynchronized allocation.  These classes are subclasses of org.mmtk.utility.alloc.Allocator, and in the case of MarkSweep, it is called MarkSweepLocal.  Instances of MarkSweepLocal are created as part of a mutator context, like this

MSMutator.java

The design pattern is that the local Allocator will allocate space from a thread-local buffer, and when that is exhausted it will allocate a new buffer from the global Space, performing appropriate locking.  The constructor of the MarkSweepLocal specifies the space from which the allocator will allocate global memory.

Allocation

MMTk provides two methods for allocating an object.  These are provided by the MSMutator class, to give each plan the opportunity to use fast, unsynchronized thread-local allocation before falling back to a slower synchronized slow-path.

The version implemented in MarkSweep looks like this:

MSMutator.java

The basic structure of this method is common to all MMTk plans.  First they decide whether the operation applies to this level of abstraction (if (allocator == MS.ALLOC_DEFAULT)), and if so, delegate to the appropriate place, otherwise pass it up the chain to the super-class.  In the case of MarkSweep, MSMutator delegates the allocation to its thread-local MarkSweepLocal object ms.

The alloc method of MarkSweepLocal is inherited from SegregatedFreeListLocal (mark-sweep is not the only way of managing free-list allocation), and looks like this

SegregatedFreeListLocal.java (simplified)

This is a standard pattern for thread-local allocation: first we look in the thread-local space (line 3), and if successful return the result (lines 4-8).  If unsuccessful, we request space from the global policy via the method Allocator.allocSlow.  This is the common interface that all Allocators use to request space from the global policy.  This will eventually call the allocator-specific allocSlowOnce method.  The workings of the allocSlowOnce method are very policy-specific, so not appropriate to look at at this stage, but eventually all policies will attempt to acquire fresh virtual memory via the Space.acquire method.

Space.acquire is the only correct way for a policy to allocate new virtual memory for its own use.  

Space.java (simplified)

The logic of space.acquire is:

  • First, poll the plan to find out whether the heap is full.  This logic is performed by the plan, because it has knowledge of copy reserves etc.
  • The 'poll' method will request a GC if required, and return true if it has done so.
  • Then we wait for GC if required.  'poll' can't wait, because it is called in circumstances that aren't GC safe.
  • If Plan.poll(...) returns false (we are within the allowed heap size), we call pr.getNewPages to allocate virtual memory.  At this stage we can find that we have run out of virtual memory, and if so, we force a GC
  • If a GC is performed, we return Address.zero(), rather than retrying locally.  In many plans, the next allocation request will be satisfied by re-using space in a page that already belongs to a policy, so the post-GC allocation must be performed further up in the call stack.  The retry logic is handled in Allocator.allocSlowInline.

 

Allocator.java (simplified)

This code fragment shows the retry logic in the allocator.  We try allocating using allocSlowOnce, which may recycle partially-used blocks and eventually call Space.acquire.  If a GC occurred, we try again.  Eventually the plan will request an emergency collection which will (for example) cause soft references to be dropped.  If this fails we throw an OutOfMemoryError.

Collection

Scheduling

In a stop-the-world garbage collector like MarkSweep, the mutator threads run until memory is exhausted, then all mutator threads are suspended, the collector threads are activated, and they perform a garbage collection.  After the GC is complete, the collector threads are suspended and the mutator threads resume.  MMTk also has some support for concurrent collectors, in which one or more collector threads can be scheduled to run alongside the mutator, either exclusively or in addition to (hopefully briefer) stop-the-world phases. 

Thread scheduling in MMTk is handled by a GC controller thread, implemented in the singleton class org.mmtk.plan.ControllerCollectorContext  held in the static field Plan.controlCollectorContext. Whenever a collection is initiated, it is done by calling methods on this object.

Initiating

As mentioned above, every attempt to allocate fresh virtual memory calls the current plan's poll(...) method.  This initiates a GC by calling controlCollectorContext.request(), which in a stop-the-world collector like MarkSweep pauses the mutator threads and then wakes the collector threads.  The main loop of the garbage collector is simply the run() method of ParallelCollector, shown below.

ParallelCollector

The collect() method is specific to the type of collector, and in StopTheWorldCollector it looks like this

StopTheWorldCollector

Collector Phases

Every garbage collection consists of a series of steps.  Leaving aside concurrent collectors for the moment, these 

  • No labels