One of the services that MMTk expects a virtual machine to perform on its behalf is the scanning of objects, i.e. identifying and processing the pointer fields of the live objects it encounters during collection. In principle the implementation of this interface is simple, but there are two moderately complex optimizations layered on top of this.
From MMTk's point of view, each time an object requires scanning it passes it to the VM, along with a TransitiveClosure object. The VM is expected to identify the pointers and invoke the processEdge method on each of the pointer fields in the object. The rationale for the current object scanning scheme is presented in this paper.
JikesRVM to MMTk Interface
MMTk requires its host virtual machine to provide an implementation of the class org.mmtk.vm.Scanning as its interface to scanning objects. JikesRVM's implementation of this class is found under the source tree MMTk/ext/vm/jikesrvm, in the class org.jikesrvm.mm.mmtk.Scanning. The methods we are interested in are scanObject(TransitiveClosure, ObjectReference) and specializedScanObject(int, TransitiveClosure, ObjectReference).
In MMTk, each plan defines one or more TransitiveClosure operations. Simple full-heap collectors like MarkSweep only define one TransitiveClosure, but complex plans like GenImmix or the RefCount plans define several. MMTk allows the plan to request specialized scanning on a closure-by-closure basis, closures that are specialized call specializedScanObject while unspecialized ones call scanObject. Specialization is covered in more detail below.
In the absence of hand-inlined scanning, or if specialization is globally disabled, scanning reverts to the fallback method in org.jikesrvm.mm.mminterface.SpecializedScanMethod. This method can be regarded as the basic underlying mechanism, and is worth understanding in detail.
This code fetches the array of offsets that JikesRVM uses to identify the pointer fields in the object. This array is constructed by the classloader when a class is resolved.
One distinguished value (actually null) is used to identify arrays of reference objects, and this block of code scans scalar objects by tracing each of the fields at the offsets given by the offset array.
The other case is reference arrays, for which we fetch the array length and scan each of the elements.
The internals of trace.processEdge vary by collector and by collection type (e.g. nursery/full-heap in a generational collector), and the details need not concern us here.
Hand inlining was introduced in February 2011, and uses a cute technique to encode 3 bits of metadata into the TIB pointer in an object's header. The 7 most frequent object patterns are encoded into these bits, and then special-case code is written for each of them.
Hand inlining produces an average-case speedup slightly better than specialization, but performs poorly on some benchmarks. This is why we use it in combination with specialization.
Specialized Scanning was introduced in September 2007. It speeds up GC by removing the process of fetching and interpreting the offset array that describes each object, by jumping directly to a hard-coded method for scanning objects with a particular pattern.
The departure point from "standard" java into the specialized scanning method is
SpecializedScanMethod.invoke(...), which looks like this
@SpecializedMethodInvoke annotation signals to the compiler that it should dispatch to one of the specialized method slots in the TIB
Creation of specialized methods is handled by the class