Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

Outcomes

This year we had four Summer of Code projects. We are very grateful to Google for their support and are pleased that all four students successfully passed. As part of the GSoC, Google also provided Jikes RVM with 4 x $500 which we will use to support the regression testing kit at ANU. Below are short summaries of the outcomes of the four projects. The source code for all these project can be found at http://hg.code.sourceforge.net/p/jikesrvm/soc2011.

1) Tools for debugging the Jikes RVM runtime

My project sought to deliver tools for debugging the Jikes RVM runtime. We decided that implementing the JVM Tools Interface (JVMTI) would be a good way of doing this, allowing both application-level debugging (through a JDWP agent) and low-level access to VM internals. The end result of the project was a partially working implementaion of JVMTI, with most of the infrastructure in place to complete an implementation.

JVMTI allows us to write a native (C/C++) agent to interact with the VM at runtime, responding to events and modifying VM state. This allows us to do things at runtime such as redefine bytecode for classes, hook method entry and exit and field access/modification, track heap allocations, etc - essentially, instrument the VM in fairly arbitrary ways. It can also allow us to look at lower level VM areas, inspecting the stack and in some instances being able to access VM state.

The major stumbling blocks related to threading and monitors and the JVMTI implementations thereof. This, combined with the sheer size of the specification (even after reducing it to simpler test cases) mean only a partially operating implementation is available. I intend to revisit these later this month when coursework dies down again, with the goal of getting it to a state where it can be committed to the head. This does not necessarily mean it will be a fully functional and compliant implementation of the specification, but the goal is to have most of Harmony's implementation of a JDWP agent using JVMTI operational. This will allow some form of interactive debugging at the application level (using e.g. the Eclipse debugger), but the parts of JVMTI implemented to make this happen will also allow other, lower level tasks to be completed. For now, my work is on a branch in our soc2011 repository at http://hg.code.sourceforge.net/p/jikesrvm/soc2011, together with a readme describing the future directions further.

The implementation itself, in the org.jikesrvm.runtime.jvmti package, consists of:

  • Infrastructure to allow arbitrary native agents (as shared libraries, the same as JNI) to be loaded and managed (JVMTI, JVMTIAgent)
  • Plumbing for the VM to send events to agents through a small JNI trampoline (JVMTIEventTrampoline)
  • Three events are actually implemented - VMStart (during boot), VMInit (immediately before application main), ThreadStart. Others are mostly a copy/paste job, then inserting the call in the appropriate place
  • Missing: proper handling of events a normal JVM would fire that we initialise instead in the boot image
  • Roughly 65 of the 155 JVMTI functions implemented, including most relating to methods, classes, fields (JVMTIFunctions)
  • Key parts missing: breakpoints, set/get local vars, early returns, stackframe-related stuff
  • Ability for agents to spawn threads running arbitrary native code (not fully functional)

Also included, in the test-agents directory, are some simple JVMTI agents used to test various parts of the implementation for functionality.

I've greatly appreciated the opportunity to work on Jikes RVM, especially because I came to Summer of Code having already used it for my own research work. In particular, I'm grateful to Richard for managing our involvement in the program, and to Steve, Daniel and Dave for their advice and willingness to answer my silly questions.

James Bornholt

2) An Efficient Mark & Sweep Concurrent Snapshot-Based And On The Fly Sliding-Views based Garbage Collectors For Jikes RVM

The project's code is in branch RVM-923-Snapshot. The project is an implementation of a concurrent garbage collector, based on the Snapshot algorithm for garbage collection, and it is intended as a first step towards the on-the-fly Sliding-views based algorithm. 

Both algorithms are described in the paper "An on-the-fly mark & sweep garbage collector based on sliding-views" by Aztachi, Levanoni, Paz & Petrank, OOPSLA 2003. The idea behind the implemented (snapshot) algorithm is to take a 'snapshot' of the heap, and trace objects within it. During the mark phase, if an attempt is made to modify an object before it has been traced, we first take a replica of its pointers and save them in an auxiliary buffer (this is done only once per object).  The algorithm uses two mark bits for every object, enabling four possible colors. This is essential to support the color-toggle mechanism, that swaps the black and white colors before every collection cycle. 

The new collector is implemented in the org.mmtk.concurrent.marksweep folder, but naturally many other classes were added and or edited in the MMtk outside this folder. Note that this folder also exists in the trunk, but it contains a different algorithm. The algorithm in the trunk, when attempting to modify a pointer, first traces through the target object (the object that a pointer towards it is going to be overwritten), instead of replicating the source object (the object that contains the pointer going go be overwritten), like the "new" snapshot algorithm. This "older" implementation is technically simpler, but has several drawbacks, mainly in locality (which leads to a less efficient write-barrier) and that statistically the barrier has to scan more objects, according to most benchmarks (which is again, less efficient). Also this "older" algorithm doesn't fit as well as a basis for the on-the-fly sliding-views algorithm. 

The new collector was not yet thoroughly tested, and is not yet ready for practical usage. It was tested so far only in the MMTk.harness environment. The algorithm successfully passes all the  MMTk.harness tests. There is, however, a race condition that occurs in about 1 in 1000 GC cycles, that results in a deadlock. This is currently the only known bug, but other bugs are likely to exists as well, due to insufficient testing.

The implemented collector currently works with concurrent tracing, but the sweeping is still done in the "Stop The World" mode. This should be modified, to support full concurrency. Some efficiency issues should also be addressed, as the implementation has not been optimized yet. Most notably, a better mechanism should be applied to decide when to reclaim an entire block of memory. The regular (not concurrent) mark-sweep algorithm, does this by using a mark bit for each block. This solution currently can't be used simultaneously with the 'preserveFreeLists' option, since the BlockAllocator uses the same Word field both for the mark of the block and for the free-list head. Currently, the concurrent mark-sweep algorithm determines if a block can be reclaimed by checking every one of its cell, but this is clearly not efficient. The best solution is probably to have the BlockAllocator use the least significant bit of the Word only as a mark bit for the block, and accordingly change its code a little to allow it to simultaneously support both the preserveFreeList and mark bit for blocks.

Shahar Timnat

3) New young generation architecture for the generational garbage collector

The project is accessible via Mercurial at http://hg.code.sourceforge.net/p/jikesrvm/soc2011 under the branch RVM-924-GARY46.

The goal of the project was to compare different young generation models in one virtual machine. The young generation organisations compared were Steps (e.g. as used by the GHC Haskell system), Eden and Survivor spaces (as used by HotSpotO and the default “en masse” promotion model (e.g. only one space in the nursery).

In order to build new young generation models, the young generation has been redesigned.  The objective is to allow the use of new nursery without duplicating the old generations for each model. Garbage collectors in JikesRVM are built using hierarchical dependencies which simplifies the creation of new collectors but increase the complexity when tweaking with the middle components. The project modified the young generation to use delegate classes to manage the nursery’s spaces. This is similar to the use of delegates for the tracing (e.g. MatureTraceLocal, NurseryTraceLocal). Each class of the young generation uses a delegate because they are all affected by modifications to the nursery. This has a cost and the default “en masse” promotion model runs 20% slower than the original design.  Results from the Steps and Eden + Survivor models, show that they are 20% slower than de “en masse” promotion, therefore 40% slower than the original design. In the current implementation, the increase of space usage also triggers an error with small amount of memory. This error is also present in the original design but is triggered more often with the new young generations. The OutOfMemoryException is not always thrown, Exceptions from the Main Thread or the benchmark running can be expected. This is not a normal behavior. Hopefully, the work on the young generation will help prevent this bug from happening. The project is at a stage where it needs to be greatly optimized, 20 to 40 % speed difference with the original design is excessive. For example, remsets might be replaced by card tables and it might be possible for the delegates to take more advantage of the optimizing compiler.

Gary Rampaud

4) Multi-Method Compilation

Multi-Method is a technique where one or more methods are put on a thread to be compiled in parallel. The granularity is coarse when compared to single method. Currently Pre Compile [1], Hot Method Recompilation [2], and Boot [3] have been modified to utilize Multi-Method . The parallel Multi-Method is the most stable of all three and the parallel Boot and Pre-Compile need to be improved.

Why Multi-Thread PreCompile [1,15], Hot Method Recompilation [2,4], and Boot [3,14] ?

Hot Method Recompilation [2] – The original design of recompilation of the hot methods in the current Jikes RVM head made it very easy to multi-thread due to the fact that recompilation of hot methods was already controlled by one daemon thread running in the background constantly waiting for hot methods [6] and recompiling them in the background. All that needed to be done to utilize parallel processing was just spawn more Compilation Threads [5] and add a command line argument to specify the number of threads the user desires. It is important to remember that multi-threaded recompilation of hot methods will only be effective when the user's software or benchmark produces a certain amount of hot methods to be recompiled, otherwise there will be no speedup due to thread overhead or even worse a slow down. One can tune Jikes RVM recompilation heuristics in the following source code files CompilerDNA.java [16,17],AnalyticModel.java [6,18], to increase the number of hot methods in the recompilation queue.

Command to run with N threads: ./rvm -X:aos:num_threads=2 -cp scimark2lib.jar jnt.scimark2.commandline (eg with scimark2lib.jar)

Pre Compile [1] – Pre Compile's core method that controls compilation is embarrassingly parallel because it uses a For loop to iterate over each Method. Currently due to time constraints of Google Summer the multi-threaded PreCompile method is in an non-optimized version 0 state and is currently seeing overhead but the overhead is minimal because it is causing neither a speed up nor a slow down in other words the running time is the same. However, that can be easily fixed by sending groups of data onto a thread at once preferably 1/2 methods for thread one and 1/2 methods for thread two on a two thread system which is an optimization that was used to help improve multi-threade Single-Method [8,9,10,11,12,13].

Command to run and generate advice file: ./rvm -X:aos:enable_advice_generation=true -cp dacapo-2006-10-MR2.jar Harness xalan

Command to run and precompile advice file with N threads: ./rvm -X:aos:enable_precompile=true -X:aos:pre_compile_threads=2 -X:aos:cafi=aosadvice.ca -cp dacapo-2006-10-MR2.jar Harness xalan

Boot [3] – The main advantage to multi-threading the boot [3] source file is because in the Jikes RVM the Boot code compiles around 100 methods every time. The beauty in this multi-threaded optimization is the speedup achieved by this optimization is independent of the benchmark or program the user is running. Unfortunately this optimization is still in a pre version 0 phase, however I hope to polish it up in the near future as continued post GSOC work. The ordering of the class initialization becomes non-deterministic due to arbitrary thread inter-leavings. This causes problems when Class A is required before Class B and Class A and Class B are on two different threads.

Single-Method Compilation [7]

Single-Method parallel compilation is a technique where each method is optimized using multi-threaded compiler optimization. Occasionally when the compiler needs to optimize and compile large methods (around 14,000 Instructions on Core i5) it is better to dedicate all the available threads to one method. For this project we focused on flow insensitive optimization's [7] in the jikes rvm as a baseline implementation. In our experiments we multi-threaded insruction level and register level optimizations and the next logical step for future work is to experiment with block level optimization's and flow sensitive optimization's.

Current Optimizations that are Mutli-threaded for Single-Method

  • Sort Commutative Register Uses[8,12]
  • Copy Propagation[9]
  • Array Propagation [10]
  • Type Propagation [11]
  • RecomputeSSA [13]

Single-Method [7-13] Future Work

  • Determine if any form of compiler optimization in the jikes rvm can benefit greatly from Single-Method
  • Possibly use the power of the GPU or the Cell processor to accelerate Single Method

Multi-Method [1,2,3] Future Work

  • Tune the compiler DNA to allow for more efficient multi-threaded recompilation of hot methods
  • Evaluate if any other locations in the Source Code can benefit from multi-method compilation

References

[1] http://hg.code.sourceforge.net/p/jikesrvm/soc2011/rev/e3397f4b5006
[2] http://hg.code.sourceforge.net/p/jikesrvm/soc2011/rev/fed20f02f806
[3] http://hg.code.sourceforge.net/p/jikesrvm/soc2011/rev/1c6019198603
[4] http://parallelcompiler.blogspot.com/2011/07/new-design-mutlimethod.html
[5] http://hg.code.sourceforge.net/p/jikesrvm/soc2011/file/fed20f02f806/rvm/src/org/jikesrvm/adaptive/recompilation/CompilationThread.java
[6] ftp://ftp.cs.man.ac.uk/pub/apt/theses/ChristosKotselidis_MSc.pdf
[7] http://hg.code.sourceforge.net/p/jikesrvm/soc2011/file/b3a96880313a/rvm/src/org/jikesrvm/compilers/opt/Simple.java
[8] http://parallelcompiler.blogspot.com/2011/07/version-1-sortcommutative-registeruses.html
[9] http://hg.code.sourceforge.net/p/jikesrvm/soc2011/rev/a5e56291dbf0
[10] http://hg.code.sourceforge.net/p/jikesrvm/soc2011/rev/362d399161e5
[11] http://hg.code.sourceforge.net/p/jikesrvm/soc2011/rev/84029a70dce4
[12] http://hg.code.sourceforge.net/p/jikesrvm/soc2011/rev/066d79055dbc
[13] http://hg.code.sourceforge.net/p/jikesrvm/soc2011/rev/ab1c9ed8e2dd
[14] http://parallelcompiler.blogspot.com/2011/08/parallel-boot.html
[15] http://parallelcompiler.blogspot.com/2011/08/parallel-precompile.html
[16] http://docs.codehaus.org/display/RVM/Compiler+DNA
[17] http://hg.code.sourceforge.net/p/jikesrvm/jikesrvm/file/c6bfce68e447/rvm/src/org/jikesrvm/adaptive/recompilation/CompilerDNA.java\
[18] http://hg.code.sourceforge.net/p/jikesrvm/jikesrvm/file/c6bfce68e447/rvm/src/org/jikesrvm/adaptive/controller/AnalyticModel.java
http://hg.code.sourceforge.net/p/jikesrvm/soc2011/rev/ab1c9ed8e2dd

http://hg.code.sourceforge.net/p/jikesrvm/soc2011/rev/ab1c9ed8e2dd

http://hg.code.sourceforge.net/p/jikesrvm/soc2011/rev/ab1c9ed8e2dd

http://hg.code.sourceforge.net/p/jikesrvm/soc2011/rev/ab1c9ed8e2dd

http://hg.code.sourceforge.net/p/jikesrvm/soc2011/rev/ab1c9ed8e2dd

http://hg.code.sourceforge.net/p/jikesrvm/soc2011/rev/ab1c9ed8e2dd

Michael Taylor

Call

The mentoring organization application deadline for the Google Summer of Code 2011 is March 11. We have had some really excellent contributions to JikesRVM come out of previous Summers of Code, and we intend to apply again. For this we need a strong set of proposals, so please take the time to look at the ones below and offer some more.

FAQs etc

Timeline

Student application template

Previous years

Proposals
Anchor
proposals2011
proposals2011

The projects are organised into the major components of Jikes RVM that they effect: Infrastructure, Garbage Collector (MMTk), Compilers and the Runtime System. We also have other projects that relate to Jikes RVM.

Jikes RVM infrastructure proposals

Enable Jikes RVM to use OpenJDK class libraries

We believe the largest limitation in the set of programs that can be run on Jikes RVM is that Jikes RVM primarily relies on the GNU Classpath class libraries. The most viable library plan for Jikes RVM is to switch to using Open JDK libraries. The work in switching is primarily in the VM/library interface and in the "core" classes in java.lang. This project for be of significant value to the research community because it will widen the set of benchmarks that can be used for Jikes RVM research.

Interested mentors: Daniel Frampton

Get the new DaCapo benchmarks (dacapo-9.12-bach) running on Jikes RVM.

A number of the new DaCapo benchmarks don't run on Jikes RVM. This project would be to figure out why each failing program is failing and get make fixes to Jikes RVM and/or its class libraries to make the programs work. This would be of significant value to the Jikes RVM research community.

Interested mentors: Steve Blackburn

Garbage Collector (MMTk) proposals

Concurrent Garbage Collectors

MMTk has a basic concurrent garbage collector and Laurence Hellyer added Sapphire as part of GSoC 2010. However, we would like to have more, for example, Concurrent immix, the Doliguez-Leroy-Gonthier algorithm (POPL93, POPL94, ISMM00), Azatchi et al (OOPSLA03), or mostly concurrent (Boehm91, Barabash et al 2005).

Interested mentors: Tony Hosking, Erez Petrank, Daniel Frampton, Eliot Moss, Richard Jones

Generational Collector Organisation

Currently, the generational collectors in Jikes RVM use en masse promotion, i.e. all surviving objects in the nursery are promoted to the mature generation. Other VMs have adopted more complex generational organisations that require objects to survive more nursery collections before they are promoted. For example, HotSpot divides the young generation into an Eden (where objects are created) and Survivor semispaces. Implementations for functional languages have typically organised generations into steps. This project would implement some of these organisations.

Interested mentors: Richard Jones, Eliot Moss

Garbage Collection Visualisation

The GCspy framework has become slightly out-of-date and needs improving [RVM-388].

Interested mentors: Richard Jones

The Compressor Mark-Compact Garbage Collector

This project would add the Compressor Mark-Compact algorithm to MMTK's suite of collectors, starting with a stop-the-world implementation
[RVM-401].

Interested mentors: Richard Jones, Daniel Frampton

The MMTk test harness tools

There are several possible projects here:{

  • Implement a weak memory model, and/or tools for checking data races in
    the simulated memory.
  • Implement a mechanism for running Java applications through the harness.
    The most tractable approach might be to use bytecode rewriting to manage
    the simulated MMTk heap as a shadow of the Java heap.
  • New invariants/checks/tests to detect errors in GC implementations. This could be motivated by a GC implementation project (eg MC^2).

Interested mentors: Robin Garner, Daniel Frampton

Separate Heap For VM objects

In most JVMs there is no confusion between memory allocated for the application and memory allocated by the running of the VM itself (for example a call to malloc() within the JIT). However, in Jikes RVM, the VM and the application are both written in Java. Moreover, they currently share the same heap. It would be very desirable to improve this situation and separately allocate VM and application objects. Aside from cleaner accounting and behaving more like a production JVM, there may be opportunities for performance optimizations since the lifetimes of objects created by the JIT will typically be bounded by the invocation of a single compilation, as an example.

This project would start by identifying all transitions from the application into the VM proper and channeling all such transitions through a zero-cost "trap", which simply serves as a marker. The trap can be viewed as analogous to a kernel trap in the OS setting. The project would also involve writing a simple checking routine which would walk the stack and determine whether execution was currently within the VM or application context. The combination of these mechanisms could then be used to identify and verify all application<->VM transitions.

Interested mentors: Andreas Sewe

Compiler proposals

Multi-core parallel compilation services

This project would add support to Jikes RVM runtime compilation to enable parallel compilation of methods in two different modes. Mode 1 (as at present) - individual methods are compiled sequentially in a single thread. Multiple methods could be compiled in parallel in separate threads. Mode 2 (new feature) - an individual method could be compiled using parallel data flow analysis in multiple threads. This would benefit compilation for large methods. There is an interesting research project in working out the trade-off between these two modes of parallel compilation. The GSoC project would involve generating the infrastructure to carry out this research project.

Interested mentors: Jeremy Singer

Port of the opt compiler to x86_64

The opt compiler is completely ported to ppc64, so this would require a similar piece of work for the x86 code generator. If the baseline compiler is working on x86_64, then the assembler work has already been done, so porting the opt compiler actually should be entirely doable by someone who is comfortable with lowlevel x86/x86_64 debugging.

Interested mentors: Tony Hosking, Eliot Moss

Improved register allocator for x86 and x86_64

The current linear scan allocator works best with register rich architectures. This project would look at providing a better allocator for architectures that provide fewer registers.

Interested mentors: Tony Hosking, Eliot Moss

GIST

The CoGenT project at UMass aims to make it easier to generate compiler components given descriptions of compiler IRs and of target machines. One of its tools is a generator of instruction selectors; we call it GIST. We have GIST somewhat working to generate the target dependent part of the Jikes RVM Baseline compiler.

  • Rounding out automatic generation of the instruction selector ("code generator") for the
    Baseline compiler.
  • Developing automatic generation of BURS rules (which are quite similar to the patterns that
    GIST finds anyway) for the Jikes RVM Opt compiler.

{We would first create code generators for existing targets (x86 and PowerPC) to validate the work, and could then proceed to new target (64-bit x86, and look at new platforms, though a complete platform port has somewhat broader scope).

  • Generating peephole optimizers using GIST is also a possibility, though at this stage perhaps more challenging.
  • Creating a quality "generic" register allocator for the Opt compiler that a GIST-related tool could target would be an excellent project. This could work along the lines of "A Generalized Algorithm for Graph-Coloring Register Allocation" by Smith, Ramsey, and Holloway (PLDI, 2004), possibly with coalescing as well.
  • It would also be attractive to follow a similar plan for generating linear-scan register allocators for the Jikes RVM Opt compiler from machine descriptions.
  • Finally, GIST, augmented with suitable cost models for each target, could be used to generate
    instruction schedulers automatically.

Any of these projects would advance both Jikes RVM and the CoGenT tools.

Interested mentors: Eliot Moss and Chip Weems

Object splitting

This project involves a profiling infrastructure to measure field hotness, changes to the JIT compilers to deal with accesses to cold fields, and changes to the garbage collector(s) to allocate cold fields out of the way of hot objects. Given such an infrastructure, future work could then study alternative implementations, or ideas such as the lazy allocation of the cold fields.

Interested mentor: Matthias Hauswirth

Runtime Services

Extending VM Magic

Jikes RVM uses an extended variant on Java, which includes special types and intrinsic methods that are essential to the internals of the VM but that cannot be expressed in regular Java. For example, we implement an Address type, which corresponds roughly to C's void*. This type has a magical size (the size of an architectural word), and magic semantics (for example you can de reference it and even prefetch on it). None of these semantics could be implemented with regular Java.

An important feature of VM Magic which has not been properly implemented is compound types (roughly equivalent to C's struct types). Such types would be very useful within Jikes RVM because they represent compound types without the overhead and restriction of an object header. This project will implement compound types for Jikes RVM.

A useful reference is the 2009 paper describing VM Magic, including some of these unimplemented features.

Interested mentors: Steve Blackburn, Daniel Frampton, Eliot Moss

Support (a subset of) the JVM Tool Interface (JVMTI)

The JVMTI is large. This project would concentrate on providing support for GC-related JVMTI functionality.

Suggested by: Suriya Subramanian
Interested mentors: Steve Blackburn, Richard Jones, Daniel Frampton, Eliot Moss

Support for jmap, jhat, jstack

This project would provide support for JVM debugging facilities such as jmap which dumps an application's heap, jhat which examines the dumped heap, and jstack which dumps an application's stack.

Suggested by: Suriya Subramanian
Interested mentors: Steve Blackburn, Daniel Frampton

Integration with Maxine VM's Inspector

The Maxine Inspector plays many roles:

  • object, class, and method browser;
  • code views include disassembled machine code, disassembled bytecode, and source;
  • low-level debugger (imagine gdb or dbx) with visual displays of threads, registers, stacks, * stack frames, thread local values, breakpoints, memory watchpoints, etc.;
  • intermediate representation debugger; etc.

This project would provide an interface to the Inspector.

Suggested by: Suriya Subramanian
Interested mentors: Steve Blackburn, Daniel Frampton

Probes for DTrace

DTrace allows user-space code to add so-called USDT (User-level Statically Defined Tracing) probes (see example below). It would be useful to be able to use DTrace to correlate system behavior (e.g. scheduling, thread migration, ...) with behavior of the Jikes RVM (e.g. locking, GC, ...).

Interested mentors: Matthias Hauswirth