Skip to end of metadata
Go to start of metadata

The GCspy Heap Visualisation Framework

GCspy is a visualisation framework that allows developers to observe the behaviour of the heap and related data structures. For details of the GCspy model, see GCspy: An adaptable heap visualisation frameworkby Tony Printezis and Richard Jones, OOPSLA'02. The framework comprises two components that communicate across a socket: a client and a serverincorporated into the virtual machine of the system being visualised. The client is usually a visualiser (written in Java) but the framework also provides other tools (for example, to store traces in a compressed file). The GCspy server implementation for JikesRVM was contributed by Richard Jones of the University of Kent.

GCspy is designed to be independent of the target system. Instead, it requires the GC developer to describe their system in terms of four GCspy abstractions, spaces, streams, tiles and events. This description is transmitted to the visualiser when it connects to the server.

A space is an abstraction of a component of the system; it may represent a memory region, a free-list, a remembered-set or whatever. Each space is divided into a number of blocks which are represented by the visualiser as tiles. Each space will have a number of attributes -- streams -- such as the amount of space used, the number of objects it contains, the length of a free-list and so on.

In order to instrument a Jikes RVM collector with GCspy:

  1. Provide a startGCspyServer method in that collector's plan. That method initialises the GCspy server with the port on which to communicate and a list of event names, instantiates drivers for each space, and then starts the server.
  2. Gather data from each space for the tiles of each stream (e.g. before, during and after each collection).
  3. Provide a driver for each space.

Space drivers handle communication between collectors and the GCspy infrastructure by mapping information collected by the memory manager to the space's streams. A typical space driver will:

  • Create a GCspy space.
  • Create a stream for each attribute of the space.
  • Update the tile statistics as the memory manager passes it information.
  • Send the tile data along with any summary or control information to the visualiser.

The Jikes RVM SSGCspy plan gives an example of how to instrument a collector. It provides GCspy spaces, streams and drivers for the semi-spaces, the immortal space and the large object space, and also illustrates how performance may be traded for the gathering of more detailed information.

Installation of GCspy with Jikes RVM

System Requirements

The GCspy C server code needs a pthread (created in gcspyStartserver() in sys.C) in order to run. So, GCspy will only work on a system where you've build Jikes RVM with config.single.virtual.processor set to 0. The build process will fail if you try to configure such a build.

Building GCSpy

The GCspy client code makes use of the Java Advanced Imaging (JAI) API. The build system will attempt to download and install the JAI component when required but this is only supported on the ia32-linux platform. The build system will also attempt to download and install the GCSpy server when required.

Building Jikes RVM to use GCspy

To build the Jikes RVM with GCSpy support the configuration parameter config.include.gcspy must be set to 1 such as in the BaseBaseSemiSpaceGCspyconfiguration. You can also have the Jikes RVM build process create a script to start the GCSpy client tool if GCSpy was built with support for client component. To achieve this the configuration parameter config.include.gcspy-client must be set to 1.

The following steps build the Jikes RVM with support for GCSpy on linux-ia32 platform.
$ cd $RVM_ROOT
$ ant -Dhost.name=ia32-linux -Dconfig.name=BaseBaseSemiSpaceGCspy -Dconfig.include.gcspy-client=1

It is also possible to build the Jikes RVM with GCSpy support but link it against a fake stub implementation rather than the real GCSpy implementation. This is achieved by setting the configuration parameter config.include.gcspy-stub to 1. This is used in the nightly testing process.

Running Jikes RVM with GCspy

To start Jikes RVM with GCSpy enabled you need to specify the port the GCSpy server will listen on.
$ cd $RVM_ROOT/dist/BaseBaseSemiSpaceGCspy_ia32-linux
$ ./rvm -Xms20m -X:gc:gcspyPort=3000 -X:gc:gcspyWait=true &

Then you need to start the GCspy visualiser client.
$ cd $RVM_ROOT/dist/BaseBaseSemiSpaceGCspy_ia32-linux
$ ./tools/gcspy/gcspy

After this you can specify the port and host to connect to (i.e. localhost:3000) and click the "Connect" button in the bottom right-hand corner of the visualiser.

Command line arguments

Additional GCspy-related arguments to the rvm command:

  • -X:gc:gcspyPort=<port>
    The number of the port on which to connect to the visualiser. The default is port 0, which signifies no connection.
  • -X:gc:gcspyWait=<true|false>
    Whether Jikes RVM should wait for a visualiser to connect.
  • -X:gc:gcspyTilesize=<size>
    How many KB are represented by one tile. The default value is 128.

Writing GCspy drivers

To instrument a new collector with GCspy, you will probably want to subclass your collector and to write new drivers for it. The following sections explain the modifications you need to make and how to write a driver. You may use org.mmtk.plan.semispace.gcspy and its drivers as an example.

The recommended way to instrument a Jikes RVM collector with GCspy is to create a gcspy subdirectory in the directory of the collector being instrumented, e.g. MMTk/src/org/mmtk/plan/semispace/gcspy. In that directory, we need 5 classes:

  • SSGCspy,
  • SSGCspyCollector,
  • SSGCspyConstraints
  • SSGCspyMutator and
  • SSGCspyTraceLocal.

SSGCspy is the plan for the instrumented collector. It is a subclass of SS.

SSGCspyConstraints extends SSConstraints to provide methodsboolean needsLinearScan() and boolean withGCspy(), both of which return true.

SSGCspyTraceLocal extends SSTraceLocal to override methodstraceObject and willNotMoveto ensure that tracing deals properly with GCspy objects: the GCspyTraceLocal file will be similar for any instrumented collector.

The instrumented collector, SSGCspyCollector, extends SSCollector. It needs to override collectionPhase.

Similarly, SSGCspyMutator extends SSMutator and must also override its parent's methodscollectionPhase, to allow the allocators to collect data; and its alloc and postAlloc methods to allocate GCspy objects in GCspy's heap space.

The Plan

SSGCspy.startGCspyServer is called immediately before the "main" method is loaded and run. It initialises the GCspy server with the port on which to communicate, adds event names, instantiates a driver for each space, and then starts the server, forcing the VM to wait for a GCspy to connect if necessary. This method has the following responsibilities.

  1. Initialise the GCspy server: server.init(name, portNumber, verbose);
  2. Add each event to the ServerInterpreter (`server' for short) server.addEvent(eventID, eventName);
  3. Set some general information about the server (e.g. name of the collector, build, etc) server.setGeneralInfo(info);
  4. Create new drivers for each component to be visualised myDriver = new MyDriver(server, args...);

Drivers extend AbstractDriver and register their space with the ServerInterpreter. In addition to the server, drivers will take as arguments the name of the space, the MMTk space, the tilesize, and whether this space is to be the main space in the visualiser.

The Collector and Mutator

Instrumenters will typically want to add data collection points before, during and after a collection by overriding collectionPhase in SSGCspyCollector and SSGCspyMutator.

SSGCspyCollector deals with the data in the semi-spaces that has been allocated there (copied) by the collector. It only does any real work at the end of the collector's last tracing phase, FORWARD_FINALIZABLE.

SSGCspyMutator is more complex: as well as gathering data for objects that it allocated in From-space at the start of the PREPARE_MUTATOR phase, it also deals with the immortal and large object spaces.

At a collection point, the collector or mutator will typically

  1. Return if the GCspy port number is 0 (as no client can be connected).
  2. Check whether the server is connected at this event. If so, the compensation timer (which discounts the time taken by GCspy to ather the data) should be started before gathering data and stopped after it.
  3. After gathering the data, have each driver call its transmit method.
  4. SSGCspyCollector does not call the GCspy server's serverSafepoint method, as the collector phase is usually followed by a mutator phase. Instead, serverSafepoint can be called by SSGCspyMutator to indicate that this is a point at which the server can pause, play one event, etc.

Gathering data will vary from MMTk space to space. It will typically be necessary to resize a space before gathering data. For a space,

  1. We may need to reset the GCspy driver's data depending on the collection phase.
  2. We will pass the driver as a call-back to the allocator. The allocator will typically ask the driver to set the range of addresses from which we want to gather data, using the driver's setRange method. The driver should then iterate through its MMTk space, passing a reference to each object found to the driver's scan method.

The Driver

GCspy space drivers extend AbstractDriver. This class creates a new GCspy ServerSpace and initializes the control values for each tile in the space. Control values indicate whether a tile is used, unused, a background, a separator or a link. The constructor for a typical space driver will:

  1. Create a GCspy Stream for each attribute of a space.
  2. Initialise the tile statistics in each stream.

Some drivers may also create a LinearScan object to handle call-backs from the VM as it sweeps the heap (see above).

The chief roles of a driver are to accumulate tile statistics, and to transmit the summary and control data and the data for all of their streams. Their data gathering interface is the scan method (to which an object reference or address is passed).

When the collector or mutator has finished gathering data, it calls the transmit of the driver for each space that needs to send its data. Streams may send values of types byte, short or int, implemented through classes ByteStream, ShortStream or IntStream. A driver's transmit method will typically:

  1. Determine whether a GCspy client is connected and interested in this event, e.g. server.isConnected(event)
  2. Setup the summaries for each stream, e.g. stream.setSummary(values...);
  3. Setup the control information for each tile. e.g. controlValues(CONTROL_USED, start, numBlocks);
    controlValues(CONTROL_UNUSED, end, remainingBlocks);
  4. Set up the space information, e.g. setSpace(info);
  5. Send the data for all streams, e.g. send(event, numTiles);

Note that AbstractDriver.send takes care of sending the information for all streams (including control data).

Subspaces

Subspace provides a useful abstraction of a contiguous region of a heap, recording its start and end address, the index of its first block, the size of blocks in this space and the number of blocks in the region. In particular, Subspace provides methods to:

  • Determine whether an address falls within a subspace;
  • Determine the block index of the address;
  • Calculate how much space remains in a block after a given address;
  • No labels