The GCspy Heap Visualisation Framework
GCspy is a visualisation framework that allows developers to observe the behaviour of the heap and related data structures. For details of the GCspy model, see GCspy: An adaptable heap visualisation frameworkby Tony Printezis and Richard Jones, OOPSLA'02. The framework comprises two components that communicate across a socket: a client and a serverincorporated into the virtual machine of the system being visualised. The client is usually a visualiser (written in Java) but the framework also provides other tools (for example, to store traces in a compressed file). The GCspy server implementation for JikesRVM was contributed by Richard Jones of the University of Kent.
GCspy is designed to be independent of the target system. Instead, it requires the GC developer to describe their system in terms of four GCspy abstractions, spaces, streams, tiles and events. This description is transmitted to the visualiser when it connects to the server.
A space is an abstraction of a component of the system; it may represent a memory region, a free-list, a remembered-set or whatever. Each space is divided into a number of blocks which are represented by the visualiser as tiles. Each space will have a number of attributes -- streams -- such as the amount of space used, the number of objects it contains, the length of a free-list and so on.
In order to instrument a Jikes RVM collector with GCspy:
- Provide a
startGCspyServermethod in that collector's plan. That method initialises the GCspy server with the port on which to communicate and a list of event names, instantiates drivers for each space, and then starts the server.
- Gather data from each space for the tiles of each stream (e.g. before, during and after each collection).
- Provide a driver for each space.
Space drivers handle communication between collectors and the GCspy infrastructure by mapping information collected by the memory manager to the space's streams. A typical space driver will:
- Create a GCspy space.
- Create a stream for each attribute of the space.
- Update the tile statistics as the memory manager passes it information.
- Send the tile data along with any summary or control information to the visualiser.
The Jikes RVM SSGCspy plan gives an example of how to instrument a collector. It provides GCspy spaces, streams and drivers for the semi-spaces, the immortal space and the large object space, and also illustrates how performance may be traded for the gathering of more detailed information.
Installation of GCspy with Jikes RVM
The GCspy C server code needs a pthread (created in
sys.C) in order to run. So, GCspy will only work on a system where you've build Jikes RVM with
config.single.virtual.processor set to
0. The build process will fail if you try to configure such a build.
The GCspy client code makes use of the Java Advanced Imaging (JAI) API. The build system will attempt to download and install the JAI component when required but this is only supported on the
ia32-linux platform. The build system will also attempt to download and install the GCSpy server when required.
Building Jikes RVM to use GCspy
To build the Jikes RVM with GCSpy support the configuration parameter
config.include.gcspy must be set to
1 such as in the
BaseBaseSemiSpaceGCspyconfiguration. You can also have the Jikes RVM build process create a script to start the GCSpy client tool if GCSpy was built with support for client component. To achieve this the configuration parameter
config.include.gcspy-client must be set to
The following steps build the Jikes RVM with support for GCSpy on linux-ia32 platform.
$ cd $RVM_ROOT
$ ant -Dhost.name=ia32-linux -Dconfig.name=BaseBaseSemiSpaceGCspy -Dconfig.include.gcspy-client=1
It is also possible to build the Jikes RVM with GCSpy support but link it against a fake stub implementation rather than the real GCSpy implementation. This is achieved by setting the configuration parameter
1. This is used in the nightly testing process.
Running Jikes RVM with GCspy
To start Jikes RVM with GCSpy enabled you need to specify the port the GCSpy server will listen on.
$ cd $RVM_ROOT/dist/BaseBaseSemiSpaceGCspy_ia32-linux
$ ./rvm -Xms20m -X:gc:gcspyPort=3000 -X:gc:gcspyWait=true &
Then you need to start the GCspy visualiser client.
$ cd $RVM_ROOT/dist/BaseBaseSemiSpaceGCspy_ia32-linux
After this you can specify the port and host to connect to (i.e. localhost:3000) and click the "Connect" button in the bottom right-hand corner of the visualiser.
Command line arguments
Additional GCspy-related arguments to the
The number of the port on which to connect to the visualiser. The default is port
0, which signifies no connection.
Whether Jikes RVM should wait for a visualiser to connect.
How many KB are represented by one tile. The default value is 128.
Writing GCspy drivers
To instrument a new collector with GCspy, you will probably want to subclass your collector and to write new drivers for it. The following sections explain the modifications you need to make and how to write a driver. You may use
org.mmtk.plan.semispace.gcspy and its drivers as an example.
The recommended way to instrument a Jikes RVM collector with GCspy is to create a
gcspy subdirectory in the directory of the collector being instrumented, e.g.
MMTk/src/org/mmtk/plan/semispace/gcspy. In that directory, we need 5 classes:
SSGCspy is the plan for the instrumented collector. It is a subclass of
SSConstraints to provide methods
boolean needsLinearScan() and
boolean withGCspy(), both of which return true.
SSTraceLocal to override methods
willNotMoveto ensure that tracing deals properly with GCspy objects: the GCspyTraceLocal file will be similar for any instrumented collector.
The instrumented collector,
SSCollector. It needs to override
SSMutator and must also override its parent's methods
collectionPhase, to allow the allocators to collect data; and its
postAlloc methods to allocate GCspy objects in GCspy's heap space.
SSGCspy.startGCspyServer is called immediately before the "main" method is loaded and run. It initialises the GCspy server with the port on which to communicate, adds event names, instantiates a driver for each space, and then starts the server, forcing the VM to wait for a GCspy to connect if necessary. This method has the following responsibilities.
- Initialise the GCspy server: server.init(name, portNumber, verbose);
- Add each event to the
ServerInterpreter(`server' for short) server.addEvent(eventID, eventName);
- Set some general information about the server (e.g. name of the collector, build, etc) server.setGeneralInfo(info);
- Create new drivers for each component to be visualised myDriver = new MyDriver(server, args...);
AbstractDriver and register their space with the
ServerInterpreter. In addition to the server, drivers will take as arguments the name of the space, the MMTk space, the tilesize, and whether this space is to be the main space in the visualiser.
The Collector and Mutator
Instrumenters will typically want to add data collection points before, during and after a collection by overriding
SSGCspyCollector deals with the data in the semi-spaces that has been allocated there (copied) by the collector. It only does any real work at the end of the collector's last tracing phase,
SSGCspyMutator is more complex: as well as gathering data for objects that it allocated in From-space at the start of the
PREPARE_MUTATOR phase, it also deals with the immortal and large object spaces.
At a collection point, the collector or mutator will typically
- Return if the GCspy port number is 0 (as no client can be connected).
- Check whether the server is connected at this event. If so, the compensation timer (which discounts the time taken by GCspy to ather the data) should be started before gathering data and stopped after it.
- After gathering the data, have each driver call its
SSGCspyCollectordoes not call the GCspy server's
serverSafepointmethod, as the collector phase is usually followed by a mutator phase. Instead,
serverSafepointcan be called by
SSGCspyMutatorto indicate that this is a point at which the server can pause, play one event, etc.
Gathering data will vary from MMTk space to space. It will typically be necessary to resize a space before gathering data. For a space,
- We may need to reset the GCspy driver's data depending on the collection phase.
- We will pass the driver as a call-back to the allocator. The allocator will typically ask the driver to set the range of addresses from which we want to gather data, using the driver's
setRangemethod. The driver should then iterate through its MMTk space, passing a reference to each object found to the driver's scan method.
GCspy space drivers extend
AbstractDriver. This class creates a new GCspy
ServerSpace and initializes the control values for each tile in the space. Control values indicate whether a tile is used, unused, a background, a separator or a link. The constructor for a typical space driver will:
- Create a GCspy
Streamfor each attribute of a space.
- Initialise the tile statistics in each stream.
Some drivers may also create a
LinearScan object to handle call-backs from the VM as it sweeps the heap (see above).
The chief roles of a driver are to accumulate tile statistics, and to transmit the summary and control data and the data for all of their streams. Their data gathering interface is the
scan method (to which an object reference or address is passed).
When the collector or mutator has finished gathering data, it calls the
transmit of the driver for each space that needs to send its data. Streams may send values of types byte, short or int, implemented through classes
IntStream. A driver's
transmit method will typically:
- Determine whether a GCspy client is connected and interested in this event, e.g.
- Setup the summaries for each stream, e.g.
- Setup the control information for each tile. e.g.
controlValues(CONTROL_USED, start, numBlocks);
controlValues(CONTROL_UNUSED, end, remainingBlocks);
- Set up the space information, e.g.
- Send the data for all streams, e.g.
AbstractDriver.send takes care of sending the information for all streams (including control data).
Subspace provides a useful abstraction of a contiguous region of a heap, recording its start and end address, the index of its first block, the size of blocks in this space and the number of blocks in the region. In particular,
Subspace provides methods to:
- Determine whether an address falls within a subspace;
- Determine the block index of the address;
- Calculate how much space remains in a block after a given address;