Message-ID: <772283225.27151.1408553677792.JavaMail.firstname.lastname@example.org> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_27150_888197246.1408553677792" ------=_Part_27150_888197246.1408553677792 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
This section provides some tips on collecting performance number= s with Jikes RVM.
To make a long story short the best performing configuration of Jikes RV=
M will almost always be
production. Unless you really know wha=
t you are doing, don't use any other configuration to do a performance eval=
uation of Jikes RVM.
Any boot image you use for performance evaluation must have the followin= g characteristics for the results to be meaningful:
For best performance we recommend the following:
-X:processors=3Dall: By default, Jikes=E2=84=A2 RVM uses o= nly one processor for garbage collection. Setting this option tells the gar= bage collection system to utilize all available processors.
The compiler-replay methodology is deterministic and eliminates memory a= llocation and mutator variations due to non-deterministic application of th= e adaptive compiler. We need this latter methodology because the non-determ= inism of the adaptive compilation system makes it a difficult platform for = detailed performance studies. For example, we cannot determine if a variati= on is due to the system change being studied or just a different applicatio= n of the adaptive compiler. The information we record and use are hot metho= ds and blocks information. We also record dynamic call graph with calling f= requency on each edge for inlining decisions.
Note that in December 2011, compiler replay was significantly improv= ed. The notes below apply to the post December 2011 version of repla= y.
Here is how to use it:
There are three kinds of advice used by the replay system, each is workl= oad-specific (ie you should generate advice files for each benchmark):
One way to gather advice is to execute the benchmark multiple times unde= r controlled settings, producing profiles at each execution. Then es= tablish the fastest execution among the set of runs, and choose the profile= s associated with that execution as the advice files. A common metho= dology is to invoke each benchmark 20 times (ie take the best invocation fr= om a set of 20 trials), and in each invocation, run 10 iterations of the be= nchmark (ie the advice will then capture the warmed-up, steady state of the= benchmark).
When generating the advice, you will need to use the following command l= ine arguments (typically use all six arguments, so that all three advice fi= les are generated at each invocation):
The basic model is simple. At a nominated time in the execution of=
a program, all methods specified in the .ca advice file will be (re)compil=
ed with the compiler and optimization level nominated in the advice file. &=
nbsp;Broadly, there are two ways of initiating bulk compilation: a) by call=
ing the method
ompileAllMethods() during execution, and b) by using the <=
code>-X:aos:enable_precompile=3Dtrue flag at the command line t=
o trigger bulk compilation at boot time. A standard methodology is to=
use a benchmark harness call back mechanism to call
Methods() at the end of the first iteration of the benchmark. &=
nbsp; At the time of writing this gave performance roughly 2% faster than t=
he 10th iteration of regular adaptive compilation. Because precompila=
tion occurs early, the compiler has less information about the classes, and=
in consequence the performance of precompilation is about 9% slower than t=
he 10th iteration of adaptive compilation.
For 'warmup' replay (whe=
hods() is called at the end of the first iteration):
For precompile replay (w= here bulk compilation occurs at boot time):
You can alter the verbosity of the replay behavior with the flag <= code>-X:aos:bulk_compilatio= n_verbosity, which by default (0) is silent, but will pro= duce more information about the recompilation with values of 1 or 2. <= /p>
MMTk includes a statistics subsystem and a harness mechanism for m= easuring its performance. If you are using the DaCapo benchmarks, the= MMTk harness can be invoked using the '-c MMTkCallback' command line optio= n, but for other benchmarks you will need to invoke the harness by calling = the static methods
at the appropriate places. Other command line switches that affect= the collection of statistics are
Print statistics for each mutator/gc phase du= ring the run
Print statistics in an XML format (as opposed= to human-readable format)
This is incompatible with MMTk's statistics s= ystem.
Disable dynamic resizing of the heap
Unless you are specifically researching flexible heap sizes, it is best = to run benchmarks in a fixed size heap, using a range of heap sizes to prod= uce a curve that reflects the space-time tradeoff. Using replay compi= lation and measuring the second iteration of a benchmark is a good way to p= roduce results with low noise.
There is an active debate among memory management and VM researchers abo= ut how best to measure performance, and this section is not meant to dictat= e or advocate any particular position, simply to describe one particular me= thodology.
Perhaps you are not seeing stellar Jikes=E2=84=A2 RVM performance. If Ji= kes RVM as described above is not competitive product JVMs, we recommend yo= u test your installation with the DaCapo benchmarks. We expect Jikes RVM pe= rformance to be very close to Sun's HotSpot 1.5 server running the DaCapo b= enchmarks. Of course, running DaCapo well does not guarantee that Jikes RVM= runs all codes well.
Some kinds of code will not run fast on Jikes RVM. Known issues include:=
The Jikes RVM developers wish to ensure that Jikes RVM delivers competit= ive performance. If you can isolate reproducible performance problems, plea= se let us know.
Jikes RVM is not as stable as commercial JVMs such as HotSpot or J9. Des= ign your evaluation systems (e.g. scripts) so that they can deal with crash= es and deadlocks/livelocks. The latter can be dealt with by running Jikes R= VM with a timelimit. For example, if you are using Linux and shell scripts,= you can use the timelimit program to terminate the Jikes RVM after a set t= ime.