The Jikes RVM project contains several different test runs with different purposes. This document attempts to capture the purpose of each different test run.
Red test run
This test run MUST be run prior to committing code. They are relatively short and are designed to capture as many potential bugs in the shortest possible time.It is expected that the red test run will take 15-20 minutes on modern intel architecture.
Green test run
There is a set of workloads we consider important (i.e. dacapo and SPEC*). There is a set of build configurations we consider important (ie prototype, development, production). We as a group wish to guarantee that all important workloads will will run correctly on all important build configurations. ( i.e. We should NEVER regress). The green test run is designed to identify as early as possible any failures in this matrix of build configuration x workload. It is run continuously 24 hours a day (or at least every time a change is made). It is expected that the green test run will take 2-6 hours to complete depending on the environment.
The best way to identify the failures is to stress test the system by forcing frequent garbage collections and compilation at specific optimization levels (and perhaps frequent thread switching and frequent OSR events in the future). It is critical that we have a stable research base so intermittent failures are NOT acceptable. If we can not pass a stress test then there is no guarantee that we have a stable research base.
Blue test run
The blue test run cover a larger number of build configurations and workloads. They may not always pass and may test many of the less frequently used configurations (gctrace, gcspy, and individual stress tests) and less important workloads. Performance tests are also included in this test run. Something we use to gauge the health of the project as a whole and to track regressions. These are run once a day on major platforms. These time to complete can vary but expected to take several hours at the least.
Rainbow test runs
This is not a single test run but a set of test runs that are used for testing specific aspects of the system from performance, gcmap bug finding, io hammering etc. There may also be a set of personal/site-specific test runs included in this set that are not checked into Mercurial repository.
We must NEVER regress in green test run. The red test run attempts to ensure no green regressions this while keeping running time reasonable. The blue test run gives us an overall picture on the health of the code base. While the rainbow test runs are used at different times for different purposes.