Ensuring System Performance
When building enterprise software, it is important to be able to test performance and scalability. As more and more components for enterprise software started moving into the Open Source realm, there is a real need for having benchmarks to compare the different projects together. As I was thinking about this problem, I started thinking that building a set of Open Source performance benchmarks fixes a lot of the problems that current benchmarks run into. As I see it, there are three major problems with benchmarks, all of which I feel can be fixed by "Open Sourcing" the process:
- Benchmarks can be irrelevent. Benchmarks are often built by either vendors or committees that don't include enough customers. I propose that by being more collaborative about building the set of benchmarks, and being able to change or modify them quickly when we find a shortcoming, we can build a set of tests that helps better describe the real world performance of different open source projects.
- Benchmarks can be hard to get real information from. Currently, benchmarks are mainly run by vendors. This results in a couple of problems with transparency. Firstly, vendors can only release the results they want to. Secondly, oftentimes, benchmarks try to distill the performance of a very complex scenario into one number. This creates a very black and white way for vendors to compete, but does not give the customer a good way to find out where each of the products are good. For example, with TPC-C, the suite runs a large number of selects, updates, joins, etc. In the end, you get only two numbers to determine how the products stood up (tps/dollar and tps). You don't get any good information on product 1 wins on updates, while product 2 wins on queries. By adding more visibility into the process and making tools that allow customers to run the benchmarks in their own environments, we can make benchmarks a way for understanding the trade-off between competing projects better.
- Vendors can add benchmark specific performance enhancements. It's not unheard of for vendors to add performance fixes to their products that work really well for a specific scenario in a benchmark, but will practically never show up in a real scenario. In the Open Source world, this becomes untenable. Firstly, an open source project would not add this type of code, because the entire world would be able to see they are cheating. Secondly, for traditional vendors who get involved, the fast moving properties of open source projects would make it less worth while to add benchmark specific tests. As the benchmark keeps getting refined, and we get more customer feedback added into it, it becomes likely that benchmark specific fixes would become irrelevant quickly.
Project Bluegrass is going to help address this by building the tools and infrastructure necassary to help run and create system wide performance tests.
The Approach
I feel that it is quite dangerous to start building a framework like this without having a problem you are trying to solve in the meantime. So rather than doing an approach of building a framework, and then trying to build tests, we want to start off trying to solve the problem of creating good benchmarks for Open Source JMS providers and then go about solving the problem in a sufficienly pluggable manner that we can re-use the same infrastructure for running tests against any facet of a J2EE server (or in fact any other application server).
I decided on JMS as being a good place to start after reading a thread on The Server Side (http://www.theserverside.com/news/thread.tss?thread_id=28728). I talked with James a bit about this topic, and we both came away with the idea that oftentimes the hard part of running system performance suites is not trying to figure out what APIs to call and measure, but has more to do with laying out a system and being able to pound on the interesting aspects, as well as being able to easily change parameters like number of clients hitting the JMS server, or number of machines processing the messages, etc.
