Benchmarking Infrastructure for the JVM

When performance is measured, performance improves. When performance is measured and reported, the rate of improvement accelerates. — Thomas Monson

There is a tool called JMH which is one of the JVM’s best friend when it comes to performance tracking and analysis. In this post, i…

  • Describe the pieces of infrastructure i developed around JMH and how you can use those for your own endeavors…
  • Tell the long story of the Why... ;)

How and Why — Chapter I

A few years ago - working at Datameer, a big data platform - some customers were concerned about performance. “Why is this and that job taking only five hours in Hive but eight in Datameer ?” Kind of that in different shades. So a small team set out to investigate & solve the complaints before they could turn into a problem. First thing they did, was to craft a benchmark suite which run data processing jobs on Hadoop with Hive and with Datameer. With that, they were able to compare both products, figure out how to improve Datameer and we could finally cope with the customer’s expectation in terms of performance.

A few month later, i was asking “Hey where is our performance suite running? And where can i see the report ?”. The response was like “Well, you’ve to prepare an environment, then trigger the benchmarks manually and then you can find the performance number printed to the logs!”.

“Phew, if that suite doesn’t run regularly and if it doesn’t alarm or cheer anyone with some fancy graphs… that thing will rotten soon…”. Maybe i was just thinking, maybe i articulated my concern in more polite terms. Anyhow…

I can’t prove that proper reporting and automation would have saved the life of this piece of infrastructure, but eventually it lay dead for a while and got removed from the codebase years after.

How and Why — Chapter II

Two years back - still working at Datameer - we started to develop a new major feature. The feature’s main feature was about performance. It didn’t really brought new functionality to the table, but through it’s mere speed, it should enable new use cases.

This time i was more involved and, building on the former experience, i had these thoughts:

  • If performance is that much crucial, we have to measure the de-facto state of each feature and track losses and gains with each step of development.
  • I want good reporting & proper automation!!

Pretty obvious... There should be proper software out there i thought…

It didn’t took me long until i figured that JMH is kind of the de-facto tool used by most (and more & more) people, to realize micro benchmarks on the JVM. While JMH’s strength lies surely in micro benchmarking, i found it for good use in “integration benchmarking” as well. The pluggable profilers it can run are pretty cool too. They help to find explanations for the seen performance, such as object allocation, JIT work, etc...

But in terms of presenting the benchmark results… most of the tooling around JMH lacked what i was wishing for. Gradle plugins were there, but only for mere execution. The existing Jenkins plugins were of great help, but missed depth. A online visualizer did exist… but it was pretty good in hiding and i didn’t find it at that time. And the printout / the result file JMH does at the end of a execution… well its pretty good if you work on a small set of benchmarks, but once you grow your suite… good luck keeping your vision!

Thats why i set out and developed my own set of tools…

JMH Visualizer — Online Visualizer

Starting with a gradle plugin generating html and javascript based on the benchmark results, i soon figured out that this won’t scale and developing a regular webapp with standard tools will bring me much farther much sooner.

So here it is, a mostly javascript app, the JMH visualizer:

Code is at Assuming you know how to execute JMH benchmarks, you can instruct JMH to output JSON instead of CSV (e.g. java -jar benchmarks.jar -rf json). Simply upload the produced JSON at the JMH Visualizer page and you will get a graphical representation of the json like this:

one benchmark run

Execute the benchmarks a second time and upload both of the JSON files and you will get a compare view:

two benchmark runs

Upload three or more JSON files and you get a line chart:

three or more benchmark runs

Here a few additional points to the online visualizer:

  • It’s online, but your data stays offline (once the hmtl and javascript is loaded, you can safely disconnect from the internet and start using the visualizer)
  • You usually write your benchmarks in a class. JMH Visualizer does bundle the benchmarks from one class into one chart usually.
  • It tries to make use of most of the provided data, so additionally to the score it visualizes, it give insights into the score-error, iteration result s and secondary metrics like ·gc.alloc.rate.norm.

Loading your reports from Gists or external URL’s

If you have posted your benchmark results (in JSON form) anywhere in the web, you can now reference them (instead of uploading). JMH Visualizer understands those 4 paramters: source, sources, gist, gists:

This is especially helpful if you want to share your results with others, e.g. in an issue tracker!

Gradle JMH Report

Once the online visualizer was build, it was straightforward to put it inside a Gradle plugin which then could produce a local report for executed benchmarks. You find it here: with a getting-started guide included in

Note: This plugin only generates a visual report based on a benchmark result. It does NOT execute your benchmark suite for you (I’ve written about how to setup execution in JMH with Gradle — From Easy to Simple).

Jenkins Plugin

Finally… automation… Hooked up the online visualizer in a Jenkins plugin as well:

This allows to run your benchmarks suite regularily and figure out regressions or improvements easily (even in a big suite).

Above and Beyond

JMH is quite a beast. Writing meaningful benchmarks isn’t as easy as writing good unit tests (and even that is challenging at times). Understanding the inner mechanics of the JVM is often necessary to understand some of the “awkwardnesses” of JMH and to avoid the common pitfalls which are inherent when it comes to micro benchmarking.

I found that visualizing the data JMH produces helped me a great deal to understand JMH and to write acceptable benchmarks. I would love to build on that. It’s probably easy to analyze the benchmarks and yield warnings or tips to the user (like “This benchmark has a high variation, try…”) and by thus helping other JMH starters... However, due lack of free time, respectively due abundance of idea’s creating software for a better world… i’m not sure how much i’m going to drive this stack forward… So participation is appreciated, contact me through !!

Also, you might be interested in a list of other JMH tools i put together.