The Realities of Java Performance Profiling
Professional Java Performance Profiling In Practice
In the instrumentation, measurement and (data) collection of software performance behavior, it is critical to keep in mind that there’re many possible realities to be obtained from a performance benchmark test and the resulting profile. Some are more relevant, useful and accurate than others.
When we instrument there is a trade-off to be made in terms of coverage versus overhead cost. The greater the coverage, the greater the overhead and the less relevant and accurate the performance model collected. With each degree of code coverage we get a new reality, sometimes completely replacing and revising a previously perceived, collected, model of reality.
Many application performance monitoring products, including AppDynamics, dynaTrace and NewRelic, tackle this problem by being very selective in their instrumentation, limiting it to well defined methods within a framework.
Selective instrumentation is another form of reality distortion. In some cases, it can be the worst form as it forces a decision on instrumentation to be made prior to any degree of knowledge acquisition and understanding.
Ideally, a monitoring runtime needs to be able to remove instrumentation on the fly based on intelligent and learned analysis of the software execution behavior. When that’s not possible, or practical, then the next best thing is for the measurement layer, e.g. the metering engine in Satoris, to disable measurement of firing, calling, instrumentation hooks. Various policies, or strategies, can be employed to take into account many factors in deciding, electing, to measure a particular instrumentation firing including cost, value, and context such as a path of execution, a flow, through some code.
Finally, when the measurement has indeed taken place, and an event created, the monitoring runtime needs to be able to inspect and assess the value of the measurement event generated and then to intelligently decide whether additional contextual information needs to be collected for such an event. Should the runtime create and update some extraneous measurement aggregation structure looking at the measurement value? Does the value have value? For the performance engineer, the task should be to steer the system to make the most appropriate decision at runtime — as it happens.
Back in 2008 when I created the metering engine underlying Satoris, Stenos, Simz, and Sentris I envisaged replacing the human entirely. But I soon realized that there would always be a need for a performance engineer, to direct effort as well as managing the change that followed from such work. The goal for tooling should be to extend, augment, the capabilities of the user. To play to their strengths allowing more time for thinking and experiencing with effective data!
After many years I’ve formulated a number of simple rules, habits, and techniques to reduce the many possible observable performance realities. Here are few of the more important ones when using a tool like Satoris:
- Focus on instrumenting code that can be changed — ignore libraries
- Repeatedly execute set of long running performance benchmark tests
- Employ one or more adaptive performance measurement extensions
- Steer the measurement system with a few, mostly cost-based, thresholds
- Assess the stability of a benchmark over multiple unchanged executions
- Investigate, understand and possibly minimize benchmark variability
- Refine code coverage based on previous and ongoing hotspot analysis
- Resist urge to capture additional data until there’s stability in the reality
- Experiment with sets of the cost-based thresholds — from high to low
- Attentively watch the execution of certain probes during benchmark run
- Prioritize code inspection using the exclusive time to rank probe hotspots
- Introduce delay into detected hotspots to determine the potential impact
- Track system performance over changes in code, usage, and configuration
- Identify the method hotspots before looking to understand the call paths
- Tune the top few hotspots and repeat before tackling all other remaining