Monday, June 25, 2012

On Stack Replacement in HotSpot JVM

In JVM, JIT Compiler is used to convert bytecode at runtime prior to executing it natively.  Since the JIT does not have time to compile every single method in an Java application, all code starts out initially running in the interpreter, and once it becomes hot enough it gets scheduled for compilation.

The HotSpot VM can perform special compiles called On Stack Replacement compiles, or OSRs.  These are used when Java code contains a long-running loop (see the example here) that started executing in the interpreter.

As described in [1], it is important to know about OSR if you want to benchmark Java programs and have put everything in a loop to be run in a method such as main.  You may run into some pitfalls caused by OSR on your benchmark results.  Without further ado, I'll refer you to read [1] for more details.

On Stack Replacement (OSR)[2]

Normally the way Java code ends up in compiled code is that when invoking a method the interpreter detects that there's compiled code for it, and it dispatches to that instead of staying in the interpreter.  This does not help long-running loops that started in the interpreter since they are not being invoked again.

When a long-running loop is detected at runtime, HotSpot VM requests a compile that starts its execution at the first bytecode of loop instead of starting at the first bytecode in the method.  The resulting generated code takes an interpreter frame as its input and uses that state to begin its execution.  In this way, long-running loops are able to take advantage of compiled code.  The act of the generated code taking an interpreter frame as input to be execution is called On Stack Replace.

How to Detect If OSR Happened?

The diagnostic options -XX:+LogCompilation can emit a structured XML log of compilation related activity during a run of the virtual machine.  Because this is a diagnostic option, to enable it, you need to specify  -XX:+UnlockDiagnosticVMOptions first.  For example, to generate compilation log file, you can specify:

  • -XX:+UnlockDiagnosticVMOptions -XX:+LogCompilation -XX:LogFile=/<path to log>/logs/<name of log>.log 
From the log file, you can search "osr" to see if any loop in your application has been compiled:

<task_queued compile_id='1' compile_kind='osr' 
  method='java/util/jar/JarFile hasClassPathAttribute ()Z' bytes='275' 
  count='223' backedge_count='57584' iicount='223' osr_bci='157' stamp='0.233' 
  comment='tiered' hot_count='57584'/>
<nmethod compile_id='1' compile_kind='osr' compiler='C2' level='4' 
  entry='0x00002aaaab4d4d40' size='1576' address='0x00002aaaab4d4bd0' 
  relocation_offset='288' insts_offset='368' stub_offset='1072' 
  scopes_data_offset='1120' scopes_pcs_offset='1352' dependencies_offset='1512' 
  nul_chk_table_offset='1520' oops_offset='1096'
  method='java/util/jar/JarFile hasClassPathAttribute ()Z' bytes='275' 
  count='241' backedge_count='62159' iicount='241' stamp='0.237'/>


Every compile is assigned a compile id by the system at the point is enqueued and that's recorded as the compile_id attribute. The above example shows an on stack replacement (OSR) compile where the code is going to be used to replace an already existing activation. It is tagged by the compile_kind attribute as:
  •  compile_kind='osr'
The 'method' attribute is a string version of the VM name of the method with spaces separating the class, method name signature.

The 'bytes' attribute is number of bytecodes in the method.

'count' is the invocation count as recorded by the method invocation counters. Note that these counters are mainly used for triggering compiles and are not guaranteed to be an accurate reflection of the number of times a method has actually executed. Multiple threads may be updated these counters and sometimes the VM will reset a counter to a lower value to delay retriggering of compiles.

'iicount' is the interpreter invocation count. This is a separate copy of the invocation count which is maintained by the profiling support. Again it's not guaranteed to be accurate since multiple threads may update it but it's never reset so it's reasonably accurate.

The 'backedge_count' attribute is used to detect methods that contain loops and to cause them to get compiled earlier than they would with just an invocation counter. Whenever this counter is incremented by the interpreter it checks it against a threshold, and if it crosses this threshold the interpreter requests a compile of that loop.

'stamp' gives a timestamp for the start time.  Many elements contain time stamps as the end which can be used to order them relative to events in other threads and the measure elapsed time. The time stamp is in seconds since the start of the VM and the start time of the VM is recorded in the hotspot_log element in the time_ms attribute.

References

  1. Robust Java benchmarking, Part 1: Issues 
  2. Java Performance by Charlie Hunt and Binu John
  3. LogCompilation Overview

No comments: