Analyzer produces incorrect lvt types for jsr return targets when jsr frame changes twice during analysis
I'm using ASM to perform some analyses that are very sensitive to type
information in the local variable table. Tracking down a problem with the
results of one of the analyses led me to find a rare bug in ASM.
The bug happens when running BasicInterpreter (and thus Analyzer) on the class
org.eclipse.core.runtime.internal.adaptor.BundleStopper. The problem happens
when you have a JSR instruction that can be reached from two different control
flow paths, one path with more active LVT variables than the other path.
Consider the following (a simplification of the problematic code from
BundleStopper.basicStopBundles()):
for (...) {
try {
Object o1 = new Object();
} catch (Exception e) {
Object o1 = new Object();
Object o2 = new Object();
Object o3 = new Object();
} finally {
System.out.println("hi");
}
}
This gets compiled such that the end of the try block and the end of the catch
block both JUMP to a label and then immediately executes a JSR to the finally
block (the compiler also could have put a separate JSR at the end of each
block, which would have avoided the bug, but for whatever reason, it chose to
JUMP at the end of each block and then do the JSR).
Now consider what happens if Analyzer happens to process the catch block in the
queue before the try block. The catch block uses more LVT slots, and so when it
JUMPs to the label right before the JSR, initially that label will be populated
with type information for all of those LVT slots.
Then let's say that the JSR subroutine happens to be processed next in the
queue. At the end of the JSR when it reaches the RET instruction, it will merge
its current frame at the end of the JSR (except for variables accessed in the
JSR) with the frame of the calling instruction and merge that with the return
target (the instruction following the JSR instruction). At this point the
calling JSR instruction still have a state with the extra LVT variables, and so
it propagates this to the next instruction (the return target, happens to be a
label).
Eventually the queue contains the if block and at the end of the if block we
merge our LVT state with the JUMP target, which reduces the number of valid LVT
slots at the JSR instruction. The JSR instruction gets processed again, but
there are no changes to be merged (the subroutine by this point already had the
LVT slots that were fewer), and so the analysis terminates.
However, this leaves the return target of the JSR instruction (the instruction
after the JSR instruction) with an LVT that has too be many variables in it.
The key insight to fixing this is to realize that anytime the LVT of a JSR
instruction changes, we must add to the queue the RET instructions of the
relevant subroutine.
I have implemented this in the attached patch, and verified that it works
correctly. The patch uses a map to track the RET instruction(s) that jump to
any JSR return target that have already been processed. When a JSR is processed
in the queue, it always adds these RET instructions into the queue as well
(even if no changes were caused by the JSR instruction itself, if we are
processing the JSR instruction something caused the JSR instruction's state
itself to change, so we just propagate this one step further).
I will be attaching the printout of the LVT/STACK of the actual
BundleStopper.basicStopBundles method anaylsis before the fix and after the
fix, as well as the patch against SVN /tags/ASM_3_3_2 release. I'm also
attaching the .jar file with the .class file that will reproduce the problem.
I also fixed another problem along the way with this patch -- if there are
multiple callers for the same subroutine then the code to add another caller to
the subroutine doesn't update all of the copies of the subroutine in the
subroutines array (because findSubroutine makes a deep copy of the subroutine
for each entry). The fix is to update all of the subroutines entries when a
second caller is added to a subroutine object.