The problem (which we've only been able to reproduce consistently with
the openjdk-src process=interpret build on Linux virtual machines) was
a race condition during VM shutdown. Thread "A" would exit, see there
were other threads still running and thus enter ZombieState, which
involves acquiring and releasing a lock using RawMonitorResource.
Then the last thread (thread "B") would exit, wait for thread "A" to
release the lock, then shut down the VM, freeing all memory. However,
thread "A" writes to its Thread object one last time after releasing
the lock (in ~Resource, the destructor of the superclass of
RawMonitorResource, which sets Thread::resource). If thread "B" frees
that Thread before ~Resource runs, we end up writing to freed memory.
Thus, we need to update Thread::resource before releasing the lock.
Apparently C++ destructors run in order from most derived to least
derived, which is not what we want here. My solution to split
Resource into two classes, one that has no destructor and another that
extends it (called AutoResource) which does hafe a destructor. Now
all the classes which used to extend Resource extend AutoResource,
except for RawMonitorResource, which extends Resource directly so it
can control the order of operations.
This ensures that all tests pass when Avian is built with an
openjdk=$path option such that $path points to either OpenJDK 7 or 8.
Note that I have not yet tried using the openjdk-src option with
OpenJDK 8. I'll work on that next.
For some reason, running Avian under the SVN version of Valgrind
caused mmap to fail, which caused tryAllocateExecutable to return a
null pointer, which led to a non-obvious crash later on. Adding an
expect to check the result immediately will at least make it obvious
what went wrong.
Turns out Function can do the jobs of both CallbackReceiver and
FunctionReceiver, so I've removed the latter two.
Also, shift and reset should work with a combination of types, not
just a single type, so I've expanded their generic signatures.
Tail call and dead code optimizations can cause code after a throw to
be eliminated, which confuses findUnwindTarget because it doesn't know
what code is throwing the exception. So we need at least one
instruction to follow the call to the throw_ thunk. Previously, we
only added such an instruction when we knew the throw was the last
instruction in the bytecode, but it turns out there are other cases
where it is needed, including certain try/finally situations.
There's a small optimization in compileDirectInvoke which tries to
avoid generating calls to empty methods. However, this causes
problems for code which uses such a call to ensure a class is
initialized -- if we omit that call, the class may not be
initialized and any side effects of that initialization may not
happen when the program expects them to.
This commit ensures that the compiler only omits empty method calls
when the target class does not need initialization. It also removes
commented-out code in classpath-openjdk.cpp which was responsible for
loading libmawt proactively; that was a hack to get JogAmp to work
before we understood what the real problem was.
Clang was complaining that newIp might be used uninitialized at the
bottom of our giant, unstructured compile loop, so I initialized it
with a bogus value, which means it will at least fail consistently if
Clang is right and there really is a path by which that code is
reached without otherwise initializing newIp.
Previously, I used a shell script to extract modification date ranges
from the Git history, but that was complicated and unreliable, so now
every file just gets the same year range in its copyright header. If
someone needs to know when a specific file was modified and by whom,
they can look at the Git history themselves; no need to include it
redundantly in the header.
Previously, we would attempt to initialize a class (e.g. call its
static initializer) whenever a method in that class was called, as
well as in any of the cases listed in
http://docs.oracle.com/javase/specs/jls/se7/html/jls-12.html#jls-12.4.
However, the above approach may lead to deadlock in an app which
relies on being able to call non-static methods in parallel with a
static initializer invocation in the same class. Thus, this commit
ensures that we initialize classes only in the cases defined by the
standard.
scalac may generate bytecode such that an exception is thrown within
the bounds of a handler for that exception such that the throw is the
last instruction in the method, which we weren't handling properly.
This ensures that, if an exception is thrown later but before the
method has been fully compiled, we will know exactly how much memory
to free. Previously, we would abort when trying to free the wrong
amount due to an assertion failure.
This is necessary to avoid name conflicts on various platforms. For
example, iOS has its own util.h, and Windows has a process.h. By
including our version as e.g. "avian/util.h", we avoid confusion with
the system version.
The eventual intent with the lir namespace is to formalize some of
the important bits of Assembler interface, to be tested, debug-printed,
and potentially, serialized.
Also, group arguments to apply(...) in OperandInfos
The primary motivation behind this is to allow all the different Assemblers
to be built at once, on a single machine. This should dramatically reduce
the time required to make sure that a particular change doesn't break
the build for one of the not-so-common architectures (arm, powerpc)
Simply pass "codegen-targets=all" to make to compile all
src/codegen/<arch>/assembler.cpp.
Note that while these architectures are built, they will not be fully-
functional. Certain stuff is assumed to be the same across the entire
build (such as TargetBytesPerWord), but this isn't the case anymore.
In order to calculate the initial stack map of GC roots for an
exception handler, we do a logical "and" of maps across all the
instructions contained in the try block for that handler. This is
complicated by the presence of jsr/ret instructions, though, because
instructions in a subroutine may have multiple maps associated with
them corresponding to all the paths from which execution might flow to
them.
The bug in this case was that we were using an uninitialized map in
our calculation, resulting in a map with no GC roots at all. By the
time the map was initialized, the damage had already been done. The
solution is to treat an uninitialized map as if it has roots at all
positions so that it has no effect on the calculation until it has
been initialized with real data.
Some OSes (notably, Windows CE) restrict the size of the call stack
such that recursive compilation of branch instructions can lead to
stack overflow in methods with large numbers of such instructions. In
fact, a worst-case method could even lead to overflow when the stack
size limit is relatively generous.
The solution is to convert this recursion into iteration with an
explicit stack to maintain state about alternate paths through each
branch.
We weren't adding entries to the frame map for calls to the instanceof
thunk when compiling methods. However, that thunk may trigger a GC,
in which case we'll need to unwind the stack, which will lead to a
crash if we don't have a frame map entry for that instruction.
Java requires that NaNs be converted to zero and that numbers at or
beyond the limits of integer representation be clamped to the largest
or smallest value that can be represented, respectively.
The existing code handled such odd switch statements correctly in the
JIT case, but did the wrong thing for the AOT case, leading to an
assertion failure later on.
4512a9a introduced a new ArgumentList constructor which was handling
some types incorrectly (e.g. implicitly converting floats to
integers). This commit fixes it.
Our Thread.getStackTrace implementation is tricky because it might be
invoked on a thread executing arbitrary native or Java code, and there
are numerous edge cases to consider. Unsurprisingly, there were a few
lingering, non-fatal bugs revealed by Valgrind recently, one involving
the brief interval just before and after returning from invokeNative,
and the other involving an off-by-one error in x86.cpp's nextFrame
implementation. This commit fixes both.
If a class references a field or method as static and we find it's
actually non-static -- or vice-versa -- we ought to throw an error
rather than abort.
The first problem was that, on x86, we failed to properly keep track
of whether to expect the return address to be on the stack or not when
unwinding through a frame. We were relying on a "stackLimit" pointer
to tell us whether we were looking at the most recently-called frame
by comparing it with the stack pointer for that frame. That was
inaccurate in the case of a thread executing at the beginning of a
method before a new frame is allocated, in which case the most recent
two frames share a stack pointer, confusing the unwinder. The
solution involves keeping track of how many frames we've looked at
while walking the stack.
The other problem was that compareIpToMethodBounds assumed every
method was followed by at least one byte of padding before the next
method started. That assumption was usually valid because we were
storing the size following method code prior to the code itself.
However, the last method of an AOT-compiled code image is not followed
by any such method header and may instead be followed directly by
native code with no intervening padding. In that case, we risk
interpreting that native code as part of the preceding method, with
potentially bizarre results.
The reason for the compareIpToMethodBounds assumption was that methods
which throw exceptions as their last instruction generate a
non-returning call, which nonetheless push a return address on the
stack which points past the end of the method, and the unwinder needs
to know that return address belongs to that method. A better solution
is to add an extra trap instruction to the end of such methods, which
is what this patch does.
Scala occasionally generates exception handler tables with interval
bounds which fall outside the range of valid bytecode indexes, so we
must clamp them or risk out-of-bounds array accesses.
Floats are implicitly promoted to doubles when passed as part of a
variable-length argument list, so we can't treat them the same way as
32-bit integers.
Until now, the bootimage build hasn't supported using the Java
invocation API to create a VM, destroy it, and create another in the
same process. Ideally, we would be able to create multiple VMs
simultaneously without any interference between them. In fact, Avian
is designed to support this for the most part, but there are a few
places we use global, mutable state which prevent this from working.
Most notably, the bootimage is modified in-place at runtime, so the
best we can do without extensive changes is to clean up the bootimage
when the VM is destroyed so it's ready for later instances. Hence
this commit.
Ultimately, we can move towards a fully reentrant VM by making the
bootimage immutable, but this will require some care to avoid
performance regressions. Another challenge is our Posix signal
handlers, which currently rely on a global handle to the VM, since you
can't, to my knowledge, pass a context pointer when registering a
signal handler. Thread local variables won't necessarily help, since
a thread might attatch to more than one VM at a time.
This reverts commit 88d614eb25.
It turns out we still need separate sets of thunks for AOT-compiled
and JIT-compiled code to ensure we can always generate efficient jumps
and calls to thunks on architectures such as ARM and PowerPC, whose
relative jumps and calls have limited ranges.
Now that the AOT-compiled code image is position-independent, there is
no further need for this distinction. In fact, it was harmful,
because we were still using runtime-generated thunks when we should
have been using the ones in the code image. This resulted in
EXC_BAD_ACCESS errors on non-jailbroken iOS devices.
This avoids the requirement of putting the code image in a
section/segment which is both writable and executable, which is good
for security and avoids trouble with systems like iOS which disallow
such things.
The implementation relies on relative addressing such that the offset
of the desired address is fixed as a compile-time constant relative to
the start of the memory area of interest (e.g. the code image, heap
image, or thunk table). At runtime, the base pointer to the memory
area is retrieved from the thread structure and added to the offset to
compute the final address. Using the thread pointer allows us to
generate read-only, position-independent code while avoiding the use
of IP-relative addressing, which is not available on all
architectures.
This fixes a number of bugs concerning cross-architecture bootimage
builds involving diffent endianesses. There will be more work to do
before it works.
This monster commit is the first step towards supporting
cross-architecture bootimage builds. The challenge is to build a heap
and code image for the target platform where the word size and
endianess may differ from those of the build architecture. That means
the memory layout of objects may differ due to alignment and size
differences, so we can't just copy objects into the heap image
unchanged; we must copy field by field, resizing values, reversing
endianess and shifting offsets as necessary.
This commit also removes POD (plain old data) type support from the
type generator because it added a lot of complication and little
value.
We must throw an AbstractMethodError when such a call is executed (not
when the call is compiled), so we compile this case as a call to a
thunk which throws such an error.
sun.font.FontManager.initIDs is a native method defined in
libfontmanager.so, yet there seems to be no mechanism in OpenJDK's
class library to actually load that library, so we lazily load it
before trying to resolve the method.
As described in commit 36aa0d6, apps such as jython which generate
bytecode dynamically can produce patterns of bytecode for which the
VM's compiler could not handle properly. However, that commit
introduced a regression and had to be partially reverted.
It turns out the real problem was the call to Compiler::restoreState
which we made before checking whether we were actually ready to
compile the exception handler (we delay compiling an exception handler
until and unless the try/catch block it serves has been compiled so we
can calculate the stack maps properly). That confused the compiler in
rare cases, so we now only call restoreState once we're actually ready
to compile the handler.
My last commit introduced a regression in JIT compilation of
subroutines. This reverts the specific change which caused the
regression. Further work will be needed to address the case which
that change was intended to fix (namely, exception handlers which
apply to multiple try/catch blocks).
Bytecode generated by compilers other than javac or ecj (such as
jython's dynamically generated classes) can contain unreachable code
and exception handlers which apply to more than one try/catch scope.
Previously, the VM's JIT compiler did not handle either of these cases
well, hence this commit.
Also, assume any class which has an ancestor class which has a static
initializer needs initialization even if it doesn't have one itself,
per the Java Language Spec.
OpenJDK's sun.reflect.MethodAccessorGenerator can generate
invokevirtual calls to private methods (which we normally consider
non-virtual); we must compile them as non-virtual calls since they
aren't in the vtable.
It turns out commit 31eb047 was too aggressive and led to incorrect
calculation of line numbers for machine addresses, as well as
potentially incorrect exception handler scope calculation. This fixes
the regression.
I recently encountered a Batik JAR with a method containing a
redundant goto which confused the JIT compiler because it was refered
to in the exception handler and line number tables despite being
unreachable. I don't know how such code was generated, but this
commit ensures the compiler can handle it.
We can't blindly try release the monitors for all synchronized methods
when unwinding the stack since we may not have finished acquiring the
most recent one when the exception was thrown.
Big applications can exceed the 16MB limit we previously used.
Increasing this above 30MB (if/when desired) will require changes to
the ARM and PowerPC JIT code to work around immediate branch encoding
limits on those platforms,
Unlike the interpreter, the JIT compiler tries to resolve all the
symbols referenced by a method when compiling that method. However,
this can backfire if a symbol cannot be resolved: we end up throwing
an e.g. NoClassDefFoundError for code which may never be executed.
This is particularly troublesome for code which supports multiple
APIs, choosing one at runtime.
The solution is to defer to stub code for symbols which can't be
resolved at JIT compile time. Such a stub will try again at runtime
to resolve the needed symbol and throw an appropriate error if it
still can't be found.
Due to encoding limitations, the immediate operand of conditional
branches can be no more than 32KB forward or backward. Since the
JIT-compiled form of some methods can be larger than 32KB, and we also
do conditional jumps to code outside the current method in some cases,
we must work around this limitation.
The strategy of this commit is to provide inline, intermediate jump
tables where necessary. A given conditional branch whose target is
too far for a direct jump will instead point to an unconditional
branch in the nearest jump table which points to the actual target.
Unconditional immediate branches are also limited on PowerPC, but this
limit is 32MB, which is not an impediment in practice. If it does
become a problem, we'll need to encode such branches using multiple
instructions.