The primary motivation behind this is to allow all the different Assemblers
to be built at once, on a single machine. This should dramatically reduce
the time required to make sure that a particular change doesn't break
the build for one of the not-so-common architectures (arm, powerpc)
Simply pass "codegen-targets=all" to make to compile all
src/codegen/<arch>/assembler.cpp.
Note that while these architectures are built, they will not be fully-
functional. Certain stuff is assumed to be the same across the entire
build (such as TargetBytesPerWord), but this isn't the case anymore.
In order to calculate the initial stack map of GC roots for an
exception handler, we do a logical "and" of maps across all the
instructions contained in the try block for that handler. This is
complicated by the presence of jsr/ret instructions, though, because
instructions in a subroutine may have multiple maps associated with
them corresponding to all the paths from which execution might flow to
them.
The bug in this case was that we were using an uninitialized map in
our calculation, resulting in a map with no GC roots at all. By the
time the map was initialized, the damage had already been done. The
solution is to treat an uninitialized map as if it has roots at all
positions so that it has no effect on the calculation until it has
been initialized with real data.
Some OSes (notably, Windows CE) restrict the size of the call stack
such that recursive compilation of branch instructions can lead to
stack overflow in methods with large numbers of such instructions. In
fact, a worst-case method could even lead to overflow when the stack
size limit is relatively generous.
The solution is to convert this recursion into iteration with an
explicit stack to maintain state about alternate paths through each
branch.
We weren't adding entries to the frame map for calls to the instanceof
thunk when compiling methods. However, that thunk may trigger a GC,
in which case we'll need to unwind the stack, which will lead to a
crash if we don't have a frame map entry for that instruction.
Java requires that NaNs be converted to zero and that numbers at or
beyond the limits of integer representation be clamped to the largest
or smallest value that can be represented, respectively.
The existing code handled such odd switch statements correctly in the
JIT case, but did the wrong thing for the AOT case, leading to an
assertion failure later on.
4512a9a introduced a new ArgumentList constructor which was handling
some types incorrectly (e.g. implicitly converting floats to
integers). This commit fixes it.
Our Thread.getStackTrace implementation is tricky because it might be
invoked on a thread executing arbitrary native or Java code, and there
are numerous edge cases to consider. Unsurprisingly, there were a few
lingering, non-fatal bugs revealed by Valgrind recently, one involving
the brief interval just before and after returning from invokeNative,
and the other involving an off-by-one error in x86.cpp's nextFrame
implementation. This commit fixes both.
If a class references a field or method as static and we find it's
actually non-static -- or vice-versa -- we ought to throw an error
rather than abort.
The first problem was that, on x86, we failed to properly keep track
of whether to expect the return address to be on the stack or not when
unwinding through a frame. We were relying on a "stackLimit" pointer
to tell us whether we were looking at the most recently-called frame
by comparing it with the stack pointer for that frame. That was
inaccurate in the case of a thread executing at the beginning of a
method before a new frame is allocated, in which case the most recent
two frames share a stack pointer, confusing the unwinder. The
solution involves keeping track of how many frames we've looked at
while walking the stack.
The other problem was that compareIpToMethodBounds assumed every
method was followed by at least one byte of padding before the next
method started. That assumption was usually valid because we were
storing the size following method code prior to the code itself.
However, the last method of an AOT-compiled code image is not followed
by any such method header and may instead be followed directly by
native code with no intervening padding. In that case, we risk
interpreting that native code as part of the preceding method, with
potentially bizarre results.
The reason for the compareIpToMethodBounds assumption was that methods
which throw exceptions as their last instruction generate a
non-returning call, which nonetheless push a return address on the
stack which points past the end of the method, and the unwinder needs
to know that return address belongs to that method. A better solution
is to add an extra trap instruction to the end of such methods, which
is what this patch does.
Scala occasionally generates exception handler tables with interval
bounds which fall outside the range of valid bytecode indexes, so we
must clamp them or risk out-of-bounds array accesses.
Floats are implicitly promoted to doubles when passed as part of a
variable-length argument list, so we can't treat them the same way as
32-bit integers.
Until now, the bootimage build hasn't supported using the Java
invocation API to create a VM, destroy it, and create another in the
same process. Ideally, we would be able to create multiple VMs
simultaneously without any interference between them. In fact, Avian
is designed to support this for the most part, but there are a few
places we use global, mutable state which prevent this from working.
Most notably, the bootimage is modified in-place at runtime, so the
best we can do without extensive changes is to clean up the bootimage
when the VM is destroyed so it's ready for later instances. Hence
this commit.
Ultimately, we can move towards a fully reentrant VM by making the
bootimage immutable, but this will require some care to avoid
performance regressions. Another challenge is our Posix signal
handlers, which currently rely on a global handle to the VM, since you
can't, to my knowledge, pass a context pointer when registering a
signal handler. Thread local variables won't necessarily help, since
a thread might attatch to more than one VM at a time.
This reverts commit 88d614eb25.
It turns out we still need separate sets of thunks for AOT-compiled
and JIT-compiled code to ensure we can always generate efficient jumps
and calls to thunks on architectures such as ARM and PowerPC, whose
relative jumps and calls have limited ranges.
Now that the AOT-compiled code image is position-independent, there is
no further need for this distinction. In fact, it was harmful,
because we were still using runtime-generated thunks when we should
have been using the ones in the code image. This resulted in
EXC_BAD_ACCESS errors on non-jailbroken iOS devices.