We had be using System::Monitor::wait to block threads internally in
the VM as well as to implement Object.wait. However, the interrupted
flag should only be cleared in the latter case, not the former. This
commit adds a new method and changes the semantics of the old one in
order to acheive the desired behavior.
This fixes the tails=true build (at least for x86_64) and eliminates
the need for a frame table in the tails=false build. In the
tails=true build, we still need a frame table on x86(_64) to help
determine whether we've caught a thread executing code to do a tail
call or pop arguments off the stack. However, I've not yet written
the code to actually use this table, and it is only needed to handle
asynchronous unwinds via Thread.getStackTrace.
Previously, we unwound the stack by following the chain of frame
pointers for normal returns, stack trace creation, and exception
unwinding. On x86, this required reserving EBP/RBP for frame pointer
duties, making it unavailable for general computation and requiring
that it be explicitly saved and restored on entry and exit,
respectively.
On PowerPC, we use an ABI that makes the stack pointer double as a
frame pointer, so it doesn't cost us anything. We've been using the
same convention on ARM, but it doesn't match the native calling
convention, which makes it unusable when we want to call native code
from Java and pass arguments on the stack.
So far, the ARM calling convention mismatch hasn't been an issue
because we've never passed more arguments from Java to native code
than would fit in registers. However, we must now pass an extra
argument (the thread pointer) to e.g. divideLong so it can throw an
exception on divide by zero, which means the last argument must be
passed on the stack. This will clobber the linkage area we've been
using to hold the frame pointer, so we need to stop using it.
One solution would be to use the same convention on ARM as we do on
x86, but this would introduce the same overhead of making a register
unavailable for general use and extra code at method entry and exit.
Instead, this commit removes the need for a frame pointer. Unwinding
involves consulting a map of instruction offsets to frame sizes which
is generated at compile time. This is necessary because stack trace
creation can happen at any time due to Thread.getStackTrace being
called by another thread, and the frame size varies during the
execution of a method.
So far, only x86(_64) is working, and continuations and tail call
optimization are probably broken. More to come.
This allows OpenJDK to access time zone data which is normally found
under java.home, but which we must embed in the executable itself to
create a self-contained build. The VM intercepts various file
operations, looking for paths which start with a prefix specified by
the avian.embed.prefix property and redirecting those operations to an
embedded JAR.
For example, if avian.embed.prefix is "/avian-embedded", and code
calls File.exists() with a path of
"/avian-embedded/javahomeJar/foo.txt", the VM looks for a function
named javahomeJar via dlsym, calls the function to find the memory
region containing the embeded JAR, and finally consults the JAR to see
if the file "foo.txt" exists.
We now consult the JAVA_HOME environment variable to determine where
to find the system library JARs and SOs. Ultimately, we'll want to
support self-contained build, but this allows Avian to behave like a
conventional libjvm.so.
Whereas the GNU Classpath port used the strategy of patching Classpath
with core classes from Avian so as to minimize changes to the VM, this
port uses the opposite strategy: abstract and isolate
classpath-specific features in the VM similar to how we abstract away
platform-specific features in system.h. This allows us to use an
unmodified copy of OpenJDK's class library, including its core classes
and augmented by a few VM-specific classes in the "avian" package.
The trick is to make all destructors non-virtual. This is safe because
we never use the delete operator, which is the only case where virtual
destructors are relevant. This is a better solution than implementing
our own delete operator, because we want libraries loaded at runtime to
use the libstdc++ version, not ours.
We now support immortal objects, which the GC will scan for references
but not consider for collection. On x86_64, we allocate JIT code memory
via mmap, which lets us map memory into the bottom 2GB of the address
space, ensuring that 32-bit relative jumps and calls work.
This includes support for using the least significant bits of the class
pointer to indicate object state, which we'll use to indicate the
presence of a monitor pointer, among other things.