If AVIAN_USE_FRAME_POINTER is not defined, the caller of vmInvoke will
calculate a frame size which assumes vmInvoke does not push rbp on the
stack before allocating the frame. However, vmInvoke pushes rbp
reguardless, so we need to adjust the frame size to ensure the stack
remains aligned.
That code was unused and will be unecessary until we add proper
support for unwinding through tail calls in nextFrame, at which point
it may be reinstated in some form.
It is dangerous to initiate a GC from a thunk like divideLong (which
was possible when allocating a new ArithmeticException to signal
divide-by-zero) since we don't currently generate a GC root frame map
for the return address of the thunk call. Instead, we use the backup
heap area if there is room, or else throw a pre-allocated exception
instead.
Like, PowerPC, ARM has an instruction cache which must be manually
flushed if/when we compile a new method. This commit updates
syncInstructionCache to use GCC's builtin __clear_cache routine.
This fixes the tails=true build (at least for x86_64) and eliminates
the need for a frame table in the tails=false build. In the
tails=true build, we still need a frame table on x86(_64) to help
determine whether we've caught a thread executing code to do a tail
call or pop arguments off the stack. However, I've not yet written
the code to actually use this table, and it is only needed to handle
asynchronous unwinds via Thread.getStackTrace.
This is necessary to accomodate classes loaded at runtime which refer
to primitive array types. Otherwise, they won't be included unless
classes in the bootimage refer to them.
When loading a class which extends another class that contained a
field of primitive array type using defineClass in a bootimage=true
build, the VM was unable to find the primitive array class, and
makeArrayClass refused to create one since it should already have
existed.
The problem was that the bootimage=true build uses an empty
Machine::BootstrapClassMap, and resolveArrayClass expected to find the
primitive array classes there. The fix is to check the
Machine::BootLoader map if we can't find it in
Machine::BootstrapClassMap.
Previously, we unwound the stack by following the chain of frame
pointers for normal returns, stack trace creation, and exception
unwinding. On x86, this required reserving EBP/RBP for frame pointer
duties, making it unavailable for general computation and requiring
that it be explicitly saved and restored on entry and exit,
respectively.
On PowerPC, we use an ABI that makes the stack pointer double as a
frame pointer, so it doesn't cost us anything. We've been using the
same convention on ARM, but it doesn't match the native calling
convention, which makes it unusable when we want to call native code
from Java and pass arguments on the stack.
So far, the ARM calling convention mismatch hasn't been an issue
because we've never passed more arguments from Java to native code
than would fit in registers. However, we must now pass an extra
argument (the thread pointer) to e.g. divideLong so it can throw an
exception on divide by zero, which means the last argument must be
passed on the stack. This will clobber the linkage area we've been
using to hold the frame pointer, so we need to stop using it.
One solution would be to use the same convention on ARM as we do on
x86, but this would introduce the same overhead of making a register
unavailable for general use and extra code at method entry and exit.
Instead, this commit removes the need for a frame pointer. Unwinding
involves consulting a map of instruction offsets to frame sizes which
is generated at compile time. This is necessary because stack trace
creation can happen at any time due to Thread.getStackTrace being
called by another thread, and the frame size varies during the
execution of a method.
So far, only x86(_64) is working, and continuations and tail call
optimization are probably broken. More to come.
This rather large commit modifies the VM to use non-local returns to
throw exceptions instead of simply setting Thread::exception and
returning frame-by-frame as it used to. This has several benefits:
* Functions no longer need to check Thread::exception after each call
which might throw an exception (which would be especially tedious
and error-prone now that any function which allocates objects
directly or indirectly might throw an OutOfMemoryError)
* There's no need to audit the code for calls to functions which
previously did not throw exceptions but later do
* Performance should be improved slightly due to both the reduced
need for conditionals and because undwinding now occurs in a single
jump instead of a series of returns
The main disadvantages are:
* Slightly higher overhead for entering and leaving the VM via the
JNI and JDK methods
* Non-local returns can make the code harder to read
* We must be careful to register destructors for stack-allocated
resources with the Thread so they can be called prior to a
non-local return
The non-local return implementation is similar to setjmp/longjmp,
except it uses continuation-passing style to avoid the need for
cooperation from the C/C++ compiler. Native C++ exceptions would have
also been an option, but that would introduce a dependence on
libstdc++, which we're trying to avoid for portability reasons.
Finally, this commit ensures that the VM throws an OutOfMemoryError
instead of aborting when it reaches its memory ceiling. Currently, we
treat the ceiling as a soft limit and temporarily exceed it as
necessary to allow garbage collection and certain internal allocations
to succeed, but refuse to allocate any Java objects until the heap
size drops back below the ceiling.
There is a delay between when we tell the OS to start a thread and
when it actually starts, and during that time a thread might
mistakenly think it was the last to exit, try to shut down the VM, and
then block in joinAll when it finds it wasn't the last one after all.
The solution is to increment Machine::liveCount and add the new thread
to the process tree before starting it -- all while holding
Machine::stateLock for atomicity. This helps guarantee that when
liveCount is one, we can be sure there's really only one thread
running or staged to run.
If we don't do this, the VM will crash when it tries to create a stack
trace for the error because makeObjectArray will return null
immediately when it sees there is a pending exception.
GCC 4.5.1 and later use a naming convention where functions are not
prefixed with an underscore, whereas previous versions added the
underscore. This change was made to ensure compatibility with
Microsoft's compiler. Since GCC 4.5.0 has a serious code generation
bug, we now only support later versions, so it makes sense to assume
the newer convention.
We now check for stack overflow in the JIT build as well as the
interpreted build, throwing a StackOverflowError if the limit
(currently hard-coded to 64KB, but should be easy to make
configurable) is exceeded.
There was an unlikely but dangerous race condition in monitorRelease
such that when a thread released a monitor and then tried to notify
the next thread in line, the latter thread might exit before it can be
notified. This potentially led to a crash as the former thread tried
to acquire and notify the latter thread's private lock after it had
been disposed.
The solution is to do as we do in the interrupt and join cases: call
acquireSystem first and thereby either block the target thread from
exiting until we're done or find that it has already exited, in which
case nothing needs to be done.
I also looked at monitorNotify to see if we have a similar bug there,
but in that case the target thread can't exit without first acquiring
and releasing the monitor, and since we ensure that no thread can
execute monitorNotify without holding the monitor, there's no
potential for a race.
In makeCodeImage, we were passing zero to Promise::Listener::resolve,
which would lead to an assertion error if the address of the code
image was further from the base of the address space (i.e. zero) than
could be spanned by a jump on the target architecture. Since, in this
context, we immediately overwrite the value stored, we may pass
whatever we want to this function (we're only calling it so we can
retrieve the location of the value in the image), and the code image
pointer is a better choice for the above reason.
When trying to create an array class, we try to resolve
java.lang.Object so we can use its vtable in the array class.
However, if Object is missing, we'll try to create and throw a
ClassNotFoundException, which requires creating an array to store the
stack trace, which requires creating an array class, which requires
resolving Object, etc.. This commit short-circuits this process by
telling resolveClass not to create and throw an exception if it can't
find Object.
While doing the above work, I noticed that the implementations of
Classpath::makeThrowable in classpath-avian.cpp and
classpath-openjdk.cpp were identical, so I made makeThrowable a
top-level function.
Finally, I discovered that Thread.setDaemon can only be called before
the target thread has been started, which allowed me to simplify the
code to track daemon threads in the VM.
We have to be careful about how we calculate return addresses on ARM
due to padding introduced by constant pools interspersed with code.
When calculating the offset of code where we're inserting a constant
pool, we want the offset of the end of the pool for jump targets, but
we want the offset just prior to the beginning of the pool (i.e. the
offset of the instruction responsible for jumping past the pool) when
calculating a return address.
The code added to runJavaThread was unecessary and harmful since it
allowed the global daemon thread count to become permanently
out-of-sync with the actual number of daemon threads.
This mainly involves some makefile ugliness to work around bugs in the
native Windows OpenJDK code involving conflicting static and
not-static declarations which GCC 4.0 and later justifiably reject but
MSVC tolerates.
We weren't properly handling the case where a 64-bit value is
multipled with itself in multiplyRR, leading to wrong code. Also,
addCarryCR didn't know how to handle constants more than 8-bits wide.
* add libnet.so and libnio.so to built-in libraries for openjdk-src build
* implement sun.misc.Unsafe.park/unpark
* implement JVM_SetClassSigners/JVM_GetClassSigners
* etc.
The main change here is to use a lazily-populated vector to associate
runtime data with classes instead of referencing them directly from
the class which requires updating immutable references in the heap
image. The other changes employ other strategies to avoid trying to
update immutable references.
Compiling the entire OpenJDK class library into a bootimage revealed
some corner cases which broke the compiler, including synchronization
in a finally block and gotos targeting the first instruction of an
unsynchronized method.
We must call notifyAll on visitLock after setting threadVisitor to
null in case another thread is waiting to do a visit of its own.
Otherwise, the latter thread will wait forever, eventually deadlocking
the whole VM at the next GC since it's in an active state.
If the VM runs out of heap space and the "avian.heap.dump" system
property was specified at startup, the VM will write a heap dump to
the filename indicated by that property. This dump may be analyzed
using e.g. DumpStats.java.
My recent commit to ensure that OS resources are released immediately
upon thread exit introduced a race condition where interrupting or
joining a thread as it exited could lead to attempts to use
already-released resources. This commit adds locking to avoid the
race.
This makes heap dumps more useful since these classes are now refered
to by name instead of number.
This commit also adds a couple of utilities for parsing heap dumps:
PrintDump and DumpStats.
The primary change is to ensure we output a Mach-O file of appropriate
endianness when cross-compiling for an opposite-endian architecture.
Earlier versions of XCode's linker accepted files of either
endianness, reguardless of architecture, but later versions don't,
hence the change.
Previously, loading an arbitrary 32-bit constant required up to four
instructions (128 bytes), since we did so one byte at a time via
immediate-mode operations.
The preferred way to load constants on ARM is via PC-relative
addressing, but this is challenging because immediate memory offsets
are limited to 4096 bytes in either direction. We frequently need to
compile methods which are larger than 4096, or even 8192, bytes, so we
must intersperse code and data if we want to use PC-relative loads
everywhere.
This commit enables pervasive PC-relative loads by handling the
following cases:
1. Method is shorter than 4096 bytes: append data table to end
2. Method is longer than 4096 bytes, but no basic block is longer
than 4096 bytes: insert data tables as necessary after blocks, taking
care to minimize the total number of tables
3. Method is longer than 4096 bytes, and some blocks are longer than
4096 bytes: split large basic blocks and insert data tables as above
Previously, we waited until the next GC to do this, but that can be
too long for workloads which create a lot of short-lived threads but
don't do much allocation.
This requires adding LinkRegister to the list of reserved registers,
since it must be preserved in the thunk code generated by
compileDirectInvoke. An alternative would be to explicitly preserve
it in that special case, but that would complicate the code quite a
bit.
All the tests are passing for openjdk-src builds, but the non-src
openjdk build is crashing and there's trouble loading time zone info
from the embedded java.home directory.
This allows OpenJDK to access time zone data which is normally found
under java.home, but which we must embed in the executable itself to
create a self-contained build. The VM intercepts various file
operations, looking for paths which start with a prefix specified by
the avian.embed.prefix property and redirecting those operations to an
embedded JAR.
For example, if avian.embed.prefix is "/avian-embedded", and code
calls File.exists() with a path of
"/avian-embedded/javahomeJar/foo.txt", the VM looks for a function
named javahomeJar via dlsym, calls the function to find the memory
region containing the embeded JAR, and finally consults the JAR to see
if the file "foo.txt" exists.
sun.misc.Unsafe.getUnsafe expects a null result if the class loader is
the boot classloader and will throw a SecurityException otherwise
(whereas it should really be checking both for null and comparing
against the system classloader). However, just returning null
whenever the loader is the boot loader can cause trouble for embedded
apps which put everything in the boot loader, including application
resources.
Therefore, we only return null if it's the boot loader and we're being
called from Unsafe.getUnsafe.
As described in readme.txt, a standalone OpenJDK build embeds all
libraries, classes, and other files needed at runtime in the resulting
binary, eliminating dependencies on external resources.
Rather than try to support mixing Avian's core classes with those of
an external class library -- which necessitates adding a lot of stub
methods which throw UnsupportedOperationExceptions, among other
comprimises -- we're looking to support such external class libraries
in their unmodified forms. The latter strategy has already proven
successful with OpenJDK's class library. Thus, this commit removes
the stub methods, etc., which not only cleans up the code but avoids
misleading application developers as to what classes and methods
Avian's built-in class library supports.
We now consult the JAVA_HOME environment variable to determine where
to find the system library JARs and SOs. Ultimately, we'll want to
support self-contained build, but this allows Avian to behave like a
conventional libjvm.so.
The main changes in this commit ensure that we don't hold the global
class lock when doing class resolution using application-defined
classloaders. Such classloaders may do their own locking (in fact,
it's almost certain), making deadlock likely when mixed with VM-level
locking in various orders.
Other changes include a fix to avoid overflow when waiting for
extremely long intervals and a GC root stack mapping bug.
The biggest change in this commit is to split the system classloader
into two: one for boot classes (e.g. java.lang.*) and another for
application classes. This is necessary to make OpenJDK's security
checks happy.
The rest of the changes include bugfixes and additional JVM method
implementations in classpath-openjdk.cpp.
Whereas the GNU Classpath port used the strategy of patching Classpath
with core classes from Avian so as to minimize changes to the VM, this
port uses the opposite strategy: abstract and isolate
classpath-specific features in the VM similar to how we abstract away
platform-specific features in system.h. This allows us to use an
unmodified copy of OpenJDK's class library, including its core classes
and augmented by a few VM-specific classes in the "avian" package.
We've been getting away with not doing this so far since our Java
calling convention matches the native calling convention concerning
where the return address is saved, so when our thunk calls native code
it gets saved for us automatically. However, there was still the
danger that a thread would interrupt another thread after the stack
pointer was saved to the thread field but before the native code was
called and try to get a stack trace, at which point it would try to
find the return address relative to that stack pointer and find
garbage instead. This commit ensures that we save the return address
before saving the stack pointer to avoid such a situation.