It seems that GCC 4.6.1 gets confused at LTO time when we take the
address of inline functions, so I'm switching them to non-inline
linkage to make it happy.
Apple's linker tends to remove functions which are never called, which
is not what we want for e.g. vmPrintTrace, since that function is only
intended to be called interactively from within GDB.
My previous attempt wasn't quite sufficient, since it was too late to
call join on a thread which had already exited given the code was
written to aggressively dispose of system handles as soon as the
thread exited. The solution is to delay disposing these handles until
after we're able to join the thread.
The bug here is that when a thread exits and becomes a "zombie", the
OS resources associated with it are not necessarily released until we
actually join and dispose of that thread. Since that only happens
during garbage collection, and collection normally only happens in
response to heap memory pressure, there's no guarantee that we'll GC
frequently enough to clean up zombies promptly and avoid running out
of resources.
The solution is to force a GC whenever we start a new thread and there
are at least N zombies waiting to be disposed, where N=16 for now.
There was a subtle race condition in the VM shutdown process such that
a System::Thread would be disposed after the System instance it was
created under has been disposed, in which case doing a virtual call to
System::free with that instance would potentially cause a crash. The
solution is to just call the C library version of free directly, since
that's all System::free does.
Until now, the bootimage build hasn't supported using the Java
invocation API to create a VM, destroy it, and create another in the
same process. Ideally, we would be able to create multiple VMs
simultaneously without any interference between them. In fact, Avian
is designed to support this for the most part, but there are a few
places we use global, mutable state which prevent this from working.
Most notably, the bootimage is modified in-place at runtime, so the
best we can do without extensive changes is to clean up the bootimage
when the VM is destroyed so it's ready for later instances. Hence
this commit.
Ultimately, we can move towards a fully reentrant VM by making the
bootimage immutable, but this will require some care to avoid
performance regressions. Another challenge is our Posix signal
handlers, which currently rely on a global handle to the VM, since you
can't, to my knowledge, pass a context pointer when registering a
signal handler. Thread local variables won't necessarily help, since
a thread might attatch to more than one VM at a time.
This avoids the requirement of putting the code image in a
section/segment which is both writable and executable, which is good
for security and avoids trouble with systems like iOS which disallow
such things.
The implementation relies on relative addressing such that the offset
of the desired address is fixed as a compile-time constant relative to
the start of the memory area of interest (e.g. the code image, heap
image, or thunk table). At runtime, the base pointer to the memory
area is retrieved from the thread structure and added to the offset to
compute the final address. Using the thread pointer allows us to
generate read-only, position-independent code while avoiding the use
of IP-relative addressing, which is not available on all
architectures.
This monster commit is the first step towards supporting
cross-architecture bootimage builds. The challenge is to build a heap
and code image for the target platform where the word size and
endianess may differ from those of the build architecture. That means
the memory layout of objects may differ due to alignment and size
differences, so we can't just copy objects into the heap image
unchanged; we must copy field by field, resizing values, reversing
endianess and shifting offsets as necessary.
This commit also removes POD (plain old data) type support from the
type generator because it added a lot of complication and little
value.
Internally, the VM augments the method tables for abstract classes
with any inherited abstract methods to make code simpler elsewhere,
but that means we can't use that table to construct the result of
Class.getDeclaredMethods since it would include methods not actually
declared in the class. This commit ensures that we preserve and use
the original, un-augmented table for that purpose.
Previously, we would abort the process if we encountered a truncated
multibyte character in parseUtf8NonAscii (called by the JNI method
NewStringUTF). Now we simply terminate the string at that point.
Also, assume any class which has an ancestor class which has a static
initializer needs initialization even if it doesn't have one itself,
per the Java Language Spec.
The result of Class.getInterfaces should not include interfaces
declared to be implemented/extended by superclasses/superinterfaces,
only those declared by the class itself. This is important because it
influences how java.io.ObjectStreamClass calculates serial version
IDs.
This includes a proper implementation of JVM_ActiveProcessorCount, as
well as JVM_SetLength and JVM_NewMultiArray. Also, we now accept up
to JNI_VERSION_1_6 in JVM_IsSupportedJNIVersion.
We must not allocate heap objects from doCollect, since it might
trigger a GC while one is already in progress, which can cause trouble
when we're still queuing up objects to finalize, among other things.
To avoid this, I've added extra fields to the finalizer and cleaner
types which we can use to link instances up during GC without
allocating new memory.
OpenJDK uses an alternative to Object.finalize for resource cleanup in
the form of sun.misc.Cleaner. Normally, OpenJDK's
java.lang.ref.Reference.ReferenceHandler thread handles this, calling
Cleaner.clean on any instances it finds in its "pending" queue.
However, Avian handles reference queuing internally, so it never
actually adds anything to that queue, so the VM must call
Cleaner.clean itself.
The main changes here are:
* fixes for runtime annotation support
* proper support for runtime generic type introspection
* throw NoClassDefFoundErrors instead of ClassNotFoundExceptions
where appropriate
This commit ensures that we use the proper memory barriers or locking
necessary to preserve volatile semantics for such fields when accessed
or updated via JNI.
Unlike the interpreter, the JIT compiler tries to resolve all the
symbols referenced by a method when compiling that method. However,
this can backfire if a symbol cannot be resolved: we end up throwing
an e.g. NoClassDefFoundError for code which may never be executed.
This is particularly troublesome for code which supports multiple
APIs, choosing one at runtime.
The solution is to defer to stub code for symbols which can't be
resolved at JIT compile time. Such a stub will try again at runtime
to resolve the needed symbol and throw an appropriate error if it
still can't be found.
It is possible to create an Exception with no stack trace by
overriding Throwable.fillInStackTrace, so we can't assume any given
instance will have one.
There was a race between these two functions such that one thread A
would run dispose on thread B just before thread B finishes exit, with
the result that Thread::lock and/or Thread::systemThread would be
disposed twice, resulting in a crash.
It seems that older versions of GCC (4.0 and older, at least) generate
assembly files with duplicate symbols for function templates which
differ only by the attributes of the templated types. Newer versions
have no such problem, but we need to support both, hence the
workaround in this commit of using a dedicated, non-template "alias"
function where we previously used "cast<alias_t>".
We use a template function called "cast" to get raw access to fields
in in the VM. In particular, we use this function in util.cpp to
treat reference fields as intptr_t fields so we can use the least
significant bit as the red/black flag in red/black tree nodes.
Unfortunately, this runs afoul of the type aliasing rules in C/C++,
and the compiler is permitted to optimize in a way that assumes such
aliasing cannot occur. Such optimization caused all the nodes in the
tree to be black, leading to extremely unbalanced trees and thus slow
performance.
The fix in this case is to use the __may_alias__ attribute to tell the
compiler we're doing something devious. I've also used this technique
to avoid other potential aliasing problems. There may be others
lurking, so a complete audit of the VM might be a good idea.