This is necessary to avoid name conflicts on various platforms. For
example, iOS has its own util.h, and Windows has a process.h. By
including our version as e.g. "avian/util.h", we avoid confusion with
the system version.
Previously, if you forgot to use RUNTIME_ARRAY_BODY to reference an
array declared with (THREAD_)RUNTIME_ARRAY, you wouldn't get a
compiler error until you tried to build on e.g. MSVC, where
runtime-sized stack arrays aren't supported. This change ensures you
find out regardless of what compiler you're using, which ought to
protect us from regressions going forward.
We must use separate va_start/va_end pairs for each call to vsnprintf
on Linux and possibly other platforms in order to avoid a crash.
Also, we need to give it room to null terminate the string at the
right point.
When GetStringCritical or GetPrimitiveArrayCritical are called, the VM
cannot risk new Java heap allocations until the corresponding release
method is called because allocations may result in GC, which cannot
happen while a string or array is pinned in memory. We already have a
check for this latter in the footprint function used during GC, but
it's best to catch the problem as early as possible.
Previously, we would blithely exceed the heap ceiling and force the
next allocation to deal with the problem, including a major GC and
possible OutOfMemoryError. As of this commit, we throw an error
immediately if we find that the allocation will push us over the
ceiling.
Commit c918cbc added this reference to ensure
sun.misc.Unsafe.getLongVolatile could be implemented efficiently on
32-bit platforms. However, I neglected to ensure the reference was
updated to point to the final class instance instead of the temporary
one used in parseClass. This led to extra memory usage and
inconsistent locking behavior, plus broken bootimage builds.
If we don't clear these references, we risk finalizing objects which
can still be reached by one of the special reference types.
It's a bit of a chicken-and-egg problem. We need to visit finalizable
objects before visiting weak references, since some of the weak
references and/or their targets may become reachable once the
finalizable objects are visited. However, that ordering means we have
no efficient way of distinguishing between objects which are reachable
from one or more normal GC roots and those which are only reachable
via the finalization queue. The solution is to clear all weak
references to finalizable objects before visiting them.
The original stub implementation just echoed back its argument, but
that confused URLClassLoader when dealing with sealed JARs --
returning a non-null value for a non-system class from
JVM_GetSystemPackage made URLClassloader think it had already loaded a
class from a package which was supposed to be sealed, resulting in
SecurityExceptions which ultimately triggered NoClassDefFoundErrors.
The solution is to only return non-null values for actual system
classes.
We weren't wrapping exceptions thrown by invoked methods in
InvocationTargetExceptions in JVM_InvokeMethod or
JVM_NewInstanceFromConstructor. Also, JVM_GetCallerClass is supposed
to ignore Method.invoke frames when walking the stack.
My earlier fix (f8e8609) was almost -- but not quite -- sufficient.
It asked the heap to mark the dead fixies too early, so some of them
were marked dead even though they ultimately survived, causing us to
clear weak JNI references when we shouldn't.
The existing code did not handle static field lookups for
synchronization on 32-bit systems, which is necessary because such
systems generally don't support atomic operations on 64-bit values.
resolveClass was correctly respecting throw_ == false if the requested
class was not found, but it still threw an exception if e.g. the
superclass was missing. Now we catch such exceptions and return null
as appropriate.
This led to fixed-position objects being considered unreachable when
they were actually still reachable, causing global weak JNI references
to be cleared prematurely, most notably leading to crashes in AWT
buffered image code.
This commit also fixes a field offset calculation mismatch in
bootimage.cpp relative to machine.cpp.
We were assuming the array element size was always the native word
size, which is not correct in general for primitive arrays, and this
led to wasted space at best and memory corruption at worst.
The first problem was that, on x86, we failed to properly keep track
of whether to expect the return address to be on the stack or not when
unwinding through a frame. We were relying on a "stackLimit" pointer
to tell us whether we were looking at the most recently-called frame
by comparing it with the stack pointer for that frame. That was
inaccurate in the case of a thread executing at the beginning of a
method before a new frame is allocated, in which case the most recent
two frames share a stack pointer, confusing the unwinder. The
solution involves keeping track of how many frames we've looked at
while walking the stack.
The other problem was that compareIpToMethodBounds assumed every
method was followed by at least one byte of padding before the next
method started. That assumption was usually valid because we were
storing the size following method code prior to the code itself.
However, the last method of an AOT-compiled code image is not followed
by any such method header and may instead be followed directly by
native code with no intervening padding. In that case, we risk
interpreting that native code as part of the preceding method, with
potentially bizarre results.
The reason for the compareIpToMethodBounds assumption was that methods
which throw exceptions as their last instruction generate a
non-returning call, which nonetheless push a return address on the
stack which points past the end of the method, and the unwinder needs
to know that return address belongs to that method. A better solution
is to add an extra trap instruction to the end of such methods, which
is what this patch does.
It seems that GCC 4.6.1 gets confused at LTO time when we take the
address of inline functions, so I'm switching them to non-inline
linkage to make it happy.
It seems that GCC 4.6.1 gets confused at LTO time when we take the
address of inline functions, so I'm switching them to non-inline
linkage to make it happy.
Apple's linker tends to remove functions which are never called, which
is not what we want for e.g. vmPrintTrace, since that function is only
intended to be called interactively from within GDB.
My previous attempt wasn't quite sufficient, since it was too late to
call join on a thread which had already exited given the code was
written to aggressively dispose of system handles as soon as the
thread exited. The solution is to delay disposing these handles until
after we're able to join the thread.
The bug here is that when a thread exits and becomes a "zombie", the
OS resources associated with it are not necessarily released until we
actually join and dispose of that thread. Since that only happens
during garbage collection, and collection normally only happens in
response to heap memory pressure, there's no guarantee that we'll GC
frequently enough to clean up zombies promptly and avoid running out
of resources.
The solution is to force a GC whenever we start a new thread and there
are at least N zombies waiting to be disposed, where N=16 for now.
There was a subtle race condition in the VM shutdown process such that
a System::Thread would be disposed after the System instance it was
created under has been disposed, in which case doing a virtual call to
System::free with that instance would potentially cause a crash. The
solution is to just call the C library version of free directly, since
that's all System::free does.
Until now, the bootimage build hasn't supported using the Java
invocation API to create a VM, destroy it, and create another in the
same process. Ideally, we would be able to create multiple VMs
simultaneously without any interference between them. In fact, Avian
is designed to support this for the most part, but there are a few
places we use global, mutable state which prevent this from working.
Most notably, the bootimage is modified in-place at runtime, so the
best we can do without extensive changes is to clean up the bootimage
when the VM is destroyed so it's ready for later instances. Hence
this commit.
Ultimately, we can move towards a fully reentrant VM by making the
bootimage immutable, but this will require some care to avoid
performance regressions. Another challenge is our Posix signal
handlers, which currently rely on a global handle to the VM, since you
can't, to my knowledge, pass a context pointer when registering a
signal handler. Thread local variables won't necessarily help, since
a thread might attatch to more than one VM at a time.
This avoids the requirement of putting the code image in a
section/segment which is both writable and executable, which is good
for security and avoids trouble with systems like iOS which disallow
such things.
The implementation relies on relative addressing such that the offset
of the desired address is fixed as a compile-time constant relative to
the start of the memory area of interest (e.g. the code image, heap
image, or thunk table). At runtime, the base pointer to the memory
area is retrieved from the thread structure and added to the offset to
compute the final address. Using the thread pointer allows us to
generate read-only, position-independent code while avoiding the use
of IP-relative addressing, which is not available on all
architectures.
This monster commit is the first step towards supporting
cross-architecture bootimage builds. The challenge is to build a heap
and code image for the target platform where the word size and
endianess may differ from those of the build architecture. That means
the memory layout of objects may differ due to alignment and size
differences, so we can't just copy objects into the heap image
unchanged; we must copy field by field, resizing values, reversing
endianess and shifting offsets as necessary.
This commit also removes POD (plain old data) type support from the
type generator because it added a lot of complication and little
value.
Internally, the VM augments the method tables for abstract classes
with any inherited abstract methods to make code simpler elsewhere,
but that means we can't use that table to construct the result of
Class.getDeclaredMethods since it would include methods not actually
declared in the class. This commit ensures that we preserve and use
the original, un-augmented table for that purpose.
Previously, we would abort the process if we encountered a truncated
multibyte character in parseUtf8NonAscii (called by the JNI method
NewStringUTF). Now we simply terminate the string at that point.
Also, assume any class which has an ancestor class which has a static
initializer needs initialization even if it doesn't have one itself,
per the Java Language Spec.
The result of Class.getInterfaces should not include interfaces
declared to be implemented/extended by superclasses/superinterfaces,
only those declared by the class itself. This is important because it
influences how java.io.ObjectStreamClass calculates serial version
IDs.
This includes a proper implementation of JVM_ActiveProcessorCount, as
well as JVM_SetLength and JVM_NewMultiArray. Also, we now accept up
to JNI_VERSION_1_6 in JVM_IsSupportedJNIVersion.
We must not allocate heap objects from doCollect, since it might
trigger a GC while one is already in progress, which can cause trouble
when we're still queuing up objects to finalize, among other things.
To avoid this, I've added extra fields to the finalizer and cleaner
types which we can use to link instances up during GC without
allocating new memory.
OpenJDK uses an alternative to Object.finalize for resource cleanup in
the form of sun.misc.Cleaner. Normally, OpenJDK's
java.lang.ref.Reference.ReferenceHandler thread handles this, calling
Cleaner.clean on any instances it finds in its "pending" queue.
However, Avian handles reference queuing internally, so it never
actually adds anything to that queue, so the VM must call
Cleaner.clean itself.
The main changes here are:
* fixes for runtime annotation support
* proper support for runtime generic type introspection
* throw NoClassDefFoundErrors instead of ClassNotFoundExceptions
where appropriate
This commit ensures that we use the proper memory barriers or locking
necessary to preserve volatile semantics for such fields when accessed
or updated via JNI.
Unlike the interpreter, the JIT compiler tries to resolve all the
symbols referenced by a method when compiling that method. However,
this can backfire if a symbol cannot be resolved: we end up throwing
an e.g. NoClassDefFoundError for code which may never be executed.
This is particularly troublesome for code which supports multiple
APIs, choosing one at runtime.
The solution is to defer to stub code for symbols which can't be
resolved at JIT compile time. Such a stub will try again at runtime
to resolve the needed symbol and throw an appropriate error if it
still can't be found.
It is possible to create an Exception with no stack trace by
overriding Throwable.fillInStackTrace, so we can't assume any given
instance will have one.
There was a race between these two functions such that one thread A
would run dispose on thread B just before thread B finishes exit, with
the result that Thread::lock and/or Thread::systemThread would be
disposed twice, resulting in a crash.
It seems that older versions of GCC (4.0 and older, at least) generate
assembly files with duplicate symbols for function templates which
differ only by the attributes of the templated types. Newer versions
have no such problem, but we need to support both, hence the
workaround in this commit of using a dedicated, non-template "alias"
function where we previously used "cast<alias_t>".
We use a template function called "cast" to get raw access to fields
in in the VM. In particular, we use this function in util.cpp to
treat reference fields as intptr_t fields so we can use the least
significant bit as the red/black flag in red/black tree nodes.
Unfortunately, this runs afoul of the type aliasing rules in C/C++,
and the compiler is permitted to optimize in a way that assumes such
aliasing cannot occur. Such optimization caused all the nodes in the
tree to be black, leading to extremely unbalanced trees and thus slow
performance.
The fix in this case is to use the __may_alias__ attribute to tell the
compiler we're doing something devious. I've also used this technique
to avoid other potential aliasing problems. There may be others
lurking, so a complete audit of the VM might be a good idea.