We now check for stack overflow in the JIT build as well as the
interpreted build, throwing a StackOverflowError if the limit
(currently hard-coded to 64KB, but should be easy to make
configurable) is exceeded.
When trying to create an array class, we try to resolve
java.lang.Object so we can use its vtable in the array class.
However, if Object is missing, we'll try to create and throw a
ClassNotFoundException, which requires creating an array to store the
stack trace, which requires creating an array class, which requires
resolving Object, etc.. This commit short-circuits this process by
telling resolveClass not to create and throw an exception if it can't
find Object.
While doing the above work, I noticed that the implementations of
Classpath::makeThrowable in classpath-avian.cpp and
classpath-openjdk.cpp were identical, so I made makeThrowable a
top-level function.
Finally, I discovered that Thread.setDaemon can only be called before
the target thread has been started, which allowed me to simplify the
code to track daemon threads in the VM.
* add libnet.so and libnio.so to built-in libraries for openjdk-src build
* implement sun.misc.Unsafe.park/unpark
* implement JVM_SetClassSigners/JVM_GetClassSigners
* etc.
The main change here is to use a lazily-populated vector to associate
runtime data with classes instead of referencing them directly from
the class which requires updating immutable references in the heap
image. The other changes employ other strategies to avoid trying to
update immutable references.
Previously, loading an arbitrary 32-bit constant required up to four
instructions (128 bytes), since we did so one byte at a time via
immediate-mode operations.
The preferred way to load constants on ARM is via PC-relative
addressing, but this is challenging because immediate memory offsets
are limited to 4096 bytes in either direction. We frequently need to
compile methods which are larger than 4096, or even 8192, bytes, so we
must intersperse code and data if we want to use PC-relative loads
everywhere.
This commit enables pervasive PC-relative loads by handling the
following cases:
1. Method is shorter than 4096 bytes: append data table to end
2. Method is longer than 4096 bytes, but no basic block is longer
than 4096 bytes: insert data tables as necessary after blocks, taking
care to minimize the total number of tables
3. Method is longer than 4096 bytes, and some blocks are longer than
4096 bytes: split large basic blocks and insert data tables as above
This allows OpenJDK to access time zone data which is normally found
under java.home, but which we must embed in the executable itself to
create a self-contained build. The VM intercepts various file
operations, looking for paths which start with a prefix specified by
the avian.embed.prefix property and redirecting those operations to an
embedded JAR.
For example, if avian.embed.prefix is "/avian-embedded", and code
calls File.exists() with a path of
"/avian-embedded/javahomeJar/foo.txt", the VM looks for a function
named javahomeJar via dlsym, calls the function to find the memory
region containing the embeded JAR, and finally consults the JAR to see
if the file "foo.txt" exists.
As described in readme.txt, a standalone OpenJDK build embeds all
libraries, classes, and other files needed at runtime in the resulting
binary, eliminating dependencies on external resources.
The main changes in this commit ensure that we don't hold the global
class lock when doing class resolution using application-defined
classloaders. Such classloaders may do their own locking (in fact,
it's almost certain), making deadlock likely when mixed with VM-level
locking in various orders.
Other changes include a fix to avoid overflow when waiting for
extremely long intervals and a GC root stack mapping bug.
The biggest change in this commit is to split the system classloader
into two: one for boot classes (e.g. java.lang.*) and another for
application classes. This is necessary to make OpenJDK's security
checks happy.
The rest of the changes include bugfixes and additional JVM method
implementations in classpath-openjdk.cpp.
Whereas the GNU Classpath port used the strategy of patching Classpath
with core classes from Avian so as to minimize changes to the VM, this
port uses the opposite strategy: abstract and isolate
classpath-specific features in the VM similar to how we abstract away
platform-specific features in system.h. This allows us to use an
unmodified copy of OpenJDK's class library, including its core classes
and augmented by a few VM-specific classes in the "avian" package.
In order to facilitate making the VM compatible with multiple class
libraries, it's useful to separate the VM-specific representation of
these classes from the library implementations. This commit
introduces VMClass, VMField, and VMMethod for that purpose.
A long time ago, I refactored the class initialization code in the VM,
but did not notice until today that it had caused the
process=interpret build to break on certain recursive initializations.
In particular, we were not always detecting when a thread recursively
tried to initialize a class it was already in the process of
initializing, leading to the mistaken assumption that another thread
was initializing it and that we should wait until it was done, in
which case we would wait forever.
This commit ensures that we always detect recursive initialization and
short-circuit it.
If we catch the target thread in a virtual thunk when getting its
stack trace, we must assume its Thread::stack field is garbage and use
the register values instead. Previously, we treated these thunks as
any other native code, leading to crashes when we tried to use the
garbage pointer.
compileDirectInvoke does some magic to optimize tail calls to native
methods which involves storing the return address (which we'll never
actually return to, since it's a tail call) in a thread-local field so
the thunk function can figure out which native method to look up at
runtime. Since this address will change when the boot image is
loaded, the boot image creation code needs to know about it.
callContinuation failed to call the correct continuation when feeding
it an exception due to a regression introduced with the
Thread.getStackTrace changes.
It's not safe to use malloc from a signal handler, so we can't
allocate new memory when handling segfaults or Thread.getStackTrace
signals. Instead, we allocate a fixed-size backup heap for each
thread ahead of time and use it if there's no space left in the normal
heap pool. In the rare case that the backup heap isn't large enough,
we fall back to using a preallocated exception without a stack trace
as a last resort.
This function was broken in two different ways:
1. It only checked MyProcessor::thunks, not MyProcessor::bootThunks.
It needs to check both.
2. When checking MyProcessor::thunks, it used fields from
MyProcessor::bootThunks instead of from the same thunk collection.
This fixes both problems.
Implementing Thread.getStackTrace is tricky. A thread may interrupt
another thread at any time to grab a stack trace, including while the
latter is executing Java code, JNI code, helper thunks, VM code, or
while transitioning between any of these.
To create a stack trace we use several context fields associated with
the target thread, including snapshots of the instruction pointer,
stack pointer, and frame pointer. These fields must be current,
accurate, and consistent with each other in order to get a reliable
trace. Otherwise, we risk crashing the VM by trying to walk garbage
stack frames or by misinterpreting the size and/or content of
legitimate frames.
This commit addresses sensitive transition points such as entering the
helper thunks which bridge the transitions from Java to native code
(where we must save the stack and frame registers for use from native
code) and stack unwinding (where we must atomically update the thread
context fields to indicate which frame we are unwinding to). When
grabbing a trace for another thread, we determine what kind of code we
caught the thread executing in and use that information to choose the
thread context values with which to begin the trace. See
MyProcessor::getStackTrace::Visitor::visit for details.
In order to atomically update the thread context fields, we do the
following:
1. Create a temporary "transition" object to serve as a staging area
and populate it with the new field values.
2. Update a transition pointer in the thread object to point to the
object created above. As long as this pointer is non-null,
interrupting threads will use the context values in the staging
object instead of those in the thread object.
3. Update the fields in the thread object.
4. Clear the transition pointer in the thread object.
We use a memory barrier between each of these steps to ensure they are
made visible to other threads in program order. See
MyThread::doTransition for details.
We were generating code which clobbered the data we were putting into
64-bit volatile fields (and potentially also clobbering the target or
source object in the case of non-static fields) due to misplaced
synchronization code. Reordering this code ensures that both the data
and the target or source survive across calls to synchronization
helper functions.
Previously, the stack frame mapping code (responsible for statically
calculating the map of GC roots for a method's stack frame during JIT
compilation) would assume that the map of GC roots on entry to an
exception handler is the same as on entry to the "try" block which the
handler is attached to. Technically, this is true, but the algorithm
we use does not consider whether a local variable is still "live"
(i.e. will be read later) when calculating the map - only whether we
can expect to find a reference there via normal (non-exceptional)
control flow. This can backfire if, within a "try" block, the stack
location which held an object reference on entry to the block gets
overwritten with a non-reference (i.e. a primitive). If an exception
is later thrown from such a block, we might end up trying to treat
that non-reference as a reference during GC, which will crash the VM.
The ideal way to fix this is to calculate the true interval for which
each value is live and use that to produce the stack frame maps. This
would provide the added benefit of ensuring that the garbage collector
does not visit references which, although still present on the stack,
will not be used again.
However, this commit uses the less invasive strategy of ANDing
together the root maps at each GC point within a "try" block and using
the result as the map on entry to the corresponding exception
handler(s). This should give us safe, if not optimal, results. Later
on, we can refine it as described above.
We were miscompiling methods which contained getfield, getstatic,
putfield, or putstatic instructions for volatile 64-bit primitives on
32-bit PowerPC due to not noticing that values in registers are clobbered
across function calls.
The solution is to create a separate Compiler::Operand instance for each
object monitor reference before and after the function call to avoid
confusing the compiler. To avoid duplicate entries in the constant pool,
we add code look for and, if found, reuse any existing entry for the same
constant.
Before allocating a new reference in NewGlobalReference or when
creating a local reference, we look for a previously-allocated
reference pointing to the same object. This is a linear search, but
usually the number of elements in the reference list is small, whereas
the memory, locking, and allocation overhead of creating duplicate
references can be large.
We need to check to see if we caught the thread somewhere in the thunk
code (i.e. about to call a helper function), in which case the stack
and base pointers are valid and may be used to create an accurate
trace.