compileDirectInvoke does some magic to optimize tail calls to native
methods which involves storing the return address (which we'll never
actually return to, since it's a tail call) in a thread-local field so
the thunk function can figure out which native method to look up at
runtime. Since this address will change when the boot image is
loaded, the boot image creation code needs to know about it.
callContinuation failed to call the correct continuation when feeding
it an exception due to a regression introduced with the
Thread.getStackTrace changes.
The new Thread::defaultHeap declaration has increased the offset of all
the fields following it.
This commit also makes vmInvoke_returnAddress global so it can be refered
to from compile.cpp.
It's not safe to use malloc from a signal handler, so we can't
allocate new memory when handling segfaults or Thread.getStackTrace
signals. Instead, we allocate a fixed-size backup heap for each
thread ahead of time and use it if there's no space left in the normal
heap pool. In the rare case that the backup heap isn't large enough,
we fall back to using a preallocated exception without a stack trace
as a last resort.
This function was broken in two different ways:
1. It only checked MyProcessor::thunks, not MyProcessor::bootThunks.
It needs to check both.
2. When checking MyProcessor::thunks, it used fields from
MyProcessor::bootThunks instead of from the same thunk collection.
This fixes both problems.
Implementing Thread.getStackTrace is tricky. A thread may interrupt
another thread at any time to grab a stack trace, including while the
latter is executing Java code, JNI code, helper thunks, VM code, or
while transitioning between any of these.
To create a stack trace we use several context fields associated with
the target thread, including snapshots of the instruction pointer,
stack pointer, and frame pointer. These fields must be current,
accurate, and consistent with each other in order to get a reliable
trace. Otherwise, we risk crashing the VM by trying to walk garbage
stack frames or by misinterpreting the size and/or content of
legitimate frames.
This commit addresses sensitive transition points such as entering the
helper thunks which bridge the transitions from Java to native code
(where we must save the stack and frame registers for use from native
code) and stack unwinding (where we must atomically update the thread
context fields to indicate which frame we are unwinding to). When
grabbing a trace for another thread, we determine what kind of code we
caught the thread executing in and use that information to choose the
thread context values with which to begin the trace. See
MyProcessor::getStackTrace::Visitor::visit for details.
In order to atomically update the thread context fields, we do the
following:
1. Create a temporary "transition" object to serve as a staging area
and populate it with the new field values.
2. Update a transition pointer in the thread object to point to the
object created above. As long as this pointer is non-null,
interrupting threads will use the context values in the staging
object instead of those in the thread object.
3. Update the fields in the thread object.
4. Clear the transition pointer in the thread object.
We use a memory barrier between each of these steps to ensure they are
made visible to other threads in program order. See
MyThread::doTransition for details.
In java-nio.cpp, we can't use GetPrimitiveArrayCritical when reading
from or writing to blocking sockets since it may block the rest of the
VM indefinitely.
In SelectableChannel.java, we can't use a null test on
SelectableChannel.key to determine whether the channel is open since
it might never be registered with a Selector. According to the Sun
documentation, a SelectableChannel is open as soon as it's created, so
that's what we now implement.
In Mac OS X, if a path contains a space, the path of the main executable
will contain a special URL-encoded character (%20 in this case). This
probably happens when any non-ASCII character is provided.
The fix is to use CFURLCreateStringByReplacingPercentEscapes which
creates a path that the POSIX API likes better.
We were generating code which clobbered the data we were putting into
64-bit volatile fields (and potentially also clobbering the target or
source object in the case of non-static fields) due to misplaced
synchronization code. Reordering this code ensures that both the data
and the target or source survive across calls to synchronization
helper functions.
Previously, the stack frame mapping code (responsible for statically
calculating the map of GC roots for a method's stack frame during JIT
compilation) would assume that the map of GC roots on entry to an
exception handler is the same as on entry to the "try" block which the
handler is attached to. Technically, this is true, but the algorithm
we use does not consider whether a local variable is still "live"
(i.e. will be read later) when calculating the map - only whether we
can expect to find a reference there via normal (non-exceptional)
control flow. This can backfire if, within a "try" block, the stack
location which held an object reference on entry to the block gets
overwritten with a non-reference (i.e. a primitive). If an exception
is later thrown from such a block, we might end up trying to treat
that non-reference as a reference during GC, which will crash the VM.
The ideal way to fix this is to calculate the true interval for which
each value is live and use that to produce the stack frame maps. This
would provide the added benefit of ensuring that the garbage collector
does not visit references which, although still present on the stack,
will not be used again.
However, this commit uses the less invasive strategy of ANDing
together the root maps at each GC point within a "try" block and using
the result as the map on entry to the corresponding exception
handler(s). This should give us safe, if not optimal, results. Later
on, we can refine it as described above.
See commit 8120bee4dc for the original
problem description and solution. That commit and a couple of related
ones had to be reverted when we found they had introduced GC-safety
regressions leading to crashes.
This commit restores the reverted code and fixes the regressions.
We're seeing race conditions which occasionally lead to assertion
failures and thus crashes, so I'm reverting these changes for now:
29309fb414e92674cb738120bee4dc
We don't want to check Thread::waiting until we have re-acquired the
monitor, since another thread might notify us between releasing
Thread::lock and acquiring the monitor.
We need to prefix instructions of the form "mov R,M" with a REX byte
when R is %spl, %bpl, %sil, or %dil. Such moves are unencodable on
32-bit x86, and, because of the order in which we pick registers,
pretty rare on 64-bit systems, which is why this took so long to
notice.