Whereas the GNU Classpath port used the strategy of patching Classpath
with core classes from Avian so as to minimize changes to the VM, this
port uses the opposite strategy: abstract and isolate
classpath-specific features in the VM similar to how we abstract away
platform-specific features in system.h. This allows us to use an
unmodified copy of OpenJDK's class library, including its core classes
and augmented by a few VM-specific classes in the "avian" package.
In order to facilitate making the VM compatible with multiple class
libraries, it's useful to separate the VM-specific representation of
these classes from the library implementations. This commit
introduces VMClass, VMField, and VMMethod for that purpose.
Previously, we risked segfaults by passing negative numbers to memcpy.
This commit also makes arraycopy throw an IndexOutOfBounds exception
instead of an ArrayStoreException if the specified offsets and lengths
would take us outside the bounds of one or both of the arrays, per the
Sun documentation.
It's not safe to use malloc from a signal handler, so we can't
allocate new memory when handling segfaults or Thread.getStackTrace
signals. Instead, we allocate a fixed-size backup heap for each
thread ahead of time and use it if there's no space left in the normal
heap pool. In the rare case that the backup heap isn't large enough,
we fall back to using a preallocated exception without a stack trace
as a last resort.
See commit 8120bee4dc5f9ae2dec75a907778f1479ad398bd for the original
problem description and solution. That commit and a couple of related
ones had to be reverted when we found they had introduced GC-safety
regressions leading to crashes.
This commit restores the reverted code and fixes the regressions.
We're seeing race conditions which occasionally lead to assertion
failures and thus crashes, so I'm reverting these changes for now:
29309fb4149ec02f993f84ffe4675e95c98db832
e92674cb7337355dc4dd6317219010e5d1ce7e1c
8120bee4dc5f9ae2dec75a907778f1479ad398bd
We don't want to check Thread::waiting until we have re-acquired the
monitor, since another thread might notify us between releasing
Thread::lock and acquiring the monitor.
Due to SWT's nasty habit of creating a new object monitor for every
task added to Display.asyncExec, we've found that, on Windows at
least, we tend to run out of OS handles due to the large number of
mutexes we create between garbage collections.
One way to address this might be to trigger a GC when either the
number of monitors created since the last GC exceeds a certain number
or when the total number of monitors in the VM reaches a certain
number. Both of these risk hurting performance, especially if they
force major collections which would otherwise be infrequent. Also,
it's hard to know what the values of such thresholds should be on a
given system.
Instead, we reimplement Java monitors using atomic compare-and-swap
(CAS) and thread-specific native locks for blocking in the case of
contention. This way, we can create an arbitrary number of monitors
without creating any new native locks. The total number of native
locks needed by the VM is bounded instead by the number of live
threads plus a small constant.
Note that if we ever add support for an architecture which does not
support CAS, we'll need to provide a fallback monitor implementation.
Before allocating a new reference in NewGlobalReference or when
creating a local reference, we look for a previously-allocated
reference pointing to the same object. This is a linear search, but
usually the number of elements in the reference list is small, whereas
the memory, locking, and allocation overhead of creating duplicate
references can be large.
This implementation does not conform to the Java standard in that
finalize methods are called from whichever thread happens to be garbage
collecting, and that thread may hold locks, whereas the standard
guarantees that finalize will be run from a thread which holds no locks.
Also, an object will never be finalized more than once, even if its
finalize method "rescues" (i.e. makes reachable) the object such that it
might become unreachable a second time and thus a candidate for
finalization once more. It's not clear to me from the standard if this
is OK or not.
Nonwithstanding the above, this implementation is useful for "normal"
finalize methods which simply release resources associated with an
object.
The previous code relied on the invalid assumption that the thread-local
heaps for all threads would have been cleared immediately following a
garbage collection. However, the last thing the garbage collection
function does is run finalizers which may allocate new objects. This
can lead allocate3 to call allocateSmall with a size which is too large
to accomodate, overflowing the heap.
The solution is to iterate until there really is enough room for the
original allocation request.
We now create a unique thunk for each vtable position so as to avoid
relying on using the return address to determine what method is to be
compiled and invoked, since we will not have the correct return address
in the case of a tail call. This required refactoring how executable
memory is allocated in order to keep AOT compilation working. Also, we
must always use the same register to hold the class pointer when
compiling virtual calls, and ensure that the pointer stays there until
the call instruction is executed so we know where to find it in the
thunk.
This helps us support the Java Memory Model without adding a memory
barrier to every object allocation. It's also potentially more
efficient, since we zero out each heap segment all at once instead of
bit-by-bit with each object allocation.