At first, it might look like the atomicIncrement operations here,
since they resolve to OSAtomicCompareAndSwap32Barrier, ought to
provide all the memory barrier guarantees we need; however, it turns
out it's not quite sufficient.
Here's why: Apple's OSAtomic*Barrier operations guarantee that memory
operations *before* the atomic op won't be reordered with memory
operations *after* the atomic op - but makes no guarantee that the
atomic op itself won't be reordered with other operations before or
after it. The key is that the atomic operation is not really atomic,
but rather consists of separate read, check and write steps - in a
loop, no less. Here, presumably, the read of t->m->exclusive is
hoisted by the out-of-order processor to between the read and write
steps of the "atomic" increment of t->m->activeCount.
Longer term, it probably makes sense to replace this with the c11
<stdatomic.h> operations or the c++11 <atomic> types. We ought to
actually use the atomic increment operations provided there. As it
is, our atomicIncrement looks substantially less efficient than those,
since it's actually implemented on arm64 as two nested loops (one in
atomicIncrement and one in the compare-and-swap) instead of one. We
should also evaluate whether the various instances of atomic
operations actually need as strong of barriers as we're giving them.
FWIW, the gcc __sync_* builtins seem to have the same problem, at
least on arm64.
such as %s, %d, %x... also width, left justify, zerofill flags are
implemented. Many of the other formats do not work, see the comments in
avian.FormatString javadoc and look for FIXME and TODO items for more
information on features that are known to not work correctly.
This would theoretically break compatibility with apps using embedded
classpaths, on big-endian architectures - because of the size type
extension. However, we don't currently support any big-endian
architectures, so it shouldn't be a problem.
This improves support for building with openjdk=$jdk8. Note, however,
that the Processes test does not pass, since JDK 8's
java.lang.UNIXProcess uses invokedynamic, which Avian does not yet
support.
OpenJDK 8 includes a core class (java.lang.Thread) which so many
fields that it exceeds the class size limit in type-generator dictated
by the logic responsible for calculating each class's GC object mask,
at least on 32-bit systems. There was no fundamental need for this
limit -- it just made the code simpler.
This commit removes the above limit at the cost of slightly more
complicated code. The original motivation for this change is that the
platform=macosx arch=i386 openjdk=$jdk8 build was failing. However,
there doesn't seem to be a prebuild JDK 8 for 32-bit OS X anywhere on
the Internet, nor is there any obvious way to build one on a modern
Mac, so it's safe to say we won't be supporting this combination
anyway. The problem also occurs on Linux and Windows, though,
so it needs to be fixed.