On Mac OS, signals sent using pthread_kill are never delivered if the
target thread is blocked (e.g. acquiring a lock or waiting on a
condition), so we can't rely on it and must use the Mach-specific
thread execution API instead to implement Thread.getStackTrace.
For Thread.interrupt, we must not only use pthread_kill but also
pthread_cond_signal to ensure the thread is woken up.
The primary change is to ensure we output a Mach-O file of appropriate
endianness when cross-compiling for an opposite-endian architecture.
Earlier versions of XCode's linker accepted files of either
endianness, reguardless of architecture, but later versions don't,
hence the change.
These paths reduce contention among threads by using atomic operations
and memory barriers instead of mutexes where possible. This is
especially important for JNI calls, since each such call involves two
state transitions: from "active" to "idle" and back.