The primary change is to ensure we output a Mach-O file of appropriate
endianness when cross-compiling for an opposite-endian architecture.
Earlier versions of XCode's linker accepted files of either
endianness, reguardless of architecture, but later versions don't,
hence the change.
These paths reduce contention among threads by using atomic operations
and memory barriers instead of mutexes where possible. This is
especially important for JNI calls, since each such call involves two
state transitions: from "active" to "idle" and back.