There was a subtle bug in that we were not considering alignment
padding for fields defined in superclasses when calculating field
offsets for a derived class when the superclass(es) were visited by
the bootimage generator before the derived class.
This avoids the requirement of putting the code image in a
section/segment which is both writable and executable, which is good
for security and avoids trouble with systems like iOS which disallow
such things.
The implementation relies on relative addressing such that the offset
of the desired address is fixed as a compile-time constant relative to
the start of the memory area of interest (e.g. the code image, heap
image, or thunk table). At runtime, the base pointer to the memory
area is retrieved from the thread structure and added to the offset to
compute the final address. Using the thread pointer allows us to
generate read-only, position-independent code while avoiding the use
of IP-relative addressing, which is not available on all
architectures.
This fixes a number of bugs concerning cross-architecture bootimage
builds involving diffent endianesses. There will be more work to do
before it works.
This monster commit is the first step towards supporting
cross-architecture bootimage builds. The challenge is to build a heap
and code image for the target platform where the word size and
endianess may differ from those of the build architecture. That means
the memory layout of objects may differ due to alignment and size
differences, so we can't just copy objects into the heap image
unchanged; we must copy field by field, resizing values, reversing
endianess and shifting offsets as necessary.
This commit also removes POD (plain old data) type support from the
type generator because it added a lot of complication and little
value.
This requires reducing HeapCapacity and CodeCapacity back to 128MB and
30MB respectively. I had set them to larger values to test
non-ProGuard'ed OpenJDK bootimage builds, which naturally needed a lot
more space. However, such builds aren't really useful in the real
world, and the compiler currently can't handle jumps or calls spanning
more than the maximum size of an immediate branch offset on ARM or
PowerPC, so I'm lowering them back down to more realistic values.
This is necessary to accomodate classes loaded at runtime which refer
to primitive array types. Otherwise, they won't be included unless
classes in the bootimage refer to them.
This rather large commit modifies the VM to use non-local returns to
throw exceptions instead of simply setting Thread::exception and
returning frame-by-frame as it used to. This has several benefits:
* Functions no longer need to check Thread::exception after each call
which might throw an exception (which would be especially tedious
and error-prone now that any function which allocates objects
directly or indirectly might throw an OutOfMemoryError)
* There's no need to audit the code for calls to functions which
previously did not throw exceptions but later do
* Performance should be improved slightly due to both the reduced
need for conditionals and because undwinding now occurs in a single
jump instead of a series of returns
The main disadvantages are:
* Slightly higher overhead for entering and leaving the VM via the
JNI and JDK methods
* Non-local returns can make the code harder to read
* We must be careful to register destructors for stack-allocated
resources with the Thread so they can be called prior to a
non-local return
The non-local return implementation is similar to setjmp/longjmp,
except it uses continuation-passing style to avoid the need for
cooperation from the C/C++ compiler. Native C++ exceptions would have
also been an option, but that would introduce a dependence on
libstdc++, which we're trying to avoid for portability reasons.
Finally, this commit ensures that the VM throws an OutOfMemoryError
instead of aborting when it reaches its memory ceiling. Currently, we
treat the ceiling as a soft limit and temporarily exceed it as
necessary to allow garbage collection and certain internal allocations
to succeed, but refuse to allocate any Java objects until the heap
size drops back below the ceiling.
We now check for stack overflow in the JIT build as well as the
interpreted build, throwing a StackOverflowError if the limit
(currently hard-coded to 64KB, but should be easy to make
configurable) is exceeded.
In makeCodeImage, we were passing zero to Promise::Listener::resolve,
which would lead to an assertion error if the address of the code
image was further from the base of the address space (i.e. zero) than
could be spanned by a jump on the target architecture. Since, in this
context, we immediately overwrite the value stored, we may pass
whatever we want to this function (we're only calling it so we can
retrieve the location of the value in the image), and the code image
pointer is a better choice for the above reason.
The main change here is to use a lazily-populated vector to associate
runtime data with classes instead of referencing them directly from
the class which requires updating immutable references in the heap
image. The other changes employ other strategies to avoid trying to
update immutable references.
32MB was just slightly too large for PowerPC immediate call instructions
to span, and 16MB matches the JIT executable memory area we use in
compile.cpp.
We now create a unique thunk for each vtable position so as to avoid
relying on using the return address to determine what method is to be
compiled and invoked, since we will not have the correct return address
in the case of a tail call. This required refactoring how executable
memory is allocated in order to keep AOT compilation working. Also, we
must always use the same register to hold the class pointer when
compiling virtual calls, and ensure that the pointer stays there until
the call instruction is executed so we know where to find it in the
thunk.
bootimage.cpp
This is useful for debugging the compiler, since it allows you to
compile one method in particular if that's where a bug manifests itself.