Previously, I used a shell script to extract modification date ranges
from the Git history, but that was complicated and unreliable, so now
every file just gets the same year range in its copyright header. If
someone needs to know when a specific file was modified and by whom,
they can look at the Git history themselves; no need to include it
redundantly in the header.
This is necessary to avoid name conflicts on various platforms. For
example, iOS has its own util.h, and Windows has a process.h. By
including our version as e.g. "avian/util.h", we avoid confusion with
the system version.
It seems that older versions of GCC (4.0 and older, at least) generate
assembly files with duplicate symbols for function templates which
differ only by the attributes of the templated types. Newer versions
have no such problem, but we need to support both, hence the
workaround in this commit of using a dedicated, non-template "alias"
function where we previously used "cast<alias_t>".
We use a template function called "cast" to get raw access to fields
in in the VM. In particular, we use this function in util.cpp to
treat reference fields as intptr_t fields so we can use the least
significant bit as the red/black flag in red/black tree nodes.
Unfortunately, this runs afoul of the type aliasing rules in C/C++,
and the compiler is permitted to optimize in a way that assumes such
aliasing cannot occur. Such optimization caused all the nodes in the
tree to be black, leading to extremely unbalanced trees and thus slow
performance.
The fix in this case is to use the __may_alias__ attribute to tell the
compiler we're doing something devious. I've also used this technique
to avoid other potential aliasing problems. There may be others
lurking, so a complete audit of the VM might be a good idea.
This allows OpenJDK to access time zone data which is normally found
under java.home, but which we must embed in the executable itself to
create a self-contained build. The VM intercepts various file
operations, looking for paths which start with a prefix specified by
the avian.embed.prefix property and redirecting those operations to an
embedded JAR.
For example, if avian.embed.prefix is "/avian-embedded", and code
calls File.exists() with a path of
"/avian-embedded/javahomeJar/foo.txt", the VM looks for a function
named javahomeJar via dlsym, calls the function to find the memory
region containing the embeded JAR, and finally consults the JAR to see
if the file "foo.txt" exists.
The biggest change in this commit is to split the system classloader
into two: one for boot classes (e.g. java.lang.*) and another for
application classes. This is necessary to make OpenJDK's security
checks happy.
The rest of the changes include bugfixes and additional JVM method
implementations in classpath-openjdk.cpp.
This helps us support the Java Memory Model without adding a memory
barrier to every object allocation. It's also potentially more
efficient, since we zero out each heap segment all at once instead of
bit-by-bit with each object allocation.