Most of these regressions were simply due to testing a lot more stuff,
esp. annotations and reflection, revealing holes in the Android
compatibility code. There are still some holes, but at least the
suite is passing (except for a fragile test in Serialize.java which I
will open an issue for).
Sorry this is such a big commit; there was more to address than I
initially expected.
In the Android class path, TreeMap is implemented differently and as a
consequence its serialization is incompatible with OpenJDK's. So let's
test a private static class' serialization instead, to make sure that
the wire protocol defined by the Java Language Specification is
implemented.
This addresses issue #123 reported by Joel Dice.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The intent of this target is to run our test suite against the installed jre.
This should help prevent our VM from diverging in implementation from the jdk.
The remainder of this commit fixes the problems that this exposes.
OpenJDK's regex engine can only handle look-behinds of limited sizes.
So let's just test for that, not the unbounded one we had before (that
our own regex engine handles quite fine, though).
This fixes issue #115.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Method.invoke should initialize its class before invoking the method,
throwing an ExceptionInInitializerError if it fails, without wrapping
said error in an InvocationTargetException.
Also, we must initialize ExceptionInInitializerError.exception when
throwing instances from the VM, since OpenJDK's
ExceptionInInitializerError.getCause uses the exception field, not the
cause field.
Inner classes can have inner classes, but getDeclaredClasses() is
supposed to list *only* the immediate inner classes.
Example: if class Reflection contains a class Hello that contains
a class World, Reflection.class.getDeclaredClasses() must not
include World in its result.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The particular pattern we use to test it is used in ImgLib2, based on
this answer on stackoverflow:
http://stackoverflow.com/a/279337
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This adds support for character classes such as \d or \W, leaving \p{...}
style character classes as an exercise for later.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
A program for the PikeVM corresponds to a regular expression pattern. The
program matches the character sequence in left-to-right order. However,
for look-behind expressions, we will want to match the character sequence
backwards.
To this end, it is nice that regular expression patterns can be reversed
in a straight-forward manner. However, it would be nice if we could avoid
multiple parsing passes and simply parse even look-behind expressions as
if they were look-ahead ones, and then simply reverse the program for that
part.
Happily, it is not difficult to reverse the program so it is equivalent to
matching the pattern backwards.
There is one catch, though. Imagine matching the sequence "a" against the
regular expression "(a?)a?". If we match forward, the group will match the
letter "a", when matching backwards, it will match the empty string. So,
while the reverse pattern is equivalent to the forward pattern in terms of
"does the pattern match that sequence", but not its sub-matches. For that
reason, Java simply ignores capturing groups in look-behind patterns (and
for consistency, the same holds for look-ahead patterns).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Now that we have non-greedy repeats, we can implement the find() (which
essentially prefixes the regular expression pattern with '.*?'.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Now that we have reluctant quantifiers, we can get rid of the hardcoded
program for the challenging regular expression pattern.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
While at it, let's get rid of the unescaping in TrivialPattern which was
buggy anyway: special operators such as \b were misinterpreted as trivial
patterns.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Originally, this developer wanted to (ab)use the PikeVM with a
hand-crafted program and an added "callback" opcode to parse the regular
expressions.
However, this turned out to be completely unnecessary: there are no
ambiguities in regular expression patterns, so there is no need to do
anything else than parse the pattern, one character at a time, into a
nested expression that then knows how to write itself into a program for
the PikeVM.
For the moment, we still hardcode the program for the regular expression
pattern demonstrating the challenge with the prioritized threads because
the compiler cannot yet parse reluctant operators.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This will be used to match character classes (such as '[0-9a-f]'),
but it will also be used by the regular expression pattern compiler
to determine whether a character has special meaning in regular
expressions.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>