This adds support for character classes such as \d or \W, leaving \p{...}
style character classes as an exercise for later.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Now that we have non-greedy repeats, we can implement the find() (which
essentially prefixes the regular expression pattern with '.*?'.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Now that we have reluctant quantifiers, we can get rid of the hardcoded
program for the challenging regular expression pattern.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
While at it, let's get rid of the unescaping in TrivialPattern which was
buggy anyway: special operators such as \b were misinterpreted as trivial
patterns.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Originally, this developer wanted to (ab)use the PikeVM with a
hand-crafted program and an added "callback" opcode to parse the regular
expressions.
However, this turned out to be completely unnecessary: there are no
ambiguities in regular expression patterns, so there is no need to do
anything else than parse the pattern, one character at a time, into a
nested expression that then knows how to write itself into a program for
the PikeVM.
For the moment, we still hardcode the program for the regular expression
pattern demonstrating the challenge with the prioritized threads because
the compiler cannot yet parse reluctant operators.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>