old version was too slow to recognize nested identical paths because the
complete context needed to be analyzed.
now the token analysis saves meta information about whether a given
token is from the context or the target element. using this we can
detect nested identical paths by checking if the context starts at the
same position if two token lists are identical.
this can be done when ever the index list is full (no gaps), which is a
lot sooner than the end of the tokenCombinations call.
our algorithm can't deal with recursive prefixes. that's not a huge
problem though since ANTLR uses LL(*) which can't deal with these
constructs either.
solution for the moment: if the minimal path difference analysis errors
out check if more than 2 path are recursive. if so, throw an exception
possible improvement for later: just consider predicates and not tokens
- similar to ANTLR which falls back to LL(1)
renamed NestedPrefixAlternativesException
cleanup of guard classes
redo of parentheses avoidance
removed general start and end parentheses (antlr should be able to deal
with that)
because the hash code of TokenAnalysisPath objects changes on
add-operations; instead of sets we use lists and manually check for
collisions when merging
there is no need to throw an exception if a loop has no progress -> it
can be skipped
fixed bug with merge() in TokenAnalysisPaths: merging with empty path is
an identity operation but further merged with the same path cause
problems because of concurrent modifications in the LinkedHashSet
check progress on smallest branch in every iteration not just the first
-> now loops in token analysis should always exit, either by being done
or by tripping the endless loop protection
use path in grammar as key for caching elements; using the elements
itself doesn't work because the reference will be different for the
content assist grammar run.
this does not work for generated elements; the only place that generated
elements are necessary is when dealing with nested identical paths, in
this case caching is disabled.
the benchmark test case does not validate the result since we are only
interested in the runtime.
removed special case for non-trivial cardinality - use regular
findGuardForElement instead
adjusted test cases to follow antlr semantic - ignore following
predicates
made HoistingProcessor Singleton
preparation for better caching
tests predicate rendering in alternatives, in non-trivial cardinalities,
and setup block rendering. in both the production grammar and content
assist grammar.
added debug grammar generators for production grammar and content assist
grammar which use DebugGrammarNaming to avoid null-ptr because of
missing adapters in ecore objects. (existing AntlrDebugGrammarGenerator
can't be used because setup blocks are handled by AntlrGrammarGenerator
and AntlrContentAssistGrammarGenerator respectively, not
AbstractAntlrGrammarGenerator.)
fixed small details in rendered output (removed guard of rule comment
and trimmed setup block).
- added test case of recursive rules (with start rule) and optional
context
- disable cardinalities (and repetitions in unordered groups) in context
analysis if the current element was already seen and there was no
progress this recursion - the element itself will not be returned by
getNextElementsInContext()
- changed isStartRule() and findAllRuleCalls() in GrammarUtils to only
compare the name of the rule. otherwise constructed paths cause problems
(since the rule object is not the same and the alternatives-attribute
might (very probably) not be equal in the constructed rule object).
- changed findGuardForOptionalCardinalityWithoutContext() in
HoistingProcessor to avoid the construction of virtual elements and
instead provide the virtual cardinality to the token analysis via a new
parameter.
- fixed testCardinalityQuestionmarkWithExternalContext_expectContextCheck()
test case (context analysis is not possible if the context equals the
remainder of the prefixed alternative; changed context to be different)
empty paths are keep the state of the prefix including the empty flag
which causes empty paths in alternatives to be lost.
fix: create new token state with prefix (reset empty flag)
recursive (only tail recursion) causes problems because context of rule
call causes endless recursion in getNextElementsInContext()
example:
S: {S} $$ p0 $$?=> 's'
| $$ p1 $$?=> 's' s=S
;
solution: added set of visited rule calls to parameter list, skip rule
call if it was already handled by another recursion
added test cases
when context analysis is needed and an element in context has multiple
cardinality and contains empty paths (e.g. ?-quantified) the quantified
element will be recursed endlessly without the token analysis path even
being done (since there is always an empty path) - this also applies to
unordered groups.
recognizing empty paths is not trivial because of recursive rules.
solution: save "call stack" (actually just a set of visited elements)
during recursions in context analysis. if an element is seen multiple
times we check if there is any progress in the analysis paths. if not
this is an endless recursion
-> throw exception
non-trivial quantifiers in context path are not handled correctly.
change to getNextElementsInContext():
add non-trivially quantified elements in path (except first element
because of potential endless recursion) to result set
when the following context element is optional, the minimal sequence is
not going to get bigger. in this case further context analysis is
blocked by the exception.
=> removed exception
also made sure getNextElementsInContext() won't return non-token,
non-compound elements like actions or predicates.
is necessary because otherwise the identity analysis might not be able
to detect if paths are identical up to the token limit (which should
cause an error)
testRecursiveRuleCallingAlternative_expectCorrectGuard
guard for nested alternatives might not be optimal if the positions that
discriminate in the first alternatives is different from the positions
needed on the next one
collapsed paths loose the containing token guard
=> fixed problem
+ added positional condition in token sequence guard constructor to not
check positions twice (order matters: give local token guard first in
case it is sufficient)
removed testNestedAlternativesWithSingleTokenDifference, because it is
wrong: 'a' 'b' 'd' with p1 satisfies semantic predicates but not the
generated guard condition
nested prefix alternatives analysis
now flattenPaths() considers following repetitions
new problem: unordered groups and non-trivial cardinalities without
non-optional elements causes explosion of generated alternatives, which
in turn cause the identity analysis to go out of control
example: S: ('a'? | 'b'?)+
with a token limit of 10 the nested prefix alternatives analysis would
generate over 1300 alternatives
current quick fix: limit of alternatives
after minimal path difference analysis failed, flatten paths (limited by
token limit; justification: identity check would error out if paths are
not distinguishable within the limit) and recompute alternative guard.
now nested prefixes will be collapsed with all corresponding guard
conditions