AlvisNLP

corpus processing engine

PatternMatcher

Synopsis

Matches a regular expression-like pattern on the sequence of annotations in a given layer.

Description

PatternMatcher searches for pattern on the sequence of annotations in layer layer . Note that in a layer, annotations are sorted in increasing order of start boundary, then decreasing order of end boundary; the order is undefined for annotations with the exact same span.

For each match, PatternMatcher applies all actions specified by actions . Each action concerns a sub-group of the pattern, if no sub-group is specified then the action applies to the whole match.

Snippet

<patternmatcher class="PatternMatcher">
    <actions></actions>
    <pattern></pattern>
</patternmatcher>

Mandatory parameters

actions

Mandatory

Actions to perform each time the pattern is matched on the annotation sequence. See MatchActionArray for all available actions.

pattern

Mandatory

Pattern to match see ElementPattern for pattern syntax.

Optional parameters

constantAnnotationFeatures

Optional
Type: Mapping

Constant features to add to each annotation created by this module.

constantRelationFeatures

Optional
Type: Mapping

Constant features to add to each relation created by this module.

constantTupleFeatures

Optional
Type: Mapping

Constant features to add to each tuple created by this module.

annotationComparator

Default value: `length`

Comparator to use when removing overlaps.

documentFilter

Default value: `true`
Type: Expression

Only process document that satisfy this expression.

layer

Default value: `words`
Type: String

Match the pattern on the annotations contained in this layer.

overlappingBehaviour

Default value: `remove`

What to do if the layer contains overlapping annotations.

sectionFilter

Default value: `true and layer:words`
Type: Expression

Process only sections that satisfy this expression.

Deprecated parameters

layerName

Deprecated
Type: String

Deprecated alias for layer .