AlvisNLP

corpus processing engine

RegExp

Synopsis

Matches a regular expression on sections contents and create an annotation for each match.

Description

RegExp searches for pattern in the contents of sections, then creates an annotation for each match. The created annotations will span on the entire match. The created annotations will be added in the layer named targetLayer of the corresponding section. If pattern contains groups, then the pattern elements inside groups will be matched but the grouping will not be taken into account in the creation of the annotation.

The created annotations will automatically have all features defined in constantAnnotationFeatures .

Snippet

<regexp class="RegExp">
    <pattern></pattern>
    <targetLayer></targetLayer>
</regexp>

Mandatory parameters

pattern

Mandatory
Type: Pattern

Regular expression to match.

targetLayer

Mandatory
Type: String

Name of the layer where to store matches.

Optional parameters

constantAnnotationFeatures

Optional
Type: Mapping

Constant features to add to each annotation created by this module.

documentFilter

Default value: `true`
Type: Expression

Only process document that satisfy this expression.

sectionFilter

Default value: `true`
Type: Expression

Process only sections that satisfy this expression.

Deprecated parameters

targetLayerName

Deprecated
Type: String

Deprecated alias for targetLayer .