AlvisNLP

corpus processing engine

GeniaTagger

Synopsis

Runs Genia Tagger on annotations.

Description

GeniaTagger executes theGenia Tagger on annotations from the layer wordLayer and record the results in the features specified by posFeature , lemmaFeature , chunkFeature and entityFeature . GeniaTagger reinforces sentences specified by annotations in the sentenceLayer layer.

Snippet

<geniatagger class="GeniaTagger">
    <geniaDir></geniaDir>
</geniatagger>

Mandatory parameters

geniaDir

Mandatory
Type: File

Directory where geniatagger is installed.

Optional parameters

chunkFeature

Optional
Type: String

Feature where to put the chunk status.

entityFeature

Optional
Type: String

Feature where to put the entity status.

documentFilter

Default value: `true`
Type: Expression

Only process document that satisfy this expression.

geniaCharset

Default value: `UTF-8`
Type: String

Character encoding of geniatagger input and output.

geniaTaggerExecutable

Default value: `./geniatagger`
Type: File

Name of the geniatagger executable file.

lemmaFeature

Default value: `lemma`
Type: String

Feature where to put the word lemma.

posFeature

Default value: `pos`
Type: String

Feature where to put the POS tag.

sectionFilter

Default value: `true and layer:sentences and layer:words`
Type: Expression

Process only sections that satisfy this expression.

sentenceFilter

Default value: `true`
Type: Expression

Evaluated as a boolean with the sentence annotation as the context element. GeniaTagger only process the sentence if the result is true. To filter sentences that are too long for Genia Tagger, use “length < 1024”.

sentenceLayer

Default value: `sentences`
Type: String

Name of the layer containing sentence annotations.

treeTaggerTagset

Default value: `false`
Type: Boolean

UNDOCUMENTED

wordFormFeature

Default value: `form`
Type: String

Feature containing the word surface form.

wordLayer

Default value: `words`
Type: String

Name of the layer containing word annotations.

Deprecated parameters

sentenceLayerName

Deprecated
Type: String

Deprecated alias for sentenceLayer .

wordLayerName

Deprecated
Type: String

Deprecated alias for wordLayer .