Stanza
Synopsis
Applies a Stanza pipeline on the sections.
This module is experimental.
Description
Stanza applies a Stanza pipeline on the contents of sections.
By default the pipeline tokenizes and predicts POS-tags. Stanza also applies dependency parsing if parse is set, constituency parsing if constituency is set, and named entity recognition if ner is set.
The tokenization can be inhibited for using the existing tokens and sentences by setting pretokenized .
Snippet
<stanza class="Stanza">
<alvisnlpPythonDirectory></alvisnlpPythonDirectory>
</stanza>
Mandatory parameters
alvisnlpPythonDirectory
Directory where the AlvisNLP Python library is found. In principle this parameter is set by default during AlvisNLP install.
Optional parameters
conda
Path to the conda executable. If not set, the Stanza uses the conda executable from PATH. If condaEnvironment is not set then this parameter is ignored.
condaEnvironment
Name of the conda environment in which the script must be run. If this parameter is not set, then the script is not run in a conda environment.
constantAnnotationFeatures
Constant features to add to each annotation created by this module.
constantDocumentFeatures
Constant features to add to each document created by this module.
constantRelationFeatures
Constant features to add to each relation created by this module.
constantSectionFeatures
Constant features to add to each section created by this module.
constantTupleFeatures
Constant features to add to each tuple created by this module.
environment
Additional variable values to pass to the script’s environment.
python
Path to the python executable. By default, let the PATH
environment determine the location of the Python executable.
workingDirectory
Directory where to run the script. By default the working directory of AlvisNLP.
constituency
Either to predict constituents.
documentFilter
Only process document that satisfy this expression.
language
Language of the text.
ner
Either to perform NER. Named entities will be stored in a layer named entities .
parse
Either to predict dependency trees.
pretokenized
Either to skip tokenization and use the existing tokens and sentences.
sectionFilter
Process only sections that satisfy this expression.