PythonScript
Synopsis
Runs a Python script. This module is useful for processing the corpus with Python libraries dedicated to NLP.
This module is experimental.
Description
PythonScript assumes the script reads from standard input the AlvisNLP data structure serialized as JSON. PythonScript also assumes the script writes the modifications serialized in JSON to the standard output, unless outputFile is set.
The alvisnlp.py
library facilitates the deserialization, serialization, and manipulation of the AlvisNLP data structure. It is located in the directory specified by alvisnlpPythonDirectory .
The script to run is specified with script .
Snippet
<pythonscript class="PythonScript">
<alvisnlpPythonDirectory></alvisnlpPythonDirectory>
<script></script>
</pythonscript>
Mandatory parameters
alvisnlpPythonDirectory
Directory where the AlvisNLP Python library is found. In principle this parameter is set by default during AlvisNLP install.
script
Path to the script to run.
Optional parameters
conda
Path to the conda executable. If not set, the PythonScript uses the conda executable from PATH. If condaEnvironment is not set then this parameter is ignored.
condaEnvironment
Name of the conda environment in which the script must be run. If this parameter is not set, then the script is not run in a conda environment.
constantAnnotationFeatures
Constant features to add to each annotation created by this module.
constantDocumentFeatures
Constant features to add to each document created by this module.
constantRelationFeatures
Constant features to add to each relation created by this module.
constantSectionFeatures
Constant features to add to each section created by this module.
constantTupleFeatures
Constant features to add to each tuple created by this module.
environment
Additional variable values to pass to the script’s environment.
layers
Names of layers to serialize. Layers not mentioned in this parameter will not be serialized. Use this to limit the amount of serialized data. By default PythonScript serializes all annotations in all layers.
outputFile
Path where to write the script standard output. If this parameter is set, then PythonScript will not read the script output for modifications.
python
Path to the python executable. By default, let the PATH
environment determine the location of the Python executable.
relations
Names of relations to serialize. Relations not mentioned in this parameter will not be serialized. Use this to limit the amount of serialized data. By default PythonScript serializes all tuples in all relations.
workingDirectory
Directory where to run the script. By default the working directory of AlvisNLP.
callPython
Either to call Python interpreter as executable and the script as the command. If this parameter is false
, then the user must have execution rights on the script, and the script must have the appropriate shebang to locate the Python interpreter.
commandLine
Additional command line arguments to pass to the script.
documentFilter
Only process document that satisfy this expression.
scriptParams
Parameters to pass through the the serialized data structure. Expressions are evaluated from the corpus as strings.
sectionFilter
Process only sections that satisfy this expression.
Deprecated parameters
layerNames
Deprecated alias for layers .
relationNames
Deprecated alias for relations .