AlvisNLP

corpus processing engine

TomapTrain

Synopsis

TomapTrain analyzes terms in preparation of the classification of candidates with ToMap .

Description

TomapTrain assumes each sentence or section is a proxy term according to the ToMap method. TomapTrain analyzes the syntactic structure of sections and stores them in outFile . Use this file for classifying terms with TomapProjector . The identifier associated with the proxy is specified with conceptIdentifier .

conceptIdentifier is

Snippet

<tomaptrain class="TomapTrain">
    <conceptIdentifier></conceptIdentifier>
    <outFile></outFile>
    <rcFile></rcFile>
    <yateaExecutable></yateaExecutable>
</tomaptrain>

Mandatory parameters

conceptIdentifier

Mandatory
Type: Expression

An expression evaluated as a string from the section or sentence that specifies the identifier associated with the proxy.

outFile

Mandatory

Path to the file where to store the proxy syntactic structures and associated identifiers/

rcFile

Mandatory

Path to the YaTeA configuration file.

yateaExecutable

Mandatory

Path to the YaTeA executable file.

Optional parameters

configDir

Optional

language

Optional
Type: String

localeDir

Optional

outputDir

Optional

perlLib

Optional
Type: String

Contents of the PERLLIB in the environment of Yatea binary.

postProcessingConfig

Optional
Type: InputFile

BioYaTeA option: path to the post-processing file option.

postProcessingOutput

Optional
Type: OutputFile

BioYaTeA option: path to the result file after post-processing.

suffix

Optional
Type: String

termListFile

Optional
Type: OutputFile

Path where to write the candidates list produced by YaTeA.

xmlTermsFile

Optional
Type: OutputFile

Path where to write the candidates XML file produced by YaTeA.

bioYatea

Default value: `false`
Type: Boolean

documentFilter

Default value: `true`
Type: Expression

UNDOCUMENTED

formFeature

Default value: `form`
Type: String

Feature containing the word form.

lemmaFeature

Default value: `lemma`
Type: String

Feature containing the word lemma.

posFeature

Default value: `pos`
Type: String

Feature containing the word POS tag.

sectionFilter

Default value: `true and layer:words`
Type: Expression

UNDOCUMENTED

sentenceLayer

Default value: `sentences`
Type: String

Name of the layer containing sentence annotations, sentences are reinforced.

wordLayer

Default value: `words`
Type: String

Name of the layer containing the word annotations.

yateaDefaultConfig

Default value: `{}`
Type: Mapping

yateaOptions

Default value: `{}`
Type: Mapping

Deprecated parameters

sentenceLayerName

Deprecated
Type: String

Deprecated alias for sentenceLayer .

wordLayerName

Deprecated
Type: String

Deprecated alias for wordLayer .

workingDir

Deprecated

Path to the directory where YaTeA is launched. This parameter is deprecated , use xmlTermsFile and termListFile instead.