AlvisNLP

corpus processing engine

TyDIExportProjector

Synopsis

UNDOCUMENTED

Description

UNDOCUMENTED

Snippet

<tydiexportprojector class="TyDIExportProjector">
    <lemmaFile></lemmaFile>
    <mergeFile></mergeFile>
    <quasiSynonymsFile></quasiSynonymsFile>
    <synonymsFile></synonymsFile>
    <targetLayer></targetLayer>
</tydiexportprojector>

Mandatory parameters

lemmaFile

Mandatory

UNDOCUMENTED

mergeFile

Mandatory

UNDOCUMENTED

quasiSynonymsFile

Mandatory

UNDOCUMENTED

synonymsFile

Mandatory

UNDOCUMENTED

targetLayer

Mandatory
Type: String

Name of the layer that contains the match annotations.

Optional parameters

acronymsFile

Optional

UNDOCUMENTED

constantAnnotationFeatures

Optional
Type: Mapping

Constant features to add to each annotation created by this module.

saveDictFile

Optional

UNDOCUMENTED

trieSink

Optional
Type: OutputFile

If set, then TyDIExportProjector writes the compiled dictionary to the specified file.

trieSource

Optional
Type: InputFile

If set, read the compiled dictionary from the specified file. Compiled dictionaries are usually faster for large dictionaries.

typographicVariationsFile

Optional

UNDOCUMENTED

allUpperCaseInsensitive

Default value: `false`
Type: Boolean

If set to true , then allow case folding on all characters in words that are all upper case.

allowJoined

Default value: `false`
Type: Boolean

If set to true , then allow arbitrary suppression of whitespace characters in the subject. For instance, the contents aminoacid matches the key amino acid .

canonicalFormFeature

Default value: `lemma`
Type: String

UNDOCUMENTED

caseInsensitive

Default value: `false`
Type: Boolean

If set to true , then allows case folding on all characters.

documentFilter

Default value: `true`
Type: Expression

Only process document that satisfy this expression.

ignoreDiacritics

Default value: `false`
Type: Boolean

If set to true , then allow dicacritic removal on all characters. For instance the contents acide amine matches the key acide aminé .

joinDash

Default value: `false`
Type: Boolean

If set to true , then treat dash characters (-) as whitespace characters with regard to allowJoined . For instance, the contents aminoacid matches the entry amino-acid .

matchStartCaseInsensitive

Default value: `false`
Type: Boolean

If set to true , then allow case folding on the first character of the entry key.

multipleEntryBehaviour

Default value: `all`

Specifies the behavior if the lexicon contains several entries with the same key.

sectionFilter

Default value: `true`
Type: Expression

Process only sections that satisfy this expression.

skipConsecutiveWhitespaces

Default value: `false`
Type: Boolean

If set to true , then allow the insertion of consecutive whitespace characters in the subject. For instance, the contents amino acid matches the entry amino acid .

skipWhitespace

Default value: `false`
Type: Boolean

If set to true , then allow arbitrary insertion of whitespace characters in the subject. For instance, the contents amino acid matches the key aminoacid .

subject

Default value: `WORD`
Type: Subject

Specifies the contents to match.

substituteWhitespace

Default value: `false`
Type: Boolean

If set to true , then all whitespace characters match each other (including ‘\n’, ‘\r’, ‘\t’, and non-breaking spaces).

wordStartCaseInsensitive

Default value: `false`
Type: Boolean

If set to true , then allow case folding on the first character of each word.

Deprecated parameters

targetLayerName

Deprecated
Type: String

Deprecated alias for targetLayer .