TokenizedReader

Synopsis

Reads a tokenized corpus: one token per line, empty line separates sentence.

Reads a tokenized corpus: one token per line, empty line separates sentence.

<tokenizedreader class="TokenizedReader">
    <source></source>
</tokenizedreader>

Mandatory

Path to the file or directory containing the tokenized text.

Optional

Constant features to add to each annotation created by this module.

Optional

Constant features to add to each document created by this module.

Optional

Constant features to add to each section created by this module.

Default value: `text`

Type: String

Name of the section containing the tokenized text.

Default value: `sentences`

Type: String

Name of the sentence layer.

Default value: `words`

Type: String

Name of the token layer.

Deprecated

Type: String

Deprecated alias for section .

Deprecated

Type: String

Deprecated alias for sentenceLayer .

Deprecated

Type: String

Deprecated alias for tokenLayer .