AlvisNLP

corpus processing engine

WekaTrain

Synopsis

Trains a Weka classifier where examples are elements.

Description

WekaTrain builds a Weka training set where examples are elements, trains a classifier and writes it into classifierFile . The training set is specified by examples . Example attributes are specified by relationDefinition .

WekaTrain activates cross validation if one of the following parameters is set: evaluationFile , foldFeature , predictedClassFeature .

Snippet

<wekatrain class="WekaTrain">
    <algorithm></algorithm>
    <classifierFile></classifierFile>
    <examples></examples>
    <relationDefinition></relationDefinition>
</wekatrain>

Mandatory parameters

algorithm

Mandatory
Type: String

Classifier algorithm, this must be the canonical name of a class that extends Weka’s Classifier .

classifierFile

Mandatory
Type: File

File where to write the trained classifier serialization.

examples

Mandatory
Type: Expression

Training set examples. This expression is evaluated as a list of elements with the corpus as the context element.

relationDefinition

Mandatory

Specification of example attributes and class.

Optional parameters

arffFile

Optional

File where to write the training set in ARFF format.

classifierInfoFile

Optional

File where to write classifier information and statistics.

classifierOptions

Optional
Type: String[]

Options to the classifier algorithm.

crossFolds

Optional
Type: Integer

Number of segments for cross validation.

evaluationFile

Optional

File where to write evaluation results.

foldFeature

Optional
Type: String

Feature where to write the fold number in which the training element was in the test set if cross validation is activated.

predictedClassFeature

Optional
Type: String

Feature where to write the class prediction if cross validation is activated.

randomSeed

Default value: `1`
Type: Long

Random seed used by some algorithms and cross validation.

Deprecated parameters

foldFeatureKey

Deprecated
Type: String

Deprecated alias for foldFeature .

predictedClassFeatureKey

Deprecated
Type: String

Deprecated alias for predictedClassFeature .