AlvisNLP

corpus processing engine

REBERTPredict

Synopsis

Binary relation extraction using RE-BERT .

This module is experimental.

Description

REBERTPredict classifies candidate binary relations with ensemble models finetuned with RE-BERT /

Candidate relations can be either asserted by specifying tuples of a relation, or generated by specifying the set of subject and object arguments, or a combination of both.

assertedCandidates specifies the elements corresponding to asserted candidates. Arguments are specified with assertedSubject and assertedObject .

candidateGenerationScope specifies the scope of generated andidates, e.g. documents or sentences. The list of elements corresponding the arguments are specified with generatedSubjects and generatedObjects . For each scope, REBERTPredict will generate a candidate for all combinations of subject and object.

The selection of the model is specified with modelType ( bert , biobert , or scibert ) and finetunedModel . ensembleNumber is the number of models used to make a prediction, ensembleNumber must be less or equal than the number of finetuned models. The final prediction is aggregated by vote.

REBERTPredict stores the prediction in labelFeature either in the asserted candidate, or a new tuple for generated candidates. REBERTPredict does not create tuples for negative predictions unless createNegativeTuples .

Snippet

<rebertpredict class="REBERTPredict">
    <finetunedModel></finetunedModel>
    <modelType></modelType>
    <rebertDir></rebertDir>
</rebertpredict>

Mandatory parameters

finetunedModel

Mandatory

Path to the directory containing the models finetuned with RE-BERT.

modelType

Mandatory
Type: String

BERT variant ( bert , biobert , scibert ).

rebertDir

Mandatory

Path to RE-BERT directory.

Optional parameters

assertedCandidates

Optional
Type: Expression

Asserted candidates. The expression is evaluatd as a list of elements from the corpus.

assertedObject

Optional
Type: Expression

Object (right) argument of asserted candidates. The expression is evaluated as a single element from the asserted candidate element (see assertedCandidates ).

assertedSubject

Optional
Type: Expression

Subject (left) argument of asserted candidates. The expression is evaluated as a single element from the asserted candidate element (see assertedCandidates ).

candidateGenerationScope

Optional
Type: Expression

Scopes for the generated candidates. The expression is evaluated as a list of elements from the corpus.

conda

Optional

Path to a conda executable. If not set, then REBERTPredict uses the conda from the environment PATH.

condaEnvironment

Optional
Type: String

Name of the conda environment. If this parameter is set, then RE-BERT will be executed under a Conda environment.

constantRelationFeatures

Optional
Type: Mapping

Constant features to add to each relation created by this module.

constantTupleFeatures

Optional
Type: Mapping

Constant features to add to each tuple created by this module.

ensembleModels

Optional
Type: Integer[]

Models to use for the prediction. Either this parameter or ensembleNumber is mandatory (and mutually exclusive).

explainFeaturePrefix

Optional
Type: String

Prefix for additional features ( e.g. number of votes, probability).

generatedObjects

Optional
Type: Expression

Object (right) arguments of generated candidates within a scope. This expression is evaluated as a list of elements from the scope element.

generatedSubjects

Optional
Type: Expression

Subject (left) arguments of generated candidates within a scope. This expression is evaluated as a list of elements from the scope element.

python

Optional

Path to the Python interpreter. If not set, then REBERTPredict will use the Python interpreter from the PATH environment variable.

relation

Optional
Type: String

Name of the relation for generated candidates.

aggregator

Default value: `vote`

Aggregation method of predictions of all ensemble models. Only vote is available.

createAssertedTuples

Default value: `false`
Type: Boolean

If set to true , then REBERTPredict creates a tuple even for asserted candidates. labelFeature will be set on the created tuple instead of the asserted candidate element.

createNegativeTuples

Default value: `false`
Type: Boolean

If set to true , then REBERTPredict creates a tuple even for generated candidates predicted as negative.

end

Default value: `end`
Type: Expression

End position of candidates (asserted and generated). This expression is evaluated as an integer from the candidate element.

ensembleNumber

Default value: `1`
Type: Integer

Number of models that make a prediction. The value must be lower or equal to the number of models finetuned. Either this parameter or ensembleModels is mandatory (and mutually exclusive).

labelFeature

Default value: `predicted-label`
Type: String

Feature where to store the predicted category.

negativeCategory

Default value: `0`
Type: Integer

Category that is considered negative (no relation).

objectRole

Default value: `object`
Type: String

Name of the object (right) argument in tuples created for generated candidates.

sentenceLayer

Default value: `sentences`
Type: String

Layer containing sentences.

start

Default value: `start`
Type: Expression

Start position of candidates (asserted and generated). This expression is evaluated as an integer from the candidate element.

subjectRole

Default value: `subject`
Type: String

Name of the subject (left) argument in tuples created for generated candidates.

useGPU

Default value: `false`
Type: Boolean

Use GPU instead of CPU.

Deprecated parameters

relationName

Deprecated
Type: String

Deprecated alias for relation .