AlvisNLP

corpus processing engine

OBOReader

Synopsis

Reads terms in OBO files as documents.

Description

OBOReader reads files specified by oboFiles in OBO format . Each term is loaded as a distinct document with the term identifier as the document identifier. Each document contains a section ( nameSection ) containing the term name, and one section for each term synonym ( synonymSection ). Optionally OBOReader also sets features on the document with the term path from the root ( pathFeature ), the identifier of the parent term ( parentFeature ), the identifiers of each ancestor ( ancestorsFeature ), of the identifiers of each child term ( childrenFeature ).

Snippet

<oboreader class="OBOReader">
    <oboFiles></oboFiles>
</oboreader>

Mandatory parameters

oboFiles

Mandatory

OBO files to read.

Optional parameters

ancestorsFeature

Optional
Type: String

Name of the feature that contains the term ancestors ids.

childrenFeature

Optional
Type: String

Name of the feature that contains the term children ids.

constantDocumentFeatures

Optional
Type: Mapping

Constant features to add to each document created by this module.

constantSectionFeatures

Optional
Type: Mapping

Constant features to add to each section created by this module.

excludeOBOBuiltins

Default value: `true`
Type: Boolean

Either to exclude builtin OBO terms.

idPrefix

Default value: ``
Type: String

Prefix to prepend to each Term identifier.

nameSection

Default value: `name`
Type: String

Name of the section that contains the term name.

parentFeature

Default value: `is_a`
Type: String

Name of the feature that contains the term parents.

pathFeature

Default value: `path`
Type: String

Name of the feature that contains the term paths.

synonymSection

Default value: `synonym`
Type: String

Name of the sections that contains the term synonyms.

Deprecated parameters

nameSectionName

Deprecated
Type: String

Deprecated alias for nameSection .

synonymSectionName

Deprecated
Type: String

Deprecated alias for synonymSection .