PESVReader
Synopsis
Read documents and entities in the PESV format.
Description
PESVReader reads CSV files in docStream and creates one document for each record. The identifier of the document is the id column. The section content is created from the tokenization provided in the processed_text column. The tokenization itself is recorded in the layer named after tokenLayer .
PESVReader also reads CSV files in entitiesStream and creates one entity annotation in the layer named entityLayer for each record. All properties are recorded in the corresponding feature, as well as in a single feature names propertiesFeature .
Snippet
<pesvreader class="PESVReader">
<docStream></docStream>
<entitiesStream></entitiesStream>
</pesvreader>
Mandatory parameters
docStream
Path to the file(s) or directory(ies) where to look for document files.
entitiesStream
Path to the file(s) or directory(ies) where to look for entities files.
Optional parameters
constantAnnotationFeatures
Constant features to add to each annotation created by this module.
constantDocumentFeatures
Constant features to add to each document created by this module.
constantSectionFeatures
Constant features to add to each section created by this module.
entityLayer
Name of the layer where to create entities.
ordFeature
Name of the feature where to record the token ordinal.
propertiesFeature
Name of the feature where to record entities properties. PESVReader also records each property in a separate feature.
section
Name of the (unique) section.
tokenLayer
Name of the layer where to create tokens.
Deprecated parameters
entityLayerName
Deprecated alias for entityLayer .
ordFeatureKey
Deprecated alias for ordFeature .
propertiesFeatureKey
Deprecated alias for propertiesFeature .
sectionName
Deprecated alias for section .
tokenLayerName
Deprecated alias for tokenLayer .