OgmiosTokenizer
Synopsis
Tokenizes the sections contents according to the Ogmios tokenizer specifications.
Description
OgmiosTokenizer creates an annotation for each token found in the section contents according to the Ogmios tokenizer specifications and adds these annotations to the targetLayer layer. The created annotations have a the feature tokenTypeFeature with one of the values:
- alpha : for an alphabetic token;
- num : for a numeric token;
- sep : for a whitespace token;
- symb : for all other tokens.
If separatorTokens is false, the OgmiosTokenizer does not create annotations corresponding to whitespace tokens.
Snippet
<ogmiostokenizer class="OgmiosTokenizer">
<targetLayer></targetLayer>
<tokenTypeFeature></tokenTypeFeature>
</ogmiostokenizer>
Mandatory parameters
targetLayer
Name of the layer where to store the tokens.
tokenTypeFeature
Name of the token feature where to store the token type (alpha, num, sep, symb).
Optional parameters
constantAnnotationFeatures
Constant features to add to each annotation created by this module
documentFilter
Only process document that satisfy this filter.
sectionFilter
Process only sections that satisfy this filter.
separatorTokens
Either if separator tokens should be added.
Deprecated parameters
targetLayerName
Deprecated alias for targetLayer .