AlvisNLP

corpus processing engine

XMLReader

Synopsis

Reads XML files and creates elements.

Description

XMLReader reads its input from source as XML and creates documents, sections, annotations, relations or tuples. The structure of the input XML is handled through the xslTransform XSLT stylesheet.

XMLReader also provides XSLT function and element extensions. The namespace for all extensions is xalan://fr.inra.maiage.bibliome.alvisnlp.bibliomefactory.modules.xml.XMLReader2 .

Element extensions

Function extensions

Snippet

<xmlreader class="XMLReader">
    <source></source>
    <xslTransform></xslTransform>
</xmlreader>

Mandatory parameters

source

Mandatory

Path to the source directory or source file.

xslTransform

Mandatory

XSLT Stylesheet to apply on the input.

Optional parameters

constantAnnotationFeatures

Optional
Type: Mapping

Constant features to add to each annotation created by this module.

constantDocumentFeatures

Optional
Type: Mapping

Constant features to add to each document created by this module.

constantRelationFeatures

Optional
Type: Mapping

Constant features to add to each relation created by this module.

constantSectionFeatures

Optional
Type: Mapping

Constant features to add to each section created by this module.

constantTupleFeatures

Optional
Type: Mapping

Constant features to add to each tuple created by this module.

stringParams

Optional
Type: Mapping

Parameters to pass to the XSLT Stylesheet specified by xslTransform .

html

Default value: `false`
Type: Boolean

Set to true if the input is HTML rather than XML.

rawTagNames

Default value: `false`
Type: Boolean

If true, do not convert tag names to upper case.

Deprecated parameters

sourcePath

Deprecated

Alias for source . Use source instead.