ELEXIS WP1 dictionary transformation backend demo

Description of the mapping (in JSON):
[Syntax · Examples: ]

Dictionary to be transformed (in XML):
[Examples: ]






Results:

Syntax of the JSON mapping descriptions

The JSON object should contain the following members:

Selector descriptions

A selector is a rule that selects 0 or more elements in the input XML tree.

The description of a selector must be a JSON object. This object must contain an attribute named type, whose value specifies the type the selector, plus one or more other attributes whose name and meaning depends on the selector type.

The following types of selectors are currently supported:

Transformer descriptions

A transformer is a rule that describes which data from the input document must be transformed into a certain type of element in the output document.

The description of a transformer must be a JSON object. This object must contain an attribute named type, whose value specifies the type the transformer, plus one or more other attributes whose name and meaning depends on the transformer type.

The following types of transformers are currently supported:

(1) Simple transformers

A simple transformer selects a set of elements and extracts an attribute or the inner text from these elements; optionally applies a regular expression to the resulting text and returns the substring matched by a specific group within the regular expression.

The JSON object that describes a simple transformer must contain the following attributes:

A simple example:

{ "type": "simple",
  "selector": {"type": "xpath", "expr": ".//ExampleCtn//Locale"},
  "attr": "lang" }

A more complex example:

{ "type": "simple",
  "selector": {"type": "xpath", "expr": ".//sense/seg[1][@type='beleg']"},
  "attr": "{http://elex.is/wp1/teiLex0Mapper/meta}innerTextRec"
  "rex": "'(?P<insideQuotes>[^']*)'",
  "rexGroup: "insideQuotes" }

This transformer selects the first <seg> in each <sense>, builds the inner text and extracts the first substring delimited by single quote marks.

An example of a constant-output transformer:

{ "type": "simple",
  "selector": {"type: "xpath", "expr": ".//artikel"},
  "attr": "{http://elex.is/wp1/teiLex0Mapper/meta}constant",
  "const": "nl" }

An example of an transformer with an adoption operation:

"ex": {
  "type": "simple",
  "selector": {"type": "xpath", "expr": ".//Example"},
  "adoptSelector": {"type": "xpath", "expr": "..//Translation"}
},
"ex_tr": {
  "type": "simple",
  "selector": {"type": "xpath", "expr": ".//ExampleCtn//Translation"},
},

Here, when we find an <Example>, we then also find all <Translation> elements that are descendants of its parent (the "../Translation" xpath is taken to be relative to the <Example> that the transformer is working on at the moment). Those <Translation>s are moved inside the <Example>, resulting in a transformed structure of the form:

  <cit type="example">
    <quote>J'ai mangé une pomme.</quote>
    <cit type="translation"><quote xml:lang="dk">Jeg har spist et æble.</quote></cit>
    ...
  </cit>

(2) Union transformers

A union transformer takes a set of simple transformers and performs all of their transformations. This might be useful if you need to combine several different transformation rules, e.g. extract attribute @a from instances of the element <b> and also extract attribute @c from instances of the element <d>.

The JSON object that describes a union transformer must contain the following attributes:


Janez Brank