Automatic transformation of XML namespaces/Transformations

From Wikiversity
Jump to navigation Jump to search

The requirements (non-normative)[edit | edit source]

The requirements

Pipeline[edit | edit source]

Pipeline is applying every workflow list element of transformers from the user options.

If user options specify this, the execution of our application is running the pipeline.

The user-supplied XML document is feed to the pipeline and XML document resulting from the pipeline applied to it is produced.

Note that during processing the pipeline, some variables about automatic transformers are kept (see below).

Digraph of executed scripts[edit | edit source]

We call enriched scripts pairs of a script and a transformer which is connected with this script by the predicate :script. Properties of an enriched script are both properties of the script and of the transformer (for example, the minimum version of the script and destination namespaces of the transformer are both properties of an enriched script).

Executable scripts are these scripts for which we have execution facility (for example, have a suitable version of an interpreter installed). Executable enriched scripts are enriched scripts with the script part being executable.

There is hold a digraph whose vertexes are namespaces and edges are enriched scripts which were run during the pipeline. The digraph also has a vertex signifying any namespace (as defined by :targetNamespacesSet :allNamespaces).

It is managed in such a way that that between every pair of vertexes there is at most one path. Moreover from any given vertex there cannot be paths both to more than one destination namespace nor to a destination namespace and to null. (TODO: Prove it.)

The algorithm below is constructed in such a way that if once an enriched script happened in a path in this digraph, then any transformation from a namespace N to a namespace M will happen accordingly a path in this digraph from N to M (if such a path exists before the transformation).

Rationale: It is necessary to keep consistency of meanings of XML tags, so that for example an XML tag meaning emphasis would not be sometimes translated into bold and sometimes into italic text.

Variables used by the algorithm[edit | edit source]

The following data is used for automatic transformations (note that if there are several automatic transformation in the workflow, the data is preserved between transformations, that is the same variables may be used by several transformations):

  • the list of loaded RDF files;
  • the list of "see also" links to process;
  • current XML document;
  • the list of namespaces, whose info has been loaded;
  • the list of transformers loaded;
  • the precedences data;
  • digraph of executed scripts (see above).

The list of available enriched scripts between two given namespaces is:

  • one element list consisting of the enriched script, if this enriched script was executed;
  • the list of all enriched scripts with given source and target.

It is ready for transformation when and only when there is a path from every namespace in the current document ending either in a destination namespace or in no namespace.

Before starting the pipeline, all user specified RDF files are downloaded and processed.

Transformers and scripts in workflow[edit | edit source]

Scripts are applied to XML document (receiving an XML document and producing an other XML document) in accordance with "Order kinds of of document transformers" below.

If the next element of the workflow refers to a :Script object, the referenced script is simply executed transforming the input XML document to output XML document. The script is not added to the digraph of executed scripts.

If the next element of the workflow refers to a :Transformer object, then the script is figured out as described below (TODO: more the explanation above) and used to transform the input XML document to output XML document. The script is not added to the digraph of executed scripts.

Rationale: Doing otherwise would create the possibility of violation of the rule of existence no more than one path in the digraph.

Order kinds of of document transformers[edit | edit source]

See Order kinds of transformers

Automatic transformation[edit | edit source]

Automatic transformation