XML fragment processing

In some cases the stream-based approach to processing content is not very convenient. For example, say you have a <bookinfo> element that contains a large collection of meta data, and you want to extract just the <isbn> element contained in it. You could do this by creating mappings to scan and suppress all of the other elements, but that could require a very large number of mappings. TopLeaf provides a more efficient way to extract content from an element that is part of the document being processed.

To identify the element to be processed, use a command such as this:

<set var="Fragment" copy="element"/>

The element pseudo-variable refers to the element matched by the current mapping. The mapping must have the scan box checked on the Content tab.

A variable created this way contains an object that represents a fragment of the input document. To extract values from it, use the xmlproc command, which can contain the following attributes:

  • the source attribute is required. It contains the name of the variable containing the fragment.

  • the select attribute is required. It contains an XPath 1.0 expression that is evaluated in the context of the root node of the fragment.

  • if the var attribute is present it contains the name of the variable to receive the extracted value; if this attribute is not present the value is processed as input.

The example above could be achieved with something like the following:

<set var="Bookinfo" copy="element"/>
<xmlproc source="Bookinfo" select="//isbn"/>

Namespaces in fragments

Any namespace prefixes used in the select expression must be declared using the following command:

<xmlns prefix="PRE" uri="NAMESPACE"/>

The following prefixes are declared automatically for processing files generated by TopLeaf:

Prefix Namespace Used for ...
tlt http://www.turnkey.com.au/topleaf/v7.0/tltocndx Tags in TOC file
tli http://www.turnkey.com.au/topleaf/v7.0/tltocndx Tags in Index file
tlx http://www.turnkey.com.au/topleaf/v7.0/tlxref Tags in XREF file
tlf http://www.turnkey.com.au/topleaf/v7.0/tlfiling Tags in Filing instructions/Live pages
cmk http://turnkey.com.au/namespace/topleaf/custom Custom markers in any file

Restrictions

The following restrictions apply when processing XML fragments:

  • A variable containing a fragment can only be used in the xmlproc command, or in the copy attribute of the set command to copy it to another variable. If it is used in any other context it will be treated as an empty string.

  • The element variable name is only valid in this context. You cannot use {element} to refer to the content of the fragment.

  • The fragment must be parseable as a stand-alone XML document. TopLeaf will report an error if the fragment contains references to external entities or general entities that are not defined within the publication DTD or internal DTD subset. Some predefined entities will always be resolved even if they are not defined in the publication DTD or internal DTD subset.

  • It is not possible to process the content of a custom marker as an XML fragment.

  • Processing instructions are not included in the fragment content.