Using XML Queries and Transformations

Pre-defined Templates

Apart from the templates that you will define and implement, two default templates are provided for free. These templates can be overruled by creating a template that matches the same nodes. We haven't covered the implementation of templates yet, but still it can be instructive to see what real implemented templates look like:

<xsl:template match="*|/">
   <xsl:apply-templates/>
</xsl:template>
<xsl:template match="text()|@*">
   <xsl:value-of select="."/>
</xsl:template>

What do we see? There are two templates defined. One matches all elements and the root (*|/). The other one matches both text nodes and all attributes. The implementation of the templates is fairly simple. The first one has only an xsl:apply-templates element. The implementation of the second template uses another element: xsl:value-of. This element generates text output containing the string value of the context node.

Now suppose that we try to transform the sample source document (family.xml) using only the built-in templates. What would happen? The document root would be matched by the first built-in template, matching on "*|/", i.e. any node including the root. The only thing this template does is call xsl:apply-templates with no select attribute. This will cause the processor to process all child nodes (but not attributes).

The result of our sample source, transformed by only built-in templates, would be an empty document. If it contained any text nodes, these would appear in the output. But although no output appears in the result, all nodes in the document have been processed. This is an important fact. The default templates will process all nodes in the document.

If you implement your own template, you will specify specific output for the element you are matching. But if you ever want the children of this element to become the context node, you must also make sure that you pass the context to them. One of the most common mistakes is using a stylesheet like this:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:template match="/">
    <HTML><BODY>
     </BODY></HTML>
 </xsl:template>
 <xsl:template match="*">
    <!—some content  here -->
 </xsl:template>
</xsl:stylesheet>

Note that the first template contains no xsl:apply-templates element. This means that after processing the document root and outputting a document like this:

<HTML>
  <BODY/>
</HTML>  

the processor will stop. The context is not passed to any other node, so the XSLT processor assumes that the job is done. We must change that template to:

<xsl:template match="/">
   <HTML><BODY>
           <xsl:apply-templates/>
   </BODY></HTML>
</xsl:template>

Forgetting to pass the context from a node to its children is one of the most common mistakes when developing XSLT documents.

Of course, you may have good reasons to do it on purpose. Often, you don't want all nodes to appear in the destination document and you may decide not to pass focus to them at all. That's fine, as long as it is a deliberate decision to leave out apply-templates.

Elements that Generate Output Elements

The most easily understandable elements in an XSLT document are the literals. They must be any fragment of valid XML and should not be in the XSLT namespace, that is any XML content within the xsl:template element that is not prefixed xsl: is passed on to the result document. The output to the destination document is identical to the literal value in the XSLT document. This can be a piece of text, but also a tree of XML nodes.

This template will output a LITERALS element for each PERSON element it is used on (we have actually seen this already in the example for the xsl:template element). If the PERSON element has any child elements or attributes, these will not be included in the destination document:

<xsl:template match="PERSON"> 
   <LITERALS/>
  </xsl:template>  

Literal values can include both text and XML elements. Other nodes, like comments and processing instructions, cannot be output as literal values. A literal value must always be a well-formed piece of XML. So we cannot generate only an opening tag. This would prevent the XSLT document from being well-formed.

value-of

The value-of element generates the string value of the specified node in the destination document.

The select attribute indicates which node's value should be output. It contains an XPath expression that is evaluated in the template's context. For example, this code would generate the text string in the destination document with the value of the name attribute of the matched PERSON element:

<xsl:template match="PERSON">
   <xsl:value-of select="@name"/>
  </xsl:template>  

copy

The copy element creates a node in the destination document with the same node name and node type as the context node. The copy element will not copy any children or attributes of an element. An example of using this element would be:

<xsl:template match="PERSON|PET">
     <xsl:copy/>
   </xsl:template> 

This template will output a PERSON element for each matched PERSON element in the source document and a PET element for each matched PET element in the source document. Any attributes of the copied elements will not show up in the destination document.

copy-of

The copy-of element is used to copy a set of nodes to the destination document. The select attribute can be used to indicate which nodes are to be copied. Unlike the copy element, copy-of will copy all children and all attributes of an element.

The copy-of element is very much like the value-of element, except that copy-of does not convert the selected node to a string value and that copy-of will copy all selected nodes, not only the first, for example:

<xsl:template match="PERSON">
    <xsl:copy>
     <xsl:copy-of select="@name"/> 
   </xsl:copy>
 </xsl:template>  

This template creates a PERSON element for each matched PERSON element in the source document and copies any existing name attribute into it. Note how the copying of the attribute is placed within the copying of the element.

<xsl:template match="PERSON">
    <xsl:copy-of select="."/>
 </xsl:template>  

This template will copy a PERSON element with all its attributes and children (and further descendants) to the destination document for each matched PERSON element in the source.

element

The element element (how meta can you get?) allows us to create elements in the destination document. You must use the name attribute to specify the element name. The namespace of the created element can be set using the optional namespace attribute. If you include a namespace attribute, the XSLT engine may decide to change the prefix you specified in the name attribute. The local name (everything after the colon) will remain intact.

<xsl:template match="PERSON">
    <xsl:element name="PERSONAL_DATA"/>
 </xsl:template>  

This template will produce exactly the same output as the example for literals. You may wonder why you would ever use the element element if you can use literals. The extra value is in the fact that the name and namespace attributes are not normal attributes, but 'attribute value templates'. We will explain about those later.

attribute

The attribute element generates attributes in the destination document. It works in the same way as the element element, but inserting attributes is bound to some limitations:

  • You may not insert an attribute in an element after child elements have been added to that element.
  • You can only use this in the context of an element. Adding an attribute to a comment node is not allowed.
  • Within the attribute element, no nodes may be generated other than text nodes. Attribute nodes can not have child nodes.

This template will create a species attribute for each matched type attribute, inserting the value of the type attribute in the species attribute:

<xsl:template match="@type">
    <xsl:attribute name="species">
     <xsl:value-of/> 
   </xsl:attribute>
  </xsl:template>  

Attribute Value Templates

The attribute element is often used to create attributes in the output that have a calculated name. Because their value is not fixed, they cannot be specified in a literal element, or can they? XSLT specifies a special kind of attribute, called attribute value templates. All literal attributes in XSLT are value templates, but many attributes on predefined XSLT elements are as well. An attribute value template can contain an expression part that is evaluated before execution of the element the attribute is in. The expression must be placed in curly braces, so this code:

<LITERAL some="blah{4+5}"/>

would create this node in the output:

<LITERAL some="blah9"/>

The expression can also be an XPath expression. Using attribute value templates, the following transformation can be made much more readable than it is with attribute elements, so this code:

<photograph> 
   <url>img/pic.jpg</url>
   <size width="40"/> 
 </photograph>
  <xsl:template match="photograph">
   <img src="{url}" width="{size/@width}"/>
  </xsl:template>  

would create:

<img src="img/pic.jpg" width="40"/>

You cannot use nested braces. If you need to specify a {, use a double brace: {{. Check Appendix D to find out which attributes can be used as value templates.

A Stylesheet Example

Before we go on with any theory, we will now have a look at a sample. Remember the two XML documents specifying information about a family? It was the first code sample of Chapter 2.

We will create a transformation document to convert documents of type A into documents of type B. To work along and try the result of several elements, you may want to use a tool that allows you to see source, rules and destination documents side by side. Some good commercial tools exist, but we suggest using the free open source tool under development by some members of the VBXML mailing list. It is called XSLTester and can be downloaded from www.vbxml.com. The sample XSL files can be downloaded from the Wrox web site.

First we define a template that matches the root of the document and outputs all standard elements:

<xsl:template match="/">
   <FAMILY>
     <PERSONS>
      <xsl:apply-templates select="FAMILY/PERSON"/>
     </PERSONS>
     <PETS>
      <xsl:apply-templates  select="FAMILY/PET"/>
     </PETS>
    </FAMILY>
 </xsl:template>

The template generates a framework for the document and specifies the places where other content should appear. In this case, it specifies the PERSON and PET elements to appear in two different places. Note how two XPath expressions are used to invoke new transformations to occur.

For each of the PERSON elements, we want to do a simple transformation: instead of having the name in a name attribute, it should be the content of the element:

<xsl:template match="PERSON">
   <PERSON>
     <xsl:value-of select="@name"/>
   </PERSON>
 </xsl:template>  

The PET element needs a more complex transformation. Like the PERSON element, it has its name attribute transformed into the element content. But the PET element in the source document also has a type attribute. In the destination syntax, this attribute is called species. We achieve this transformation with this template:

<xsl:template match="PET">
   <PET>
     <xsl:attribute name="species">
       <xsl:value-of select="@type"/>
     </xsl:attribute>
     <xsl:value-of select="@name"/>
   </PET>
 </xsl:template>  

There it is – our first complete and functional XSLT document. Using MSXML, we could program a VB application that does this transformation containing code like this:

'Object to hold the format we cannot handle
Dim oDocFormatA as new DOMDocument
'Object that holds the format we know Dim oDocFormatB as new DOMDocument
'Object that holds the XSLT stylesheet Dim oXSLT as new DOMDocument oDocFormatA.async = false oXSLT.async = false oDocFormatA.load "D:\sourceDocument.xml" oXSLT.load "D:\stylesheet.xsl" ' Now save this string or process it further oDocFormatB.loadXML( oDocFormatA.transformNode(oXSLT))

text

The text element creates a text node in the destination document, holding the content of the original text element. This can also be achieved using literal text, but the text attribute will also be included if it contains only whitespace. Including whitespace is the main reason for using the text element. See the sections on strip-space and preserve-space for more information on whitespace stripping. So these two templates are functionally identical:

<xsl:template match="PERSON">
   <xsl:text>A person element found</xsl:text>
 </xsl:template>
 <xsl:template match="PERSON">
   A person element found 
 </xsl:template>  

processing-instruction

The processing-instruction element generates a processing instruction in the destination document. The syntax for creating a processing instruction is different from that for elements. So this code:

<xsl:processing-instruction name="xml-stylesheet">
   href="style.xsl" type="text/xsl" 
 </xsl:processing-instruction>  

would generate in the destination document:

<?xml-stylesheet href="style.xsl" type="text/xsl"?>

This would be typical for an XSLT document that is used for pre-processing – specifying the transformation rules for the next step. Look at the very end of this chapter to see what the effect of this processing instruction is.

The attributes of the processing instruction (href and type) must be created as a text node instead of attributes. This is because the content of the processing instruction does not necessarily use an XML-based syntax.

The name attribute must contain a valid name for a processing instruction. This means that it cannot be 'xml' and therefore cannot be used to generate the XML declaration itself. To learn about how to create XML declarations, see the section on the xsl:output element.

It is not allowed to create any node other than a text node within the processing-instruction element. It is also forbidden to create textual content holding the string '?>' – it will be interpreted as the end of the processing instruction.

comment

The comment element is the only way to create comments in the destination document – a comment in the source document would be ignored, because it will not be parsed anyway. So this code:

<xsl:comment>This file was generated using XSLT</xsl:comment>

would generate this line in the destination document:

<!-- This file was generated using XSLT-->

It can, of course, not have any other content than text nodes.

number

The number element is a special one. It is more or less a numerical conversion tool. It creates a numeric value in the output and has a ton of attributes for specifying which number and format should be output:

<xsl:number
   value = number-expression
   level = "single" | "multiple" | "any"
   count = pattern
   from = pattern
   format = { string }
   grouping-separator = { char }
   grouping-size = { number }
 />  

The simplest way to use the number element is by specifying the numeric value that should be output using the value attribute. The value attribute is evaluated and converted to a number (as if using the number function). This number is rounded to an integer value and converted back to a string value. So this code would output the index number of the context node (relative to its parent) followed by a dot and a space:

<xsl:number value="position()" format="1. "/>

The attributes of the number element can be separated into two groups: those necessary to calculate the numeric value and those necessary to format the numerical value into a string.

You might also like...

Comments

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“The greatest performance improvement of all is when a system goes from not-working to working.” - John Ousterhout