An XSLT document defines rules for transforming a specific kind of XML document into another kind of document. These rules are themselves defined in an XML-based document syntax. Most of this chapter will be used to describe all of the available elements in an XSLT document.
To differentiate the XSLT-specific elements in a stylesheet from other XML
content, XSLT uses namespaces. The official XSLT namespace is http://www.w3.org/1999/XSL/Transform
.
Remember that this URI does not necessarily point to any resource. It only specifies
to the XSLT processor that these elements are part of an XSLT stylesheet. In
this chapter we will always use the xsl
namespace prefix for XSLT elements. This assumes that all our stylesheets contain
this namespace declaration:
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
For example, if we talk about the template
element in the XSLT namespace, we will display it as xsl:template
.
Remember that this URL is not pointing to anything special. It is only used
as a unique identifier to make these elements unique from all other kinds of
elements (that are not specifying an XSLT stylesheet).
stylesheet
The root element of any XSLT stylesheet document is normally the stylesheet
element (exceptions are the transform
element and the simplified
syntax; both will be explained later). It holds a number of templates and can
hold some more elements that specify settings. Elements that can appear in the
stylesheet
element (and only there) are called top level elements. An example of a stylesheet
element is shown:
<xsl:stylesheet id = id extension-element-prefixes = tokens exclude-result-prefixes = tokens version = number> </xsl:stylesheet>
The version attribute of the stylesheet
element is necessary to ensure that later additions to the XSLT specification
can be implemented without changing the old stylesheets. The current version
is 1.0. When newer versions of the recommendation are specified, the version
number can be increased (but the namespace for XSLT will remain stable, including
the '1999'). If the version is set to anything higher than 1.0, this will also
affect the way a 1.0 processor works. The processor will switch on forward compatibility
mode. In this mode, the processor ignores any unknown elements or elements in
unexpected places. You will rarely use the other attributes of the stylesheet element, but we'll discuss them here briefly
anyway.
With the extension-element-prefixes
attribute, it is possible to assign a number of
namespace prefixes, other than the defined XSLT prefix, as XSLT extension prefixes.
This tells the XSLT processors that support any extensions to watch out for
these namespace extensions. They might be extensions that it knows. The prefixes
must be defined namespaces.
If the source document contains namespace declarations, these will normally
automatically appear in the result document as well. The only exception is the
XSLT declaration itself. If there are any other namespaces in the source document
that you do not want to show up in the output, these can be excluded with the
exclude-result-prefixes
attribute.
Just to give you the idea, we'll have a look at an extremely simple stylesheet here. We'll use some elements that we have not described yet, but we'll describe what happens afterwards.
<?xml version="1.0"?> <xsl:stylesheet xmlns:xsl=" http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/"> <root_node/> </xsl:template> </xsl:stylesheet>
You will recognize the stylesheet
element carrying the namespace
declaration to indicate that this is an XSLT stylesheet. Inside the stylesheet
is one xsl:template element. This element has a match
attribute set to "/
" and a child element root
_node
.
This template matches ('is a suitable template for') the document root (indicated
by '/
').
The only content of the template is the root
_node
element. This is not an XSLT element, but a literal element that is added to
the output when this template is executed. When this stylesheet is used to transform
an arbitrary XML document, the processor will start processing the document
root of the source document. It will find a suitable template in the stylesheet
(the only template we have) and use it to process the document root. The only
thing the template does is create a root
_node
element in the output document. This stylesheet will transform an arbitrary
XML source document to:
<root_node/>
transform
The transform
element is synonymous to the
stylesheet
element. It is included because
the uses for XSLT have grown much wider than just giving style to XML content,
but the stylesheet is still the most common way to define a transformation.
Functionally, there is no difference.
import
To construct a stylesheet from several reusable fragments, the XSLT specification
supports the importing of external stylesheet document fragments. This is done
with either the import or include
elements, for example:
<xsl:import
href=uri-reference/>
The document retrieved from the URI should be a stylesheet document itself
and the children of the stylesheet
element are imported directly
into the main stylesheet. The import
element can only be used as a
top-level element and must appear before any of the template
elements in the document. If the XSLT processor is trying to match a node in
the source document to a template in the stylesheet, it will first try to use
one of the templates in the importing document before trying to use one of the
imported templates. This allows for creating rules that are used in many stylesheets.
Rules can be overridden by defining one of the rules again locally.
Both the import
and the include
elements may never reference themselves (not even indirectly).
include
The include
element is the simpler brother
of the import
element:
<xsl:include
href=uri-reference/>
It just inserts the rules from the referenced URI. These are parsed as if they were in the original document.
Like the import
element, include
can only appear at the top-level. There is no restriction on the location of
this element in the document (unlike import
).
template
The template
element is one of the main
building blocks of an XSLT stylesheet. It consists of two parts, the matching pattern
and the implementation. Roughly, you can say that
the pattern defines which nodes will be acceptable as input for the template.
The implementation defines what the output will look like. We will cover the
implementation later, when we discuss the elements that generate output.
<xsl:template match = pattern name = qname priority = number mode = qname> <!-- Content: implementation--> </xsl:template>
The attributes name
, priority
and mode
are used to differentiate between several templates that match on the same node.
In these cases several rules exist for preference of templates over each other.
In the section titled "What if Several Templates Match?"
we will show the use of these attributes.
The match
attribute holds the matching pattern
for the template. The matching pattern defines for which nodes in the source
document this template is the appropriate processing rule. The syntax used is
a subset of XPath. It contains only the child
and attribute
axes (but it is also legal to use "//
" from the abbreviated syntax,
so the descendant
axis is also available). A template matches a node, if the node is part of the
result set of the pattern from any available context, which basically says that
a node should be "selectable" with the pattern. We'll take a look
at a few examples to clear this up.
Imagine that we are processing a document with chapters and paragraphs. The
paragraphs are marked up with the element para
,
the chapters with chapter. We will look at possible values for the match
attribute of the xsl:template element. This matches any para
element that has a chapter element as a parent:
<xsl:template match="child::chapter/child::para">
</xsl:template>
Note that this will only work when the chapter
element has a parent node. This parent node is the context we need to select
the para
element from with this pattern. Fortunately, all elements have a parent (the
root element has the document root for a parent), so this pattern matches all
para
elements that have a chapter as a parent. This example will match with all para
elements:
<xsl:template match="para">
</xsl:template>
This matches any para
element as well as any chapter
element:
<xsl:template match="(chapter|para)">
</xsl:template>
This matches any para
element that has a chapter
element as an ancestor:
<xsl:template match="chapter//para">
</xsl:template>
This matches the root node:
<xsl:template match="/">
</xsl:template>
This matches all nodes but not attributes and the root:
<xsl:template match="node()">
</xsl:template>
This matches any para
element, which is the first para
child of its parent:
<xsl:template match="para[position()
= 1]">
</xsl:template>
This matches any title
attribute (not an element that
has a title
attribute):
<xsl:template match="@title">
</xsl:template>
This matches only the odd-numbered para
elements within its parent:
<xsl:template match="para[position() mod 2 = 1]">
</xsl:template>
Two interesting extra functions that you can use in the pattern are id()
and key()
.
id('someLiteral')
evaluates to the node that has 'someLiteral
' as its ID value. This pattern
matches all para elements that are children of the element with its
ID attribute set to 'Table1':
<xsl:template match="id('Table1')/para">
</xsl:template>
Note that the ID attribute is not necessarily called ID
– it can be any attribute that is declared as having type ID in the DTD or Schema.
The key()
method does something similar, but refers to defined keys instead of elements
by ID. Refer to the section covering the xsl:key
element to learn more about
the key()
method.
apply-templates
In the simple and rather non-functional example we looked at in the paragraph
about the stylesheet
element, we had only one
template. This template matched on the document root. When the XSLT processor
starts transforming a document with that stylesheet, it will first search for
a template to match the document root. Our only template does this, so it is
executed. It generates an output element and processing is stopped. All content
held by nodes other than the document root is not processed. We need a way to
tell the processor to carry on processing another node.
<xsl:apply-templates select = node set-expression mode = qname> </xsl:apply-templates>
This is done using the xsl:apply-templates
element. It selects the nodes that should be processed next using an XPath expression.
The nodes in the node set that is selected by this XPath expression will become
the new context nodes. For these new context nodes, the processor will search
a new matching template. The transformed output of these nodes will appear within
the output generated by the current template.
You may compare the use of the apply-templates
element with calling a subroutine in a procedural programming language. There
are only two possible attributes for the apply-templates
element: select
and mode
.
The select
attribute is the more important
one. It specifies which nodes should be transformed now and have their transformed
output shown. It holds an XPath expression. The expression is evaluated with
the current context node. For each node in the result set, the processor will
search for the appropriate template and transform it.
The default value for the select
attribute is 'child::node()
'.
This matches all child nodes, but not attributes.
Let's make a few changes to our example and use xsl:apply-templates
:
<?xml version="1.0"?> <xsl:stylesheet xmlns:xsl=" http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/"> <root_node> <xsl:apply-templates/> </root_node> </xsl:template> <xsl:template match="*"> <result_node> <xsl:apply-templates/> </result_node> </xsl:template> </xsl:stylesheet>
Now we'll use the following source document to test the transformation:
<?xml version="1.0" ?> <FAMILY> <PERSON name="Freddy" /> <PERSON name="Maartje" /> <PERSON name="Gerard"/> <PERSON name="Peter"/> <PET name="Bonzo" type="dog"/> <PET name="Arnie" type="cat"/> </FAMILY>
Let's first have a look at the changes in the stylesheet. Something was added
to the original template: the root
_node
element now has a child element: xsl:apply-templates
.
This means that when the template is executed, the root
_node
element will still output a root
_node
element in the output document, but between outputting the start tag and the
end tag, it will try to process all nodes that are selected by the xsl:apply-templates
element. This element has no select
attribute, so that defaults to
child::node()
,
which selects all child nodes of the current context (excluding attributes).
Another change is that we added a new template, matching on "*
".
All it does is generate a result
_node
element in the output document (which does
not mean anything, it is just test output). This node too has an xsl:apply-templates
child element.
We saved the sample XML source as family.xml
and the stylesheet as test.xsl. Then we called the SAXON processor like this:
saxon –o destination.xml family.xml test.xsl
We'll follow the XSLT processor step-by-step as it creates an output document from the sample source document and our test stylesheet:
1. Try to match the root to one of the templates: the first template matches.
2. Process the implementation of the first template, using the root as the context node.
3.
The implementation causes the output of a root
_node
element to the destination document and tells us to process all the child nodes
of the root. These are only the XML declaration (<?xml version="1.0"?>
)
and the FAMILY
element.
4.
The XML declaration has no matching template, and will not
be processed. The FAMILY
element matches the second template.
5.
The implementation causes the output of a result
_node
element to the destination document (as a child of the root
_node
element) and tells us to process all the child nodes of the FAMILY
.
These are all PERSON
and PET
elements.
6.
The processor tries to match the PERSON
element to one of the templates: the second template matches.
7.
The second template generates a result
_node
element in the output and tells the processor to process the children of the
element. It finds no children.
8.
Steps 6 and 7 are repeated for all PERSON
and PET
elements.
The result of all this processing looks like this:
<root_node> <result_node> <result_node/> <result_node/> <result_node/> <result_node/> <result_node/> <result_node/> </result_node> </root_node>
The outer element (root
_node
)
is the transformed result of the document root; the element within the root
_node
is the transformed result of the FAMILY
element in the source. All of
the PERSON
and PET
elements are transformed to the six empty result
_node
elements.
So, what about the mode
attribute? We will discuss that
in the section "What if Several Templates Match?"
Comments