Let's have a look at some more examples to demonstrate the use of XSLT. In the last part of this section, we will look at using XSLT to style an XML document in HTML. There will be more examples there. Here we will cover examples that are not HTML-related, but targeted to converting one XML dialect into another. This will be a very common case in business-to-business e-commerce, where XML documents containing orders, inventories, product descriptions, etc., are sent automatically and converted on the fly to a format that is suitable for the target system.
Product Information Import
Think of a system that retrieves product descriptions from several suppliers to present users in the organization with a coherent view of all available products. Some of these suppliers will have their product range available in an XML format. In an ideal world, an agreement could be made with all suppliers about the format used for delivering the data. Unfortunately, in the real world suppliers will not be willing to do that, the user will have to settle for what he can get. Some will conform to an industry standard but, in the end, transformation from some other format to that which is required will be necessary.
The format that can be natively imported by our application looks like this:
<?xml version="1.0"?>
<Product>
<ID>21456</ID>
<Name>
Nail clipper
</Name>
<Product_category>Personal
care</Product_category>
<Supplier>
<Name>Clippers Inc.</Name>
<Address>
<Street_address>
234, Wood lane
Humblestown, MA
</Street_address>
<Country>USA</Country>
</Address>
<Contact>Macy Marble</Contact>
</Supplier>
</Product>
The XML descriptions we receive from Clippers Inc look like this:
<Clipper product-reference="21456">
<FullName>Solid
quality nail clipper, San Juanito steel</FullName>
<Short>Nail
clipper</Short>
</Clipper>
We want to transform this delivered format into our native format using XSLT. We could create a stylesheet for the transformation like this:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<Product>
<ID><xsl:value-of select="Clipper/@product-reference"/></ID>
<Name>
<xsl:value-of select="Clipper/short"/>
</Name>
<Product_category>Personal
care</Product_category>
<xsl:copy-of
select="document('http://ourserver/supplier_lookup.xml')/suppliers/supplier
[Name = 'Clippers Inc.']"/>
</Product>
</xsl:template>
</xsl:stylesheet>
Let's have a look at the sample little by little. There is only one template,
matching the root. This template contains a framework for the output document.
The Product
element and its ID child element are inserted as literals.
The value of the ID element is fetched from the source document, by inserting
the value of the product-reference
attribute from the source. The
same thing is done for the name. We create a name
element with
literals and insert a value from the source document in it. Note that we chose
to use the short name from the source and discard the long name. The Product
_category
element is hard-coded. We expect only products in this category from this supplier.
Now comes the hard part. The supplier information is not provided in this case.
Some suppliers will, some will not. We could choose to hard-code the supplier
information in the stylesheet. But that would force us to update the stylesheet
every time the supplier changes its address or we get a new contact person.
We decided to store all supplier information in our own format in one file.
While transforming the document, the processor does a lookup in the supplier
_lookup.xml
document and copies a whole fragment from that document to the destination document
using copy-of
.
Author Summary
Our second example is for a publishing company; all books are stored in a giant XML document (in fact it is stored in a database, but this database allows access to the data as if it were an XML document). A fragment of this document looks like:
<publisher>
<books>
<book>
<title>Stranger in a strange land</title>
<ISBN>0441788386</ISBN>
<author-ref ref="rh"/>
<sold>2300000</sold>
</book>
<book>
<title>Starman Jones</title>
<ISBN>0345328116 </ISBN>
<author-ref
ref="rh"/>
<author-ref ref="jldr"/>
<sold>80000</sold>
</book>
...
</books>
<authors>
<author id="rh">
<first_name>Robert</first_name>
<last_name>Heinlein</last_name>
</author>
<author id="jldr">
<first_name>Judy-Lyn</first_name>
<last_name>Del Rey</last_name>
</author>
</authors>
</publisher>
Note how the second book has several authors. For making an overview of the most successful authors, the publisher wants to transform this huge books file to something like this:
<author>
<name>Heinlein, Robert</name>
<total_publications>67</total_publications>
<total_sold>7343990</total_sold>
<rank>1</rank>
</author>
Authors will be ranked by the total number of copies of books sold, and this should also determine their position in the document. So, the best selling author in the books document should be the highest on the list. This can be accomplished by this stylesheet:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<bestsellers-list>
<xsl:apply-templates select="/publisher/authors/author">
<xsl:sort select="sum(/publisher/books/book
[author- ref/@ref=current()/@id]/sold)"/>
<xsl:sort
select="last_name"/>
</xsl:apply-templates>
</bestsellers-list>
</xsl:template>
<xsl:template
match="author">
<copy>
<name><xsl:value-of
select="last_name"/>,
<xsl:value-of select="first_name"/></name>
<total_publications>
<xsl:value-of select="count(/publisher/books/book[author-
ref/@ref=current()/@id])"/>
</total_publications>
<total_sold>
<xsl:value-of select="sum(/publisher/books/book[author-
ref/@ref=current()/@id]/sold)"/>
</total_sold>
<rank><xsl:value-of select="position()"/></rank>
</copy>
</xsl:template>
</xsl:stylesheet>
Some things in this stylesheet are worthy of further comment. First, note how
the sum()
and count()
functions are used, both in
the author
template for calculating the number of publications
and total number sold for each author, and in the sort
element
within the apply-templates
element. Note how the current()
function is used to match the author-ref
elements to the author
elements they refer to. An interesting thing to note is that the current()
function within the apply-templates
element refers to the current
context after selecting the new set.
If the source document is large, this stylesheet will probably take a long
time to process. Many calculations are done in counting and summing the nodes.
In these counting actions, a lot of searching is done on books that have an
author-ref
element with a certain ref
attribute. We
could also implement this using a key. If the processor is optimized for using
keys, this will speed things up significantly (but I don't know of any such
processor at the time of writing). Even if it doesn't give us a performance
gain (it still might in the future), our code becomes somewhat cleaner. Then
the stylesheet would look like this. See if you can figure it out.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:key match="/publisher/books/book"
use="author-ref/@ref" name="books-by-author"/>
<xsl:template match="/"><bestsellers-list>
<xsl:apply-templates select="/publisher/authors/author">
<xsl:sort select="sum(key('books-by-author', @id)/sold)"/>>
<xsl:sort select="last_name"/>
</xsl:apply-templates>
</bestsellers-list></xsl:template>
<xsl:template match="author">
<copy>
<name>
<xsl:value-of select="last_name"/>,
<xsl:value-of select="first_name"/>
</name>
<total_publications>
<xsl:value-of select="count(key('books-by-author', @id))"/>
</total_publications>
<total_sold>
<xsl:value-of select="sum(key('books-by-author', @id)/sold)"/>
</total_sold>
<rank>
<xsl:value-of select="position()"/>
</rank>
</copy>
</xsl:template>
</xsl:stylesheet>
At the beginning of the document, we added an xsl:key
element.
It is called 'books-by-author
'. The key will give us direct access
to a set of nodes from the source document. With the match
attribute
we specify which nodes we want to be able to access. In our case, we want access
through the key to all book
elements in the document (match="/publisher/books/book"
).
With the use
attribute we specify the key value we want to use
to access a book
element. This is apparently the ref
attribute on the author-ref
child element(s) of the book (use="author-ref/@ref"
).
Now if we use the key()
function anywhere in the stylesheet like
this:
key('books-by-author', 'rh')
This will return a node set containing all book
elements that
have an author-ref child element with ref="rh"
. Effectively
these are all books by Robert Heinlein. Using this, we could simplify some of
the expressions in the stylesheet significantly.
Comments