XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (144 page)

Read XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition Online

Authors: Michael Kay

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition

8Mb size Format: txt, pdf, ePub

Read Book Download Book

xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”

xmlns:xs=“http://www.w3.org/2001/XMLSchema”

exclude-result-prefixes=“xs”

version=“2.0”

as=“xs:integer”/>

select=“$list[position()!=1]”/>

Output

The output gives the text of the longest speech in this scene. It starts like this:

IAGO

O, sir, content you;

I follow him to serve my turn upon him:

We cannot all be masters, nor all masters

Cannot be truly follow'd. You shall mark

Many a duteous and knee-crooking knave,

That, doting on his own obsequious bondage,

Wears out his time, much like his master's ass,

For nought but provender, and when he‘s old, cashier’d:

Whip me such honest knaves…

Our version of AltovaXML 2008 gave the wrong answer on this stylesheet. Altova tell us there's a fix in the next release.

Note that this is taking advantage of several new features of XSLT 2.0. The template uses

to return a reference to an existing node, rather than creating a copy of the node using

. It also declares the type of the parameters expected by the template, and the type of the result, which is useful documentation, and provides information that the XSLT processor can use for generating optimized code. I also found that while I was developing the stylesheet, many of my errors were trapped by the type checking. Note that the form
as=“element(SPEECH)”
can be used even when there is no schema. The example could have been rewritten to make much heavier use of XSLT 2.0 features; for example, it could have been written using

rather than

, and the

instruction could have been replaced by an XPath 2.0
if
expression. The result would have occupied fewer lines of code, but it would not necessarily have been any more readable or more efficient.

There is another solution to this problem that may be more appropriate depending on the circumstances. This involves sorting the node-set, and taking the first or last element. It goes like this:

In principle, the recursive solution should be faster, because it only looks at each node once, whereas sorting all the values requires more work than is strictly necessary to find the largest. In practice, though, it rather depends on how efficiently recursion is implemented in the particular processor.

Another case where recursion has traditionally been useful is processing of a list presented in the form of a string containing a list of tokens. In XSLT 2.0, most such problems can be tackled much more conveniently using the XPath 2.0
tokenize()
function, which breaks a string into a sequence by using regular expressions, or by using the

instruction described on page 230. But although these functions are excellent at breaking a string into a sequence of substrings, they don't by themselves provide any ability to process the resulting sequence in a nonlinear way. Sometimes recursion is still the best way of tackling such problems.

Example: Using Recursion to Process a Sequence of Strings

Suppose that you want to find all the lines within a play that contain the phrase
A and B
, where A and B are both names of characters in the play.

Source

There is only one line in the whole of
Othello
that meets these criteria. So you will need to run the stylesheet against the full play,
othello.xml
.

Stylesheet

The stylesheet
naming-lines.xsl
starts by declaring a global variable whose value is the set of names of the characters in the play, with duplicates removed and case normalized for efficiency:

xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”

xmlns:xs=“http://www.w3.org/2001/XMLSchema”

xmlns:local=“local-functions.uri”

exclude-result-prefixes=“xs local”

version=“2.0”

select=“for $w in distinct-values(//SPEAKER) return upper-case($w)”/>

We'll write a function that splits a line into its words. This was hard work in XSLT 1.0, but it is now much easier.

The next step is a function that tests whether a given word is the name of a character in the play:

This way of doing case-independent matching isn't really recommended, it's better to use a collation designed for the purpose, but it works with this data. Note that we are relying on the “existential” properties of the
=
operator: that is, the fact that it compares the word on the left with every string in the
$speakers
sequence.

Now I'll write a named template that processes a sequence of words, and looks for the phrase
A and B
where A and B are both the names of characters.

lower-case($words[2]) = ‘and’ and

local:is-character($words[3])”>

select=“$words[position() gt 1]”/>

Then comes the “main program,” the template rule that matches the root node. This simply calls the named template for each

element in the document, which causes

elements to be output for all matching sequences:

Output

The output is simply:

Othello and Desdemona

Other books

Marry Me, Cowboy (Copper Mountain Rodeo) by Darcy, Lilian

Love Beyond the Curve (BookStrand Publishing Romance) by Kate Patrick

Hearts Racing by Hodgson, Jim

B008KQO31S EBOK by Cooke, Deborah, Cross, Claire

His Ordinary Life by Linda Winfree

Harte Strings: The Billionaire Matchmaker, Part Two by Gina Robinson

Double Eagle by Dan Abnett

The Spacetime Pool by Catherine Asaro

Intergenerational Trauma: The Ghosts of Times Past by Thomas Hodge

The Girl Who Was on Fire by Leah Wilson, Diana Peterfreund, Jennifer Lynn Barnes, Terri Clark, Carrie Ryan, Blythe Woolston