XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (696 page)

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition
3.59Mb size Format: txt, pdf, ePub

This might mean writing a little more code, and it might take a little longer because work is being repeated—but it is usually the right approach. The problem of repeated processing can often be solved by using variables for the sequences used as input to both calculations: if you need to use a particular set of nodes as input to more than one process, save that sequence in a variable, which can then be supplied as a parameter to the two separate templates or functions.

One problem I encountered where multiple results were needed involved processing a tree, making small changes if particular situations were encountered. At the end I wanted both the new tree and a flag telling me whether any changes had been made. It turned out that the best way to solve this was first to search the tree looking for the nodes that needed to change and then to process it again, making the changes. Although this incurs an extra cost in processing the tree twice when changes are needed, it achieves a significant saving when no changes are needed, because it avoids making an unnecessary copy of the tree.

An approach that may sometimes give better performance is to write a template or function that returns a composite result. With XSLT 2.0, it is possible to return a composite result structured either as a sequence or as a tree. The calling code is then able to access the individual items of this sequence, or the individual nodes of the tree, using an XPath expression. For example, the following recursive template, when supplied with a sequence of nodes as a parameter, constructs a tree containing two elements,

and

, set to the minimum and maximum value of the nodes, respectively. A working stylesheet based on this example can be found in the download file as
minimax.xsl
; it can be used with a source document such as
booklist.xml
.


  

  

  

  

    

      

        

        

                        select=“if (number($nodes[1]) lt $min-so-far)

                                  then $nodes[1]

                                  else $min-so-far”/>

        

                        select=“if (number($nodes[1]) gt $max-so-far)

                                  then $nodes[1]

                                  else $max-so-far”/>

      

    

    

      

         

         

      

    

  


When you call this template, you can let the second and third parameters take their default values:


  

    

  


Minimum price is: .

Maximum price is: .

One particular situation where it is a good idea to save intermediate results in a variable, and then use them as input to more than one process, is where the intermediate results are sorted. If you've got a large set of nodes to sort, the last thing you want to do is to sort it more than once. The answer to this is to do the transformation in two passes: the first pass creates a sorted sequence, and the second does a transformation on this sorted sequence. If the first pass does nothing other than sorting, then the data passed between the two phases can simply be a sequence of nodes in sorted order. If it does other tasks as well as sorting (perhaps numbering or grouping), then it might be more appropriate for the first phase to construct a temporary tree.

There are actually two ways you can achieve a multistage transformation in XSLT:

  • Create a temporary tree in which the nodes appear in sorted order. Then use

    or

    to process the nodes on this tree in their sorted order. This is similar to the min-and-max example above; it relies on having either XSLT 2.0 or an XSLT 1.0 processor with the
    exslt:node-set()
    extension function.
  • Use a sequence of stylesheets (often called a
    pipeline
    ): the first stylesheet creates a document in which the nodes are sorted in the right order, and subsequent stylesheets take this document as their input. Such a chain of stylesheets can be conveniently constructed using an XProc pipeline processor (
    http://www.w3.org/XML/Processing/
    ) or if you prefer to write your own Java code, by using the JAXP interface described in Appendix E. The advantage of this approach compared with a single stylesheet is that the individual stylesheets in the chain are easier to split apart and reuse in different combinations for different applications.

Note that neither of these techniques violates the XSLT design principle of “no side effects.”

Don't Iterate, Recurse

One of the most common uses of variables in conventional programming is to keep track of where you are in a loop. Whether this is done using an integer counter in a
for
loop, or using an
Iterator
object to process a list, the principle is the same: we have a variable that represents how far we have got and that tells us when we are finished.

In a functional program, you can't do this, because you can't update variables. So instead of writing a loop, you need to write a recursive function.

In a conventional program a common way to process a list of items is as follows:

iterator = list.iterator();

while (iterator.hasNext()) {

   Item item = iterator.next();

   item.doSomething();

}

There's no assignment statement here (the
item
variable is declared repeatedly within the loop and never changes its value once initialized). However, the code relies on the
iterator
containing some sort of updateable variable that keeps track of where it is, and the
iterator.next()
call implicitly changes the state of this internal variable.

In a functional program we handle this by recursion rather than iteration. The pseudocode becomes:

function process(list) {

   if (!isEmpty(list)) {

      doSomething(getFirst(list));

      process(getRemainder(list));

   }

}

This function is called to process a list of objects. It does whatever is necessary with the first object in the list, and then calls itself to handle the rest of the list. (I'm assuming that
getFirst()
gets the first item in the list and
getRemainder()
gets a list containing all items except the first). The list gets smaller each time the function is called, and when it finally becomes empty, the function exits, and unwinds through all the recursive calls.

It's important to make sure there is a terminating condition such as the list becoming empty. Otherwise, the function will keep calling itself forever—the recursive equivalent of an infinite loop.

So, the first lesson in programming without variables is to use recursion rather than iteration to process a list. With XSLT, this technique isn't necessary to handle every kind of loop, because XSLT and XPath collectively provide built-in facilities, such as

and

and the
for
expression, that process all the members of a sequence, as well as functions like
sum()
and
count()
to do some common operations on sequences; but whenever you need to process a set of things that can't be handled with these constructs, you need to use recursion.

Other books

Uncle Janice by Matt Burgess
Paint on the Smiles by Grace Thompson
The Ingredients of Love by Nicolas Barreau
Imprudent Lady by Joan Smith
The Oyster Catcher by Thomas, Jo
Severance by Chris Bucholz
The Considine Curse by Gareth P. Jones