XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (62 page)

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition
3.9Mb size Format: txt, pdf, ePub

The language specification says nothing about how extension functions are written, and how they are linked to the stylesheet. The notes that follow are provided to give an indication of the kind of techniques you are likely to encounter.

In the case of Java, several processors have provided a mechanism in which the name of the Java class is contained in the namespace URI of the function, while the name of the method is represented by the local name. This mechanism means that all the information needed to identify and call the function is contained within the function name itself. For example, if you want to call the Java method
random()
in class
java.lang.Math
to obtain a random number between 0.0 and 1.0, you can write:

    xmlns:Math=“ext://java.lang.Math”/>

Unfortunately, each processor has slightly different rules for forming the namespace URI, as well as different rules for converting function arguments and results between Java classes and the XPath type system, so it won't always be possible to make such calls portable between XSLT processors. But the example above works with both Saxon and Xalan.

This example calls a static method in Java, but most products also allow you to call Java constructors to return object instances, and then to call instance methods on those objects. To make this possible, the processor needs to extend the XPath type system to allow expressions to return values that are essentially wrappers around external Java objects. The XSLT and XPath specifications are written to explicitly permit this, though the details are left to the implementation.

For example, suppose you want to monitor the amount of free memory that is available, perhaps to diagnose an “out of memory” error in a stylesheet. You could do this by writing:


   Free memory: 

   

                 xmlns:rt=“ext://java.lang.Runtime”/>


Again, this example is written to work with both Saxon and Xalan.

There are two extension function calls here: the call on
getRuntime()
calls a static method in the class
java.lang.Runtime
, which returns an instance of this class. The call on
freeMemory()
is an instance method in this class. By convention, instance methods are called by supplying the target instance as an extra first parameter in the call.

Another technique that's used for linking an extension function is to use a declaration in the stylesheet. Microsoft's processors use this approach to bind JavaScript functions. Here is an example of a simple extension function implemented using this mechanism with Microsoft's MSXML3/4 processor and an expression that calls it.

   xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”

   xmlns:ms=“javascript:my-extensions”>  

   xmlns:msxsl=“urn:schemas-microsoft-com:xslt”

   language=“VBScript”

   implements-prefix=“ms”

>

Function ToMillimetres(inches)

   ToMillimetres = inches * 25.4

End Function



   

   



This is not a particularly well-chosen example, because it could easily be coded in XSLT, and it's generally a good idea to stick to XSLT code unless there is a very good reason not to; but it illustrates how it's done.

People sometimes get confused about the difference between script in the stylesheet, which is designed to be called as part of the transformation process, and script in the HTML output page, which is designed to be called during the display of the HTML in the browser. When the transformation is being done within the browser, and is perhaps invoked from script in another HTML page, it can be difficult to keep the distinctions clearly in mind. I find that it always helps in this environment to create a mock-up of the HTML page that you want to generate, test that it works as expected in the browser, and then start thinking about writing the XSLT code to generate it.

Sometimes you need to change configuration files or environment variables, or call special methods in the processor's API to make extension functions available; this is particularly true of products written in C or C++, which are less well suited to dynamic loading and linking.

In XSLT 2.0 (this is a change from XSLT 1.0), it is a static error if the stylesheet contains a call on a function that the compiler cannot locate. If you want to write code that is portable across processors offering different extension functions, you should therefore use the new
use-when
attribute to ensure that code containing such calls is not compiled unless the function is available. You can test whether a particular extension function is available by using the
function-available()
function. For example:


   

                 use-when=“function-available(‘acme:moonshine’)”/>

   

          >*** Sorry, moonshine is off today ***


Extension functions, because they are written in general-purpose programming languages, can have side effects. For example, they can write to databases, they can ask the user for input, or they can maintain counters. At one time Xalan provided a sample application to implement a counter using extension functions, effectively circumventing the restriction that XSLT variables cannot be modified in situ. Even the simple
Math:random()
function introduced earlier has side effects, because it returns different results each time it is called. However, extension functions with side effects should be used with great care, because the XSLT specification doesn't say what order things are supposed to happen in. For example, it doesn't say whether a variable is evaluated when its declaration is first encountered, or when its value is first used. The more advanced XSLT processors adopt a lazy evaluation strategy in which (for example) variables are not evaluated until they are used. If extension functions with side effects are used to evaluate such variables, the results can be very surprising, because the order in which the extension functions are called becomes quite unpredictable. For example, if one function writes to a log file and another closes this file, you could find that the log file is closed before it is written to. In fact, if a variable is never used, the extension function contained in its definition might not be evaluated at all.

Before writing an extension function, there are a number of alternatives you should consider:

  • Can the function be written in XSLT, using an

    element?
  • Is it possible to supply the required information as a stylesheet parameter? Generally this provides a cleaner and more portable solution.
  • Is it possible to get the result by calling the
    document()
    function, with a suitable URI? The URI passed to the
    document()
    function does not have to identify a static file; it could also invoke a web service. The Java JAXP API allows you to write a
    URIResolver
    class that intercepts the call on the
    document()
    function, so the
    URIResolver
    can return the results directly without needing to access any external resources. The
    System.Xml
    namespace in the Microsoft .NET framework has a similar capability, referred to as an
    XmlResolver
    .

Extension Instructions

An extension instruction is an element occurring within a sequence constructor that belongs to a namespace designated as an extension namespace. A namespace is designated as an extension namespace by including its namespace prefix in the
extension-element-prefixes
attribute of the

element, or in the
xsl:extension-element-prefixes
attribute of the element itself, or of a containing extension element or literal result element. For example, Saxon provides an extension instruction

to perform looping while a condition remains true. There is no standard XSLT construct for this because without side effects, a condition once true can never become false. But when used in conjunction with extension functions,

can be a useful addition.

Example: Using an Extension Instruction

The following stylesheet (
sysprops.xsl
) uses the

element to display the values of all the Java system properties. It does not use a source document, and can be run in Saxon by using the option
-it:main
on the command line.

Stylesheet

The stylesheet calls five methods in the Java class library:

  • System.getProperties()
    to get a
    Properties
    object containing all the system properties
  • Properties.propertyNames()
    to get an
    Enumeration
    of the names of the system properties
  • Enumeration.hasMoreElements()
    to determine whether there are more system properties to come
  • Enumeration.nextElement()
    to get the next system property
  • Properties.getProperty()
    to get the value of the system property with a given name. For this method, the
    Properties
    object is supplied as the first argument, and the name of the required property in the second

   xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”

>



  

       xmlns:System=“ext://java.lang.System”

       xmlns:Properties=“ext://java.util.Properties”

       xmlns:Enumeration=“ext://java.util.Enumeration”

       xsl:exclude-result-prefixes=“System Properties Enumeration”>

    

                  select=“System:getProperties()”/>

    

                  select=“Properties:propertyNames($props)”/>

    

          xsl:extension-element-prefixes=“saxon”

          xmlns:saxon=“http://saxon.sf.net/”>

       

                     select=“Enumeration:nextElement($enum)”/>

       

           value=“{Properties:getProperty($props, $property-name)}”/>

    

  



Note that for this to work,
saxon
must be declared as an extension element prefix; otherwise,

would be interpreted as a literal result element and would be copied to the output. I've chosen to declare it with the smallest possible scope, to mark the parts of the stylesheet that are non-portable. The
xsl:exclude-result-prefixes
attribute is not strictly necessary, but it prevents the output being cluttered with unnecessary namespace declarations.

Technically, this code is unsafe. Although it appears that the extension functions are read-only, the
Enumeration
object actually contains information about the current position in a sequence, and the call to
nextElement()
modifies this information; it is therefore a function call with side effects. In practice you can usually get away with such calls. However, as optimizers become more sophisticated, stylesheets that rely on side effects can sometimes work with one version of an XSLT processor, and fail with the next version. So you should use such constructs only when you have no alternative.

A tip: with Saxon, the
-TJ
option on the command line can be useful for debugging. It gives you diagnostic output showing which Java classes were searched to find methods matching the extension function calls. Another useful option is
-explain
, which shows how the optimizer has rearranged the execution plan.

Other books

Scandalous-nook by RG Alexander
Joan Hess - Arly Hanks 03 by Much Ado in Maggody
Solo by Alyssa Brugman
Death of a Perfect Wife by Beaton, M.C.
Davy Crockett by Robert E. Hollmann
Angel Town by Saintcrow, Lilith