Read XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition Online
Authors: Michael Kay
The language specification says nothing about how extension functions are written, and how they are linked to the stylesheet. The notes that follow are provided to give an indication of the kind of techniques you are likely to encounter.
In the case of Java, several processors have provided a mechanism in which the name of the Java class is contained in the namespace URI of the function, while the name of the method is represented by the local name. This mechanism means that all the information needed to identify and call the function is contained within the function name itself. For example, if you want to call the Java method
random()
in class
java.lang.Math
to obtain a random number between 0.0 and 1.0, you can write:
xmlns:Math=“ext://java.lang.Math”/>
Unfortunately, each processor has slightly different rules for forming the namespace URI, as well as different rules for converting function arguments and results between Java classes and the XPath type system, so it won't always be possible to make such calls portable between XSLT processors. But the example above works with both Saxon and Xalan.
This example calls a static method in Java, but most products also allow you to call Java constructors to return object instances, and then to call instance methods on those objects. To make this possible, the processor needs to extend the XPath type system to allow expressions to return values that are essentially wrappers around external Java objects. The XSLT and XPath specifications are written to explicitly permit this, though the details are left to the implementation.
For example, suppose you want to monitor the amount of free memory that is available, perhaps to diagnose an “out of memory” error in a stylesheet. You could do this by writing:
xmlns:rt=“ext://java.lang.Runtime”/>
Again, this example is written to work with both Saxon and Xalan.
There are two extension function calls here: the call on
getRuntime()
calls a static method in the class
java.lang.Runtime
, which returns an instance of this class. The call on
freeMemory()
is an instance method in this class. By convention, instance methods are called by supplying the target instance as an extra first parameter in the call.
Another technique that's used for linking an extension function is to use a declaration in the stylesheet. Microsoft's processors use this approach to bind JavaScript functions. Here is an example of a simple extension function implemented using this mechanism with Microsoft's MSXML3/4 processor and an expression that calls it.
xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”
xmlns:ms=“javascript:my-extensions”>
xmlns:msxsl=“urn:schemas-microsoft-com:xslt”
language=“VBScript”
implements-prefix=“ms”
>
Function ToMillimetres(inches)
ToMillimetres = inches * 25.4
End Function
This is not a particularly well-chosen example, because it could easily be coded in XSLT, and it's generally a good idea to stick to XSLT code unless there is a very good reason not to; but it illustrates how it's done.
People sometimes get confused about the difference between script in the stylesheet, which is designed to be called as part of the transformation process, and script in the HTML output page, which is designed to be called during the display of the HTML in the browser. When the transformation is being done within the browser, and is perhaps invoked from script in another HTML page, it can be difficult to keep the distinctions clearly in mind. I find that it always helps in this environment to create a mock-up of the HTML page that you want to generate, test that it works as expected in the browser, and then start thinking about writing the XSLT code to generate it.
Sometimes you need to change configuration files or environment variables, or call special methods in the processor's API to make extension functions available; this is particularly true of products written in C or C++, which are less well suited to dynamic loading and linking.
In XSLT 2.0 (this is a change from XSLT 1.0), it is a static error if the stylesheet contains a call on a function that the compiler cannot locate. If you want to write code that is portable across processors offering different extension functions, you should therefore use the new
use-when
attribute to ensure that code containing such calls is not compiled unless the function is available. You can test whether a particular extension function is available by using the
function-available()
function. For example:
use-when=“function-available(‘acme:moonshine’)”/>
>*** Sorry, moonshine is off today ***
Extension functions, because they are written in general-purpose programming languages, can have side effects. For example, they can write to databases, they can ask the user for input, or they can maintain counters. At one time Xalan provided a sample application to implement a counter using extension functions, effectively circumventing the restriction that XSLT variables cannot be modified in situ. Even the simple
Math:random()
function introduced earlier has side effects, because it returns different results each time it is called. However, extension functions with side effects should be used with great care, because the XSLT specification doesn't say what order things are supposed to happen in. For example, it doesn't say whether a variable is evaluated when its declaration is first encountered, or when its value is first used. The more advanced XSLT processors adopt a lazy evaluation strategy in which (for example) variables are not evaluated until they are used. If extension functions with side effects are used to evaluate such variables, the results can be very surprising, because the order in which the extension functions are called becomes quite unpredictable. For example, if one function writes to a log file and another closes this file, you could find that the log file is closed before it is written to. In fact, if a variable is never used, the extension function contained in its definition might not be evaluated at all.
Before writing an extension function, there are a number of alternatives you should consider:
Extension Instructions
An extension instruction is an element occurring within a sequence constructor that belongs to a namespace designated as an extension namespace. A namespace is designated as an extension namespace by including its namespace prefix in the
extension-element-prefixes
attribute of the
xsl:extension-element-prefixes
attribute of the element itself, or of a containing extension element or literal result element. For example, Saxon provides an extension instruction
Example: Using an Extension Instruction
The following stylesheet (
sysprops.xsl
) uses the
-it:main
on the command line.
Stylesheet
The stylesheet calls five methods in the Java class library:
xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”
>
xmlns:System=“ext://java.lang.System”
xmlns:Properties=“ext://java.util.Properties”
xmlns:Enumeration=“ext://java.util.Enumeration”
xsl:exclude-result-prefixes=“System Properties Enumeration”>
select=“System:getProperties()”/>
select=“Properties:propertyNames($props)”/>
xsl:extension-element-prefixes=“saxon”
xmlns:saxon=“http://saxon.sf.net/”>
select=“Enumeration:nextElement($enum)”/>
value=“{Properties:getProperty($props, $property-name)}”/>
Note that for this to work,
saxon
must be declared as an extension element prefix; otherwise,
xsl:exclude-result-prefixes
attribute is not strictly necessary, but it prevents the output being cluttered with unnecessary namespace declarations.
Technically, this code is unsafe. Although it appears that the extension functions are read-only, the
Enumeration
object actually contains information about the current position in a sequence, and the call to
nextElement()
modifies this information; it is therefore a function call with side effects. In practice you can usually get away with such calls. However, as optimizers become more sophisticated, stylesheets that rely on side effects can sometimes work with one version of an XSLT processor, and fail with the next version. So you should use such constructs only when you have no alternative.
A tip: with Saxon, the
-TJ
option on the command line can be useful for debugging. It gives you diagnostic output showing which Java classes were searched to find methods matching the extension function calls. Another useful option is
-explain
, which shows how the optimizer has rearranged the execution plan.