XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (103 page)

Read XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition Online

Authors: Michael Kay

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition

5.14Mb size Format: txt, pdf, ePub

are equal or not (spot the difference). XPath itself doesn't define what the default collation is (and neither does XSLT), it leaves the choice to the user, and the way you select it is going to depend on the configuration options for your particular XSLT processor. If you want more control over the choice of a collation, you can use the
compare()
function, which is described in detail in Chapter 13 (see page 727).

The handling of the
<
and
>
operators is not backward compatible with XPath 1.0. In XPath 1.0, these operators, when applied to two strings, attempted to convert both strings to numbers, and compared them numerically. This meant, for example, that
“4” = “4.0”
was false (because they were compared as strings), while
“4” >= “4.0”
was true (because they were compared as numbers). In XPath 2.0, if you want to compare strings as numbers, you must convert them to numbers explicitly, for example by using the
number()
function.

The library of functions available for handling strings is considerably expanded from XPath 1.0. It includes:

concat()
and
string-join()
to concatenate strings with or without separators
contains()
,
starts-with()
, and
ends-with()
to test whether a string contains a particular substring
substring()
,
substring-before()
, and
substring-after()
to extract part of a string
upper-case()
and
lower-case()
to change the case of characters in a string
string-length()
to find the length of a string
normalize-space()
to remove unwanted leading, trailing, and inner whitespace characters
normalize-unicode()
to remove differences in the way equivalent Unicode characters are represented (for example, the letter
Ç
with a cedilla can be represented as either one Unicode character or two)

Perhaps the most powerful addition to the string-handling capability in XPath 2.0 is the introduction of support for regular expressions, familiar to programmers using languages such as Perl. Regular expressions provide a powerful way of matching and manipulating the contents of a string. They are used in three functions:

matches()
tests whether a string matches a particular regular expression. For example,
matches(“W151TBH”, “
∧
[A-Z][0-9]+[A-Z]+$”)
returns
true
. (This regular expression matches any string consisting of one uppercase letter, then one or more digits, and then one or more letters.)
replace()
replaces the parts of a string that match a given regular expression with a replacement string. For example,
replace(“W151TBH”, “
∧
[A-Z]([0-9]+)[A-Z]+$”, “$1”)
returns
151
. The
$1
in the replacement string supplied as the third argument picks up the characters that were matched by the part of the regular expression written in parentheses.
tokenize()
splits a string into a sequence of strings, by treating any character sequence that matches the regular expression as a separator. For example,
tokenize(“abc/123/x”, “/”)
returns the sequence
“abc”
,
“123”
,
“x”
.

Other books

The Virtu by Sarah Monette

The Secret Pilgrim by John le Carré

The Cat Who Ate Danish Modern by Lilian Jackson Braun

El orígen del mal by Jean-Christophe Grangé

The Last Of The Wilds by Canavan, Trudi

The Age of Zombies: Sergeant Jones by Rockow, B.

They Don't Dance Much: A Novel by James Ross

6 Maple Leaf Hunter by Maddie Cochere

Tianna the Terrible (Anika Scott Series) by Karen Rispin

24690 by A. A. Dark, Alaska Angelini