XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (103 page)

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition
3.23Mb size Format: txt, pdf, ePub

are equal or not (spot the difference). XPath itself doesn't define what the default collation is (and neither does XSLT), it leaves the choice to the user, and the way you select it is going to depend on the configuration options for your particular XSLT processor. If you want more control over the choice of a collation, you can use the
compare()
function, which is described in detail in Chapter 13 (see page 727).

The handling of the
<
and
>
operators is not backward compatible with XPath 1.0. In XPath 1.0, these operators, when applied to two strings, attempted to convert both strings to numbers, and compared them numerically. This meant, for example, that
“4” = “4.0”
was false (because they were compared as strings), while
“4” >= “4.0”
was true (because they were compared as numbers). In XPath 2.0, if you want to compare strings as numbers, you must convert them to numbers explicitly, for example by using the
number()
function.

The library of functions available for handling strings is considerably expanded from XPath 1.0. It includes:

  • concat()
    and
    string-join()
    to concatenate strings with or without separators
  • contains()
    ,
    starts-with()
    , and
    ends-with()
    to test whether a string contains a particular substring
  • substring()
    ,
    substring-before()
    , and
    substring-after()
    to extract part of a string
  • upper-case()
    and
    lower-case()
    to change the case of characters in a string
  • string-length()
    to find the length of a string
  • normalize-space()
    to remove unwanted leading, trailing, and inner whitespace characters
  • normalize-unicode()
    to remove differences in the way equivalent Unicode characters are represented (for example, the letter
    Ç
    with a cedilla can be represented as either one Unicode character or two)

Perhaps the most powerful addition to the string-handling capability in XPath 2.0 is the introduction of support for regular expressions, familiar to programmers using languages such as Perl. Regular expressions provide a powerful way of matching and manipulating the contents of a string. They are used in three functions:

  • matches()
    tests whether a string matches a particular regular expression. For example,
    matches(“W151TBH”, “

    [A-Z][0-9]+[A-Z]+$”)
    returns
    true
    . (This regular expression matches any string consisting of one uppercase letter, then one or more digits, and then one or more letters.)
  • replace()
    replaces the parts of a string that match a given regular expression with a replacement string. For example,
    replace(“W151TBH”, “

    [A-Z]([0-9]+)[A-Z]+$”, “$1”)
    returns
    151
    . The
    $1
    in the replacement string supplied as the third argument picks up the characters that were matched by the part of the regular expression written in parentheses.
  • tokenize()
    splits a string into a sequence of strings, by treating any character sequence that matches the regular expression as a separator. For example,
    tokenize(“abc/123/x”, “/”)
    returns the sequence
    “abc”
    ,
    “123”
    ,
    “x”
    .

Other books

Purification by Moody, David
Shadow Rising by Kendra Leigh Castle
Something About Joe by Kandy Shepherd
The Butcher's Boy by Thomas Perry
Swing, Swing Together by Peter Lovesey
Killerfind by Hopkins, Sharon Woods
Armored by S. W. Frank
Hangman by Michael Slade
A Replacement Life by Boris Fishman