XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (555 page)

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition
5.05Mb size Format: txt, pdf, ePub

Historically, the same algorithm has been used to escape URLs and URIs using character encodings other than UTF-8. However, in most environments where XPath is used UTF-8 is the recommended encoding for URIs, and this is therefore the only encoding that the
encode-for-uri()
function supports.

Which characters need to be escaped? The answer to this depends on context. Essentially, characters fall into three categories: those that can be used freely anywhere in a URI, those that cannot be used anywhere and must always be escaped, and those that have a special meaning in a URI and must be escaped if they are to be used without this special meaning. This function escapes everything except the first category (referred to in the RFC as
unreserved
characters).

Because this function applies escaping to characters that have special meaning in a URI, such as
/
and
:
, it should never be used to escape a URI as a whole, only to escape the strings that make up the components of a URI while it is being constructed. In theory, this is always the right way to construct a URI: each of its components should be escaped individually (for example, the URI scheme, the authority, the path components, the query parameters, and the fragment identifier), and the components should then be assembled by adding the appropriate delimiters. This is the only way of ensuring, for example, that an
=
sign is escaped if it appears as an ordinary character in a path component, but not if it appears between a keyword and a value in the query part.

But often in practice the unescaped URI (if I may call it that—technically, if it isn't escaped then it isn't a URI) arrives in one piece and escaping needs to be applied to the whole string. In this case an alternative approach is to use the
iri-to-uri()
function described on page 811, or, if the URI appears in the context of an HTML document, the
escape-html-uri()
function described on page 775.

See Also

escape-html-uri()
on page 775

iri-to-uri()
on page 811

escape-uri-attributes
serialization option in

: Chapter 15 page 938

ends-with

The
ends-with()
function tests whether one string ends with another string. For example, the expression
ends-with(‘17 cm’, ‘cm’)
returns
true
.

Signature

Argument
Type
Meaning
input
xs:string?
The containing string
test
xs:string?
The test string
collation
(optional)
xs:string
A collation URI
Result
xs:string?
True if the input string ends with the test string; otherwise, false

Effect

If the Unicode codepoint collation is used (this is the default), then the system tests to see whether the last
N
characters of the
input
string match the characters in the
test
string (where
N
is the length of the
test
string). If so, the result is
true
; otherwise, it is
false
. Characters match if they have the same Unicode value.

If the
test
string is zero-length, the result is always
true
. If the
input
string is zero-length, the result is
true
only if the
test
string is also zero-length. If the
test
string is longer than the
input
, the result is always
false
.

If either the
input
or the
test
argument is an empty sequence, it is treated in the same way as a zero-length string.

If no collation is specified, then the default collation from the static context is used. If a
collation
is used, this collation is used to test whether the strings match. See the description of the
contains()
function on page 730 for an account of how substring matching works with a collation.

Examples

These examples assume that the default collation is the Unicode codepoint collation, which compares strings codepoint by codepoint.

Expression
Result
ends-with(“a.xml”, “.xml”)
true
ends-with(“a.xml”, “.xsl”)
false
ends-with(“a.xml”, “”)
true
ends-with(“”, “”)
true
ends-with((), ())
true

Usage

The
ends-with()
function is useful when the content of text values, or attributes, has some internal structure. For example, the following code can be used to strip an unwanted
/
at the end of an
href
attribute:

Other books

A Triumph of Souls by Alan Dean Foster
Cajun Vacation by Winters, Mindi
The Scapegoat by Sophia Nikolaidou
Someone to Trust by Lesa Henderson
Mistletoe Magic by Sydney Logan
Dr Berlin by Francis Bennett