XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (238 page)

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition
6.41Mb size Format: txt, pdf, ePub

If you say nothing at all about the collation you want, then the default is implementation-defined. For XPath operators and functions, the default is always the Unicode codepoint collation, but this is not necessarily the default for

.

Usage

In this section we'll consider two specific aspects of sorting that tend to be troublesome. The first is the choice of a collation, and the second is how to achieve dynamic sorting—that is, sorting on a key chosen by the user at runtime, perhaps by clicking on a column heading.

Using Collations

XSLT is designed to be capable of handling serious professional publishing applications, and clearly this requires some fairly powerful sorting capabilities. In practice, however, the most demanding applications almost invariably have domain-specific collating rules; for instance, the rules for sorting personal names in a telephone directory are unlikely to work well for geographical names in a gazetteer. This is why the working groups decided to make the specification so open-ended in its support for collations.

Collations based on the Unicode collation algorithm (UCA) generally assign each character in the sort key value a set of weights. The primary weight distinguishes characters that are fundamentally different:
A
is different from
B
. The secondary weight distinguishes secondary differences; for example, the distinction between
A
and
Ä
. The tertiary weight is used to represent the difference between upper and lower case, for example
A
and
a
. The way that weights are used varies a little in non-Latin scripts, but the principles are similar.

Rather than looking at each character separately, the Unicode collation algorithm compares two strings as a whole. First, it looks to see if there are two characters whose primary weights differ; if so, the first such character determines the ordering. If all the primary weights are the same, it looks at the secondary weights, and it only considers the tertiary weights if all the secondary weights are the same. This means for example that in French,
attache
sorts before
attaché
, which in turn sorts before
attachement
. The acute accent is taken into account when comparing
attache
with
attaché
, because there is no primary difference between the strings, but it is ignored when comparing
attaché
with
attachement

Other books

Double Alchemy: Climax by Susan Mac Nicol
Fever Dream by Douglas Preston, Lincoln Child
Chain of Fools by Richard Stevenson
The Sealed Nectar by Safiur-Rahman Al-Mubarakpuri
Sweet Seduction Serenade by Nicola Claire
The Amulets (An 'Amulets of Andarrin' tale) by Michael Alexander Card-Mina
A Language Older Than Words by Derrick Jensen
Haiku by Andrew Vachss