Read XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition Online
Authors: Michael Kay
Throughout this book we use the namespace prefix
xs
to refer to the namespace
http://www.w3.org/2001/XMLSchema
, which is the namespace in which these types are defined.
However, XPath is designed to be used in a wide variety of different environments, and host languages (that is, specifications that incorporate XPath as a sublanguage) are allowed to tailor this list, both by omitting types from the list and by adding to it. The host language we are primarily concerned with in this book is XSLT 2.0, and this defines two conformance levels: a
basic
XSLT processor supports all the above 19 types with the exception of
xs:NOTATION
, while a
schema-aware
XSLT processor recognizes the full list.
The type
xs:integer
is unusual. On the one hand it has a special status in the XPath language (it is one of the few types for which values can be written directly as literals). On the other hand, it is actually not a primitive type, but a type that is derived as a restriction of
xs:decimal
. This is because the set of all possible
xs:integer
values is a subset of the set of all possible
xs:decimal
values.
In fact there are four types for which XPath provides a syntax for defining literal constants:
Type | Example literals |
xs:string | “New York” , ‘Moscow’ , “” |
xs:integer | 3 , 42 , 0 |
xs:decimal | 93.7 , 1.0 , 0.0 |
xs:double | 17.5e6 , 1.0e-3 , 0e0 |
A number can always be preceded by a plus or minus sign when it appears in an XPath expression, but technically the sign is not part of the numeric literal, it is an arithmetic operator.
Values of type
xs:boolean
can be represented using the function calls
false()
and
true()
, listed in the library of functions described in Chapter 13. Values of any other type can be written using constructor functions, where the name of the function is the same as the name of the type. For example, a constant date can be written as
xs:date(“2004-07-31”)
.
There is one other type we need to mention in this section: the type
xs:untypedAtomic
. This type is defined not by XML Schema, but in the XPath specifications (in working drafts it was also in a different namespace, with the conventional prefix
xdt
, which you may still find used in some products). This type is used to label values that have not been validated using any schema, and which therefore do not belong to any schema-defined type. It is also used to label values that have been validated against a schema, in cases where the schema imposes no constraints. The set of possible values for this type is exactly the same as the value space for the
xs:string
type. The values are not strictly strings, because they have a different label (
xs:untypedAtomic
is not derived by restricting
xs:string
). Nevertheless, an
xs:untypedAtomic
value can be used anywhere that an
xs:string
can be used. In fact, it can be used anywhere that a value of any atomic type can be used; for example, it can be used where an integer or a boolean or a date is expected. In effect,
xs:untypedAtomic
is a label applied to values whose type has not been established.
If an
xs:untypedAtomic
value is used where an integer is expected, then the system tries to convert it to an integer at the time of use. If the actual value is not valid for an integer, then a runtime failure will occur. In this respect
xs:untypedAtomic
is quite different from
xs:string
, because if you try to use a string where an integer is expected, you will get a type error regardless whether it could be converted or not.
Atomic Types
We've been talking about atomic values and we've introduced the 19 primitive atomic types. In this section we'll look at these types more closely, and we'll also see what other atomic types are available.
Notice that we're talking here about atomic types rather than simple types. In XML Schema, we use an
This defines a simple type whose value allows a list of names (the type
xs:NCName
defines a name that follows the XML rules:
NCName
means no-colon-name). An example of an attribute conforming to this type might be
a = “h1 h2 h3”
. This is a simple type, but it is not an atomic type. Atomic types do not allow lists.
XML Schema also allows simple types to be defined as a choice; for example, a simple type might allow either a decimal number, or the string
N/A
. This is referred to as a union type. Like list types, union types are simple types, but they are not atomic types.
Atomic types come from a number of sources.
As well as the 19 primitive types, the XML Schema specification defines 25 derived types that can be used in any schema; together these are referred to as the built-in types. Like the primitive types, these types are all in the XML Schema namespace
http://www.w3.org/2001/XMLSchema
.
There is also a second namespace for schema-defined types, called
http://www.w3.org/2001/XMLSchema-datatypes
. Frankly, this namespace is best forgotten. It doesn't provide anything that you don't get by using the ordinary XML Schema namespace, and it creates some technical problems because the types in this namespace are not exact synonyms of the types in the ordinary namespace. My advice is, don't go anywhere near it.
XPath 2.0 adds four more atomic types:
xs:dayTimeDuration
,
xs:yearMonthDuration
,
xs:anyAtomicType
, and
xs:untypedAtomic
. We've already covered
xs:untypedAtomic
in the previous section. The two duration types are described on page 205, later in this chapter.
The type
xs:anyAtomicType
is simply an abstract supertype for all the other atomic types. It is used mainly in function signatures, when you want to write a function that can handle atomic values of any type (the
min()
and
max()
functions are examples).
In a basic XSLT processor (as distinct from one that is schema-aware), the only built-in derived types that are recognized are
xs:integer
,
xs:dayTimeDuration
,
xs:yearMonthDuration
,
xs:anyAtomicType
,
and xs:untypedAtomic
.
In a schema-aware processor, all the built-in types are available, and you can also define your own atomic types in a schema. As we saw in Chapter 4, a type defined in a schema becomes available for use in a stylesheet when the schema is imported using an
Implementors can also add their own atomic types. There are a number of reasons they might want to do this. The most likely reason is to make it easier for XPath expressions to make calls on external functions; for example, functions written in C# or Java. The XPath specification doesn't say how this is done, and leaves it to implementors to define. Another reason implementors might want to add extra types is to support XPath access to some specialized database, for example, an LDAP directory. XPath is defined in terms of a data model with an obvious relationship to XML, but there is no reason why other sources of data cannot be mapped to the data model equally well, and doing this effectively might involve defining some custom types. (I mentioned LDAP because it is a hierarchic database, which provides a particularly good fit to the XPath data model.) Generally, any extra types added by the implementor will have names that are in some implementation-controlled namespace.