Read XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition Online
Authors: Michael Kay
In some other regular expression languages (for example, Perl), the
x
flag also allows comments to be included in a regular expression, starting with
#
and ending with a newline. This feature wasn't included in the XPath dialect because newline is an inappropriate choice of delimiter, given that XML parsers are expected to replace newlines with spaces when processing an attribute value. If you want to include a comment in a regular expression within an XSLT stylesheet, use XML comments:
([0-9]+)
\s+
([A-Za-z]+)
\s+
([0-9]{4})
…
Perl treats an escaped space (a space preceded by a backslash) as a significant space, despite the presence of this flag. The XPath rules don't follow this precedent—spaces are ignored even if preceded by a backslash.
Disallowed Constructs
Finally, here are some examples of constructs that might be familiar from other regular expression dialects that have not been included in the XPath 2.0 definition. A conformant XPath 2.0 processor is expected to reject any attempt to use constructs that aren't allowed by the grammar presented in this chapter. A few of these constructs are shown in the following table.
Disallowed Construct | Meaning in other languages |
[a-z&&[ ∧ oi]] | Intersection: any character in the range a to z , except for o and i |
[a-z[A-Z]] | Union: same as [a-zA-Z] |
\0 nn , \x nn , \u nnnn | Character identified by Unicode codepoint in octal or hexadecimal |
\a, \e, \f, \cN | Various control characters not allowed in XML 1.0 |
\p{Alpha} , \P{Alpha} | Character classes defined in POSIX |
\b, \B | Word boundary |
\A, \Z, \z | Beginning and end of input string |
\g, \G | End of the previous match |
X*+ | Non-backtracking or possessive quantifiers (in Java, these force the matching engine down this path even if this results in the match as a whole failing) |
(?...) | Expressions that set various special options; non-capturing subexpressions; comments |