XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (679 page)

Read XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition Online

Authors: Michael Kay

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition

2.75Mb size Format: txt, pdf, ePub

escape-uri-attributes	This attribute determines whether non-ASCII characters appearing in URI-valued attributes should be escaped using the %HH convention. The default is yes . Although HTML requires URIs to be escaped in this way, there are several reasons why you might choose to suppress this. Firstly, the URIs might already be in escaped form: you can do the escaping from within the stylesheet, with much greater control, using the escape-html-uri() function described in Chapter 13. Secondly, browsers do not always handle escaped URIs correctly. This is especially true when the URI is handled on the client side; for example, when it invokes JavaScript functions, or when it contains a fragment identifier.
include-content-type	If this attribute is set to yes (or if it is omitted), the serializer will add a element as a child of the HTML element, provided that the result tree contains a element. This element contains details of the media type and encoding of the document. Any existing element containing this information will be replaced. You may want to suppress this by specifying the value no , for example if the stylesheet is copying a document that already includes such an element.
indent	If this attribute has the value , the idea is that the HTML output should be indented to show its hierarchic structure. The XSLT processor is not obliged to respect this request, and if it does so, the precise form of the output is not defined. When producing indented output, the processor has much more freedom to add or remove whitespace than in the XML case, because of the way whitespace is handled in HTML. The processor can add or remove whitespace anywhere it likes so long as it doesn't change the way a browser would display the HTML.
media-type	This attribute defines the media type of the output file (often referred to as its MIME type). The default value is text/html . The specification doesn't say what use is made of this information; it doesn't affect the contents of the output file, but it may affect the way it is named, stored, or transmitted, depending on the environment. For example, the information might find its way into an HTTP protocol header.
normalization-form	This attribute is used in the same way as for the XML output method, described on page 934.
use-character-maps	This attribute is used in the same way as for the XML output method, described on page 935.
version	This attribute determines the version of HTML used in the output document. It is up to the implementation to decide which versions of HTML should be supported, though all implementations can be expected to support the default version, namely version 4.0.

The XHTML Output Method

An XHTML document is an XML document, so when you specify
method=“XHTML”
, most of the rules for the XML output method are inherited without change. However, there are special guidelines for serializing XHTML so that it is rendered correctly in browsers that were designed originally to handle HTML, and in addition some of the features of HTML serialization, such as URI escaping and addition of

elements, are also applicable to XHTML. So the XHTML output method is essentially a blend of features from the XML and HTML methods.

It's worth asking yourself whether you really need to use this method. If the browser understands XHTML, then serializing the result tree as XML will work fine. If the browser doesn't understand XHTML, and is going to handle it as if it were HTML, then why not serialize the tree as HTML to start with?

In fact, the XHTML output method works in the same way as the XML output method (and uses all the serialization parameters that control the XML method) with specific exceptions. These exceptions are:

The way that empty elements are output depends on the way the element is declared in the XHTML DTD. For an element whose content model is empty, such as

or

or

, the serializer should use an XML empty-element tag, taking care to include a space before the final
/>
, so that the tag looks like

or

. For an element that is empty but allowed to have content, such as a

element, the serializer should use a start tag followed by an end tag, thus:

.
The entity reference
'
is not recognized by all browsers, so the serializer will probably use

instead.
The serializer needs to take care with whitespace (for example newlines) appearing in attribute values. The specification doesn't say exactly how this should be handled, but it's probably safest, if there is any whitespace other than a single space character, to represent it using numeric character references.
The serializer must not output redundant namespace declarations, since these would violate the XHTML DTD. (At one time this rule was wider and encouraged the serializer to put XHTML elements in the default namespace. However, the serializer has no discretion in this area—namespace prefixes are chosen by the user, not by the serializer.) Because DTDs are not namespace-aware, it's always the case that if you choose the wrong prefix, the result document of a transformation may be invalid against the DTD.

Other books

Beacon 23: Part One: Little Noises (Kindle Single) by Hugh Howey

Bending the Rules by Ali Parker

Corporate Plaything by Lizzie Lynn Lee

UnSouled by Neal Shusterman

Infinity's Reach by Robinson, Glen

Liam's List by Haleigh Lovell

In Love with Ezra (Love Unaccounted Book 2) by Belvin, Love

Deep Yellow by Stuart Dodds

Backstage at The Price Is Right: Memoirs of A Barker Beauty by Kathleen Bradley

The Sopaths by Anthony, Piers