Read HTML The Definitive Guide Online
Authors: Chuck Musciano Bill Kennedy
10.2.2.1 The application/x-www-form-urlencoded encoding
The standard encoding - application/x-www-form-urlencoded - converts any spaces in the form values to a plus sign (+), nonalphanumeric characters into a percent sign (%) followed by two hexadecimal digits that are the ASCII code of the character, and the line breaks in multiline form data into %0D%0A.
The standard encoding also includes a name for each field in the form. (A "field" is a discrete element in the form, whose value can be nearly anything from a single number to several lines of text - the user's address, for example.) If there is more than one value in the field, the values are separated by ampersands.
For example, here's what the browser sends to the server after the user fills out a form with two input fields labeled name and address; the former field has just one line of text, while the latter field has several lines of input:
name=O'Reilly+and+Associates&address=103+Morris+Street%0D%0A Sebastopol,%0D%0ACA+95472
We've broken the value into two lines for clarity in this book, but in reality, the browser sends the data in an unbroken string. The name field is "O'Reilly and Associates" and the value of the address field, complete with embedded newline characters, is:
CA 95472
10.2.2.2 The multipart/form-data encoding
The multipart/form-data encoding encapsulates the fields in the form as several parts of a single MIME-compatible compound document. Each field has its own section in the resulting file, set off by a standard delimiter. Within each section, one or more header lines define the name of the field, followed by one or more lines containing the value of the field. Since the value part of each section can contain binary data or otherwise unprintable characters, no character conversion or encoding occurs within the transmitted data.
This encoding format is by nature more verbose and longer than the application/x-www-form-urlencoded format. As such, it can be used only when the method attribute of the
10.2.4.1 POST or GET?
Which one to use if your form-processing server supports both the POST and GET methods? Here are some rules of thumb:
For best form-transmission performance, send small forms with a few short fields via the GET
●
method.
Because some server operating systems limit the number and length of command-line ●
arguments that can be passed to an application at once, use the POST method to send forms that have many fields or that have long text fields.
If you are inexperienced in writing server-side form-processing applications, choose GET. The ●
extra steps involved in reading and decoding POST-style transmitted parameters, while not too difficult, may be more than you are willing to tackle.
If security is an issue, choose POST. GET places the form parameters directly in the application ●
URL where they easily can be captured by network sniffers or extracted from a server log file.
If the parameters contain sensitive information like credit card numbers, you may be compromising your users without their knowledge. While POST applications are not without their security holes, they can at least take advantage of encryption when transmitting the parameters as a separate transaction with the server.
If you want to invoke the server-side application outside the realm of a form, including passing ●
it parameters, use GET because it lets you include form-like parameters as part of a URL.
POST-style applications, on the other hand, expect an extra transmission from the browser after the URL, something you can't do as part of a conventional tag.
10.2.4.2 Passing parameters explicitly
The foregoing bit of advice warrants some explanation. Suppose you had a simple form with two elements named x and y. When the values of these elements are encoded, they look like this: x=27&y=33
If the form uses method=GET, the URL used to reference the server-side application looks something like this:
http://www.kumquat.com
cgi-bin
update?x=27&y=33
There is nothing to keep you from creating a conventional tag that invokes the form with any parameter value you desire, like so:
The only hitch is that the ampersand that separates the parameters is also the character-entity insertion character. When placed within the href attribute of the tag, the ampersand will cause the browser to replace the characters following it with a corresponding character entity.
To keep this from happening, you must replace the literal ampersand with its entity equivalent, either & or &. With this substitution, our example of the nonform reference to the server-side application looks like this:
Because of the potential confusion that arises from having to escape the ampersands in the URL, server implementors are encouraged to also accept the semicolon as a parameter separator. You might want to check your server's documentation to see if they honor this convention. See
Appendix E,
10.2.5 The target Attribute
With the advent of frames, it is possible to redirect the results of a form to another window or frame.
Simply add the target attribute to your
The title attribute defines a quote-enclosed string value to label the form. However, it entitles only
the form segment; its value cannot be used in an applet reference or hyperlink. [The id attribute,
4.1.1.4]
[The title attribute, 4.1.1.5]
10.2.7 The class, style, lang, and dir Attributes
The style attribute creates an inline style for the elements enclosed by the form, overriding any other style rule in effect. The class attribute lets you format the content according to a predefined class of the
The actual effects of style with
For instance, you may create a special font face and background color style for the form. The form's text labels, but not the text inside a text input form element, will appear in the specified font face and background color. Similarly, the text labels you put beside a set of radio buttons will be in the form-specified style, but not radio buttons themselves.
The lang attribute lets you specify the language used within the form, with its value being any of the ISO standard two-character language abbreviations, including an optional language modifier. For example, adding lang=en-UK tells the browser that the list is in English ("en") as spoken and written in the United Kingdom (UK). Presumably, the browser may make layout or typographic decisions based upon your language choice.
Similarly, the dir attribute tells the browser which direction to display the list contents, from left to right (dir=ltr) like English or French, or from right to left (dir=rtl), such as with Hebrew or Chinese.