Incomplete

1.0 (2015-07-16)

Attribute Lists

FIXME

Notes

Examples

Specification

Attribute lists occur in four places: header attribute lists for pages and sections, on informational elements, on block elements, and on inline elements. For header, informational, and inline elements attribute lists, the attribute list starts with a left square bracket. For block attribute lists, the block declaration itself starts with a left square bracket, so the attribute list begins with the first whitespace character after the block name.

An attribute list is a space-separated list of key-value pairs, of the form name=value. The value may be quoted with a single or double quote, in which case the value finishes when that quote character is encountered. However, the quote character itself may be escaped in the value as $' or $", or using other entity values, discussed below. Furthermore, $ itself may be escaped as $$.

If the value is not quoted, it ends with the first whitespace character or right square bracket, unless the right square bracket is escaped as $]. An unescaped right square bracket (outside a quoted value) ends the attribute list. Note that, unlike XML, no space is allowed around the = character.

The full attribute list may contain newlines, either as insignificant whitespace or as part of values. As long as the parser is inside the attribute list, only a square bracket may end it. Normal indentation requirements do not hold in an attribute list, except for attribute lists on inline elements. Since block and inline parsing can be done in separate steps, if newlines occur in inline attribute lists, the new lines must be indented to at least the inner indent of the containing block.

In addition to regular key-value pairs, bare words are possible in attribute lists, possibly with leading sigils, for common cases. A bare word starts either with the sigil or a non-whitespace character, and finish at the first whitespace, unescaped right square bracket, and (exceptionally) an = character. Note that the = character indicates a regular key-value pair, not a bare word.

  • If the word starts with >>, the value is taken as the value of the href attribute.

  • If the word starts with >, the value is taken as the value of the xref attribute.

  • If the word starts with ., the value is taken as the value of the style attribute.

  • If the word starts with #, the value is taken as the value of the id attribute.

  • Otherwise, the value is taken as the value of the type attribute.

For the type and style attributes only, multiple values are joined with a space character.

Attribute Values

Attribute values are parsed to allow character escapes and entities. If the character $ is followed by one of $, *, =, -, @, [, ], (, ), ", or ', that two-character sequence is replaced by the second character.

If the character $ is followed by a QName immediately followed by a semicolon (;), then that sequence is replaced according to the following rules:

  • If the name is a user-defined entity, the content of that entity is parsed as attribute value content, and the result is used. Note that each entity is parsed in its own subcontext, so you cannot span an entity reference across multiple entities. Parsers must detect cycles in entity references and produce an error.

  • Otherwise, if the name is defined in XML Entity Definitions for Characters (2nd Edition), the corresponding value is used.

  • Otherwise, if the name consists only of characters in the ranges 0-9, a-f, and A-F, that value is interpreted as a hexadecimal number, and the Unicode character with that code point is used.

  • Otherwise, the entity is unrecognized, and parsers must produce an error.

© 2015 Shaun McCance
cc-by-sa 3.0 (us)

This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.

As a special exception, the copyright holders give you permission to copy, modify, and distribute the example code contained in this document under the terms of your choosing, without restriction.

Powered by
Mallard