MEP-0010

The type Attribute on the code Element

This page proposes deprecating the mime attribute on the block and inline code elements in favor of a simpler type attribute.

Authors: Shaun McCance
Created: 2017-08-31
Status: final (2019-01-31)
Target: 1.1
Issue: https://github.com/projectmallard/projectmallard.org/issues/38
History:
show history
hide history
history
2017-08-31 1.1 proposed
2019-01-05 1.1 implemented
2019-01-31 1.1 final

Background

Mallard 1.0 provides the mime attribute on a few elements, including on block and inline code elements. On code elements, the mime attribute can specify the specific programming language or other syntax in use. This can help syntax highlighting, as well as automated code extraction and testing.

The mime attribute takes a MIME type. This seemed like a reasonable way to specify a file format, but in practice, there are very few standard, registered MIME types for programming languages. As a result, people need to memorize arbitrary and inconsistent extension types.

Proposal

This page proposes adding a type attribute to the code element that would take a simple string identifying the type of code, such as c, python, or xml. These are the same types of identifiers used in many other document formats, as well as by many syntax highlighters.

This proposal is only for the block code element, the block screen element, the inline code element, and the inline cmd element. It does not replace the mime attribute on the block or inline media element. That may also be worth doing, but should be done in a separate proposal.

The short identifiers are also somewhat arbitrary, and there may be differences in recognized strings from other formats or even between different Mallard implementations. However, this is already also the case for non-standard MIME types. Short identifier strings, at least, are easier to type and easier to remember. Just as with MIME types, implementations may recognize multiple identifiers for a single format.

The type attribute should take a space-separated list of values, with implementations choosing the first or best value they recognize. This would allow authors to provide very specific values for certain uses, while still providing something that would be recognized by most syntax highlighters.

For backwards compatibility, the mime attribute would have to remain in the Mallard schema. Implementations that make use of the mime attribute may continue to recognized it as a fallback when the type attribute is not used.

Examples

Use the type attribute on a code block example from the Code and Commands tutorial.

<code type="python">
if (bean_is_magic(bean)) {
  bean_grow(bean);
}
</code>

Put the first example page from the Ten Minute Tour into a code block with a CDATA section, and use a list of types to specify that the content is specifically Mallard and more generally XML.

<code type="mallard xml"><![CDATA[
<page xmlns="http://projectmallard.org/1.0/"
      type="guide"
      id="index">
<title>Beanstalk Help</title>
</page>
]]></code>

Alternatives

We considered using the style attribute, since syntax highlighting or other use of the content type is not required. However, the style attribute generally carries no set semantics, and the large set of content types could interfere with site-specific uses of the style attribute. Using a prefix could help avoid conflicts, such as style="lang-xml", but this seemed cumbersome.

There was also a suggestion to use a lang attribute, which is used in some lightweight formats for content type, but this could be confused with the xml:lang attribute used for human languages, or even the non-namespaced lang attribute used in some XML vocabularies for human languages. Also, there is an advantage to using attributes that have a shorthand syntax in Ducktype, such as type or style.

Compatibility and Fallback

This proposal deprecates an existing attribute, which introduces a potential compatibility issue. Pages written with the mime attribute might not be optimally processed by newer tools that implement this proposal. However, there has never been any requirement for tools to do anything with the mime attribute, and tools that do support the mime attribute may continue to do so.

The fallback behavior on older tools for pages using just the type attribute is that they may miss out on special features like syntax highlighting, which is not mandatory for tools to support. If older tools are important, writers may continue to use the mime attribute as well as the type attribute.

Comparison to Other Formats

DocBook uses a langauge attribute on the programlisting element to specify the programmling language or other syntax. Like this proposal, DocBook uses short identifier strings. The supported strings may vary between implementations. There is no recommendation to support a space-separated list in the language attribute, although it is valid.

DITA doesn’t provide a specific a specific attribute to specify the language in codeblock elements. The general practice is to use the outputclass attribute, which is similar to the Mallard style attribute. Language identifiers are short strings, but prefixed by convention with the string language-. There is no recommendation to support a space-separated list of languages in the outputclass attribute, although it is valid.

© 2017-2018 Shaun McCance
cc-by-sa 3.0 (us)

This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.

As a special exception, the copyright holders give you permission to copy, modify, and distribute the example code contained in this document under the terms of your choosing, without restriction.

Powered by
Mallard