Ducktype Directives

Shaun McCance <shaunm at gnome.org>
Thu Dec 4 11:47:42 EST 2014

There's a basically functional Ducktype parser available here:

https://gitorious.org/projectmallard/duck/

Or just install with pip:

$ pip-python3 install duck

(It's py3 only, but pypi lets you install with py2. I don't know how to
fix that. Help from more experience python packagers welcome.)

The last major feature I'm working on are directives. These are outside
the normal markup, similar to XML processing instructions. The syntax
uses double square brackets, like so:

[[some_directive directive content]]

Directives that start with "duck" are special. Outside that, you can
make up whatever you want, and they'll probably get turned into XML
processing instructions. Not sure if that's a MUST or not. I have three
special directives planned. Similar to the XML declaration on the first
line:

[[duck/1.0 encoding="utf-8"]]

Allows us to specify an encoding and the parser version. For syntax
extensions, we can specify those too:

[[duck/1.0 my-special-syntax/1.0 encoding="utf-8"]]

Specify namespaces:

[[duck:ns if="http://projectmallard.org/if/1.0/"]]

Declare entities:

[[duck:entity app="$app(My Application)"]]

Here's where I'd like feedback. XML defines processing instructions to
just have text content, even though people often use them as if they
were attribute lists. I'd kind of like to reinforce that, while still
allowing bare words. This matters to the parser, e.g.:

[[foo bar="does this ]] close the directive or not?"]]

So my thinking is that directive content is a sequence of things, where
things can be either bare words or key/value pairs. So start reading a
token. If you hit a space or ]], bare word. If you hit a = followed by a
quote (" or '), key/value pair, keep reading until matching quote.

The contents of bare words or values is NOT PARSED, not even for escape
sequences or entities. So you can get a quote in a value like this:

[[foo bar="this has a $quot; quote charater"]]

But the literal value is "this has a $quot; quote character". For the
entity definitions, we define them to get parsed on use, but not when
parsing the directive. XML does the same thing.

Question: is it worthwhile to allow bare strings that can contain spaces
and ]]? We could do it with quotes, like so:

[[foo "This is a string, not eight bare words."]]

One other restriction: right now they can only appear at the top, above
the page title. I don't want to allow them everywhere, to make parsing
easier. But it might be worthwhile to let them appear in some block
contexts. Perhaps only after a block declaration:

[comment]
[[mal2html.show_comment]]

Dunno. Feedback welcome.

--
Shaun