Conditional Processing

Hi Shaun,

On Fri, 2010-04-02 at 12:10 -0400, Kyle Nitzsche wrote:
> Hi Shaun,
> 
> I am curious how the run-time conditional logic would discover the 
> current value(s) of keys (mal:key('somekey')='somevalue').
>  * Is there some file that needs to be present (and is correct for the 
> context) that holds the keys?
>  * What process will consult the key and use it? Yelp?

Key values are expected to be provided by the processing tool.
In the case of installing Mallard documents for Yelp to view,
that means Yelp has to understand them. As for how Yelp gets
that information, that depends on the keys. We can get some
stuff automatically from uname. For keys like distro, though,
Yelp will probably have to be built with compile-time options.

If you want to do static builds with build-time conditional
processing, then you'll want a way to define key values for
the processing tool. I'd prefer to keep that out of the main
specification. If people want a tool-neutral format for this,
we can always create an extension.

>  * Is there a limited set of keys or can keys be arbitrarily added as 
> desired?

The way I specified it, mal:key returns '' for any key it
doesn't have a value for. We'd have a standard set of keys
that are recommended, but people could define their own
keys, assuming their tools support it. We might want to
have a recommendation to avoid key name clashes.

> I am not speaking for ubuntu-docs, but I would *guess* they do not 
> currently use conditional processing. However, it occurs to me that 
> since ubuntu-docs does have two output formats currently (localized xml 
> and html), and that currently the content is identical (another 
> *guess*), it may be useful to be able to add some content that is 
> targeted at the output format. That is, perhaps some extra text/etc to 
> display in the html, and some that's targeted at run-time docs.

My guess is they don't, because Yelp doesn't support it for
DocBook. Unless they have a DocBook->DocBook tool that strips
content based on profiling attributes. Which is doable, but I
don't think they're doing it. (I seem to recall somebody at
WritersUA talking about doing this with DITA.)

> Such things could be supported by tagging text for conditions as you 
> describe below.
> 
> Setting aside specifics, I would say that in *general* the ability to 
> easily support "profiling" is important, if only to handle future 
> potential needs.
> 
> And since I am writing, but on a different topic, I am wondering (again) 
> about the performance hit of run-time processing. It now takes more than 
> 60 seconds to display the Ubuntu Help Center (Karmic) the first time in 
> every boot cycle. This is so long that I guess many users conclude the 
> application is broken and close it before the help even displays. Does 
> mallard/yelp address this somehow (perhaps by only processing/rendering 
> the pages that are actually going to be displayed only when they are 
> going to be displayed instead of doing all of them)?

The largest performance problem in Yelp is that it scans your
file system to locate all installed help (including man and
info) before it will show you a single document. Yelp 3 does
not do this. Yelp 3 is seriously fast.

For DocBook, Yelp does do a single XSLT run and pushes chunks
as they become available. So for very large DocBook documents,
you'll get the first chunk pretty quickly (after the startup
penalty is over), but the last chunk might take a while. Try
the Gnumeric manual. I'm planning to make Yelp 3 do chunks
on demand, which will be much nicer.

For Mallard, however, Yelp only processes the page you asked
for. So transformation time is very low. It does have to look
at all the installed pages first to do dynamic linking. This
has yet to be a real performance blocker, but it will become
an issue with larger documents. I plan to let Yelp read cache
files, which we'd install alongside the documents. This will
help considerably.

To answer the general question, there will be some performance
penalty to conditional processing, but for the simplest cases
it won't be significant. Because the if attribute is any XPath
expression, you could intentionally write a document that will
be slow to process. Like this:

<p if="//*[//*[//*[//*[//*[//*[//*[//*]]]]]]]">slow...</p>

So don't do that. :)

> I will also note that run-time profiling has the disadvantage that 
> content that is not applicable to the current context is *physically* 
> present, even though it may never be used on a particular system. If 
> used for a lot of content, this wastes disk space, causes packages to be 
> larger than necessary, and prevents use of the mechanism for excluding 
> information that cannot be included.

Absolutely. I think run-time and build-time both have their
uses. In some cases, I think the system I outlined can still
be used for build-time. But if you want to mix build-time and
run-time, you might be better off using a different system
for build-time.

One option is to define extension attributes, and use a model
more like DITA's. Another option is to write your page files
as XSLT. Something like this:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">
<xsl:param name="some_var" select="true()"/>
<xsl:template match="/">
<page xmlns="http://projectmallard.org/1.0/" id="foo">
<title>A page</title>
<p>You can just write a page as normal inside the xsl:template
element above. Use XSLT when you want to do profiling.</p>
<xsl:if test="$some_var">
<p>Only when some_var is true</p>
</xsl:if>
</page>
</xsl:template>
</xsl:stylesheet>

Process with `xsltproc --param some_var 1 foo.page foo.page`
or `xsltproc --param some_var 0 foo.page foo.page`.

So there are options for build-time conditional processing.
Since it's not as critical of an interchange issue, I don't
think it needs to be addressed in the core specification.
Thought, as I said, it could be a good extension.

I don't expect run-time to be perfect for everything people
want to do, but I do think it's a useful option to have.

Thanks for the comments. It's very helpful.

--
Shaun