Referencing glossary entries

Shaun McCance <shaunm at gnome.org>
Fri Jul 8 12:31:13 EDT 2011

On Fri, 2011-07-08 at 17:06 +0200, Aurélien Naldi wrote:
> Hi Shaun,
> 
> On Fri, Jul 8, 2011 at 4:51 PM, Shaun McCance <shaunm at gnome.org> wrote:
> > Hi all,
> >
> > I've been working on a glossaries extension. See my recent blog
> > post for more information:
> >
> > http://blogs.gnome.org/shaunm/2011/07/07/mallard-glossaries/
> >
> > These are dynamic, the Mallard way. Any page can declare terms,
> > and glossary pages collect terms from throughout the document.
> > There's basic support for filtering and segmenting, so you can
> > put only some entries in certain glossary pages or sections.
> 
> I saw your post and the new code in git for this but could not get it
> to work in my quick tests. Is it any command-line switch (I don't
> think so as the new xsl files are included in other ones) or some
> trick to use it?
> A working example might help getting started, but according to this
> message it may just be that you are still working out the details and
> it is not yet fully working.

Showing terms on a glossary page works right now. Referencing
terms inline does not yet work. My blog post is pretty short
on details. Here's a more complete example.

On any page in your document, add a gloss:term element to the
info element, like so:

<page xmlns="http://projectmallard.org/1.0/"
      xmlns:gloss="http://projectmallard.org/experimental/gloss/"
      id="some_page_id">
  <info>
    <gloss:term>
      <title>The term</title>
      <p>The definition</p>
    </gloss:term>
  </info>
  ...
</page>

Then create a new page, say glossary.page, that looks like this:

<page xmlns="http://projectmallard.org/1.0/"
      xmlns:gloss="http://projectmallard.org/experimental/gloss/"
      type="gloss:glossary"
      id="glossary">
  <title>Glossary</title>
</page>

Build the HTML with yelp-build, or view it in Yelp. (I haven't
actually tested this in Yelp yet, but it should just work if
you have yelp-xsl installed from git master.)

> [...]
> > Then there's the possibility of using explicit IDs.
> >
> > <gloss:term id="top_bar"><title>Top bar</title>...</gloss:term>
> >
> > <p>... on the <gloss:term idref="top_bar">top bar</gloss:term></p>
> >
> > This is more to type, but sometimes explicit is good. (We don't
> > implicitly make IDs from section titles, for example.) A slight
> > downside is that IDs in translated documents will still be in the
> > source language. They aren't really exposed to users, except that
> > you may see them as a fragment identifier in a URL. But that's no
> > different than page and section IDs right now.
> >
> > So, is being explicit worth the required extra typing?
> 
> I think using explicit IDs is better. Does it mean that they only have
> to be unique among glossary entries or also among other IDs from their
> page?

IDs would not need to be distinct from page and section IDs. They'd
be their own ID namespace. (In fact, pages and sections are each
their own ID namespace. Page IDs only have to be distinct from all
other page IDs in the document, and section IDs only have to be
distinct from all other section IDs in the same page.)

The IDs would actually not even have to be distinct from each other.
In my blog post, I mention how multiple links/definitions can be
collected together for the same term. We'd do the same thing, just
on the id attribute instead of the title.

By the way, explicit IDs would help with the aggregation thing as
well. A related problem with the current approach that I didn't
mention is that if two pages each declare a term in their info
section, but they do so with just slightly different spellings,
the links/definitions won't be merged.

> Maybe a middle ground is possible: allow fully explicit links as you
> just proposed but also allow to link using only the ID:
> 
> <p>... on the <gloss:term idref="top_bar" /></p>
> 
> Then the title of the entry can be used as link text for the lazy
> among us who know that this title is and will stay properly formatted,
> it only adds extra typing in the entry definition, which is fine (the
> title could be made optional and be replaced with the ID but it
> doesn't fit with the clean state of mallard core).

That's entirely possible. A potential hiccup is that the term
can be defined multiple places, and each of them could provide
a different title. I'd also like to allow term definitions to
provide multiple titles, like an item in a terms element can.
Think of a glossary of recommended terms, with an entry that
discusses the difference between "log in" and "login".

So the behavior could be undefined in some cases. (Concretely,
the spec would almost certainly say something like "Processing
tools should select one of the available term titles. Which
one is undefined.")

There's also translation difficulties, because translators may
want to be explicit about the text in places where you used the
implicit form. Think of case declensions, capitalization rules,
etc. But since the term reference is an inline, translators can
change that with tools like xml2po or itstool. They just might
not realize they should.