Referencing glossary entries

Shaun McCance <shaunm at gnome.org>
Sat Jul 9 12:37:49 EDT 2011

On Sat, 2011-07-09 at 13:35 +0100, Phil Bull wrote:
> Hi Shaun,
> 
> Looks good!
> 
> On Fri, 2011-07-08 at 10:51 -0400, Shaun McCance wrote:
> [...]
> > I've been working on a glossaries extension. See my recent blog
> > post for more information:
> > 
> > http://blogs.gnome.org/shaunm/2011/07/07/mallard-glossaries/
> 
> > These are dynamic, the Mallard way. Any page can declare terms,
> > and glossary pages collect terms from throughout the document.
> > There's basic support for filtering and segmenting, so you can
> > put only some entries in certain glossary pages or sections.
> 
> Part of me really likes that you can define glossary terms on any page.
> This lets you keep definitions together with relevant material, so when
> one is updated it's easy to update the other to reflect the changes.
> 
> That said, I wonder if having definitions scattered throughout a
> document will actually make them more difficult to maintain, since you
> have to hunt down which file a particular glossary item is defined in.

Right, so you can just define all the terms on the glossary page
itself, to keep them in one place. There's some flexibility here,
and I guess it's up to authors to decide what's easier for them
to maintain.

Now, each entry on a glossary page has both definitions and links
to pages that declared the term. So if you only declare/define
terms on the glossary page, you get no links. (You could always
link inline in the definition, of course.) What you could do is
do the full definitions on the glossary page, and then declare
terms on other pages for links.

On the glossary page:

<info>
  <gloss:term id="mallard">
    <title>Mallard</title>
    <p>A dynamic, topic-oriented help markup language.</p>
  </gloss:term>
</info>

Then on any other page you want the glossary entry to link to:

<info>
  <gloss:term id="mallard"/>
</info>

Think of glossary terms as miniature guides. Declaring a term
on a page is then like declaring a guide link.

> With pages, we've been giving the .page files the same name as the page
> ID. I think this 1:1 correspondence between file and page entity has
> worked to make documents easier to maintain. (Need to link to a given
> page? Just find its name in the file browser.) Because glossary entries
> are actually independent entities "piggybacking" on a page, and aren't
> necessarily associate with the page itself, the naming structure becomes
> more convoluted. There's also a possibility of breaking links when
> mallard pages are swapped in and out.

Breaking links when you swap pages out is already a potential
problem with just normal links. (Was it you or Jim that was
suggesting "soft" inline links?) I don't have a magic bullet
for that, but test tools like yelp-check help. I'd certainly
write a glossary term check into yelp-check.

> Of course, there are ways of handling this - a short script can output a
> list of glossary items and which files they are located in, and you can
> define the glossary items on a special glossary page to avoid breaking
> links. But it's not as elegant, in my opinion.

No script necessary, really. Terms always link back to the
pages that declare them (unless that page is the glossary
page itself). So you can immediately see where terms are
defined by just looking at the rendered glossary.

> [...]
> > Then there's the possibility of using explicit IDs.
> > 
> > <gloss:term id="top_bar"><title>Top bar</title>...</gloss:term>
> > 
> > <p>... on the <gloss:term idref="top_bar">top bar</gloss:term></p>
> > 
> > This is more to type, but sometimes explicit is good. (We don't
> > implicitly make IDs from section titles, for example.) A slight
> > downside is that IDs in translated documents will still be in the
> > source language. They aren't really exposed to users, except that
> > you may see them as a fragment identifier in a URL. But that's no
> > different than page and section IDs right now.
> > 
> > So, is being explicit worth the required extra typing?
> 
> In this case, definitely! Using explicit IDs seems more flexible (in
> terms of being able to use different wording to link to the same
> glossary item), and it's much more robust. Requiring identical strings
> to identify a glossary item is a recipe for inflicting confusing
> "undefined id" errors on writers making subtle changes like adding an
> apostrophe or pluralising a term. Spotting that <term>Mallard</term> and
> <term>mallard</term> are different is difficult when the tags are
> embedded in a paragraph. Relying on the <title> has similar issues, but
> not quite as bad.
> 
> In general, I think the identifier of an element should look like an
> identifier ("some-id-like-this") rather than text ("Some ID like this")
> because then it's obvious what's going on.

OK, it sounds like people are favoring explicit IDs, and I'm
leaning in that direction myself. I'll try implementing it.
Collecting and merging should be easy enough, but I'll have
to figure out what to do when titles are different for terms
with the same ID. Show them both, surely, but I'm not sure
what to do about sorting.

--
Shaun