MEP-0014

Informational keywords Element

This page proposes adding a keywords element to the info element to help document search.

Authors: Shaun McCance
Created: 2018-11-10
Status: final (2019-01-31)
Target: 1.1
Issue: https://github.com/projectmallard/projectmallard.org/issues/10
History:
show history
hide history
history
2018-11-10 1.1 proposed
2018-12-30 1.1 implemented
2019-01-31 1.1 final

Background

Search is an important method of finding pages in Mallard documents. Search could be provided in a dedicated help app, by a built-in site search on a documentation site, or by a search engine. Some systems will search the entire text of a page, while others will only use metadata like the title and desc. In the case of Yelp, both are true. The quick search results in the drop-down only use metadata, while the full search results pages perform full-text search.

Users frequently search for terms other than those used in the documentation. To improve search results, writers sometimes stuff synonyms into page descs. Forcing synonyms into user-visible text is far from ideal.

Proposal

This page proposes adding a keywords element to the info. The keywords element would take text as content, preferably a comma-separated list in case writers need to use mutli-word terms.

Custom-built search systems would be expected to use the keywords in addition to other data, and preferably to use the keywords even when only searching on metadata like title and desc.

When converting to HTML, the keywords can be placed in the HTML meta element, but be aware that most internet search enginges do not treat such keywords with much weight.

Examples

Add keywords for common search terms to a page on connecting to Wifi:

<page xmlns="http://projectmallard.org/1.0/" id="wifi">
  <info>
    <keywords>wireless, internet, wep, wpa, wpa2</keywords>
  </info>
  <title>Connect to WiFi</title>
</page>

Internationalization

Keywords should be translated. Translation tools should ensure that the keywords element is translatable. When translating keywords, translators should provide all the keywords that are suitable in their own language, rather than translating each keyword from the source language.

Alternatives

We considered an element with child elements for each keyword, similar to DocBook and DITA. There was no clear benefit to that method, other than being able to have keyword phrases with commas. Child elements would be more work for writers and translators.

We considered allowing multiple keywords elements with type attributes, following the trend of MEP-0008: Multiple desc Elements for desc elements, but there was no concrete use-case.

Compatibility and Fallback

This proposal makes no backwards-incompatible changes. Any page written in a version prior to the implementation of this proposal will work exactly the same in a processing tool that implements this proposal.

The fallback behavior for a new informational element is that it is ignored. If a page with keywords is processed by a tool that does not support the new element, the additional keywords will not be used for search or other features.

Comparison to Other Formats

DocBook provides the keywordset element for keywords. The keywordset element contains any number of keyword elements, rather than a comma-separated list.

DITA provides the keywords element for keywords. The keywords element contains any number of keyword elements, rather than a comma-separated list.

HTML uses the meta element with the name attribute set to keywords. The keywords are lists in the content attribute. Like this proposal, the keywords are a comma-separated list.

© 2018 Shaun McCance
cc-by-sa 3.0 (us)

This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.

As a special exception, the copyright holders give you permission to copy, modify, and distribute the example code contained in this document under the terms of your choosing, without restriction.

Powered by
Mallard