dl elements and non-dt, non-dd content

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

dl elements and non-dt, non-dd content

T.J. Crowder
To my non-spec-reading eyes, the current spec disallows any content other than dt and dd elements within dl elements (although there's some slight ambiguity; more below, that's not my main point/question).

In some documents, one needs to label each term definition with a reference number or similar (I've seen this in specifications on large software projects, for instance). The number isn't a term, so it doesn't make sense to make it a dt, but it's not a definition either, so it's not a dd.

Is there a valid way to do that per current spec? It seems kludgy to put each definition in its own dl purely in order to label it, although of course that does allow the label to apply to the dt+dd pair as a pair.

If there isn't, should there be? Or is the idea that different dl elements is an appropriate choice in that situation?

About the slight ambiguity I mentioned above. The header for dl says:

Content model:
Zero or more groups each consisting of one or more dt elements followed by one or more dd elements.

Quite clear, but then the text below says:

If a dl element contains non-whitespace text nodes, or elements other than dt and dd, then those elements or text nodes do not form part of any groups in that dl.

...which softens that a bit. The validator seems to agree with the former, disallowing (say) a p or span. Is the latter simply giving an indication of how invalid content should be treated?

Best,
--
T.J. Crowder
Reply | Threaded
Open this post in threaded view
|

Re: dl elements and non-dt, non-dd content

Jukka K. Korpela-2
T.J. Crowder wrote:

> To my non-spec-reading eyes, the current spec disallows any content
> other than dt and dd elements within dl elements (although there's
> some slight ambiguity; more below, that's not my main point/question).

I don't see any ambiguity here:

"Content model:
    Zero or more groups each consisting of one or more
    dt elements followed by one or more dd elements."

"Content model" specifies everything that is allowed (and/or required)
inside an element.

The requirement is stricter than in HTML 4 in the sense that HTML 4 allows
any mixture of dt and dd elements, even one beginning with dd (which does
not make much sense semantically) but looser in the sense that it allows the
content to be empty (apparently to allow the use of <dl></dl> as a
placeholder to be filled using a script).

> In some documents, one needs to label each term definition with a
> reference number or similar

That's understandable, but numbering of items is generally not supported in
HTML as marked-up _elements_. Even in an "ordered" (read: numbered) list,
i.e. an ol element, the numbering is implied, though you can set the numbers
explicitly - but in attributes, e.g. <li value="42">, not as elements.
Moreover, you cannot use the numbers directly in links - you need to assign
id attributes to the <li> elements, or use scripting, or some special way of
referring to a specific <li> element.

I guess the need for labeling dt elements, though real, is not common enough
to justify added complexity, especially since it would be rather illogical
to add it for dl but not ul, ol, and menu.

> The number isn't a term, so
> it doesn't make sense to make it a dt, but it's not a definition
> either, so it's not a dd.

Logically and semantically, you are quite right. In practice, I suppose you
could just make it part of the dt element contents _or_ use something like

<dt id="42">

together with, say,

dt[id]:before { content: attr(id) " "; }

in CSS (though this won't work in old versions of IE, which don't support
generated content). You may need to use different id values (as id values
must be unique in a document), but basically this would seem to solve most
of the problem, with no added HTML features.

> The header for dl says:
>
> Content model:

That's normative.

> Quite clear, but then the text below says:
>
> If a dl element contains non-whitespace text nodes, or elements other
> than
>> dt and dd, then those elements or text nodes do not form part of any
>> groups in that dl.
>
> ...which softens that a bit.

Well, not really. It does not change the rules (for authors and documents).
It just adds some rules (for browsers) on dealing with documents that
violate the rules.

Today I noticed a somewhat similar issue with the title element: only text
content is allowed (no markup), but the rule for this is followed by the
definition of the IDL attribute text in terms of picking up just the text
content of nodes - as if non-text nodes were allowed.

> The validator seems to agree with the
> former, disallowing (say) a p or span.

In general, the validator(s) for HTML5 do not correspond to current HTML5
drafts in every detail. This is rather understandable, as HTML5 is a moving
target. But in this case, as in most cases, the validator(s) reflect(s) the
rules.

> Is the latter simply giving an
> indication of how invalid content should be treated?

Rather, how invalid content of certain type _must_ be treated - I gather it
is a requirement on user agents, instead of just a recommendation.

--
Yucca, http://www.cs.tut.fi/~jkorpela/