> The first would provide me with an easy mechanism to check whether it
> has been provided or not, the second seems more natural. Using the first
> option is more consistent, though, I can think of no reason why the
> description should be treated differently from the other attributes.
One approach is to make sure all your documents are clearly marked up to
tell you which version of your DTD you are using (even if you never actually
use a DTD, at least have a DOCTYPE declaration or Processing Intruction or
top-level attribute to give you some kind of version numbering!). If you
also have the freedom to rewrite your software, this gives you the ability
to try attributes now, and shift over to elements later. Why should you
care? Let experience with your particular data and applications be your
guide.
People often underestimate how complicated their data is, and they
prematurely put things into attributes. But that is not bad, just natural.
The opposite is to blanketly put everything into elements. There is a kind
of metadata which is not "data about data" but "data about elements" (i.e.
data about markup) which frequently is exactly what an attribute should be
used for. Marking them up with elements would IMHO probable mess up the
structure of data, or at least make a whole lot of ugly elements. In
particular, ID attributes and the xml:lang attibute. If you are using DTDs,
then NOTATION attributes and ENTITY attributes fit in there too. If you are
not using these then you may find yourself uncompelled to use many
attributes.
Attributes are also available at the end of the start tag, which provides a
nice and natural breathing point for setting up how to process the
subsequent contents of the element. This is essential for practical stream
processing.
Rick Jelliffe