The raw and the cooked (was: there's empty ...)

John Cowan (cowan@locke.ccil.org)
Fri, 30 Oct 1998 10:05:53 -0500


Henry S. Thompson writes:

> In order
> to make sense of a claim that the two entity references in my example
> are valid, but the two CDATA sections are not, we are left in the
> difficult position of saying that the validity constraint in question
> applies AFTER entity expansion but BEFORE CDATA section
> interpretation,

I don't know what "CDATA section interpretation" means. CDATA
sections are not "interpreted", but merely scanned to find the
end.

In fact, CDATA elements can be returned to the application as
specialized blocks (the DOM allows it, though SAX does not do so);
an XML parser *never* needs to look inside a CDATA section after
its terminator has been found.

> which is really weird, because wrt e.g. mixed content
> models, the constraint clear applies AFTER CDATA section
> interpretation.

I don't understand which constraint you are referring to in the
case of mixed content models. Elements with element content can
only have S between the tags, and CDATA elements aren't S.

> So my conclusion is that in fact consistency requires that CDATA
> sections containing nothing but whitespace SHOULD be valid as part of
> the content of element-only content element types. In any case, I
> think this issue needs to be clarified in any corrigendum which may be
> forthcoming.
>
> ht
>
> Resource note:
>
> I tried all the online validators listed at
> http://www.oasis-open.org/cover/check-xml.html:
>
> * The STG one worked as discussed above;
>
> * the Koala one showed me blank pages no matter how I
> invoked it;
>
> * the xml.t2000.co.kr offers a bewildering array of (to me
> confused) choices, including validation with or without a DTD (?), but
> in any case rejected both the entity references and both the CDATA
> sections;
>
> * the WebTech validator appears to be using SP, but set up incorrectly for XML
>
> So from four 'validation services', four different answers. I rest my
> case.
>
> ht
> --
> Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
> 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
> Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
> URL: http://www.ltg.ed.ac.uk/~ht/
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)