RE: CDATA by any other name... (was The raw and the cooked)

Chris Lovett (clovett@microsoft.com)
Fri, 30 Oct 1998 19:04:12 -0800


For what it's worth, the MS-XML parser also fails to validate the CDATA
example because it takes the view that '<![CDATA[ ]]>' is not a
replacement-text-entity and therefore the characters '<![' do NOT match the
whitespace production. If this is replaced with the entity &ws; defined as
<!ENTITY ws " "> MSXML will will validate. The reason is that we figured
that we have to expand entities anyway for validation purposes because you
can also do the following:

<!DOCTYPE a [
<!ELEMENT a (b, c)>
<!ELEMENT b EMPTY>
<!ELEMENT c EMPTY>
<!ENTITY ent "<b/><c/>">
]>
<a>&ent;</a>

It doesn't make any sense to think of the contents of a CDATA section in
terms of "replacement-text" because then you'd have to wonder about
validating <![CDATA[<b/><c/>]]> which was NOT the point of CDATA sections.
The point of CDATA was to treat <b/><c/> as UNPARSED text.