Re: XObjects (was TLAa was SOX)

Peter Murray-Rust (peter@ursus.demon.co.uk)
Thu, 08 Oct 1998 19:34:25


At 10:09 08/10/98 -0700, David Brownell wrote:
>I believe we need two modes of document creation: "empty",
>so they can be constructed programmatically; and "parsed",
>where they're built from a stream of XML text.

I agree completely. I found I needed something this while I was
constructing my subclasses of ElementNode (I call it XNode in JUMBO). So we
also have to consider it for ElementNodes.

Typically I find the following modes of operation:
- read in a complete XML file and construct the subclassed ElementNodes at
document creation. This may involve special storage, validation, inter
Element relationships, etc.
- create an empty tree. build on empty nodes. then create the node
internals. This must then be strictly compatible with the DTD (if any).
Help in regulating this would be useful :-)
- read in (or create) a tree with non-empty elements and then modify these
interactively (edit). This is similar to the last operation but might
require additional support.

>
[... useful code snipped ...]
>
>Re the latter, I agree that a "Builder" API with "build"
>method makes much sense ... in fact, Sun's current (and
>changeable!) APIs adopt that exact naming convention.

I am really pleased to hear that SUN's APIs are changeable. This is exactly
the time, XML-DEVers, to try to get as much communality as possible before
the concrete sets.
>
>The structure of such a builder is a tricky problem. I
>think of it as a module that's separate from the DOM and
>separate from the (SAX) parser, but might need to be coupled
>to either or to both. Reason: neither API is sufficient
>for supporting all DOM semantics without some more coupling.
>
>Example: fully conforming to DOM means having access to more
>information than SAX exposes. Which attributes were defaulted?
>What are the default values of attributes? Where do entities
>start/stop? What parsed entities exist? And more. While
>it's easy to write a builder that takes a SAX event stream
>and a document, then populates a DOM Document from it, it can't
>possibly offer full DOM semantics. (Then there's efficiency...)

We always knew that SAX was an 80/20 solution. It seems we have the
following options:
- ignore those bits that SAX doesn't manage at present. At least this
gives us a working system (and you can see how keen I am on working
systems...) For the sort of thing I do, entities, default attributes etc.
are not essential. Maybe I'm being selfish... But I'd hate us to spend
months discussing what we should put in.
- extend SAX so it does (al)most what we want. I can't comment.
- base everything on DOM. Do we have enough stuff in DOM 1.0 and do we
have sufficient XML parsers interfaced to DOM? [I'm rather ignorant here].
>

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg