Re: XML representation of a Table

Steve Muench (SMUENCH@us.oracle.com)
21 Oct 98 12:53:12 -0700


--=_ORCL_10273127_0_0
Content-Transfer-Encoding:quoted-printable
Content-Type:text/plain; charset="iso-8859-1"

You can tag things up in a generic row/column way, or in a way that makes
the natural relationships speak themselves more directly.

Pardon my ASCII art below, but if I have some tables in a
database like:

AUTHOR
\|/
|
/|\
BOOK
\|/
|
COURSE
\|/
|
TEACHER

If I'm thinking of Courses, then I might want my data in XML
to look like:

<COURSE>
<NAME>XML Basics</NAME>
<TEACHER>
<NAME>Joe Tags</NAME>
</TEACHER>
<BOOKLIST>
<BOOK>
<PRICE>50.00</PRICE>
<NAME>Go, XML, Go!</NAME>
<AUTHORLIST>
<AUTHOR>
<NAME>Dr. Seuss</NAME>
</AUTHOR>
</AUTHORLIST>
</BOOK>
</BOOKLIST>
</COURSE>

But if I'm thinking of Teachers I might want:

<TEACHER>
<NAME>Joe Tags</NAME>
<COURSELIST>
<COURSE>
<NAME>XML Basics</NAME>
<BOOKLIST>
<BOOK>
<PRICE>50.00</PRICE>
<NAME>Go, XML, Go!</NAME>
<AUTHORLIST>
<AUTHOR>
<NAME>Dr. Seuss</NAME>
</AUTHOR>
</AUTHORLIST>
</BOOK>
</BOOKLIST>
</COURSE>
<COURSE>
<NAME>History of Markup</NAME>
<BOOKLIST>
<BOOK>
<PRICE>18.00</PRICE>
<NAME>SGML, a History</NAME>
<AUTHORLIST>
<AUTHOR>
<NAME>Sammy Jones</NAME>
</AUTHOR>
</AUTHORLIST>
</BOOK>
<BOOK>
<PRICE>28.00</PRICE>
<NAME>Zen and the Art of Markup</NAME>
<AUTHORLIST>
<AUTHOR>
<NAME>Tao Bo</NAME>
</AUTHOR>
</AUTHORLIST>
</BOOK>
</BOOKLIST>
</COURSE>
</COURSELIST>
</TEACHER>

Of course there's no strict need for the <XXXLIST> enclosing tags, but
a browser trying to parse such a "datagram" might appreciate knowing
that a group of something was about to start...

Object relational databases have the ability to predefine a
nested structure as an object and then make metadata about
the structure of that object available to client programs
via data dictionary tables so that they might be written
to generically query and XML-format the data in an intelligent
way. Oracle8 has a feature called Object Views which lets
relational data be viewed as an object structure, so you
could combine these ideas to have your data structuring and
your logical data manipulation in whatever "shapes" make
sense for your app.

____________________________________________________________________________=

Steve | Consulting PM & XML Technology Evangelist | smuench@oracle.com
Muench | Java Business Objects Dev't Team | geocities.com/~smuench=

--=_ORCL_10273127_0_0
Content-Type:message/rfc822

Date: 21 Oct 98 09:03:34
From:petsa@us.ibm.com
To:xml-dev@ic.ac.uk
Subject:Re: XML representation of a Table
Reply-to:UNX07.US.ORACLE.COM:petsa@us.ibm.com
Return-Path:<owner-xml-dev@ic.ac.uk>
Received:from mailsun3 by usmail04 with SMTP (SMI-8.6/37.9) id JAA11423; Wed, 21 Oct 1998 09:10:32 -0700
Received:from inet16.us.oracle.com by mailsun3 with ESMTP (SMI-8.6/37.9) id JAA23380; Wed, 21 Oct 1998 09:10:32 -0700
Received:from bowmore.cc.ic.ac.uk (bowmore.cc.ic.ac.uk [155.198.5.22]) by inet16.us.oracle.com (8.8.5/8.8.5) with SMTP id JAA25862; Wed, 21 Oct 1998 09:10:36 -0700 (PDT)
Received:from majordom by bowmore.cc.ic.ac.uk with smtp (Exim 1.58 #2) id 0zW0mu-00064y-00; Wed, 21 Oct 1998 17:07:04 +0100
Received:by ic.ac.uk (bulk_mailer for ic.ac.uk v1.7); Wed, 21 Oct 1998 17:06:12 +0100
Received:from majordom by bowmore.cc.ic.ac.uk with local (Exim 1.58 #2) id 0zW0lm-00060w-00; Wed, 21 Oct 1998 17:05:54 +0100
Received:from punch.ic.ac.uk [155.198.5.17] by bowmore.cc.ic.ac.uk with smtp (Exim 1.58 #2) id 0zW0kw-0005yo-00; Wed, 21 Oct 1998 17:05:02 +0100
Received:from smtp3.ny.us.ibm.com [198.133.22.42] by punch.ic.ac.uk with esmtp (Exim 1.62 #1) id 0zW0ks-0002E0-00; Wed, 21 Oct 1998 17:04:58 +0100
Received:from relay1.server.ibm.com (relay1.server.ibm.com [9.14.2.98]) by smtp3.ny.us.ibm.com (8.8.7/8.8.7) with ESMTP id LAA21152 for <xml-dev@ic.ac.uk>; Wed, 21 Oct 1998 11:47:03 -0400
Received:from us.ibm.com (d51mta01.pok.ibm.com [9.117.30.75]) by relay1.server.ibm.com (8.8.7/8.8.7) with SMTP id MAA17900 for <xml-dev@ic.ac.uk>; Wed, 21 Oct 1998 12:00:17 -0400
Received:by us.ibm.com(Lotus SMTP MTA v4.6.1 (569.2 2-6-1998)) id 852566A4.0057872B ; Wed, 21 Oct 1998 11:56:02 -0400
X-Lotus-FromDomain:IBMUS
Message-ID:<852566A4.00577E4E.00@us.ibm.com>
Sender:owner-xml-dev@ic.ac.uk
Precedence: bulk
MIME-Version: 1.0
Content-Type:multipart/mixed;
boundary="=_ORCL_10273126_0r0"

--=_ORCL_10273126_0r0
Content-Type:text/plain; charset=us-ascii
Content-Disposition:inline
Content-Transfer-Encoding:7bit

Two points re. translating from relational tables to XML and back.
1. The example seems reasonable but the fun really starts when you want to
represent 2 or more joined tables as an XML document. In general,
you want to map a SQL view into a XML document and if you have more
than 1 table in the view its not always clear how to structure the
hierarchy.
2. Omitting a value is *not* the right way to represent a NULL value.
SQL distinguishes between these 2 cases. A string column can have an
empty string as a value or its value can be NULL. In DCD we recommend
that you use a special attribute to represent that the value is NULL.

Ashok Malhotra


(Embedded
image moved to "Michael Kay" <M.H.Kay@eng.icl.co.uk>
file: 10/21/98 11:28 AM
pic25949.pcx)

Please respond to "Michael Kay" <M.H.Kay@eng.icl.co.uk>

To: xml-dev@ic.ac.uk
cc: (bcc: Ashok Malhotra/Watson/IBM)
Subject: Re: XML representation of a Table

> I just wrote an application which creates XML file from the table. All
you
>hava to do is to to give the table name and it will generate the XML. My
>Question, i just wanted to know, whether the XML generated is correct.

Sounds a useful application, I'd like to know more about it (especially if
you can do the reverse as well!). You can check whether the XML is
"correct"
(i.e. valid and well-formed) by putting it through any xml parser, I think
xp is one of the strictest. Some test cases you need to check are your
handling of non-ASCII characters and special characters such as "<" in your
data. You also need to consider whether CR/LF characters in your data are
significant: XML treats CR=LF=CRLF which may not be what you want.

One observation, in your DTD all the columns of the table are declared
mandatory, this gives you no way of handling null values (the obvious
representation of a null value is to omit the relevant element).

Another issue you may need to address is that not every SQL table and
column
identifier is a valid XML name. For example, SQL identifiers can contain
spaces.

You will also have to think about how to encode binary (blob) fields.

For large tables your representation is very inefficient in space terms.
Often we don't worry about this in XML work, but relational tables can
reach
gigabytes in size even without all these tags. An alternative I would
consider for large tables is:

<TABLEDEF NAME="ACTION">
<COLUMNS>
<COL NAME="ACTION_ID"/><COL NAME="ACTION_DESC"/> etc
</COLUMNS>
</TABLEDEF>
<TABLE NAME="ACTION">
<ROW>A<C/>Activate<C/>A<C/>PASSTEST<C/>1998-02-23 09:44:00.000</ROW>
<ROW>...</ROW>
</TABLE>

I don't think the gurus would recommend using empty elements as separators
like this, but it is a perfectly legitimate use of XML.

Finally, a transfer format for relational tables also needs to be able to
represent the metadata; my example heads in this direction.

Regards,
Mike Kay

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

--=_ORCL_10273126_0r0
Content-Type:application/octet-stream; name=pic25949.pcx
Content-Disposition:attachment; filename=pic25949.pcx
Content-Transfer-Encoding:base64

CgUBCAAAAABoACwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAABaQABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAD1E9sTzRPHE8MTwhP1E9sTzRPHE8MTwhP1E9sTzRPHE8MT
whP1E9sTzRPHE8MTwhP1E9sTzRPHE8MTwhP1E9sTzRPHE8MTwhP1E9sTzRPHE8MTwhP1E9sT
zRPHE8MTwhP1E9sTzRPHE8MTwhPwEwzIBgzYE8wTxhPDE8IT7hPOBtcTzBPGE8MTE+wTwgbC
BwbCEgbCEgbCEsUG1hPLE8YTwxMT6hMMwgYHwgLCAwISwgfEEsMCwwbVE8sTxRPDExPpE8MG
AwcCBwMCwhLDB8ISwgISwgLDBtUTyhPFE8MTE+gTwgIHA8ICEw4DDgLDE8USwwLCEMIG1BPK
E8UTwxMT5xMCAwcDAg4TDgITwgISD8ISD8ISBRICEcICwwbUE8oTxRPCExPmEwYCBwMCDgIO
wgLDExITEhPCEg8GxgLDBtMMDAfJE8QTwhMT5hMGwwITBgMCDhLFEw8SE8ISBgIDwhIDEsMG
B9MDxwwHxRPDExPlEwYHAhESAg8CwhMPwhMPxBMPxRIQwgIDAgMCBtMDxwPEDAfDE8IT4RMH
wwzCBgLCEhMCDxLIE8MSD8MSwwIQAwIDBgfSDMkDwgPCDAfCExPbEwfGDMIDDAIHERITEhMS
wxMPwxMPwxPDEgIDAgMCwwMCBgzREwfHDMYDDMITE9YTB8UMyAMGB8ICBhLDAsYTEhMSExIP
whIHAgcCAwUQAgYRBgfSE8UTB8QMwgMMwhMT0hMHxAzLA8IMBsISDxESExITAw4DxBMSExIT
wxICBwPCAsMDDMIGB9ITyRMHwwzCExPPEwfDDMkDxQwHwhMGBxITAhECEwMOAg7DExITDxMP
wxIDAgMCBwMCDAYRBgfSE8kTwhPCDMITE8wTB8MMxwPEDMIHxxMGxBLDAg4DDgIGwg/IEgID
wgIDAgwCEMIGB9ITyRMHDAcMwhMTyhMHwgzGA8MMwgfMEwYHwhLCEAIOAg4CDhDDAhIPxhIF
AgXDAgUCEQYH0hPHEwfCDAcPDMITE8gTB8IMxQPDDAfQEwbDEhDEAhAOEA4QwgLGEgcSBhIG
BcMCBcIGB9ATB8UMEwfCDA8HDwwHwhMTxhMHwgzEA8MMB9MTBgfCEhADEMICDhAOEMICEQID
xxIGBwbCAgUCEQYHyxMHxAwHwhMHEwzCEwcPBw8MB8MTE8UTBwzEA8IMB9YTBsQSEAMCA8UC
EQIDAgPDEgcSBgfCBgUQAhDCBgfGEwfEDAfGE8INEwzCEw8HwgwHwxPCE8QTBwzDA8IMB9gT
BgfEEhACEMYCEQIDAsQSBhLDBsICEALCBgfCEwfDDAfKEwfCDRMHwhPCDAfEE8ITE8MTBwzC
A8IMB9oTDBIHwxLDDBEDxQIDAgPDEgYSBgfCBgIQAhAGDAfCEwzDE8MHyRMHwhPCBxMHxRPD
ExPDEwzCAwwH3RMGxxICEQPDAgMCA8MSBhIGBwYMBhACEAIGDMMTDBPCB8YTwwfHEwfGE8MT
whPDEwwDDAfeEwYHxxICEQPDAgMCwhIGEgYHBgwGEAIQAsIGB8MTDMYTwwfKEwzGE8MTwhPD
E8IMB98TDBLCB8USAgMRxAISB8ISBgcGDAYQBhAGEAYMB8MMB8kTwwfHEwzGE8MTwhPDEwwP
wgzfEwYSB8ISB8ISAhECAwIDEgcSBwYHBgwGEAYQxgzDD8IHxRPDB8kTBwzGE8MTwhPDEwzD
D8QM3BPCBhIGwxIGAhECAwIHBgcGyAzJDxMHzRMHwwwHxxPDE8ITwxMHDMYPxwwH1BMGEgYS
BhLLDM4PwwwTDMcTwgfEDAfJE8QTwhMTxBMHwgzLD9sM0w/GDAfDEwzDEwfEDAfLE8YTwxMT
xhMHxAztD8gMBgfIE8QMB84TxxPDE8ITyhMHxwzbD8sMEAUMBcIMwgYH1RPKE8UTwxMT0RMH
2wwGEAYQBhACBQwFDAUMBgwHBgfWE8sTxRPDExPuEwYMBhAGEAIGDAYMwwYH1xPLE8YTwxMT
8BPKBgfYE8wTxhPDExP1E9sTzRPHE8MTwhP1E9sTzRPHE8MTwhMMAAAAgAAAAIAAgIAAAACA
gACAAICAwMDAwNzApsrw//vwoKCkgICA/wAAAP8A//8AAAD//wD/AP//////AAAAgAAAAIAA
gIAAAACAgACAAICAwMDAwNzApsrw//vwoKCkgICA/wAAAP8A//8AAAD//wD/AP//////AAAA
gAAAAIAAgIAAAACAgACAAICAwMDAwNzApsrw//vwoKCkgICA/wAAAP8A//8AAAD//wD/AP//
////AAAAgAAAAIAAgIAAAACAgACAAICAwMDAwNzApsrw//vwoKCkgICA/wAAAP8A//8AAAD/
/wD/AP//////AAAAgAAAAIAAgIAAAACAgACAAICAwMDAwNzApsrw//vwoKCkgICA/wAAAP8A
//8AAAD//wD/AP//////AAAAgAAAAIAAgIAAAACAgACAAICAwMDAwNzApsrw//vwoKCkgICA
/wAAAP8A//8AAAD//wD/AP//////AAAAgAAAAIAAgIAAAACAgACAAICAwMDAwNzApsrw//vw
oKCkgICA/wAAAP8A//8AAAD//wD/AP//////AAAAgAAAAIAAgIAAAACAgACAAICAwMDAwNzA
psrw//vwoKCkgICA/wAAAP8A//8AAAD//wD/AP//////AAAAgAAAAIAAgIAAAACAgACAAICA
wMDAwNzApsrw//vwoKCkgICA/wAAAP8A//8AAAD//wD/AP//////AAAAgAAAAIAAgIAAAACA
gACAAICAwMDAwNzApsrw//vwoKCkgICA/wAAAP8A//8AAAD//wD/AP//////AAAAgAAAAIAA
gIAAAACAgACAAICAwMDAwNzApsrw//vwoKCkgICA/wAAAP8A//8AAAD//wD/AP//////AAAA
gAAAAIAAgIAAAACAgACAAICAwMDAwNzApsrw//vwoKCkgICA/wAAAP8A//8AAAD//wD/AP//
////AAAAgAAAAIAAgIAAAACAgACA//vwoKCkgICA/wAAAP8A//8AAAD//wD/AP//////

--=_ORCL_10273126_0r0--

--=_ORCL_10273127_0_0--