Re: XML representation of a Table

Michael Kay (M.H.Kay@eng.icl.co.uk)
Wed, 21 Oct 1998 16:28:10 +0100


> I just wrote an application which creates XML file from the table. All
you
>hava to do is to to give the table name and it will generate the XML. My
>Question, i just wanted to know, whether the XML generated is correct.

Sounds a useful application, I'd like to know more about it (especially if
you can do the reverse as well!). You can check whether the XML is "correct"
(i.e. valid and well-formed) by putting it through any xml parser, I think
xp is one of the strictest. Some test cases you need to check are your
handling of non-ASCII characters and special characters such as "<" in your
data. You also need to consider whether CR/LF characters in your data are
significant: XML treats CR=LF=CRLF which may not be what you want.

One observation, in your DTD all the columns of the table are declared
mandatory, this gives you no way of handling null values (the obvious
representation of a null value is to omit the relevant element).

Another issue you may need to address is that not every SQL table and column
identifier is a valid XML name. For example, SQL identifiers can contain
spaces.

You will also have to think about how to encode binary (blob) fields.

For large tables your representation is very inefficient in space terms.
Often we don't worry about this in XML work, but relational tables can reach
gigabytes in size even without all these tags. An alternative I would
consider for large tables is:

<TABLEDEF NAME="ACTION">
<COLUMNS>
<COL NAME="ACTION_ID"/><COL NAME="ACTION_DESC"/> etc
</COLUMNS>
</TABLEDEF>
<TABLE NAME="ACTION">
<ROW>A<C/>Activate<C/>A<C/>PASSTEST<C/>1998-02-23 09:44:00.000</ROW>
<ROW>...</ROW>
</TABLE>

I don't think the gurus would recommend using empty elements as separators
like this, but it is a perfectly legitimate use of XML.

Finally, a transfer format for relational tables also needs to be able to
represent the metadata; my example heads in this direction.

Regards,
Mike Kay