Subject Re: [Firebird-general] Re: IBM moves the database goalposts - xml related
Author Martijn Tonies
> > Like what? XML does nothing that a relational DBMS can't do.
>
> OO does nothing that relational DBMS can't. Do you claim that OO
> concept should be deprecated?

OO is no basis for storage, is it?

OODBMS: yes, please, drop them :-)

I'd rather see a decent OO <-> DBMS bridge.

> > XPath/XQuery is nothing but patches to something that
> > should be used for data exchange. Not storage.
>
> Wrong. XML introduces a [new?] data model and provides appropriate
> methods for its manipulation.

Does XML introduce a data model? If so, please show me the basis
for this model.

The relational model has math and logic as its basis.

> >> Don't think about relational data. XML is used to store structured and
> >> semi-structured data of a different nature.
> > The same with relational databases , you can throw anything at it (a
> > whole filesistem for example, movies, ..etc)
>
> Not really. In relational model you can have only data that conform
> one scheme.

Can it? :-)

>In XML you can merge two schemes together without
> additional efforts and that would be still valid XML document.
> Example:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <document>
> <title>SuperBook</title>
> <chapter>
> <header>Chapter 1.</header>
> <para>
> ....
> </para>
> </chapter>
> ....
> </document>
>
> Now your database database is filled with "instances" of that XML files.
>
> Now assume that a manager comes to you and tells that in your database
> ofdocuments he wants to add some additional semi-structured
> information to some instances, though he cannot define in which
> instance what information will be added. That additional information
> must be query-able and accessible with standard query language.
>
> How are you going to solve that in relational model? Define a generic
> schema? Extend your schema for each case? Design additional database?
>
> In XML you define additional namespace. Now you can have following XML
> file:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <document xmlns="urn::document">
> <auth:author xmlns:auth="urn::author">
> <auth:firstName>John</auth:firstName>
> <auth:lastName>Doe</auth:lastName>
> </auth:author>
> <title>SuperBook</title>
> <chapter>
> <header>Chapter 1.</header>
> <para>
> ....
> </para>
> </chapter>
> ....
> </document>
>
> And you can add information about the author to one document and
> reviews to another one, or you can add both. Application that adds
> this information does not need to understand the original data scheme,
> but only its own. All applications that worked before with the previos
> document structure will continue to work with it - they simply do not
> see it. New applications can handle that additional information
> together with the main one. Or they just process the information they
> understand. And all additional information is available with the "old"
> query language, no changes/extensions are needed.

One of the primary reasons to create the relational model is
data independence. How can the above not be done in a
(R)DBMS? I don't see the benefit of XML here.

> Where do you need this?
>
> For example your email client. If each email is an XML file, its
> header and body are represented by standard structure. However
> currently attachments in emails are represented by a "universal"
> structure that contains only a name of the file, its MIME type and
> base64 encoded binary content. Now assume that each attachment is
> represented by appropriate XML structure from an appropriate
> namespace. Word files would have document title, keywords, etc.
> Images would have their size in pixels, resolution, etc. Executables
> would have signature. You can add other emails as attachments. And all
> this information is available with query language:
>
> //email[attachment/format='application/msword' and
> contains(attachment/word:keywords, 'Firebird')]
>
> will select all emails that have attachments in word format which
> contains "Firebird" in its keywords. And no MS Word document parser is
> needed. (This query will not work, since there is no function
> "contains", I invented it for this example, but most XPath/XQuery
> implementations allows extending the set of available functions).
>
> How are you going to implement this in relational model without
> designing a schema that would handle all available cases? What will
> you do with your data model and application if emails with some new
> attachment type have to be processed?

1) user defined types (eg: domains)
2) being able to create the basic operators on the domains
3) keyword extracter

> Some things that are natural in XML are completely unnatural in
> relational model. For example relation between document and author you
> have to implement by introducing syntetic keys. If you have two
> authors, in relational model you have to add a position field. In XML
> it is just there.

A position field? Why? If one author is more important than another,
then yes, there better be a position field. Cause if the position is based
upon the "position" in the XML document, then it's really plain silly.
You should store data, if position is part of the data, store it. If the
representation of your data inside an XML document containts info
about the data, you're doing things the wrong way.

> You can argue that this is bad, not performant, etc. But nobody argues
> that XML should replace relational databases. Just compare apples to
> apples, oranges to oranges. XML is completely different data model,

Then, I ask you: what is the solid basis for the model. How is it even a
model?

> when it is used the same way relational model is used; in this case it
> is extremly inconvenient, slow and resource expensive. But when it is
> used appropriately it is much more convenient than SQL.

SQL <> Relation Model.

> Also Marius and Martijn, considering your replies that XML is only an
> information exchange format suggests that you simply do not understand
> the applicability of the XML. That is similar to the situation where I
> would claim that triggers, SPs, referential integrity are not needed,
> since J2EE container managed persistence specification (also JDO
> specification, Hibernate etc) does not support it.

Triggers and Stored Procedures are nice - and I very much appreciate
them, but constraints are part of the basis of the relational model and
a decent (R)DBMS. It's the best we can do to ensure that our DBMS
knows what the data means and enforces integrity. How does this work
with XML?

With regards,

Martijn Tonies
Database Workbench - developer tool for InterBase, Firebird, MySQL & MS SQL
Server
Upscene Productions
http://www.upscene.com