[geeks] Any opinion on "Exist": XML Database

Jonathan C. Patschke jp at celestrion.net
Mon Jun 10 05:17:44 CDT 2002


On Mon, 10 Jun 2002, William S. wrote:

> I am exploring several alternatives in terms of providing
> a web based database server. I am using Sablotron with
> php now to transform content in XSL and XML files. Here is
> what I have working so far:
> 
> http://213.84.71.105/

That looks pretty cool.  Do you have a PHP interface to add data to the
file, or are you editing it manually?

> I recently came across this item and was wondering if
> anyone has tried it out or applied it and has an opinion.
> 
> http://exist-db.org/index.html

#ifdef SKEPTICAL_PROGRAMMER

To me, it looks like a solution in search of a problem.  It looks very
elegant and fairly simple to use, but I can't imagine that it would come
close to the speed of a well-indexed real database.  Just remember that if
you can express what you want in set-theory, you can express it in SQL.

For example, the example they give could be expressed in SQL as a table of
plays, a table of speakers (where each row relates to a play), a table of
acts (where each row relates to a play), a table of scenes (where
each row relates to an act), a table of lines (where each row relates to a
speaker and a scene, and has a line number).  If you wanted speeches set
apart from the rest of the play, you would create a table of speeches
where each speach related to a scene.

The real win for XML is that it's easy to create, import, and edit
data--you don't have to worry about the relations in specific, as they're
implied; you also don't have to worry about the care and feeding of a
DBMS.  The real win for SQL is (typically) speedy access and easy
description of what data you want to retrieve; you also can draw relations
that might not've been conceived earlier, if you plan your database
correctly.  The biggest win for SQL is being able to tag a datum with an
ID so that you know that when pieces of identical data are references to
the same thing--without that, you just have to guess.

Combining the two sounds like a real lose for general-purpose work,
espeically if you get a -lot- of queries.  It would make much more sense
to craft a database, import the XML, and export it, as needed.  XML, if
I'm remembering correctly, was initially created as an interchange
mechanism, not a storage/retrieval mechanism.

There are very valid reasons for wanting to query SGML or XML, but I think
it'd a bad idea to get in the habit of relying upon it as a permanent form
of storage, unless you absolutely cannot get anything else working
(example: a app that runs on both Unix (which typically only has BDB) and
Windows (which typically only has Jet or ODBC), but doesn't use Java) 
predictably.

My personal opinion is that XML is like the "Object-Oriented" of the early
90s", the "Web-enabled" of 1996, and the "Java-powered" of 1997--it's new
and everyone wants to use it because it's new.  It has very real potential
in a few areas, but it's not a panacea, and it's actually worse for some
applications than more traditional technologies are.

The things I would use XML for would be news stories on a CNN-like site,
recipes, or any other project that meets the following criteria:

  1) High amount of implied metadata (flour is a dry ingredient, for
     example).
  2) Content should be auto-formatted, rather than having formatting
     embedded, and there is a lot of formatting to be done (what if CNN
     wants to alter the positioning of the ads between paragraphs?).
  3) Searches are basically limited to full-text and a -tiny- bit of
     metadata (You'll probably want to search by author, text, and title
     at CNN, but you really don't care about most of the other metadata).
  4) You -don't- need many-many relationships.  Doing this in SQL is ugly
     enough.  I can't see how you'd do it in XML without emulating the
     way you'd do it in SQL (thereby killing your usability bonus).

But I've been told that people should take what I say with a grain of
salt.  I still primarily code in C.  I still use LaTeX for work
processing.  I'm still not sold on top-down design.  I still use HTML 3.2
for web-markup.  I don't -think- I'm afraid of new technology, but I
certainly don't seem to use a lot of it.

#endif /* SKEPTICAL_PROGRAMMER */

--Jonathan
SQL bigot for over 3 years.



More information about the geeks mailing list