Related Topics: Apache Web Server Journal, XML Magazine

Apache Web Server: Article

David & Goliath: A Comparison Of XML-Enabled And Native XML Data Management Techniques

A comparison of XML-enabled and native XML data management techniques

Due to the great increase of data in XML format, companies increasingly have to face the issue of how to manage this data efficiently. To do this it is important to take advantage of XML's potential, and to integrate with applications that access data stored in relational database management systems (RDBMS).

Currently, there are no available tools that can be used to analyze whether XML data is best stored in conventional RDBMSs or in native XML database systems, in particular applications involving legacy or new systems. The most appropriate solution will depend on the process type (read only or read-write), the XML document's structure (single or multiple hierarchies, or dynamic structure changes), data volume, and the application priorities (fast queries, concurrent updates), for example.

In this article we present, in a graphical form, the response times for some common operations using SQL and XPath. We store XML data using different storage and management solutions: Oracle ("Goliath") - an XML-enabled database, and eXist ("David") - a native XML database (NXD). eXist is a very recent DB that needs maturing but presents many virtues; "Goliath" is Oracle, a mature, robust, secure, and reliable RDBMS. We're bringing these two databases face-to-face like "big titans" to demonstrate which is the best at different development scenarios.

The main advantage that we have found for the eXist NXD is the direct XML management it offers as a native XML database. Another advantage is that eXist is open-source software, which also provides all of the requisite libraries. The means that it offers an inexpensive technology option, since only training costs will be incurred in its use.

In XML data management, companies wonder how to manage XML data and integrate with legacy RDBMS. Database queries must comply with the same structure as XML data. The access to RDBMS is direct through SQL sentences. NXD allows access to XML data directly. Figure 1 shows the three-tier architecture used in this experiment.

XML Data Management
In this article we investigate two different approaches to XML data management: one that uses conventional database techniques to deal with XML formatted data (often called XML-enabled database systems) and an alternative that uses XML formats as its basic storage format, native XML database systems. For the fist approach, XML-enabled databases, we used Oracle, a widely used commercial DBMS. For the second approach, we used eXist, an open source native XML database.

Oracle (XML-enabled)
Oracle (and other major RDBMS vendors) has extended its basic RDBMS functionality to include XML data management capabilities. This kind of database is named XML-enabled. By using XML-enabled databases we can transfer data from relational tables to XML documents in both ways.

For using Oracle XML-enabled database there are two main solutions:

  1. The whole XML document is stored in XMLType column:CLOB [8] (XML document stored as CLOB, character large object)
  2. An XML document's structure is mapped to database columns (XML SQL Utility).

    eXist (native XML database)
    Why store XML documents somewhere that doesn't have XML structure (RDBMS)? Why adapt the XML structure to that kind of an RDBMS when there are more natural ways to store XML documents? Open source eXist NXD will be the best solution in certain cases.

  3. Open-source eXist, Native XML Database (NXD)
Let's analyze each solution's pros and cons.

XMLType:CLOB The Oracle XMLType:CLOB stores the whole document, which is good because we don't loose context or data hierarchy, and XPath's queries are allowed. In addition, it allows us to use the XPath syntax to update specific elements and attributes without rewriting the document. This solution is not as good as object-relational solution querying and updating data. Later, we will present the response times using XMLType query engine.

In previous database versions (Oracle 8i, for example) we had no XMLType, we only had CLOB choice, and there was no way to update an XML document's part. It was necessary to change the whole document and XPath's queries were not allowed. Using more recent Oracle RDBMS versions such as 9i and 10g, this kind of XML management is much better supported. Also, this has become the most common solution for unstructured XML documents (document-centric).

XMLType has predefined member functions to extract XML nodes and fragments. We use various functions such as extract( ) (this uses XPath to return fragments as XMLType), and getStringVal( ) (which returns a document as a string). These functions are embedded into SQL sentences to provide XML functionality. For example:

Create table:
Create table clientes of xmltype;

Insert document into a table:
Insert into clientes values ('

update clientes set object_value =
updatexml(object_value, '//nombre/
text()', 'Ainhoa');

More Stories By Ainhoa Serna

Ainhoa Serna is lecturer in the department of Informatics at ETEO (Mondragon Unibertsitatea). She has also worked as Project Leader at engineering and Research projects since 1995. Her research work is focused on XML.

More Stories By Jon Kepa Gerrikagoitia

Jon Kepa Gerrikagoitia is lecturer in the department of Informatics at ETEO (Mondragon Unibertsitatea). He has participated in a European Research Project and engineering projects in the last 10 years. His research work is focused on XML Web Services.

Comments (4)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.