SQLX

This article was originally published in VSJ, which is now part of Developer Fusion.
SQLX is simple, highly intuitive and, in some development scenarios, is an ideal way of returning relational data. Another feature that most vendors support is XSL transformation of the XML result set prior to returning it to the requesting agent.

As the de facto standard for the interchange of data, XML has radically altered the information technology landscape in the last couple of years. It has left virtually nothing untouched, from tiny embedded devices that sport XML-RPC interfaces to large web servers, XML is now ubiquitous, and database servers are no exception. This should come as no surprise; given that both XML and databases are essentially ways of organising data in a structured manner, some way of bridging the two needed to be found:

“One of the most intriguing and urgent requirements to arise from the appearance of XML is a well-defined relationship between XML and SQL. Vast quantities of business data are currently stored in SQL database systems and great demand exists for the ability to present that data in XML form to various client applications.”
Special Interest Group on Management of Data, ACM

Why return XML?

There are several compelling reasons to return queries as XML rather than as traditional result sets. For one, XML provides an unequivocally high level of abstraction from any particular data source. Consumers of XML data could just as easily have received that data from a local file as from a web service located half way around the world, or in the context of this article, from an SQLX statement issued to the database.

Another compelling reason to return XML is that it is hierarchical, whereas normal result sets are not. Hierarchical data communicates the inter-relationships inherent in data much better than flat result sets. After factoring in what you can do with the returned XML in the context of supporting tools and technologies, having your database layer produce XML becomes a very interesting proposition.

Brief history of SQLX

“The SQLX Group” initially worked for some time to bridge the gap between the relational and XML worlds. As an industry consortium, it voted in October of 2000 to formalise its work with the International Committee for Information Technology Standards (INCITS) which maintains the SQL standard. The newly formed group became Task Group H2.3, whose mission became to extend the SQL standard to provide broad support for the production of XML from database queries. Member organizations include Oracle and IBM.

Not to be confused: XML DB and SQLX

A common source of confusion when first investigating SQLX is where exactly it fits into XML DB, as the two appear to be closely related at first glance. XML DB and SQLX are not to be confused. XML DB basically solves the problem of storing and retrieving XML documents to and from a relational database, which prior to XML DB had many different home-brewed solutions, none of which were optimal for all stakeholders. SQLX is simply a way of dressing up query results as XML, and also provides the facility to present results hierarchically.

XML DB puts a database face on XML-ised data for DBAs, enabling them to manage and configure the database as usual. For developers, XML DB provides both sophisticated query capabilities (using XPath and SQL semantics), and an XML-centric view of the relational data. Possibly the most interesting interface to XML DB is that provided to end-users, for whom it creates a transparent file/directory metaphor using WebDAV. This equates to end users being able to store and edit XML documents in what appears to be a folder. When an XML document is saved to it, the database decomposes the document and stores it relationally. When editing these virtual documents using their favourite editor, XML DB transparently propagates the edits back to the underlying schema. The best of all worlds is achieved for DBAs, developers and end users.

XML DB requires that the database be specially set up to store and retrieve XML documents and uses schema-to-column mappings to achieve this. SQLX on the other hand is totally transparent: SQLX statements are parsed by the database engine and the resulting XML is produced on the fly. No special preparation is required, and the database otherwise operates completely as normal.

The big difference between XML DB and SQLX is that SQLX statements may be issued against extant relational databases, so its use is totally unobtrusive. If you work with a database that supports SQLX then you can start to leverage these capabilities immediately.

SQLX enjoys a large support base which includes Sybase ASE 12.5, IBM’s DB2 8.1 and Oracle 9i R2, among others. If you work with SQL Server, you’ll have to investigate the FOR XML clause which is outside the scope of this article.

Impact on the bottom line

Returning query results as XML begs the question of how this impacts performance. Of course, definitively stating whether or not returning result sets as XML will have an impact on the performance of your particular system is pretty much impossible: there are too many design factors and scenarios to consider. The performance bottlenecks in SQLX statements are the same as that of their non-marked up cousins: expensive and unnecessary joins, poor indexing, lack of query tuning and database normalisation / denormalisation issues remain the top performance culprits.

The ability to return hierarchical data also subtly alters the playing field because it is possible to return more in one network round trip than would otherwise be sensible to return in a single trip as a flat result set. That is, network latency may impact performance more using a number of faster queries than using fewer, albeit slower ones. Either way, good architecture and design will always be paramount to ensuring the performance of a system irrespective of these details. I was however pleasantly surprised to learn just how performant SQLX is, especially when deployed server side wrapped in stored procedures.

My advice is to use SQLX where it makes sense and to address performance issues early. Using SQLX is not a panacea to development, it is another arrow for your quiver. If preliminary testing of an architectural mechanism yields unacceptable performance in the context of your requirements, change strategies as you would in any other development scenario. As usual, performance test early and regularly in your development cycle.

SQLX Syntax 101

The following example is based on Oracle 9i R2. If you use SQL*Plus (or better yet, SQL*Plus running within Emacs, which I highly recommend) then it’s a good idea to set pagesize to 0 to prevent XML results from being paged, and setting long to 500000 to ensure that your largest queries are returned without any concatenation.

SQLX syntax is intuitive and familiar because it is simply an extension to good old SQL. You indicate that you would like XML-ized result sets by using SQLX functions within normal SQL statements. We’ll begin with the canonical “Hello, World!”

This SQLX statement:

select
	XMLElement("SQLX", 'Hello, World!')
from
	dual;
produces this:
<SQLX>Hello, World!</SQLX>
This first example illustrates a couple of basics. Firstly, the structure of an SQLX statement is identical to a normal SQL select statement. Secondly, the only peculiarity is the XMLElement function (or more properly, “scalar constructor function”). In this instance, XMLElement takes two arguments, the name of an element to create (“SQLX”) and text for its child element, “Hello, World!”

XMLElement is an overloaded function, meaning that it can take different arguments depending on the kind of output you wish to produce. One such overload enables us to nest XMLElements within one another in order to produce hierarchical content. This behaviour may be illustrated as follows:

select
	XMLElement("Parent", XMLElement("Child", 'Child text'))
from
	dual;
Which produces this output:
<Parent>
	<Child>Child text</Child>
</Parent>
Similarly,
select
	XMLElement("Parent", XMLElement("Child",
		XMLElement("GrandChild", 'Grandchild text')))
from
	dual;
Produces:
<Parent>
	<Child>
		<GrandChild>Grandchild
		text</GrandChild>
	</Child>
</Parent>
It quickly becomes evident how nesting in SQLX statements mimics the structure and conventions of XML.

Scott and his pet tiger

Let’s move onto a more realistic application of these principles using the SCOTT example schema from Oracle. Suppose we wanted to return the list of departments in the DEPT table. Given our first investigations, we might be tempted to write something like the following:
select
	XMLElement("Departments",
		XMLElement("Dept",
			XMLElement("Number",
				DEPTNO),
			XMLElement("Name", DNAME),
			XMLElement("Location",
				LOC)))
from
	DEPT;
Unfortunately, that statement produces the following output:
<Departments>
	<Dept>
		<Number>10</Number>
		<Name>ACCOUNTING</Name>
		<Location>NEW YORK</Location>
	</Dept>
</Departments>
<Departments>
	<Dept>
		<Number>20</Number>
		<Name>RESEARCH</Name>
		<Location>DALLAS</Location>
	</Dept>
</Departments>
...
Look carefully at the resulting XML: what is the matter with it? The problem is that we have produced invalid XML: there are multiple instances of our document element, Departments! This is of course illegal in XML, and serves to illustrate that SQLX engines generally do not perform any error checking or validation on the output they produce. Indeed, it is incumbent upon you, dear programmer, to ensure that the SQLX statements that you author produce valid XML. Furthermore, if you are using schemas, then you are also responsible for returning well-formed XML (meaning that it conforms to the structure specified by the schema).

So, how do we aggregate all of the Dept elements under a single Departments document element in order to produce valid XML? By using XMLAgg:

select
	XMLElement("Departments", XMLAgg(
		XMLElement("Dept",
			XMLElement("Code", DCODE),
			XMLElement("Name", DNAME),
			XMLElement("Location",
				LOC))))
from
	DEPT;
The above statement produces the desired output:
<Departments>
	<Dept>
		<Number>10</Number>
		<Name>ACCOUNTING</Name>
		<Location>NEW YORK</Location>
	</Dept>
	<Dept>
		<Number>20</Number>
		<Name>RESEARCH</Name>
		<Location>DALLAS</Location>
	</Dept>
	...
</Departments>
So, XMLAgg takes multiple records and merges them. In this case, each retrieved department is emitted as a child of the Departments (document) element. That collection is returned as and considered by the SQLX engine as a single row. Remember this if you get a “Single row subquery returns more than one row” message from the DB: you may be missing an XMLAgg.

There are a couple of other things that might bother you about this last result set. Firstly, having to type in XMLElement for each column in a table is a nuisance. This isn’t so much of an issue with only three columns in the DEPT table, but if there were twenty columns to output then this statement would become rather verbose. XMLForest mitigates this by enabling multiple child elements to be emitted in a syntactically cleaner fashion than repeating XMLElement for each column. Contrast the following statement, which produces exactly the same output, with our previous effort:

select
	XMLElement("Departments", XMLAgg(
		XMLElement("Dept", XMLForest(
		DEPTNO "Number",
		DNAME "Name",
		LOC "Location"))))
from
	DEPT;
Another potential issue with the above statement is that it emits DEPTNO as a child element of Dept. You might construe DEPTNO to be metadata and thus want to emit it as an attribute child of the Dept node rather than as an element child. You can achieve this by using the XMLAttributes function:
select
	XMLElement("Departments", XMLAgg(
		XMLElement("Dept",
		XMLAttributes(DEPTNO "Number"),
		XMLForest(
		DNAME "Name",
		LOC "Location"))))
from
	DEPT;
So that we get:
<Departments>
	<Dept Number="10">
		<Name>ACCOUNTING</Name>
		<Location>NEW YORK</Location>
	</Dept>
	<Dept Number="20">
		<Name>RESEARCH</Name>
		<Location>DALLAS</Location>
	</Dept>
	...
</Departments>
XMLAttributes is named in the plural because you can specify multiple attributes in a single call; just separate each name/value pair with a comma.

As I stated earlier, one of the most powerful things about SQLX is the ability to return hierarchical data, which is not possible using regular SQL statements. Suppose that we wanted to return a list of employees by department; we could achieve that using the following query:

select
	XMLElement("EmployeesByDepartment",
		XMLAgg(XMLElement("Dept",
		XMLAttributes(DEPTNO "DeptNo"),
		XMLForest(
		D.DNAME "Name",
		D.LOC "Location"),
	(select
		XMLElement("Employees", XMLAgg(
		XMLElement("Employee",
		XMLAttributes(EMPNO "EmpNo"),
		XMLForest(
		E.ENAME "Name",
		E.JOB "Job",
		E.MGR "Manager",
		E.SAL "Salary",
		E.COMM "Comm"))))
	from
		EMP E
	where
		E.DEPTNO = D.DEPTNO))))
from
	DEPT D;
Which produces the following output:
<EmployeesByDepartment>
	<Dept DeptNo="10">
		<Name>ACCOUNTING</Name>
		<Location>NEW YORK</Location>
		<Employees>
		<Employee EmpNo="7782">
			<Name>CLARK</Name>
			<Job>MANAGER</Job>
			<Manager>7839</Manager>
			<Salary>2450</Salary>
		</Employee>
		<Employee EmpNo="7839">
			<Name>KING</Name>
			<Job>PRESIDENT</Job>
			<Salary>5000</Salary>
		</Employee>
		...
		</Employees>
	</Dept>
	...
<EmployeesByDepartment>
Notice again how the nested statement and bracketing naturally describes the structure of the output that we wish to produce. Also note the join condition that selects only those employees that work for the “current” outer enclosing department. So, in order to produce arbitrarily complex XML, you simply use nested queries.

Building complex hierarchies

The ability to arbitrarily structure hierarchical result sets incurs additional complexity. SQLX is bracket intensive and your first attempts at constructing complex levels of nesting can be quite frustrating. I have found that a good process pattern to construct complex statements is to work from the innermost level outwards. Begin by writing the SQLX to return the innermost level of nesting and build layers successively outwards. When you successfully add to the structure of your desired output, I suggest saving a version of the file as a single misplaced bracket can cause a lot of frustration. Also be sure to completely digest the preceding EmployeesByDepartment example as it illustrates how to nest these statements to any level and can serve as a template in that regard.

Consuming SQLX from stored procedures

Both Oracle and DB2 support marshalling XML-ized result sets back as CLOB fields. In Oracle, simply declare a parameter as in out CLOB and use getClobVal() to populate the variable within the procedure as in the following example:
procedure GetParentChild(
	XmlRes in out CLOB) is
begin
	select
		XMLElement("Parent",
		XMLElement("Child",
		'Child text')).getClobVal()
			into XmlRes
	from
		dual;
end;
Typecase the return value as a string and use it to populate a SOAP response, DOM Document, SAX parse it, transform it, dump it out to an ASP.NET web page… you get the idea. This flexibility is what the joy of SQLX is all about!

Conclusion

Having your database layer produce XML is a compelling proposition. Although not appropriate for all development scenarios, it is unrivalled for some. SQLX is an unobtrusive means of producing XML from your database layer and requires no special tools or techniques other than SQLX support. Happy SQLX-ing!


Edward Garson is a Senior Consultant with Dunstan Thomas, a company that transforms software development at a grassroots level through technology evangelism, architecture, design and process best practices. Edward specialises in OOA/D, UML, XML and uses both .NET and J2EE platforms. He may be reached at [email protected].

References

You might also like...

Comments

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“The generation of random numbers is too important to be left to chance.” - Robert R. Coveyou