Understanding XML Namespaces

Namespaces & Validation

The role namespaces play is most obvious when you are validating an XML document against an XML Schema Definition (XSD) XSD schema. In case you are not familiar with XSD, it is an XML-based grammar used to define document structures and data types that you use in your document. You can think of XSD as a superset of Document Type Definitions (DTD). You don’t need to know much about XSD schemas for this article and I’ll explain the little bit you do need to know.

Imagine you have an XML document that contains employee information for a human resources application:

<employees>
<employee>
    <id>49982</id>
    <name>Bart Simpson</name>
    <hireDate>2000-07-04</hireDate>
</employee>
<employee>
    <id>12345</id>
    <name>Hugo Simpson</name>
    <hireDate>2000-05-29</hireDate>
</employee>    
</employees>

You might create a schema for this document that defines a data type for the employee element like this:

<xsd:complexType name="employeeType">
<xsd:sequence>
    <xsd:element name="id" type="xsd:int"/>    
    <xsd:element name="name" type="xsd:string"/>            
    <xsd:element name="hireDate" type="xsd:date"/>
</xsd:sequence>
</xsd:complexType>
<xsd:element name="employee" type="employeeType" maxOccurs="unbounded"/>

Without going into the details of XSD, the above snippet does two things: First, it defines a data type called employeeType that contains three elements: id, name and hireDate. Second, it declares an element called employee of the type employeeType. This is the XSD way of saying “the <employee> element will contain three elements in this order: <id>, <name>, and <hireDate>”.

You can use the above schema snippet to validate the employees document and it’ll work just fine (provided you have a schema-aware validator like XML Spy V3.5). Now, imagine the payroll application wants to share this employee information and add some more to it. For example, the payroll application wants to keep track of employee salary and the taxes being deducted (this is way oversimplified, but who wants to learn the 2001 payroll taxes laws?):

<employees>
<employee>
    <id>49982</id>
    <name>Bart Simpson</name>
    <hireDate>2000-07-04</hireDate>
    <salary>4000765.00</salary>
    <taxes>3980765.27</taxes>
</employee>
<employee>
    <id>12345</id>
    <name>Hugo Simpson</name>
    <hireDate>2000-05-29</hireDate>
    <salary>82000.00</salary>
    <taxes>16567.87</taxes>    
</employee>
</employees>

How should you handle this? Do you update the HR schema to reflect the new <salary> and <taxes> elements? That might seem like a good choice at first, but it results in two separate applications sharing the same schema document, which is likely to result in ownership and maintenance problems. It would be much better if you can separate the data types that belong to HR from the data types that belong to payroll and allow each application’s team to have control over there data types with no potential of messing up each other’s schemas.

You can do that by simply placing those data types in different buckets when you define them. Those buckets are called namespaces. Lets say you define a bucket or namespace called HRData and another one called payrollData. You can then make the payroll application team in charge of maintaining data types in the payrollData namespace and the HR application team in charge of maintaining types in the HRData namespace. You will need a way to indicate that the <salary> and <taxes> elements belong to the payrollData namespace while all other elements belong to the HRData namespace. To do this you prefix each element name with the namespace and a colon like this:

<HRData:employees>
<HRData:employee>
    <HRData:id>49982</HRData:id>
    <HRData:name>Bart Simpson</HRData:name>
    <HRData:hireDate>2000-07-04</HRData:hireDate>
    <payrollData:salary>4000765.00</payrollData:salary>
    <payrollData:taxes>3980765.27</payrollData:taxes>
</HRData:employee>
<HRData:employee>
    <HRData:id>12345</HRData:id>
    <HRData:name>Hugo Simpson</HRData:name>
    <HRData:hireDate>2000-05-29</HRData:hireDate>
    <payrollData:salary>82000.00</payrollData:salary>
    <payrollData:taxes>16567.87</payrollData:taxes>        
</HRData:employee>
</HRData:employees>

Don’t be fooled by the apparent complexity of this snippet. All I did is add the HRData and payrollData prefixes before each element name. I don’t know about you, but I’d rather keep the namespace prefixes as short as possible. To do this, you come up with a short prefix, possibly as short as one letter, and map that prefix to the real namespace name. For example, you might decide to use py for payrollData and hr for HRData:

<hr:employees xmlns:hr="HRData" xmlns:py="payrollData">
<hr:employee>
    <hr:id>49982</hr:id>
    <hr:name>Bart Simpson</hr:name>
    <hr:hireDate>2000-07-04</hr:hireDate>
    <py:salary>4000765.00</py:salary>
    <py:taxes>3980765.27</py:taxes>
</hr:employee>
<hr:employee>
    <hr:id>12345</hr:id>
    <hr:name>Hugo Simpson</hr:name>
    <hr:hireDate>2000-05-29</hr:hireDate>
    <py:salary>82000.00</py:salary>
    <py:taxes>16567.87</py:taxes>    
</hr:employee>
</hr:employees>

The syntax for defining a namespace-prefix mapping is: xmlns:prefix=”namespace” where prefix is the short prefix you’ll use in the document and namespace is the actual namespace name that the prefix refers to. Once you’ve defined the prefix, you can use it in your document instead of writing out the entire namespace name in front of each element name. When using namespaces, element and attribute names have two parts: the prefix e.g. hr or py and the local name e.g. employee or salary. The two parts together form the qualified name or QName, e.g. hr:employee or py:salary.

Now you can easily create two different schemas, one that defines the HR types in the HRData namespace, and one that defines the payroll types in the payrollData namespace. The syntax you use to do this is part of XSD and is beyond the scope of this article.

You might also like...

Comments

About the author

Yasser Shohoud United States

Yasser started programming at the age of 12 when he wrote his first text-based game on a Commodore PET. He's since moved to IBM mainframes then to Microsoft technologies and has worked as System...

Interested in writing for us? Find out more.

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“In order to understand recursion, one must first understand recursion.”