OpenXML - Open for business

Introduction

This article was originally published on DNJ Online
DNJ Online

The official name for the most recent version of Microsoft’s office productivity suite is not ‘Microsoft Office 2007’ but ‘the 2007 Microsoft Office system’. Rather more cumbersome but there is a reason for the change as, with this release, Microsoft’s primary focus has been to provide a set of tools not just for producing stand-alone documents, but to help you automate your office as a system.

In most offices documents are not just written: they are drafted, compiled, corrected, updated, checked, approved, authenticated, tracked, delivered, archived and retrieved. Standard paragraphs are pulled in from other documents; data is inserted from databases or spreadsheets; comments are added, accepted or rejected and then cleaned out. Documents, spreadsheets and slide presentations have always had a lifecycle and the 2007 Microsoft Office system has the management of this lifecycle at its heart.

A central component here is the Ecma Office Open XML file format that is now native to Office Word, Excel and PowerPoint 2007. As we have seen in our earlier articles, this is an open format, based on established industry standards in XML and ZIP, which has been designed around the notion that third-party applications should be able to access and edit parts within the document that are specific to their needs, without having to parse the whole document.

Until recently, Microsoft Office has used a proprietary data format that can only be manipulated by the associated Office application. Office 2003 did introduce WordprocessingML and SpreadsheetML but these essentially saved the document as a single flat file of XML data, without the internal structure of Open XML. Furthermore, they had yet to be ratified as industry standards which meant third-parties were reticent to commit fully to their use for fear Microsoft might later make changes that could damage their applications.

This was the thinking behind the European Commission’s call in May 2004 for Microsoft to submit its XML formats “to an international standards body of their choice”, stating: “... standardisation initiatives will ensure not only a fair and competitive market but will also help safeguard the interoperability of implementing solutions whilst preserving competition and innovation.” Now that Open XML is an industry standard, developers can work with it with greater confidence.

OpenXML relations

The internal structure of the Open XML format makes it possible to change the content of a document without having to parse the main document part itself. By editing the appropriate relationship we can swap Part1.xml content for Part2.xml within the document.

Direct access

The ability to work with Office documents without having to load the associated application can bring huge benefits in itself. Microsoft Program Manager Brian Jones, whose blog is well worth reading, cites a bank that uses Word 2000 to generate the paperwork for loan agreements. These agreements are constructed from a set of document fragments according to a set of rules. The bank currently has an installation of some 70 servers, each running Word 2000, which churns out thousands of such documents a year.

Running Word or Excel unattended on a server so that you can manipulate Office documents through their APIs using COM Interop has become common practice, although fraught with problems. Such a solution is resource-intensive and the applications can crash which requires monitoring and automatic re-starts. Furthermore, it is not a practice that is supported by Microsoft.

Using Open XML, the bank’s requirements could be satisfied with a relatively small .NET application. Thanks to the use of relationship parts by Open XML, the application would not necessarily have to know anything about WordprocessingML to generate documents that are perfectly readable from Word 2007 or, if the Compatibility Pack has been installed, any version from Word 97 onwards. Furthermore, the same throughput could probably be handled by a single machine.

BPM Suite 4.5 is the latest version of Bluespring Software’s .NET application for designing, managing and monitoring business processes. Thanks to Open XML, this version no longer requires Microsoft Office to be installed alongside BPM Engine on the server, and users no longer need to include proprietary Bluespring tags in their documents. The suite uses Open XML to create Excel 2007 and Word 2007 documents on the fly and tailor them appropriately.

Mindjet makes MindManager, a package that you use to create visual ‘maps’ for brainstorming and visualising business processes. MindManager has long had the ability to export its maps as Microsoft Word documents or PowerPoint presentations. However it has been limited in its ability to take changes made in Word or PowerPoint and incorporate them back into its maps.

The Open XML format makes this much easier as MindManager can directly generate and read the necessary documents – indeed Richard Barber of Mindjet told us that tests had shown they could create Word 2007 documents ten times faster using Open XML than using the previous COM interface. Furthermore, MindManager Pro 6 is able to use Open XML to create ‘roundtrip’ buttons that are added to the Office 2007 ribbon menu when the document is opened (see ‘Customising the user interface’).

The Word 2007 Map Editor for Mindjet MindManager actually embeds the MindManager map as a custom XML part within a macro-enabled Open XML document. In addition to this are parts defining the new controls that are to appear in the Word 2007 ribbon, the macros needed to make them work, and an XSL transformation that updates the embedded map by merging it with the Word document.

You might also like...

Comments

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“Never trust a programmer in a suit.” - Anonymous