Domain-specific modeling for generative software development

This article was originally published in VSJ, which is now part of Developer Fusion.
Generative programming is about bringing the benefits of automation to software development. Currently, software development is too often regarded as a generic, labour-intensive process where applications are basically coded from scratch using tools and methods that can be used for any type of software. The outcome of this approach is often characterised by missed deadlines, defects, unwanted functionality and completely failed projects. It shows that in most organizations software development is still a very immature process. This article introduces domain-specific modeling (DSM), a new approach that uses generative programming to improve the quality of the software development process and drastically increases developer productivity.

The largest productivity boost software developers have seen was the step from assembly language to third-generation languages like FORTRAN and C. Since then, newer programming languages have only had a minor impact on the quality and speed of the development process[1]. Key to the 400% productivity improvement achieved with the move towards 3GL was automation. These newer languages allowed developers to specify their solution on a higher level of abstraction. A compiler automatically generates the lower level assembler. Domain-specific modeling uses a similar approach to improve the software or system development process.

Domain-Specific Modeling for full code generation

Normally, the first step in application or system development is to specify the solution in a sketch by using and thinking in concepts that directly relate to the problem domain. After that it is common these days to model the solution in more detail in UML, a visual modeling language that uses generic program language concepts for making a blueprint of an application. Programmers then take these specifications and specify the application once more, but now in programming language. Double work you could say, and more importantly, difficult work, since programming languages do not lend themselves very well to this task. They offer only a limited, rigid syntax.

One way of further raising the level of abstraction is to move away from specifying an application by using program language concepts. Instead, each company could do this by using the concepts and rules from the products it makes, i.e. by basing the design method directly on the underlying problem domain. Take for example mobile phone applications. Developing applications in this problem domain by using domain concepts like “menu”, “notification” and “send SMS” is much easier and faster than doing it with classes, attributes and methods.

Figure 1
Figure 1: Example of defining a mobile phone application with a domain-specific modeling language. The model only uses concepts from the mobile phone domain, i.e. all coding concepts are hidden. Still the model is complete enough to generate the full application code automatically.

Figure 1 shows a design of an application that uses domain concepts from the mobile phone world. Those who are familiar with these concepts can easily understand what the application does: It allows someone to register for the DevWeek 2005 conference via a mobile phone. In this design all code elements are hidden, in fact one does not even have to know what code is being generated or what the rules of the underlying platform are! Only familiarity with the problem domain is required to build applications or systems this way. The same is valid for developing applications or configuring systems in many other domains. IT workflow management systems could for example be modeled, or configured, with a higher abstraction business process modeling language, data security design could be defined by using concepts like “Right”, “Role” and “Authentication”, etc.

This approach of using so-called ‘domain-concepts’ directly as program-language concepts is called domain-specific modeling (DSM). With DSM the lower-level program code or configuration files can be automatically generated from the higher-level specification, which eliminates the double work. When applying DSM, a company’s domain expert creates a modeling language that is more suitable for developing applications in the company’s problem domain than an all-purpose notation. He encapsulates his expert knowledge about the domain concepts, their properties and rules that apply into the modeling language. The result is a domain-specific, but very expressive modeling language, which makes it easier to specify an application completely with less modeling. The model presented in Figure 1 for example contains enough information to automatically generate the complete application. Additionally, the encapsulation of the experts’ domain knowledge prevents developers from making illegal or unwanted designs. In comparison, UML does not describe or contain rules that apply to any product or system, just like current programming languages or assembly language. 10 years of UML have also showed that full code generation is not possible with it; it is simply too generic for this purpose.

Total control over code generation

Generating full code from high-level models is an issue concerning tool support and domain expertise. Currently there is a large offering of modeling tools. Most of these support UML and offer the ability to perform translation and synchronization between program code and diagrams and vice versa. Often the produced code is only partial and lacks quality because these tools support an all-purpose modeling language that was never designed for generative purposes (UML was developed for documentation). Furthermore, UML tools offer fixed code generators that try to fit all situations. This leads to a situation where you end up compromising quality. Code is written differently in different domains and companies for good reasons. A vendor-proprietary code generator that is targeted for many situations is thus not the most optimal solution.

To generate full and quality code from high-level design models we need to work the problem from both sides. DSM tools allow both the design of modeling language and code generators separately by the domain-experts so that both fit the requirements the situation imposes. An environment that supports both, allows developers to design applications and generate code in a way the domain-experts have defined it to be correct. This approach takes away the common concern regarding automatic code generation of losing control of the produced code. Domain-specific code generators can produce high quality, highly readable code that can be traditionally reviewed when necessary. The goal however is to optimise generated code or configuration files by tweaking the generators and modeling language. Manually editing generated code and then reverse engineering this is not a workable solution. Have you ever seen anyone hand-edit his assembler code and then try to keep his C code in synch with that? Maybe you are one of the few that have, but do you also consider it an optimal solution?

Not the end of programming

So where does this leave the skilled programmer? Have the skills that he obtained over time suddenly become obsolete? No, his role will improve as routine and repetitive tasks are minimised, allowing him to focus on the actual problem. We should apply the skills of the few very good programmers to make it easier for the rest! One of the tasks that skilled programmers are well qualified for is creating and maintaining the platform, code generators and domain framework. Such domain framework aims at making it easier for the generated code to interface with the platform by raising the level of abstraction from the platform side upwards[2, 3] (Figure 2). Additionally it works as a buffer between DSM language and generator on one side and platform on the other, making the first less sensitive to changes in the latter.

Figure 2
Figure 2: With DSM a code generation framework allows working on higher levels of abstraction and focuses on full code generation. MDA focuses on model-to-model-to-code translations.

Over time, changes will happen since a company’s domain and platform product- and system requirements evolve. As a consequence, tools, languages models and code must evolve with them. Keeping the automation mechanism up to date with these changes is a challenge for programmers and experts.

If there are changes in the underlying platform or environment then these may well have an effect on the code generation and possibly on the modeling language. In this case just the expert need make the changes. The developers’ models and code will update automatically. This formalisation of development makes the process a lot safer and consistent. Working on a higher abstraction level makes the process more agile at the same time.

How does this differ from MDA?

The Object Management Group (OMG) promotes Model-Driven Architecture (MDA), a generative method that comes down to transforming one UML model into another, possibly several times and possibly automatically. At each stage you edit the models in more detail and in the end you generate substantial code from the final model (see Figure 2). These translations are largely in the hands of the tool vendor and will therefore not be optimal for each specific case. The main difference however, concerns when to apply which method. DSM requires domain-expertise, a capability a company can achieve only when focusing on working in the same application domain. These are typically product or system development houses more than project houses. Devices with embedded software in automotive and medical segments come to mind immediately but also configuration of CRM, ETL, Business Process Management and workflow systems are, like many others, good candidate domains. In these situations, companies get significantly more benefit by adopting DSM instead of MDA. Industrial experiences consistently show productivity improvements of between 500% and 1000%[4]. Some reports from early implementations of MDA quote 35% at most[5]. MDA requires a profound knowledge of its methodology, something which is external to the company and has to be gained by experience. There where domain-expertise is usually already available and applied in an organization, MDA-expertise usually has to be found outside the organization.

Why full code generation?

DSM advocates full code generation since this concept has always been behind every successful shift of productivity in software development. Today we only use compilers because they generate full assembler code. If they would not, we would be forced to maintain the same information in two places, which always is a recipe for trouble. Yet this is the idea the OMG is selling with MDA! MDA requires reverse engineering to keep code, models and documentation synchronized (Figure 2), not only in the design phase but also during the maintenance phases. This provides large amounts of overhead, minimizing the benefits it promises. If domain-expertise is available in an organization, due to repetitive development of product or system variants, the choice between MDA and DSM really is a no brainer.

DSM adoption getting easier

Since the late 1990s, innovative companies from different market segments have implemented DSM with remarkable results: Nokia, Lucent and USAF report productivity improvements of between 5 to 10 times compared to manual programming. Building tools that support DSM languages and code generators has been a barrier that limited DSM to those with enough resources to invest a few man-years in it. More recently, open and customizable modeling environments, like meta-CASE tools and IDEs like Eclipse EMF & GEF, have appeared that support DSM. MetaEdit+ from the company I work with has been used by a large number of companies to implement DSM actually already for several years. Most notably, Microsoft has invested heavily in Domain-Specific Modeling languages with the announcement of their Software Factories approach. These tools make it a lot easier for companies to adopt DSM.

Generative programming with DSM re-uses domain knowledge of experts and skills of better developers in an organisation in order to make the rest deliver better quality in a shorter amount of time. As with the introduction of 3rd generation programming languages it uses automation in order to obtain an order of magnitude improvement in developer productivity. It’s not about turning developers into robots as the rather poor choice of naming DSM ‘Software Factories’ by the guys from Redmond might make you think. Instead it is about giving them a better, higher quality way to build software. Additionally, it improves the maturity of the software development process with more consistency and quality.


Martijn Iseger works with MetaCase as sales manager, and is responsible for evangelising DSM.

[1] Capers Jones, Software Productivity Research, 2001
[2] Jack Greenfield and Keith Short, Software Factories: Assembling Applications with Patterns, Frameworks, Models & Tools
[3] Juha-Pekka Tolvanen, Making model-based code generation work, Embedded Systems Europe, August/September 2004
[4] DSMforum
[5] Compuware press room

You might also like...

Comments

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“There are only two kinds of languages: the ones people complain about and the ones nobody uses” - Bjarne Stroustrup