richpaige — The JOT Blog

2013/01/25

Lies, Damned Lies and UML2Java

Filed under: Column — richpaige @ 11:40

We review far too many research papers for journals and conferences. (Admittedly, we probably write too many papers as well, but that’s another story.) We regularly encounter misunderstandings, misconceptions, misrepresentations and plain old-fashioned errors related to Model-Driven Engineering (MDE): what it is, how it works, what it really means, what’s wrong with it, and why it’s yet another overhyped, oversold, overheated idea. Some of these misunderstandings are annoyingly common for us to want to put them down on the digital page and try to address them here. Perhaps this will help improve research papers, or it will make reviewing easier; perhaps it will lead to debate and argument; perhaps this list will be consigned to an e-bin somewhere.

Our modest list of the ten leading misconceptions — which is of course incomplete — is as follows.

1. MDE = UML

At least once a year we read an article or blog post or paper that assumes that MDE is equivalent to using UML for some kind of systems engineering. This is both incorrect and monotonously boring. The reality is that MDE neither depends on, or implies the use of UML: the engineering tasks that you carry out with MDE can be supported by any modelling language that (a) has a metamodel/grammar/well-defined structure; and (b) has automated tools that allow the construction and manipulation of models. Using UML does not mean you are doing MDE — you might be drawing UML diagrams as rough sketches, or to enable simulation/analysis, or for conceptual modelling. Doing MDE does not mean you must be using UML: you could be using your own awesome domain-specific languages, or another general-purpose language that has nothing to do with UML.

We have noticed that this misunderstanding appears less frequently today than it did five years ago; perhaps the message is slowly getting through. The misunderstandings might have started because of the way in which we often introduce MDE to students: conceptual or design modelling with UML is often the first kind of modelling that students see.

So, the good news is that if you’re doing MDE, you don’t have to use UML; and if you are using UML, you don’t have to do MDE. The bad news is that there are many other misconceptions out there waiting to pounce. We are just getting started.

2. MDE = UML2Java

Code generation is often the first use case that’s thought of, mentioned, dissected and criticised in any technical debate about MDE. “You can generate code from your models!” is the cry of the tool vendor. This is usually followed by the even more thrilling: “you can generate Java code from your UML models!” As exciting a prospect as this is, the overemphasis of code generation in discussions of MDE has led to the myth of the UML-to-Java transformation, and that it is the sole way of doing MDE. Without doubt, this is a legitimate MDE scenario that has been applied successfully many times. But as we mentioned earlier, you do not have to use UML to do MDE. Similarly, you don’t have to target Java via code generation to do MDE. Indeed, there is a veritable medley of programming languages you can choose! C#, Objective-C, Delphi, C++, Visual Basic, Cobol, Haskell, Smalltalk. All of these exciting languages can be targeted from your modelling languages using code generators.

It would be much more interesting to read about MDE scenarios that don’t involve the infamous UML2Java transformation — there are undoubtedly countless good examples that are out there. It’s always helpful to have a standard example that everyone can understand, but eventually a field of research has to move beyond the standard, trivial examples to something more sophisticated that pushes the capabilities of the tools and theories.

3. MDE ⇒ code generation

But what if you don’t care about code generation? Clearly you are a twisted individual: if you’re doing MDE you must be generating code, right? Wrong! Code generation — a specific type of model-to-text transformation — from (UML, DSML) models is just another legitimate MDE scenario. Code may not be a desirable visible output from your engineering process. You may be interested in constructing and assessing the models themselves — producing a textual output may not deliver any value to you. You may be interested in generating text from your models, but not executable code (e.g., HTML reports, input to verification tools). You may be interested in serialising your models so as to persist them in a database or repository.

However, if you are generating code from models, you are probably applying a form of MDE (the nuance is really whether your models have a precisely defined structure [metamodel] and whether or not your code generators are externalised — and can be reused).

4. MDE ⇒ transformation.

We’ve established that MDE is more than code generation. MDE is also about more than transformation.

Some problems cannot be easily solved with transformation. As advocates of MDE do we pack our bags and look for furrows that we can plough with model transformation techniques? Or can MDE still be of use?

Supporting decision making — helping stakeholders to reason about trade-offs between competing and equally attractive solutions to a problem — is an area in which models and MDE are increasingly used. (See the wonderful world of enterprise architecture modelling for examples). Code, software or computer systems are not necessarily central to these domains, and transformation does little more for us than produce a nicely formatted report. Instead, we need to consider exploiting other state-of-the-art software engineering techniques alongside typical MDE fare. Perhaps search-based software engineering (i.e. describing what a solution looks like) is preferable to model transformation (i.e. describing how an ideal solution is constructed) in some cases. We have done work in this area at our university [DOI: 10.1007/978-3-642-31491-9_32], and there is growing interest in this topic.

Transformation is powerful. Refactoring, merging, weaving, code generating and many other exciting verb-ings would not be possible without transformation theory and tools. However, models are ripe for other types of analysis and decision support and for these tasks, transformation is often not the right approach. In 2003 model transformation was characterised as the heart-and-soul of MDE. In 2013 we believe that a more well-rounded view is preferable.

5. “The MDE process is inflexible.”

This was an actual quote from a paper we once had to review for a conference. It was both a strange sentence and an interesting one, because we didn’t know what it meant. Just what is “the MDE process”? Did we miss the fanfare associated with its announcement? Arguably “process” and MDE are orthogonal: if you are constructing well-defined models (with metamodels) and using automated tools to manipulate your models (e.g., for code generation) then you are carrying out MDE; the process via which you construct your models and metamodels and manipulate your models is largely independent. You could apply the spiral model, or V-model, or waterfall. You could embed, within one of these processes, the platform-independent/platform-specific style of development inherent in approaches like Model-Driven Architecture (MDA). There is no MDE process, but by carrying out MDE you are likely to follow a process, which may or may not be made explicit.

6. MDE = MOF/Ecore/EMF

You must conform to the Eclipse world. Or the OMG world. You must define your models and metamodels with MOF or Ecore. You will be assimilated.

This is, of course, nonsense. MOF and Ecore are perfectly lovely and useful metamodelling technologies that have served numerous organisations well. But there are other perfectly lovely and useful metamodelling technologies that work equally well, such as GOPRR, or MetaDepth, or even (shock horror) pure XML. Arguably, the humble spreadsheet is the most widely used and most intuitive metamodelling tool in the world.

MDE has nothing to do with how you encode your models and metamodels; it has everything to do with what you do with them (manipulate them using automated tools; build them with stakeholders). Arguably, you should be able to do MDE without worrying about how your models are encoded — a principle that we have taken to heart in the Epsilon toolset that we have developed at our university.

7. Model transformation = Refinement

Refinement is a well-studied notion in formal methods of software engineering: starting from an abstract specification, you successively “transform” your specification into a more concrete one that is still semantics-preserving. In some formal methods, the transformations that you apply are taken from a catalogue of so-called refinement rules (which provably preserve semantics). Their application ultimately results in a specification that is semantically equivalent to an executable program. The refinement process thus produces a program that is “correct-by-construction”.

You can follow the logical (mis-)deduction behind this misconception quite easily:

Refinement rules transform specifications.
Specifications are models (see earlier misconceptions).
Model transformations are a set of transformation rules.
Transformation rules transform models.
Therefore, refinement rules are transformation rules.
Therefore, refinement is transformation.

This is actually OK. Refinement is a perfectly legitimate form of model transformation. The problem is with the reverse inference, i.e., that a transformation rule is a refinement rule. If you assume that transformations must be semantics preserving, then this is not an unreasonable conclusion to draw. But model transformations need not preserve semantics.

Heretical statements like this usually generate one of several possible responses:

“This is crazy: why would I want to transform a model (which I have lovingly crafted and bestowed with valid properties and attributes) into something that is manifestly different, where information is lost?”
“OK, I can see that you might write a transformation that does not preserve semantics, but they must be dangerous, so we just need to be able to identify them and isolate them so that they never get deployed in the wild.”
“I don’t have to preserve semantics? That’s a relief! Semantics preserving transformations are a pain to construct anyway!”

These responses are all variants of misunderstandings we have seen previously: this idea that MDE is equated to a specific scenario or instance of application.

The first misunderstanding is, of course, confusing a specific category of model transformation — those that preserve semantics — with all model transformations. What are some examples of non-semantics preserving transformations? They are legion: measurement applied to UML diagrams is a classic example, where we transform a UML diagram into a number. The transformation process calculates some kind of (probably object-oriented) metric. Another example is from model migration: updating a model because its metamodel has changed. In some scenarios, a metamodel changes by deleting constructs; the model migration transformation likely needs to delete all instances of those constructs. This is clearly not semantics preserving.

The second misunderstanding is the classical “Well, you can do it but don’t expect me to like it” response. Unfortunately, in many real model transformation scenarios, you have to break semantics, and you probably need to enjoy it too. Consider a transformation scenario where we want to transform a very large model (e.g., consisting of several hundred thousand elements) conforming to a very large metamodel (like MARTE, AUTOSAR, SysML etc) into another very large model conforming to a different very large metamodel. Because we are good software engineers, we are likely to want to break this probably very large and complicated transformation problem into a number of smaller ones (see, for example, Jim Cordy’s excellent keynote at GPCE/SLE 2009 in Denver), which then need to be chained together. Each of the individual (smaller) transformations need not preserve semantics — indeed, some of the transformations may be to intermediate convenience languages that exist solely to make complex processing easier.

8. MDE can’t possibly work for real systems engineering because it doesn’t work well in complex domains where there is domain uncertainty.

In systems engineering we often have to cope with domain uncertainty — we don’t fully understand the threats and risks associated with a domain until we have got a certain way along the path towards developing a system. If there is domain uncertainty then the modelling languages that have been chosen, and the operations that we apply to our models (e.g., transformations, model differencing, mergings) are liable to change, and this becomes expensive and time-consuming to deal with. Domain uncertainty is a real problem — for any systems engineering technique, whether it is model-based, code-based or otherwise. Domain uncertainty will always lead to change in systems engineering. The question is: does MDE make handling the change associated with domain uncertainty any worse? Perhaps it does. If you’re using domain-specific modelling languages, then changes will often result in modifications to your modelling languages (and thereafter corresponding changes to your model transformations, constraints etc). If you are using code throughout development, changes due to domain uncertainty will be reflected in changes to your architecture, detailed modular design, protocols, algorithms, etc. Arguably, these are problems of similar conceptual complexity — it’s hard to see how MDE makes things worse, or indeed better: it’s an essentially hard problem of system engineering.

9. Metamodels never change

As we saw in the first misconception, MDE is not only about UML, but also about defining and using other modelling languages. However, when we (or you, or the OMG) design a modelling language, even a small one, we rarely get it right the first time. Or the fifth time. Or the ninth time. Like all forms of domain modelling, constructing a metamodel is difficult and requires consideration of many trade-offs. Language evolution is the norm, not the exception.

Despite this, we often encounter work that:

Does not consider or discuss tradeoffs made in language design. These kinds of papers often leave us wondering why a domain was modelled in a particular way (e.g. “why model X as a class, and Y as an association?” “why model with classes and associations at all?”).
Presents the product of language design, but not the process itself. How was the language designed? Did it arrive fully formed in the brain of a developer, or were their interesting stories and lessons to be learnt about its construction?
Proposes standardisation of domain X because “there is a metamodel.” A metamodel is often necessary for standardisation, it is not sufficient. (For example, does your favourite transformation language implement all of the QVT specification? We bet it doesn’t — and shame on you, of course!)
Contributes extensions to — or changes to — existing languages with little regard for the impact of these changes on models, transformations or other artefacts. Even in UML specifications, the impact of language evolution is not made apparent: there are no clear migration paths from one version to another, as we discovered at the 2010 Transformation Tool Contest (see also the forum discussion on UML migration).

Misconceptions about language evolution might stem from the way in which we typically go about defining a modelling language with contemporary MDE tools. We normally begin by defining a metamodel/grammar, then construct models that use (conform to) that metamodel/grammar, and then write model transformations or other model management operations. The linearity in this workflow is reminiscent of Big Design Up Front, and evokes painful memories of waterfall processes for software development.

However, we have found that designing a modelling language — like many other software engineering activities — is often best achieved in an iterative and incremental manner. We are not alone in this observation. Several recent modelling and MDE workshops (XM, ME, FlexiTools) have included work on inferring metamodels/grammars from example models; relaxing the conformance relationship (typing) of metamodels; and propagating metamodel changes to models automatically and semi-automatically. These are promising first steps towards introducing incrementality and flexibility into our domain-specific modelling tools, but the underlying issue is rather more systematic. As a community, we need to acknowledge that changing metamodels are the norm, and to better prepare to embrace change.

10. Modelling ≠ Programming

There is a tendency in many papers that we read to put a brick wall between modelling and programming — to treat them as conceptually different things that can only be bridged via transformations (created by these magical wizards, or transformation engineers). We’ve seen this type of thing before, in the 1980s, with programming and specification languages in formal methods. Some specification languages like Z were perfectly useful for specifying and reasoning, but were difficult to use for transition to code. Wide-spectrum languages, that unified programs and specifications in one linguistic framework (e.g., Carroll Morgan’s specification statements, Eric Hehner’s predicative programming, Ralph Back’s refinement calculus), did not have these difficulties. Treating models and programs in a unified framework — as artefacts that enable system engineering — would seem to have conceptual and technical benefits, and would allow us to have fewer academic arguments about their differences (and more arguments down at the pub).

Well, we lied when we said there were only ten misconceptions.

11. MDE = MDA

We end with a real chestnut: that MDE is the same thing as MDA.

MDA first appeared via the OMG back in 2001. It is a set of standards — including MOF, CWM and UML — as well as a particular approach to systems development, where business and application logic are separated from platform technology — the infamous PIM/PSM separation. MDE is more general than MDA: it does not require use of MOF, UML or CWM, nor for platform-specific and platform-independent logic and concerns to be kept separate. MDE does require the construction, manipulation and management of well-defined and structured models — but you don’t have to make use of OMG standards, or a particular style of development to do it.

So, for you authors out there: when you say that you have an MDA-based approach, please be sure that you really mean it. Are you using MOF and UML? Are you reliant on a PIM/PSM separation? If so, great! Carry on! If not, please think again, and prevent us from complaining loudly and publicly on Twitter.

The End

We have to stop somewhere. These are just a few of the misconceptions, myths, and misunderstandings related to MDE we’ve encountered. Do send us your own!

About the Authors

Richard Paige is a professor at the University of York, and complains bitterly about everything MDE on Twitter (@richpaige). He also likes really bad films. His website is http://www.cs.york.ac.uk/~paige

Louis Rose (@louismrose) is a lecturer at the University of York. He wrangles Java into the Epsilon MDE platform, tortures undergraduate students with tales of enterprise architecture, and is regularly defeated at chess. His research interests include software evolution, MDE and — in collaboration with Richard — evaluating the effects of caffeine on unsuspecting research students. His website is http://www.cs.york.ac.uk/~louis

Comments (5)