2015/08/11

VOLT 2012 / 2013 Special Edition

Filed under: Editorial — admin @ 13:17

This JOT special section contains three extended and peer reviewed papers from the first and second editions of the International Workshop on Verification Of modeL Transformation (VOLT). The first edition of VOLT was held on April 21st, 2012 in Montreal, Canada as satellite event of the 5th International Conference on Software Testing, Verification and Validation (ICST 2012). The second edition was held on June 17th, 2013 in Budapest, Hungary as a satellite event of Federated Conferences on Software Technologies: Applications and Foundations (STAF 2013).

Model transformations are everywhere in software development, implicitly or explicitly. They became first-class citizens with the advent of Model-Driven Engineering (MDE). Despite some recent activity in the field, the work on the verification of model transformations remains scattered and a clear perspective on the subject is still not in sight. Moreover, current model transformation tools often lack verification techniques to support such activities. The goal of VOLT is to offer researchers a dedicated forum to classify, discuss, propose, and advance verification techniques dedicated to model transformations. VOLT promotes discussions between theoreticians and practitioners from academy and industry. A significant part of the workshop editions includes a forum for discussing practical applications of model transformations and their verification, including interesting properties to verify and efficient techniques to actually compute those properties.

For this special section, we selected three papers by means of at least two rounds of reviews. All papers were refereed by four well-known experts in the field. The selected papers are the following:

  • Moussa Amrani, Benoit Combemale, Levi Lucio, Gehan Selim, Juergen Dingel, Yves Le Traon, Hans Vangheluwe and James Cordy in their paper entitled “Formal Verification Techniques for Model Transformations: A Tridimensional Classification” discuss the evolution, trends, and current practices in model transformation verification found in the literature from three viewpoints: the transformations, their properties, and the verification techniques.
  • David Lindecker, Gabor Simko, Tihamer Levendovszky, István Madari and Janos Sztipanovits in their paper entitled “Validating Transformations for Semantic Anchoring” present a technique to validate that a domain-specific language satisfies the intentions that the designer had in mind when engineering the language. The approach consists of validating the consistency between a formalization of intention of a language designer and the semantic mapping of the language, the latter being expressed as a formal model transformation.
  • Rick Salay, Marsha Chechik, Michalis Famelis and Jan Gorzny in their paper entitled “A Methodology for Verifying Refinements of Partial Models” present a technique to verify how uncertainty present in models and transformations is reduced after refining models and model transformations.

We would like to thank everyone who has made this special section possible. In particular, we are obliged to all past VOLT organizers, to the reviewers for giving off their time to thoroughly and thoughtfully review papers multiple times, to the authors for contributing to VOLT and JOT with high quality papers, and to the JOT editorial board for making this special issue possible.

Eugene Syriani, University of Montreal (Canada)
Manuel Wimmer, Vienna University of Technology (Austria)

2015/04/09

Volume 14 issue 1 now live

Filed under: Announcement — admin @ 10:37

The first issue of volume 14 is now online at the JOT website.

Colin Atkinson, Philipp Bostan, Dirk Draheim, Foundational MDA Patterns for Service-Oriented Computing, pp. 1:1-30
Stefan Mutke, Christoph Augenstein, Martin Roth, André Ludwig, Bogdan Franczyk, Real-time information acquisition in a model-based integrated planning environment for logistics contracts, pp. 2:1-25
Naranjo David, Mario Sánchez, Jorge Villalobos, Evaluating the capabilities of Enterprise Architecture modeling tools for Visual Analysis, pp. 3:1-32

2015/03/18

Popularity will NOT bring more contributions to your OSS project

Filed under: Column — admin @ 17:34

The vitality and success of Open Source Software (OSS) projects depend on their ability to attract, absorb and retain new developers [1] that decide to commit some of their time to the project. In the last years, new code hosting platforms like GitHub have popped up with the goal of helping in the promotion and collaboration around OSS projects thanks to their integration of social following, team management and issue-tracking features around a pull-based model implementation.

Roughly speaking, GitHub enables a distributed development model based on Git (though with some extensions). In GitHub there are two main development strategies aimed at (1) the project team members and (2) external developers. Team members have direct access to the source code, which they modify by means of pushes. External developers follow a pull-based model, where any developer can work isolately with clones (facilitated by means of forks in GitHub) of the original source code. Later, developers can then send back their changes and request those changes to be integrated in the project codebase. This is what is called to send a pull request. Finally, pull requests are evaluated by project team members, who can either approve the pull request and incoporate the changes, or reject it and propose improvements which can be addressed by the proponent. Beyond the project creator, other developers can be promoted to the status of official project collaborators and get most of the same rights project owners have, so that they can help not only on the development (by means of pushes, as said above) but also with management tasks (e.g., answering issues or providing support to other developers). Issue-tracking support helps both external developers and team members to request new features and report bugs, and therefore fosters the participation in the development process. People interested in the project can also become watchers to follow the project evolution.

What makes some projects more successful than others?

Since there is still very limited understanding of why some projects advance faster than others, we asked ourselves whether projects using all these new collaboration features available in code hosting platforms like GitHub would actually have a positive influence in the advancement of the project. Are popular projects (i.e., projects with more watchers, more issues added, more people trying to become collaborators…) really more successful?

This blog reports on our answer to this question based on our findings after conducting a quantitative analysis considering all the GitHub projects created in the last two and a half years. As metric for project success we chose the number of commits (not necessarily adding code, also removing it). We believe this reflects better than other metrics the fact that the project is alive and improving. Several works have performed qualitative analysis of GitHub samples ([2, 3, 4] among others) but none trying to determine criteria for project success.

Methodology for our quantitative analysis

To perform our study we took all GitHub projects created after 2012 and collected a few relevant attributes for each of them.

GitHub Project Attributes

For each project we were interested in getting insights regarding the following characteristics:

  1. General information. We consider basic project information such as whether the project is a fork of another and the programming language used in its development.
  2. Development. We measure the development status of GitHub projects in terms of commits (totalCommits attribute) since its creation. As GitHub projects can receive commits from pushes (i.e., source code contributions coming from team members) and pull requests (i.e., source code contributions coming from accepted pull requests), we distinguish commitsPush and commitsPR attributes, respectively.
  3. Interest. Being a social coding site, GitHub projects can also be monitored, tracked and forked by users. We therefore focus on two main facilities provided by GitHub: watchers (watchers attribute) and forks (forks attribute). The former is the number of people interested in following the evolution of the project; they are notified when the project status changes (e.g., new releases, new issues, etc.). The latter is the number of people that made a fork. Both attributes can provide good insights on the project popularity [5].
  4. Collaborators. We consider the number of collaborators (attribute) who have joined a project to help in its development.
  5. Contributions. We focus on contributions coming from (1) pull requests (PRs attribute) and (2) issues (issues attribute). In particular, we are interested in collecting the number of pull requests and issues that have been proposed (i.e., opened) for each project.

Mining GitHub

The mining process is illustrated in the following Figure and is composed of three phases: (1) extracting the data, (2) aggregating the data to calculate and import the attribute values for each project into a database, and (3) filtering the database to build the subset of projects used for analysis (see Filter). Next, we will describe each phase of the process.

GitHub mining process

Figure 1: Mining process.

Extractor. GitHub data has been obtained from GitHub Archive which has tracked every public event triggered by GitHub since February 2011. GitHub events describe individual actions performed on GitHub projects, for instance, the creation of a pull request or a push. Events are represented in JSON format. There are 22 types of events but we focus on 7 of them from which we can get the data needed to calculate the project attributes described before. The considered event types are presented in Table 1.

Table 1: Events considered in the GitHub Archive extractor.

Event Type Triggering condition Attributes Involved
MemberEvent A user is added as a collaborator to a repository collabs
PushEvent A user performs a push commitsPush
WatchEvent A user stars a repository watchers
PullRequestEvent A pull request is created, closed, reopened or synchronized PRs, commitsPR
ForkEvent A user forks (i.e., clones) a repository forks
IssuesEvent An issue is created, closed or reopened issues

Events are stored in GitHub Archive hourly. Our process collected all the events triggered per day since January 1st 2012 (starting date for our analyzed period).

Aggregator. This component aggregates the events extracted in the previous step and calculates the attributes for each project.

The resulting dataset contains 7,760,221 projects. This dataset was curated to eliminate projects with missing information or that were former private projects (which would prevent us from getting the full picture of the project). The curated dataset contained 7,365,622 projects.

Filter. This component allows building subsets of the previous dataset in order to perform a more focused analysis. The filter takes as input the dataset from the previous step and creates a new filtered dataset containing only those elements fulfilling a particular condition.

In the context of our study, we built a new filtered dataset including only those projects not being a fork of another and that explicitly mention they were repos with code in a given programming language. GitHub is used for many other tasks beyond software development (i.e., writing books) and we wanted to focus only on original software development projects. The resulting filtered dataset contained 2,126,093 projects and was the one used in all the other analysis presented in this blog post.

First of all, are projects in GitHub really using collaboration features?

Before we try to answer the question of whether using those features help in the project advancement, we should check whehter these features are used at all. To answer this question, we will characterize GitHub projects according to the attributes presented before and specifically study the use of collaboration facilities in them. Table 2 reveals that in fact they are not largely used.

Table 2: Project attributes results of the GitHub dataset.

Development attributes
Attribute Min. Q1 Median Mean Q3 Max.
totalCommits 0.00 2.00 7.00 43.00 19.00 5545441.00
commitsPush 0.00 2.00 7.00 41.00 19.00 5545441.00
commitsPR 0.00 0.00 0.00 1.31 0.00 38242.00
Interest attributes
Attribute Min. Q1 Median Mean Q3 Max.
watchers 0.00 0.00 0.00 2.26 1.00 14607.00
forks 0.00 0.00 0.00 0.68 0.00 2913.00
Collaborators and Contribution attributes
Attribute Min. Q1 Median Mean Q3 Max.
collabs 0.00 0.00 0.00 0.05 0.00 7.00
PRs 0.00 0.00 0.00 0.96 0.00 8337.00
issues 0.00 0.00 0.00 0.29 0.00 1540.00

The results for development attributes such as totalCommits are strongly influenced by the fact that a considerable number of projects have a small number of commits. Thus, 1,259,822 (59.26% of the total number of projects) have between 0-10 commits from pushes (commitsPush) and 2,092,685 (98.47% of the total number of projects) have only between 0-10 commits from pull requests (commitsPR). Figure 2 illustrates this situation by showing the number of projects (vertical axis) per group of commits (horizontal axis).

Comparison between number of projects and number of commits coming from pull requests and pushes

Figure 2: Comparison between number of projects and number of commits coming from pull requests (commitsPR) and pushes (commitsPush).

Regarding the interest attributes, 1,433,042 projects (67.40% of the total number of projects) have 0 watchers and 1,614,556 projects (75.94% of the total number of projects) have never been forked. These results suggest that the use of GitHub is far from what it would be expected as a social coding site.

The results for collaborator and contribution attributes also reveal a very poor usage. Thus, 2,017,911 projects (94.91% of the total number of projects) do not use the collaborator figure; 1,953,977 projects (91.90% of the total number of projects) have never received a pull request; and 1,949,644 projects (91.70% of the total number of projects) have never received an issue.

Therefore we can conclude that most projects do not make any use of GitHub features and use it purely as a kind of backup mechanism. The great majority of projects show a low activity (i.e., totalCommits, commitsPush and commitsPR) and attract low interest (i.e., forks and watchers).

but those that do, do they get any benefits?

If so, this would be a good reason for the other projects to follow suit. Let’s see then if popular projects that attract a lot of interset (plenty of forks and watchers) and manage to involve a large community (that opens issues, becomes collaborators, submits pull requests) end up having more commits in the repository than others.

To answer this question, we will perform a correlation analysis among the involved attributes. More specifically, we resort to the Spearman’s rho (ρ) correlation coefficient to confirm the existence of a correlation. This coefficient is used in statistics as a non-parametric measure of statistical dependence between two variables. The values of ρ are in the range [-1, +1], where a perfect correlation is represented either by a -1 or a +1, meaning that the variables are perfectly monotonically related (either increasing or decreasing relationship, respectively). Thus, the closer to 0 the ρ is, the more independent the variables are.

Table 3 shows the ρ values for each combination of attributes we wanted to evaluate. The first three rows focus on the correlation between the number of collaborators, pull requests and issues and the number of commits of the project. As you can see there is no correlation (except for the somewhat obvious correlation between the number of pull requests and the commits derived from accepting them, as long as they are accepted, but with basically no impact on the global number of commits) among them. The last rows show there is no correlation either between the number of people following the project and the commits.

Table 3: Correlation analysis between the considered attributes.

Success attributes
totalCommits commitsPush commitsPR
collabs 0.09 0.09 0.06
PRs 0.27 0.25 0.88
issues 0.25 0.25 0.34
watchers 0.11 0.10 0.24
forks 0.08 0.07 0.36

It is important to note that during our study we also calculated the correlation values among all these attributes when grouping the projects according several dimensions, specially based on their size and the language used. None of those groupings revealed different results from those shown above.

Threats to Validity

In this section we describe the threats to validity we have identified in our study.

External Validity. Our study considers a large dataset of GitHub projects, however, it may not represent the universe of all real-world projects. In particular, as GitHub allows users to create open source repositories without any expense, our dataset might include mock or personal projects that are not focused on attracting contributions and they have been open sourced only to avoid paying membership fees to keep them private.

Internal Validity. Our study only considers GitHub data and therefore does not take into account external tools used by some GitHub projects (e.g. to manage the team and issues; for instance people attaching patches to an external Bugzilla bug tracking tool, later manually merged into the project by the project owner) that can lead to bias our study (i.e., in the previous example, that patch would not count as a pull request). Finally, using the language attribute to filter out non-software projects may result in the elimination of relevant projects since some software projects do not set the programming language used.

If popularity is not a good indicator, what determines the success of a project?

Honestly, we think by now it’s clear that we have no idea. We have learnt about quite a few things that do not correlate with success but still have to find one that does. Probably because there is no single reason for that or at least not one that it is simple enough to be easily measured. Still, being able to shed some light on this issue, even if partial, would be very benefitial for the OSS community and thus it’s worth to keep trying.

To try to get some more insights on this we have complemented this quantitative analysis with a more qualitative one where we conducted a manual inspection of the 50 most successful GitHub projects in our dataset (success measured in terms of the number of commits of the project coming from pull requests, i.e., from external contributors). We noticed that 92% of them (i.e., 46 projects) included a description file (i.e. readme), with, often, a link to complementary information in wikis (46%) and/or external websites (50%). A further manual inspection of these three kinds of project information sources revealed that they were not purely “decorative” but that instead included precise information on the process to follow for all those willing to contribute to the project (e.g., how to submit a pull request, the decision process followed to accept a pull request or an issue, etc.). We have compared these numbers with random samples of projects to confirm they are not just average values for the GitHub population.

This hints at the reasonable possibility that having a clear description of the contribution process is a significant factor to attract new contributions. Unfortunately, existing GitHub APIs and services do not provide direct support to automatically check our hypothesis on the whole population of GitHub projects so further research on this requires conducting other kinds of empirical analysis like interviews to contributors and project managers. If this is confirmed, this would open plenty of other interesting questions like whether some kinds of contribution processes (also known as governance rules) attract more contributors than others (e.g. dictatorship approach versus a more open process to accept pull requests). This could help project owners to decide whether to have a more transparent governance process in order to advance faster in the project development. See [6] for a deeper discussion on this.

About the Authors

Javier Luis Cánovas Izquierdo is a postdoctoral fellow at IN3, UOC, Barcelona, Spain.

Valerio Cosentino is a postdoctoral fellow at EMN, Nantes, France.

Jordi Cabot is an ICREA Research Professor at IN3, UOC, Barcelona Spain.

References

[1]     C. Bird, A. Gourley, P. Devanbu, U. C. Davis, A. Swaminathan, and G. Hsu. Open Borders ? Immigration in Open Source Projects. In MSR conf., 2007.

[2]     J. Choi, J. Moon, J. Hahn, and J. Kim. Herding in open source software development: an exploratory study. In CSCW conf., pages 129–133, 2013.

[3]     G. Gousios, M. Pinzger, and A. V. Deursen. An Exploratory Study of the Pull-based Software Development Model. In ICSE conf., 345–355, 2014.

[4]     F. Thung, T. F. Bissyande, D. Lo, and L. Jiang. Network Structure of Social Coding in GitHub. In CSMR conf., pages 323–326, 2013.

[5]     T. F. Bissyande, D. Lo, L. Jiang, L. Reveillere, J. Klein, and Y. L. Traon. Got issues? Who cares about it? A large scale investigation of issue trackers from GitHub. ISSRE symp., pages 188–197, 2013.

[6]     J. Cánovas, J. Cabot. Enabling the Definition and Enforcement of Governance Rules in Open Source Systems. In ICSE – Software Engineering in Society (ICSE-SEIS), to appear.

2014/09/01

The common good

Filed under: Editorial — Laurence Tratt @ 13:37

We asked. You said. We listened.

From this issue onwards, all JOT articles will be licensed under a Creative Commons licence. Currently, authors can choose either Attribution 4.0 International (CC BY 4.0) or Attribution-NoDerivatives 4.0 International (CC BY-ND 4.0) as their paper’s license (depending on feedback, we may extend these options over time). The author instructions have been updated accordingly.

In doing this, we’re giving back rights to authors and stating explicitly: JOT is on your side. Practically speaking, this move will make authors’ lives easier, and ultimately that of readers. We hope you enjoy the result!

2014/07/01

Extreme Modeling 2012 Special Edition

Filed under: Editorial — Laurence Tratt @ 14:27

This JOT special section contains four extended and peer reviewed papers from the first edition of the Extreme Modeling Workshop (XM2012) held on October 1st, 2012 in Innsbruck, Austria as satellite event of the 15th International Conference on Model Driven Engineering Languages & Systems (MODELS2012).

The goal of XM 2012 was to bring together both researchers in the area of modeling and model management in order to discuss more disciplined techniques and engineering tools to support flexibility in several forms in a wide range of modeling activities, including metamodel, model, and model transformation definition processes. The workshop aimed at a) better identifying the difficulties in the current practices of MDE related to the lack of flexibility and b) soliciting contributions of ideas, concepts, and techniques also from other areas of software engineering, such as that of specific language communities (e.g. the Smalltalk and Haskell communities, and the dynamic languages community). These contributions could be useful to revise certain fundamental concepts of Model Driven Engineering (MDE), such as the conformance relation.

From 8 initial submissions we selected 4 papers by means of at least two rounds of reviews. All papers were refereed by three well-known experts in the fields. The selected papers are the following:

  • Vadim Zaytsev in his paper entitled Negotiated Grammar Evolution presents a study about the adaptability of metamodel transformations. In particular, some metamodel transformation paradigms, like unidirectional programmable grammar transformation, are rather rigid. They are written to work with one input grammar, and are not easily adapted if the grammar changes. In the paper, the author proposes a solution able to entail isolation of the applicability assertions into a component separate from the rest of the transformation engine, and enhancing the simple accept-and-proceed vs reject-and-halt scheme into one that proposes a list of valid alternative arguments and allows the other transformation participant to choose from it and negotiate the intended level of adaptability and robustness.
  • Paola Gómez, Mario Sánchez, Héctor Florez and Jorge Villalobos in their paper entitled An approach to the co-creation of models and metamodels in Enterprise Architecture Projects discuss the problems related to the lack of dynamicity of model editors and the impossibility to load new metamodels at runtime. In the paper, they present an approach able to address such problems by separating ontological and linguistic aspects of metamodels. The GraCoT tool is an implementation of the approach based on GMF and it is also discussed in the paper.
  • Konstantinos Barmpis and Dimitrios S. Kolovos in their paper entitled Evaluation of Contemporary Graph Databases for Efficient Persistence of Large-Scale Models compare the commonly used persistence mechanisms in MDE with novel approaches such as the use of graph-based NoSQL databases. Prototype integrations of Neo4J and OrientDB with EMF are used to compare with relational database, XMI and document-based NoSQL database persistence mechanisms. The paper benchmarks also two approaches for querying models persisted in graph databases to measure and compare their relative performance in terms of memory usage and execution time.
  • Zoe Zarwin, Marija Bjekovic, Jean-Marie Favre, Jean-Sébastien Sottet, and Henderik A. Proper in their paper entitled Natural Modelling motivate the need for instruments that enable a wider adoption of modeling technologies. To this end it is necessary that such technologies are perceived as natural as possible. After having defined the natural modeling concept, the authors discuss how human aspects of modeling could be better instrumented in the future by using modern technologies.

We would like to thank everyone who has made this special section possible. In particular, we are obliged to the referees for giving off their time to thoroughly and thoughtfully review and re-review papers, to the authors for their hard work on several revisions of their papers, from workshop submission to journal acceptance, and to the JOT editorial board for organising this special issue.

Davide Di Ruscio, University of L’Aquila (Italy)
Alfonso Pierantonio, University of L’Aquila (Italy)
Juan de Lara, Universidad Autónoma de Madrid (Spain)

2014/06/04

The Song Remains (Almost) The Same

Filed under: Editorial — Laurence Tratt @ 11:47

For me, taking over as Editor-in-Chief of JOT is no small matter. The most recent editors — Oscar Nierstrasz and Jan Vitek — have done sterling work in establishing JOT as a well-read reference for substantial computing research, a job that Bertrand Meyer and Richard Wiener began before them. JOT continues to fill an important role in computing: an open-access journal with rigorous standards. In most senses, my job is to strive to continue Oscar and Jan’s sterling work. After all, when the JOT formula isn’t broken, why break it?

Of course, no such formula can be perfect, because the world around us changes: habits change, needs change, and attitudes change. It is the latter aspect which I wish to address in this, my first editorial. Research, at its best, is intended to benefit mankind: when, instead, it is hidden behind paywalls, its purpose is obstructed. JOT is therefore an open-access journal: whoever you are, whatever your status is, wherever you are in the world, you can read the research we publish in JOT without hindrance.

But JOT has one vestige shared with traditional journals: when authors publish their research in JOT we ask them to transfer the copyright of their paper over to us. What this means is that JOT is then the legal guardian of the paper: anyone who wishes to distribute or alter it — even the original authors — has to ask JOT for permission to do so. This was done with the aim of ensuring that JOT maintained the definitive home of the research and JOT has the legal right to prevent people duplicating (or, worse, plagiarising) the research we publish.

Attitudes in recent years have shifted. Authors want to publish copies of their papers on the homepages, in university paper repositories, and other online paper repositories. It is reasonable for them to ask why, if they put in the effort to perform and write-up the research, they should lose the legal right to post copies of their paper where they wish to.

In consultation with the JOT Steering Committee, I therefore believe that JOT should move to a world where we no longer require authors to transfer copyright to us. There are several possible models for how we might go about this, and we are opening up this discussion to the JOT community, seeding it with an initial proposal. With luck, we will put the new process into place later in the (northern hemisphere) summer.

Our initial proposal is as follows, based in part on the approach taken by similar journals such as PLOSOne and LMCS. Instead of requiring authors to transfer copyright to us, we propose that authors whose papers have passed JOT’s peer-review process are required to place their papers under a Creative Commons license before their paper will be published. Doing so will give everyone — including JOT — the right to host copies of their paper. We intend giving authors the freedom to choose between between the Attribution CC BY or Attribution-NoDerivs CC BY-ND licenses. Broadly speaking, the former would allow anyone to distribute (possibly altered versions of) the paper; the latter would allow anyone to distribute, but not alter, a paper. In both cases, the right to distribute the specific version of the paper accepted by JOT is irrevocable: it will be publicly available for all time. We would request that all copies the authors place on other sites use the JOT template so that JOT is properly credited as the publication that put the effort into reviewing and publishing the paper, but this will rely on author’s goodwill, rather than any legal mechanism.

Please feel free to leave your suggestions in the comments below or by contacting me directly. I would like whatever process we come up with to be as good as it can be, and that is most likely to happen when the JOT community puts its collective brain to the task!

2013/08/14

TOOLS Europe 2012 Special Section

Filed under: Editorial — Jan Vitek @ 10:40

Carlo A. Furia  and   Sebastian Nanz

The 50th International Conference on Objects, Models, Components, Patterns (TOOLS Europe 2012) was the closing event in a series of symposia devoted to object technology and its applications. The conference program included 24 paper presentations covering a broad range of topics, from programming languages to models and development practices. This variety, typical of the TOOLS conferences, is a sign of the vast success of object technology and of its theoretical underpinnings.

This Special Section of the Journal of Object Technology (JOT) consists of the extended versions of two contributions selected among those presented at TOOLS Europe 2012. We picked these two pieces of work among those receiving the most positive reviews before the conference, raising substantial interest at the conference, and passing the muster of additional thorough refereeing for this Special Section after the conference. Besides being mature and high-quality research work in their own right, the two papers target topics that are indicative of the vitality of object technology even now that it has become commonplace.

Lilis and Savidis’s paper “An Integrated Approach to Source Level Debugging and Compile Error Reporting in Metaprograms” discusses techniques and tools to improve the readability and understandability of error reporting with metaprograms — that is, programs that generate other programs, such as the template programming constructs available in C++. Their solution is capable of tracing errors along the complete sequence of compilation stages and also targets aspects of IDE integration. It is also fully implemented and available for download: note the demonstration video linked to at the end of the article.

Wernli, Lungu, and Nierstrasz’s paper “Incremental Dynamic Updates with First-class Contexts” tackles a difficult problem frequently present in complex software systems that must be highly available: how to reduce the downtime required to perform system updates. Their solution hinges on turning contexts into first-class entities. Their Theseus system is thus capable of performing updates incrementally, with different threads running in parallel on different versions of the same class. The conference version of this paper also won the TOOLS 2012 Best Paper Award sponsored by the European Association for Programming Languages and Systems (EAPLS).

We are glad to be able to offer such an interesting Special Section to readers of JOT. We thank Antonio Vallecillo for suggesting this Special Section. We thank the anonymous referees for their punctual and dedicated work, instrumental in guaranteeing high quality presentations; and we thank the authors for choosing TOOLS Europe and JOT to present some of their most interesting research work.

2013/06/20

Changing of the guard

Filed under: Editorial — Jan Vitek @ 04:04
The Journal of Object Technology is the only open access academic publication dedicated to object-orientation in all its forms. Objects have been with me for my entire scientific career, it is thus an honor to take over from outgoing editor in chief Oscar Nierstrasz.  My goal  as the next editor of JOT is first and foremost to continue on the path blazed by Oscar, strengthening the scientific quality and increasing the readership of JOT.  One challenge that a journal like JOT faces is to find its proper place in the changing landscape of scientific publishing. Why should authors submit to JOT rather than to a conference or to another journal? Unlike most conferences, journals allow a dialogue between authors and reviewers, one that leads to improved papers rather than simple binary decisions. As to why JOT, I believe that our editorial board is unique in its composition and ensures that papers on topics related to object technology will receive some the best and most helpful expert reviews from world-renowned experts who share a passion for objects.

Jan Vitek

2013/01/25

Farewell editorial

Filed under: Editorial — Oscar Nierstrasz @ 11:41

It is my great pleasure to welcome Jan Vitek as incoming Editor-in-Chief of JOT. Jan is a long-time contributor to the object-oriented community and is well known for his research in various aspects of programming languages and software engineering, more specifically in the areas of dynamic languages, mobile computation, transactional memory and embedded systems.

It has been nearly three years since Bertrand Meyer invited me to take over as Editor-in-Chief from Richard Wiener, who had done an amazing job of building up JOT’s readership and providing a steady flow of provocative articles on a variety of topics.
There have been mainly two kinds of changes to JOT since then. The first is visible to readers: JOT has a new look, with the web site being driven largely by meta-data. This makes it much easier to keep the web site up-to-date and consistent, and makes it easier to add new features. The second set of changes are visible to authors: the review process is formalized and more rigorous. Despite the added rigor, the review process is very competitive with other journals, with accepted papers typically appearing within six months to a year of initial submission.

In order to make this work, JOT relies on a dedicated team of associate editors (listed in the Masthead), and a large pool of anonymous reviewers who contribute their time to carefully reviewing submissions. In addition to regular articles, JOT has a strong tradition of publishing special issues and special sections of revised, selected papers from workshops and conferences related to object technology. These are prepared by invited editors, usually the PC Chairs of the original event. Finally there is nothing to review if there is not a steady stream of submissions. I would therefore like to sincerely thank all the authors, anonymous reviewers and associate and invited editors who contributed to JOT over the past three years!

Finally, I would like to offer my best wishes to Jan Vitek and encourage him to explore new ways for JOT to serve the OO community.

Oscar Nierstrasz
2013-01-25

Lies, Damned Lies and UML2Java

Filed under: Column — richpaige @ 11:40

We review far too many research papers for journals and conferences. (Admittedly, we probably write too many papers as well, but that’s another story.) We regularly encounter misunderstandings, misconceptions, misrepresentations and plain old-fashioned errors related to Model-Driven Engineering (MDE): what it is, how it works, what it really means, what’s wrong with it, and why it’s yet another overhyped, oversold, overheated idea. Some of these misunderstandings are annoyingly common for us to want to put them down on the digital page and try to address them here. Perhaps this will help improve research papers, or it will make reviewing easier; perhaps it will lead to debate and argument; perhaps this list will be consigned to an e-bin somewhere.

Our modest list of the ten leading misconceptions — which is of course incomplete — is as follows.

1. MDE = UML

At least once a year we read an article or blog post or paper that assumes that MDE is equivalent to using UML for some kind of systems engineering. This is both incorrect and monotonously boring. The reality is that MDE neither depends on, or implies the use of UML: the engineering tasks that you carry out with MDE can be supported by any modelling language that (a) has a metamodel/grammar/well-defined structure; and (b) has automated tools that allow the construction and manipulation of models. Using UML does not mean you are doing MDE — you might be drawing UML diagrams as rough sketches, or to enable simulation/analysis, or for conceptual modelling. Doing MDE does not mean you must be using UML: you could be using your own awesome domain-specific languages, or another general-purpose language that has nothing to do with UML.

We have noticed that this misunderstanding appears less frequently today than it did five years ago; perhaps the message is slowly getting through. The misunderstandings might have started because of the way in which we often introduce MDE to students: conceptual or design modelling with UML is often the first kind of modelling that students see.

So, the good news is that if you’re doing MDE, you don’t have to use UML; and if you are using UML, you don’t have to do MDE. The bad news is that there are many other misconceptions out there waiting to pounce. We are just getting started.

2. MDE = UML2Java

Code generation is often the first use case that’s thought of, mentioned, dissected and criticised in any technical debate about MDE. “You can generate code from your models!” is the cry of the tool vendor. This is usually followed by the even more thrilling: “you can generate Java code from your UML models!” As exciting a prospect as this is, the overemphasis of code generation in discussions of MDE has led to the myth of the UML-to-Java transformation, and that it is the sole way of doing MDE. Without doubt, this is a legitimate MDE scenario that has been applied successfully many times. But as we mentioned earlier, you do not have to use UML to do MDE. Similarly, you don’t have to target Java via code generation to do MDE. Indeed, there is a veritable medley of programming languages you can choose! C#, Objective-C, Delphi, C++, Visual Basic, Cobol, Haskell, Smalltalk. All of these exciting languages can be targeted from your modelling languages using code generators.

It would be much more interesting to read about MDE scenarios that don’t involve the infamous UML2Java transformation — there are undoubtedly countless good examples that are out there. It’s always helpful to have a standard example that everyone can understand, but eventually a field of research has to move beyond the standard, trivial examples to something more sophisticated that pushes the capabilities of the tools and theories.

3. MDE ⇒ code generation

But what if you don’t care about code generation? Clearly you are a twisted individual: if you’re doing MDE you must be generating code, right? Wrong! Code generation — a specific type of model-to-text transformation — from (UML, DSML) models is just another legitimate MDE scenario. Code may not be a desirable visible output from your engineering process. You may be interested in constructing and assessing the models themselves — producing a textual output may not deliver any value to you. You may be interested in generating text from your models, but not executable code (e.g., HTML reports, input to verification tools). You may be interested in serialising your models so as to persist them in a database or repository.

However, if you are generating code from models, you are probably applying a form of MDE (the nuance is really whether your models have a precisely defined structure [metamodel] and whether or not your code generators are externalised — and can be reused).

4. MDE ⇒ transformation.

We’ve established that MDE is more than code generation. MDE is also about more than transformation.

Some problems cannot be easily solved with transformation. As advocates of MDE do we pack our bags and look for furrows that we can plough with model transformation techniques? Or can MDE still be of use?

Supporting decision making — helping stakeholders to reason about trade-offs between competing and equally attractive solutions to a problem — is an area in which models and MDE are increasingly used. (See the wonderful world of enterprise architecture modelling for examples). Code, software or computer systems are not necessarily central to these domains, and transformation does little more for us than produce a nicely formatted report. Instead, we need to consider exploiting other state-of-the-art software engineering techniques alongside typical MDE fare. Perhaps search-based software engineering (i.e. describing what a solution looks like) is preferable to model transformation (i.e. describing how an ideal solution is constructed) in some cases. We have done work in this area at our university [DOI: 10.1007/978-3-642-31491-9_32], and there is growing interest in this topic.

Transformation is powerful. Refactoring, merging, weaving, code generating and many other exciting verb-ings would not be possible without transformation theory and tools. However, models are ripe for other types of analysis and decision support and for these tasks, transformation is often not the right approach. In 2003 model transformation was characterised as the heart-and-soul of MDE. In 2013 we believe that a more well-rounded view is preferable.

5. “The MDE process is inflexible.”

This was an actual quote from a paper we once had to review for a conference. It was both a strange sentence and an interesting one, because we didn’t know what it meant. Just what is “the MDE process”? Did we miss the fanfare associated with its announcement? Arguably “process” and MDE are orthogonal: if you are constructing well-defined models (with metamodels) and using automated tools to manipulate your models (e.g., for code generation) then you are carrying out MDE; the process via which you construct your models and metamodels and manipulate your models is largely independent. You could apply the spiral model, or V-model, or waterfall. You could embed, within one of these processes, the platform-independent/platform-specific style of development inherent in approaches like Model-Driven Architecture (MDA). There is no MDE process, but by carrying out MDE you are likely to follow a process, which may or may not be made explicit.

6. MDE = MOF/Ecore/EMF

You must conform to the Eclipse world. Or the OMG world. You must define your models and metamodels with MOF or Ecore. You will be assimilated.

This is, of course, nonsense. MOF and Ecore are perfectly lovely and useful metamodelling technologies that have served numerous organisations well. But there are other perfectly lovely and useful metamodelling technologies that work equally well, such as GOPRR, or MetaDepth, or even (shock horror) pure XML. Arguably, the humble spreadsheet is the most widely used and most intuitive metamodelling tool in the world.

MDE has nothing to do with how you encode your models and metamodels; it has everything to do with what you do with them (manipulate them using automated tools; build them with stakeholders). Arguably, you should be able to do MDE without worrying about how your models are encoded — a principle that we have taken to heart in the Epsilon toolset that we have developed at our university.

7. Model transformation = Refinement

Refinement is a well-studied notion in formal methods of software engineering: starting from an abstract specification, you successively “transform” your specification into a more concrete one that is still semantics-preserving. In some formal methods, the transformations that you apply are taken from a catalogue of so-called refinement rules (which provably preserve semantics). Their application ultimately results in a specification that is semantically equivalent to an executable program. The refinement process thus produces a program that is “correct-by-construction”.

You can follow the logical (mis-)deduction behind this misconception quite easily:

  • Refinement rules transform specifications.
  • Specifications are models (see earlier misconceptions).
  • Model transformations are a set of transformation rules.
  • Transformation rules transform models.
  • Therefore, refinement rules are transformation rules.
  • Therefore, refinement is transformation.

This is actually OK. Refinement is a perfectly legitimate form of model transformation. The problem is with the reverse inference, i.e., that a transformation rule is a refinement rule. If you assume that transformations must be semantics preserving, then this is not an unreasonable conclusion to draw. But model transformations need not preserve semantics.

Heretical statements like this usually generate one of several possible responses:

  • “This is crazy: why would I want to transform a model (which I have lovingly crafted and bestowed with valid properties and attributes) into something that is manifestly different, where information is lost?”
  • “OK, I can see that you might write a transformation that does not preserve semantics, but they must be dangerous, so we just need to be able to identify them and isolate them so that they never get deployed in the wild.”
  • “I don’t have to preserve semantics? That’s a relief! Semantics preserving transformations are a pain to construct anyway!”

These responses are all variants of misunderstandings we have seen previously: this idea that MDE is equated to a specific scenario or instance of application.

The first misunderstanding is, of course, confusing a specific category of model transformation — those that preserve semantics — with all model transformations. What are some examples of non-semantics preserving transformations? They are legion: measurement applied to UML diagrams is a classic example, where we transform a UML diagram into a number. The transformation process calculates some kind of (probably object-oriented) metric. Another example is from model migration: updating a model because its metamodel has changed. In some scenarios, a metamodel changes by deleting constructs; the model migration transformation likely needs to delete all instances of those constructs. This is clearly not semantics preserving.

The second misunderstanding is the classical “Well, you can do it but don’t expect me to like it” response. Unfortunately, in many real model transformation scenarios, you have to break semantics, and you probably need to enjoy it too. Consider a transformation scenario where we want to transform a very large model (e.g., consisting of several hundred thousand elements) conforming to a very large metamodel (like MARTE, AUTOSAR, SysML etc) into another very large model conforming to a different very large metamodel. Because we are good software engineers, we are likely to want to break this probably very large and complicated transformation problem into a number of smaller ones (see, for example, Jim Cordy’s excellent keynote at GPCE/SLE 2009 in Denver), which then need to be chained together. Each of the individual (smaller) transformations need not preserve semantics — indeed, some of the transformations may be to intermediate convenience languages that exist solely to make complex processing easier.

8. MDE can’t possibly work for real systems engineering because it doesn’t work well in complex domains where there is domain uncertainty.

In systems engineering we often have to cope with domain uncertainty — we don’t fully understand the threats and risks associated with a domain until we have got a certain way along the path towards developing a system. If there is domain uncertainty then the modelling languages that have been chosen, and the operations that we apply to our models (e.g., transformations, model differencing, mergings) are liable to change, and this becomes expensive and time-consuming to deal with. Domain uncertainty is a real problem — for any systems engineering technique, whether it is model-based, code-based or otherwise. Domain uncertainty will always lead to change in systems engineering. The question is: does MDE make handling the change associated with domain uncertainty any worse? Perhaps it does. If you’re using domain-specific modelling languages, then changes will often result in modifications to your modelling languages (and thereafter corresponding changes to your model transformations, constraints etc). If you are using code throughout development, changes due to domain uncertainty will be reflected in changes to your architecture, detailed modular design, protocols, algorithms, etc. Arguably, these are problems of similar conceptual complexity — it’s hard to see how MDE makes things worse, or indeed better: it’s an essentially hard problem of system engineering.

9. Metamodels never change

As we saw in the first misconception, MDE is not only about UML, but also about defining and using other modelling languages. However, when we (or you, or the OMG) design a modelling language, even a small one, we rarely get it right the first time. Or the fifth time. Or the ninth time. Like all forms of domain modelling, constructing a metamodel is difficult and requires consideration of many trade-offs. Language evolution is the norm, not the exception.

Despite this, we often encounter work that:

  • Does not consider or discuss tradeoffs made in language design. These kinds of papers often leave us wondering why a domain was modelled in a particular way (e.g. “why model X as a class, and Y as an association?” “why model with classes and associations at all?”).
  • Presents the product of language design, but not the process itself. How was the language designed? Did it arrive fully formed in the brain of a developer, or were their interesting stories and lessons to be learnt about its construction?
  • Proposes standardisation of domain X because “there is a metamodel.” A metamodel is often necessary for standardisation, it is not sufficient. (For example, does your favourite transformation language implement all of the QVT specification? We bet it doesn’t — and shame on you, of course!)
  • Contributes extensions to — or changes to — existing languages with little regard for the impact of these changes on models, transformations or other artefacts. Even in UML specifications, the impact of language evolution is not made apparent: there are no clear migration paths from one version to another, as we discovered at the 2010 Transformation Tool Contest (see also the forum discussion on UML migration).

Misconceptions about language evolution might stem from the way in which we typically go about defining a modelling language with contemporary MDE tools. We normally begin by defining a metamodel/grammar, then construct models that use (conform to) that metamodel/grammar, and then write model transformations or other model management operations. The linearity in this workflow is reminiscent of Big Design Up Front, and evokes painful memories of waterfall processes for software development.

However, we have found that designing a modelling language — like many other software engineering activities — is often best achieved in an iterative and incremental manner. We are not alone in this observation. Several recent modelling and MDE workshops (XM, ME, FlexiTools) have included work on inferring metamodels/grammars from example models; relaxing the conformance relationship (typing) of metamodels; and propagating metamodel changes to models automatically and semi-automatically. These are promising first steps towards introducing incrementality and flexibility into our domain-specific modelling tools, but the underlying issue is rather more systematic. As a community, we need to acknowledge that changing metamodels are the norm, and to better prepare to embrace change.

10. Modelling ≠ Programming

There is a tendency in many papers that we read to put a brick wall between modelling and programming — to treat them as conceptually different things that can only be bridged via transformations (created by these magical wizards, or transformation engineers). We’ve seen this type of thing before, in the 1980s, with programming and specification languages in formal methods. Some specification languages like Z were perfectly useful for specifying and reasoning, but were difficult to use for transition to code. Wide-spectrum languages, that unified programs and specifications in one linguistic framework (e.g., Carroll Morgan’s specification statements, Eric Hehner’s predicative programming, Ralph Back’s refinement calculus), did not have these difficulties. Treating models and programs in a unified framework — as artefacts that enable system engineering — would seem to have conceptual and technical benefits, and would allow us to have fewer academic arguments about their differences (and more arguments down at the pub).

Well, we lied when we said there were only ten misconceptions.

11. MDE = MDA

We end with a real chestnut: that MDE is the same thing as MDA.

MDA first appeared via the OMG back in 2001. It is a set of standards — including MOF, CWM and UML — as well as a particular approach to systems development, where business and application logic are separated from platform technology — the infamous PIM/PSM separation. MDE is more general than MDA: it does not require use of MOF, UML or CWM, nor for platform-specific and platform-independent logic and concerns to be kept separate. MDE does require the construction, manipulation and management of well-defined and structured models — but you don’t have to make use of OMG standards, or a particular style of development to do it.

So, for you authors out there: when you say that you have an MDA-based approach, please be sure that you really mean it. Are you using MOF and UML? Are you reliant on a PIM/PSM separation? If so, great! Carry on! If not, please think again, and prevent us from complaining loudly and publicly on Twitter.

The End

We have to stop somewhere. These are just a few of the misconceptions, myths, and misunderstandings related to MDE we’ve encountered. Do send us your own!

About the Authors

Richard Paige is a professor at the University of York, and complains bitterly about everything MDE on Twitter (@richpaige). He also likes really bad films. His website is http://www.cs.york.ac.uk/~paige

Louis Rose (@louismrose) is a lecturer at the University of York. He wrangles Java into the Epsilon MDE platform, tortures undergraduate students with tales of enterprise architecture, and is regularly defeated at chess. His research interests include software evolution, MDE and — in collaboration with Richard — evaluating the effects of caffeine on unsuspecting research students. His website is http://www.cs.york.ac.uk/~louis

Older Posts »

Powered by WordPress