2010/08/26

Ten Things I Hate About Object-Oriented Programming

Filed under: Editorial — Tags: — Oscar Nierstrasz @ 17:25

Boy, I some days I really hate object-oriented programming.

Apparently I’m not the only one. In the immortal words of Edsger Dijkstra: “Object-oriented programming is an exceptionally bad idea which could only have originated in California.”

Well, I’m not normally one to complain, but I think it is time to step back and take a serious look at what is wrong with OOP. In this spirit, I have prepared a modest list of Ten Things I Hate About Object-Oriented Programming.

1. Paradigm

What is the object-oriented paradigm anyway? Can we get a straight story on this? I have heard so many different versions of this that I really don’t know myself what it is.

If we go back to the origins of Smalltalk, we encounter the mantra, “Everything is an object”. Except variables. And packages. And primitives. And numbers and classes are also not really objects, and so on. Clearly “Everything is an object” cannot be the essence of the paradigm.

What is fundamental to OOP? Peter Wegner once proposed that objects + classes + inheritance were essential to object-oriented languages [http://doi.acm.org/10.1145/38807.38823]. Every programming language, however, supports these features differently, and they may not even support them as built-in features at all, so that is also clearly not the paradigm of OOP.

Others argue convincingly that OOP is really about Encapsulation, Data Abstraction and Information Hiding. The problem is that some sources will tell you that these are just different words for the same concepts. Yet other sources tell us that the three are fundamentally different in subtle ways.

Since the mid-eighties, several myths have been propagated about OOP. One of these is the Myth of Reuse, which says that OOP makes you more productive because instead of developing your code from scratch, you can just inherit from existing code and extend it. The other is the Myth of Design, which implies that analysis, design and implementation follow seamlessly from one another because it’s objects all the way down. Obviously neither of these candidates could really be the OO paradigm.

Let’s look at other paradigms which offer a particular way to solve programming problems. Procedural programming is often described as programs = data + algorithms. Logic programming says programs = facts + rules. Functional programming might be programs = functions + functions. This suggest that OOP means programs = objects + messages. Nice try, but this misses the point, I think.

For me the point of OOP is that it isn’t a paradigm like procedural, logic or functional programming. Instead, OOP says “for every problem you should design your own paradigm”. In other words, the OO paradigm really is: Programming is Modeling

2. Object-Oriented Programming Languages

Another thing I hate is the way that everybody loves to hate the other guy’s programming language. We like to divide the world into curly brackets vs square brackets vs round brackets.

Here are some of the nice things that people have said about some of our favorite OOPLs:

“C makes it easy to shoot yourself in the foot; C++ makes it harder, but when you do, it blows away your whole leg.”

It was Bjarne Stroustrup who said that, so that’s ok, I guess.

“Actually I made up the term ‘object-oriented’, and I can tell you I did not have C++ in mind.” — Alan Kay

“There are only two things wrong with C++: The initial concept and the implementation.” — Bertrand Meyer

“Within C++, there is a much smaller and cleaner language struggling to get out.” — Bjarne Stroustrup

“C++ is history repeated as tragedy. Java is history repeated as farce.” — Scott McKay

“Java, the best argument for Smalltalk since C++.” — Frank Winkler

“If Java had true garbage collection, most programs would delete themselves upon execution.” — Robert Sewell

But perhaps the best blanket condemnation is the following:

“There are only two kinds of languages: the ones people complain about and the ones nobody uses.” — Bjarne Stroustrup

3. Classes

Classes drive me crazy. That might seem strange, so let me explain why.

Clearly classes should be great. Our brain excels at classifying everything around us. So it seems natural to classify everything in OO programs too.

However, in the real world, there are only objects. Classes exist only in our minds. Can you give me a single real-world example of class that is a true, physical entity? No, I didn’t think so.

Now, here’s the problem. Have you ever considered why it is so much harder to understand OO programs than procedural ones?

Well, in procedural programs procedures call other procedures. Procedural source code shows us … procedures calling other procedures. That’s nice and easy, isn’t it?

In OO programs, objects send messages to other objects. OO source code shows us … classes inheriting from classes. Oops. There is a complete disconnect in OOP between the source code and the runtime entities. Our tools don’t help us because our IDEs show us classes, not objects.

I think that’s probably why Smalltalkers like to program in the debugger. The debugger lets us get our hands on the running objects and program them directly.

Here is my message for tool designers: please give us an IDE that shows us objects instead of classes!

4. Methods

To be fair, I hate methods too.

As we have all learned, methods in good OO programs should be short and sweet. Lots of little methods are good for development, understanding, reuse, and so on. Well, what’s the problem with that?

Well, consider that we actually spend more time reading OO code than writing it. This is what is known as productivity. Instead of spending many hours writing a lot of code to add some new functionality, we only have to write a few lines of code to get the new functionality in there, but we spend many hours trying to figure out which few lines of code to write!

One of the reasons it takes us so long is that we spend much of our time bouncing back and forth between … lots of little methods.

This is sometimes known as the Lost in Space syndrome. It has been reported since the early days of OOP. To quote Adele Goldberg, “In Smalltalk, everything happens somewhere else.”

I believe that the code-oriented view of today’s IDEs is largely to blame — given that OO code does not accurately reflect the running application, the IDE gets in our way instead of helping us to bridge the gap. Another reason I believe that Smalltalkers like to develop in the debugger is that it lets them clearly see which objects are communicating with which other objects. I am guessing that one of the reasons that Test-Driven Development is popular is that it also exposes object interactions during development.

It is not OOP that is broken — we just haven’t figured out (after over 40 years) how best to develop with it. We need to ask ourselves: Why should the source code be the dominant view in the IDE?

I want an IDE that lets me jump from the running application to the code and back again. (For a demonstration of this idea, have a look at the Seaside web development platform which allows you to navigate directly from a running web application to the editable source code. [http://seaside.st])

5. Types

OK, I admit it. I am an impatient guy, and I hate having to say everything twice. Types force me to do that.

I’m sure some of you are thinking — “Oh, how could you program in an untyped language. You could never be sure your code is correct.”

Of course there is no such thing as an “untyped” programming language — there are just statically and dynamically typed ones. Static types just prevent you from writing certain kinds of code. There is nothing wrong with that, in principle.

There are several problems, however, with types as we know them. First of all they tend to lead to a false sense of security. Just because your Java program compiles does not mean it has no errors (even type errors).

Second of all, and much more evil, is that type systems assume the world is consistent, but it isn’t! This makes it harder to write certain useful kinds of programs (especially reflective ones). Type systems cannot deal well with the fact that programs change, and that different bits of complex systems may not be consistent.

Finally, type systems don’t cope well with the fact that there are different useful notions of types. There is no one type system to rule them all. Recall the pain we experienced to extend Java with generics. These days there are many interesting and useful type systems being developed, but we cannot extend Java to accommodate them all. Gilad Bracha has proposed that type systems should not only be optional, in the sense that we should be able to run programs even if the type system is unhappy, but that they should be pluggable, meaning that we can plug multiple type systems into different parts of our programs. [http://bracha.org/pluggableTypesPosition.pdf] We need to take this proposal seriously and explore how our languages and development tools can be more easily adapted to diverse type systems.

6. Change

“Change is inevitable — except from a vending machine.” — Robert C. Gallagher

We all hate change, right? So, if everyone hates change, why do we all complain when things don’t get better? We know that useful programs must change, or they degrade over time.

(Incidentally, you know the difference between hardware and software? Hardware degrades if you don’t maintain it.)

Given that real programs must change, you would think that languages and their IDEs would support this. I challenge you, however, to name a single programming language mechanism that supports change. Those mechanisms that do deal with change restrict and control it rather than enable it.

The world is not consistent, but we can cope with that just fine. Context is a great tool for managing change and inconsistency. We are perfectly comfortable adapting our expectations and our behavior in our daily lives depending on the context in which we find ourselves, but the programs we write break immediately if their context changes.

I want to see context as a first-class concept in OO languages and IDEs. Both source code and running software should be able to adapt to changing context. I believe that many design patterns and idioms (such as visitors, and dependency injection) are simply artifacts of the lack of support for context, and would disappear if context were available as a first-class construct.

7. Design Patterns

Patterns. Can’t live with ’em, can’t live without ’em.

Every single design pattern makes your design more complicated.

Visitors. I rest my case.

8. Methodologies

“All methodologies are based on fear.” — Kent Beck

Evidently some of my students follow the Chuck Norris school of Agile Development:

“Chuck Norris pairs alone.”

“Chuck Norris doesn’t do iterative development. It’s right the first time, every time.”

“Chuck Norris doesn’t do documentation. He stares down the code until it tells him everything he wants to know.”

9. UML

Bertrand Meyer tells this story about always wondering why diagrammatic modeling languages were always so popular, until one day it hit him: “Bubbles don’t crash.” I believe his point is that OO languages are modeling languages. (AKA “All you need is code”)

There similarly appears to be something fundamentally wrong with model-driven development as it is usually understood — instead of generating code from models, the model should be the code.

By analogy, when FORTRAN was invented, it was sold as a high-level language from which source code would be generated. Nowadays we think of the high-level languages as being the source code.

I like to think that one day, when we grow up, perhaps we will think of the model as being the source code.

10. The Next New Thing

Finally, I hate the catchphrase: “Objects are not enough. We need …” Over the years we have needed frameworks, components, aspects, services (which, curiously, seems to bring us back to procedural programming!).

Given the fact that objects clearly never were enough, isn’t it odd that they have served us so well over all these years?

Conclusion?

25 years ago we did not expect object-oriented programming to last as a “new” phenomenon for so long. We thought that OO conferences like ECOOP, OOPSLA and TOOLS would last for 4 or 5 years and then fade into the mainstream. It is too soon to dismiss OOP as just being part of the mainstream. Obviously we cannot feel passionately about something that does not interest us. The fact that academic and industrial research is still continuing suggests that there is something deep and important going on that we do not yet fully understand.

OOP is about taming complexity through modeling, but we have not mastered this yet, possibly because we have difficulty distinguishing real and accidental complexity.

I believe that to make further progress we must focus on change and how OOP can facilitate change. After all these years, we are still in the early days of OOP and understanding what it has to offer us.

Oscar Nierstrasz
[Banquet speech given at ECOOP 2010. Maribor, June 24, 2010]

2010/08/01

The Trouble with Configuration Management

Filed under: Column — John McGregor @ 02:19

The trouble with configuration management in some large technical organizations is that it is not just configuration management. An organization often assigns to a single role responsibility for software builds, configuration management (which often includes change management), and releases of products to customers. While these responsibilities are mutually dependent they are distinct roles, have different goals, and require very different skills. When we evaluate a product line organization this role is often identified as a source of problems. One or more of the dimensions is either neglected or incorrectly executed by the personnel assigned to the position.

There are several issues related to this approach. People in this integrated role are often not familiar with how software development will be carried out on a specific project and as a result they will often design a repository structure that does not permit, or at least hinders, parallel development. This has to do with the grain size of each versioned blob and the dependencies among blobs. A poor structure results in the possibility of two people needing to working on the same file at the same time. Developers who are unaware of the structure of the repository or who do not understand how that structure relates to the structure of the products will not be able to write build scripts that are sufficiently modular to be reused at every level up the aggregation hierarchy and may codify dependencies incorrectly resulting in an improperly linked module.

The CM staff will also define one process that is applied to both development activities and delivery activities. While this makes the release part of the CM job simpler, it slows down the development process which needs quick, frequent access to repositories to commit, update, and retrieve program elements. Some companies address this by having separate development and release CM processes without sufficiently differentiating between the processes.

First I will talk about the goals of each role, then the required skills, and finally dependencies among the roles. I will describe a company doing it right and will briefly describe how they perform these three functions.

Goals

Each of these roles has a specific goal that distinguishes it from the other roles.

The goal of the build role is to provide an executable that reflects the current development state of the product. Often legacy modules are not recompiled during a build. This is sometimes because of the time it would take to build the module but more often it is because there is a risk that the module will not build in the current environment. The result of partial recompilation can be that other modules that have changed and should be rebuilt aren’t.  Building is a sufficiently complex task that many organizations require an independent auditor to witness the actual building to affirm that the correct modules were used to produce an executable with no errors.

The goal of the (real) configuration management role is to allow controlled changes to the source code by multiple people at the same time while protecting the integrity of previous work. This sounds simple, but in an environment such as a software product line it is not so simple. The structure of the repository of versions of the code facilitates specific development patterns more than others. The CM role must anticipate the eventual complexity of a development effort that will involve multiple players for multiple products. In a software product line the emphasis is on multiple references from multiple products to single copies of assets.

The goal of release management is to deliver to the customer a complete and consistent set of modules and resources that represent a correct product at a specific point in time. The product release and related resources such as the installation mechanism must be tested and the release manager determines whether particular features meet quality standards. Each release represents a build that has a permanent life in the CM system.

Skills

The build role needs an understanding of the product architecture, compile and link time variation points, and, obviously, the build tools. The build manager assures that the final build and installation package includes all the required resources including appropriate licenses.  In a software product line, the build script for a product specifies a configuration of the product with compile and link time variations resolved.  The script is placed under management because it captures the variant choices. The build role provides build automation to the development and test staff. In many projects every commit is a compile and test before the commit is confirmed. If every developer commits every day the organization has “always running” code. I have developed for many years using this style and it is a very efficient approach, but it is only feasible if the builds and tests are automatic.

The configuration management role requires knowledge of design, variation mechanisms, and the development process. The structure of the repository must support the approach taken to development. Some parallelism in work is required. The CM process should support concurrent work in a baseline. One way this happens is by having assigned ownership of assets so that only one person does work in a specific asset. Or at least having a separate trunk for each asset with only one person allowed in the asset at a time. Newer build tools allow logical references to files removing the need to physically copy code, but many organizations still do copy and are then faced with a diverging code base.

Release management requires an individual who understands the development process, the needs of the customer, the install time variations, and the paperwork required to support a release.  In many companies the release manager is responsible for negotiating with the customer and development team for what will be ready for delivery in the next release and by what date. The manager determines that every module in the release has been through the full life cycle and is ready for release. The release manager is also responsible for assembling the installation package. Automation is critical to be able to exactly repeat a build and to quickly get ready for the next build.

Every project should establish and maintain a regular schedule of releases.  This steady rhythm will establish a positive expectation on the part of clients. Many large open source projects release nightly builds, which are less well tested than the periodic stable builds. Software product line organizations often make both core asset and product releases on a regular schedule. Eclipse creates a stable, deliverable build for a large percentage of its project twice a year with nightly builds in-between. While the rhythms may vary from one organization to another, this is still a useful approach.

Dependencies

The three roles all manipulate the source code for the products. Essentially CM manages a set of files, some subset of which are selected to be built together, and the result of that build is the key element in a release. While the build process could take place with almost any physical organization of the code, the CM role can facilitate the build process through the structures that are chosen. An organization of the code that reflects some logical structure can reduce the chances of build errors. Clear labeling of versions allows for the automatic generation of new build scripts. The Build role needs to understand the content of a release and know any specific constraints for resolving any compile/link time variations.

Figure 1 Interdependencies

CM needs to know logical dependencies among the source files. The CM role captures a complete snapshot of its database for each release. This does not have to be a separate copy. It can be a versioned script that specifies a specific version for every element in the release. This goes beyond the scope of a “build” which is limited to the elements needed to compile and link the code to produce an executable. The copy includes all tests, test results, the development environment, and more.

The release role needs to know the exact contents of a release, which means the contents of one of many builds and the structure of the repository. If build scripts are managed in the repository by the CM function than the release role can retrieve a specific one at any time.

A Modest Proposal

Much of this confusion can be clarified by automating the build process and treating the build script, which may be more than a single text file, as a first class object. In fact, the build script should be thought of as the product. The script is placed under management and is versioned as changes are made to the contents of the product. The CM role creates a structure for the product that will include the build script and any product unique code. The build role creates the script and submits it to the configuration management system. The release of a product simply requires that the build script be baselined, references into the core asset base are changed from relative to absolute addresses, a separate branch is created to assemble the elements of the product.

Even if the three roles are assigned to a single person, this is a sufficiently simple process that can be handled by a single person. Many tools such as Eclipse and Visual Studio support this approach with integrated build tools. Eclipse has several projects such as Buckminster and b3 that manage the build process (and much more) and consist of artifacts that can be managed in the configuration management system.

Case in Point

My example company will have to remain anonymous but it is a major software development house specializing in a single domain but producing a software product line of consumer and professional products in that domain. This company separates the build, CM, and release roles and is large enough to have several people in each role.

Build – Building a product begins with builds of individual executables. These builds are initiated by development team members.  The owner of a module develops a build script for that module. Successive layers of integration also produce build scripts that utilize the lower level scripts to build from the ground up. Finally each product has a script that automates the instantiation of the lower level pieces. Since a fundamental responsibility of the build role is to ensure that the correct code is compiled and linked, this bottom up aggregation of scripts provides a natural traceability mechanism.

CM – The configuration management is largely automated at the start of a project. The structure of the core asset base has been fixed for some time and each new product sets up to utilize that base in their scheme. The developer sets up a directory structure for their code and includes references to each core asset used rather than copying the asset into their project. The CM role establishes and evolves the basic structure and is responsible for “baselining” deliveries, which only requires creating a branch for the build scripts and making variable references in the script to the latest versions of assets into fixed references to specific versions of those assets instead.

Release management – The role of release manager carries both responsibility and authority.  Once a delivery date is set and the scope of the release is negotiated with all development streams, the release manager has the authority to remove a feature from a product if including that feature puts the delivery date at risk.   The manager also has the responsibility to ensure that any included features have been appropriately tested and have satisfied quality requirements. The release itself is based on the baselined build scripts that have been used from initial development through testing and now to release.

Summary

These infrastructure roles are important to the success of any software development effort but they are particularly critical in a software product line organization. The role definitions and training for each role need to be sharply focused and we need to push for trained personnel in each role. In the event that two or more of these roles are assigned to the same person, controls should be put in place to ensure even-handed and competent treatment in each of the three areas.

Running through this entire discussion is the notion of automation. There are many tools that can be used make building, change management, and product release much more repeatable and robust. Most of these tools support a variety of process structure and styles but most also make the lines between the three roles much more discernable.

Powered by WordPress