Archive for the ‘Software process’ Category.

Understanding and assessing Agile: free ACM webinar next Wednesday

ACM is offering this coming Wednesday a one-hour webinar entitled Agile Methods: The Good, the Hype and the Ugly. It will air on February 18 at 1 PM New York time (10 AM West Coast, 18 London, 19 Paris, see here for more cities). The event is free and the registration link is here.

The presentation is based on my recent book with an almost identical title [1]. It will be a general discussion of agile methods, analyzing both their impressive contributions to software engineering and their excesses, some of them truly damaging. It is often hard to separate the beneficial from the indifferent and the plain harmful, because most of the existing presentations are of the hagiographical kind, gushing in admiration of the sacred word. A bit of critical distance does not hurt.

As you can see from the Amazon page, the first readers (apart from a few dissenters, not a surprise for such a charged topic) have relished this unprejudiced, no-nonsense approach to the presentation of agile methods.

Another characteristic of the standard agile literature is that it exaggerates the contrast with classic software engineering. This slightly adolescent attitude is not helpful; in reality, many of the best agile ideas are the direct continuation of the best classic ideas, even when they correct or adapt them, a normal phenomenon in technology evolution. In the book I tried to re-place agile ideas in this long-term context, and the same spirit will also guide the webinar. Ideological debates are of little interest to software practitioners; what they need to know is what works and what does not.


[1] Bertrand Meyer, Agile! The Good, the Hype and the Ugly, Springer, 2014, see Amazon page here, publisher’s page here and my own book page here.

VN:F [1.9.10_1130]
Rating: 8.8/10 (13 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

Awareness and merge conflicts in distributed development (new paper)

Actually not that new: this paper [1] was published in August of last year. It is part of Christian Estler’s work for this PhD thesis, defended a few weeks ago, and was pursued in collaboration with Martin Nordio and Carlo Furia. It received the best paper award at the International Conference on Global Software Engineering; in fact this was the third time in a row that this group received the ICGSE award, so it must have learned a few things about collaborative development.

The topic is an issue that affects almost all software teams: how to make sure that people are aware of each other’s changes to a shared software base, in particular to avoid the dreaded case of a merge conflict: you and I are working on the same piece of code, but we find out too late, and we have to undergo the painful process of reconciling our conflicting changes.

The paper builds once again on the experience of our long-running “Distributed and Outsourced Software Engineering” course project, where students from geographically spread universities collaborate on a software development [2]. It relies on data from 105 student developers making up twelve development teams located in different countries.

The usual reservations about using data from students apply, but the project is substantial and the conditions not entirely different from those of an industrial development.

The study measured the frequency and impact of merge conflicts, the effect of insufficient awareness (no one told me that you are working on the same module that I am currently modifying) and the consequences for the project: timeliness, developer morale, productivity.

Among the results: distribution does not matter that much (people are not necessarily better informed about their local co-workers’ developments than about remote collaborators); lack of awareness occurs more often than merge conflicts, and causes more damage.



[1] H-Christian Estler, Martin Nordio, Carlo A. Furia and Bertrand Meyer: Awareness and Merge Conflicts in Distributed Software Development, in proceedings of ICGSE 2014, 9th International Conference on Global Software Engineering, Shanghai, 18-21 August 2014, IEEE Computer Society Press (best paper award), see here.

[2] Distributed and Outsourced Software Engineering course and project, see here. (The text mentions “DOSE 2013” but the concepts remains applicable and it will be updated.)

VN:F [1.9.10_1130]
Rating: 9.8/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

The Eiffel Documentation Drive

EiffelStudio releases are semi-annual, end of May and end of November. Release 14-05 just came out. The next release (14-11) is entirely devoted to documentation. We are hoping for extensive community involvement in this first-time Eiffel Documentation Drive.

Many people regularly comment that there is not enough Eiffel and EiffelStudio documentation, and some of what exists is not good enough. We have decided to tackle the problem seriously, hence the dedication of an entire release cycle to documentation. The term is taken here in a broad sense: “documentation” means what is at, but also everything else that can help understand Eiffel, for example updating Wikipedia entries on topics for which Eiffel has something to offer.

Anyone with an understanding of an Eiffel-related topic can help. We particularly need help from two (non-disjoint) categories of contributors

  • Those with a good understanding of one or more Eiffel-related topics.
  • Those with good writing skills.

The process will involve reviewing, so if you are an Eiffelist with moderate taste for writing, or a good writer with incomplete knowledge of Eiffel, we need your help anyway; someone else will compensate for the missing side. In particular, a common criticism is that some of the documentation was written by developers who do not have English as their mother tongue; if you can help improve it everyone will benefit. Of course if you are good at both technology and writing it’s even better.

We are mentioning English because it is the first target, but documentation in other languages, either original or a translation of existing English pages, is needed too.

Here is how the Eiffel Documentation Drive works:

  • Here you will find a form to report missing or unsatisfactory documentation. Please fill it on every applicable occasion.
  • The entries will be read by a member of the Eiffel Software team, who in applicable cases will add a row to the Eiffel Documentation Drive spreadsheet here. You can not only read that spreadsheet but also edit it yourself, so as to keep it as accurate and up-to-date as possible.
  • An email will be sent to the user list, with “Eiffel Documentation Drive” in the header (so that people not interested in the topic can filter them out), requesting help.
  • Those willing to help can enter their names in the corresponding row, indicating a planned date of completion.

Each row includes among its fields the following: topic, link to existing documentation, volunteer writer(s), planned completion, volunteer reviewer(s).

The full Eiffel Software team will participate – as noted above, improving the documentation is the strategic goal for the release – but we hope for considerable community participation. Please help make EiffelStudio documentation shine as much as the environment itself.

VN:F [1.9.10_1130]
Rating: 9.0/10 (9 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

Accurately Analyzing Agility

Book announcement:

Agile! The Good, the Hype and the Ugly
Bertrand Meyer
Springer, 2014 (just appeared)
Book page: here.
Amazon page: here.
Publisher’s page: here

A few years ago I became fascinated with agile methods: with the unique insights they include; with the obvious exaggerations and plainly wrong advice they also promote; and perhaps most of all with the constant intermingling of these two extremes.

I decided to play the game seriously: I read a good part of the agile literature, including all the important books; I sang the song, became a proud certified Scrum Master; I applied many agile techniques in my own work.

The book mentioned above is the result of that study and experience. It is both a tutorial and a critique.

The tutorial component was, I felt, badly needed. Most of the agile presentations I have seen are partisan texts, exhorting you to genuflect and adopt some agile method as the secret to a better life. Such preaching has a role but professionals know there is no magic in software development.  Agile! describes the key agile ideas objectively, concretely, and as clearly as I could present them. It does not introduce them in a vacuum, like the many agile books that pretend software engineering did not exist before (except for a repulsive idea, the dreaded “waterfall”). Instead, it relates them to many other concepts and results of software engineering, to which they bring their own additions and improvements.

Unfortunately, not all the additions are improvements. Up to now, the field has largely been left (with the exception of Boehm’s and Turner’s 2005 “Guide for the Perplexed“) to propaganda pieces and adoring endorsements. I felt that software developers would benefit more from a reasoned critical analysis. All the more so that agile methods are a remarkable mix of the best and the worst; the book carefully weeds out — in the terminology of the title — the ugly from the hype and the truly good.

Software developers and managers need to know about the “ugly”: awful agile advice that is guaranteed to harm your project. The “hype” covers ideas that have been widely advertised as shining agile contributions but have little relevance to the core goals of software development. The reason it was so critical to identify agile ideas belonging to these two categories is that they detract from the “good”, some of it remarkably good. I would not have devoted a good part of the last five years to studying agile methods if I did not feel they included major contributions to software engineering. I also found that some of these contributions do not get, in the agile literature itself, the explanations and exposure they deserve; I made sure they got their due in the book. An example is the “closed-window rule”, a simple but truly brilliant idea, of immediate benefit to any project.

Software methodology is a difficult topic, on which we still have a lot to learn. I expect some healthy discussions, but I hope readers will appreciate the opportunity to discuss agile ideas in depth for the greater benefit of quality software development.

I also made a point of writing a book that (unlike my last two) is short: 190 pages, including preface, index and everything else.

The table of contents follows; more details and sample chapters can be found on the book page listed above.

     1.1 VALUES
          Organizational principles
          Technical principles
     1.3 ROLES
     1.4 PRACTICES
          Organizational practices
          Technical practices
     1.5 ARTIFACTS
          Virtual artifacts
          Material artifacts
          Not new and not good
          New and not good
          Not new but good
          New and good!

          Proof by anecdote
          When writing beats speaking
          Discovering the gems
          Agile texts: reader beware!
          Proof by anecdote
          Slander by association
          Unverifiable claims
          Postscript: you have been ill-served by the software industry!

          Requirements engineering techniques
          Agile criticism of upfront requirements
          The waste criticism
          The change criticism
          The domain and the machine
          Is design separate from implementation?
          Agile methods and design
          CMMI in plain English
          The Personal Software Process
          CMMI/PSP and agile methods
          An agile maturity scale

     4.3 A USABLE LIST
          Put the customer at the center
          Let the team self-organize
          Maintain a sustainable pace
          Develop minimal software
          Accept change
          Develop iteratively
          Treat tests as a key resource
          Do not start any new development until all tests pass
          Test first
          Express requirements through scenarios

     5.1 MANAGER
     5.3 TEAM
     5.5 CUSTOMER

     6.1 SPRINT
          Sprint basics
          The closed-window rule
          Sprint: an assessment
     6.6 OPEN SPACE
          The code ownership debate
          Collective ownership and cross-functionality

          Pair programming concepts
          Pair programming versus mentoring
          Mob programming
          Pair programming: an assessment
          The refactoring concept
          Benefits and limits of refactoring
          Incidental and essential changes
          Combining a priori and a posteriori approaches
          The TDD method of software development
          An assessment of TFD and TDD

     8.1 CODE
     8.2 TESTS
     8.5 VELOCITY
     8.12 IMPEDIMENT

          The fox and the hedgehog
          Lean Software’s Big Idea
          Lean Software’s principles
          Lean Software: an assessment
          XP’s Big Idea
          XP: the unadulterated source
          Key XP techniques
          Extreme Programming: an assessment
     9.4 SCRUM
          Scrum’s Big Idea
          Key Scrum practices
          Scrum: an assessment
     9.5 CRYSTAL
          Crystal’s Big Idea
          Crystal principles
          Crystal: an assessment


          Deprecation of upfront tasks
          User stories as a basis for requirements
          Feature-based development and ignorance of dependencies
          Rejection of dependency tracking tools
          Rejection of traditional manager tasks
          Rejection of upfront generalization
          Embedded customer
          Coach as a separate role
          Test-driven development
          Deprecation of documents
     11.2 THE HYPED
     11.3 THE GOOD


VN:F [1.9.10_1130]
Rating: 8.4/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

New article: contracts in practice

For almost anyone programming in Eiffel, contracts are just a standard part of daily life; Patrice Chalin’s pioneering study of a few years ago [1] confirmed this impression. A larger empirical study is now available to understand how developers actually use contracts when available. The study, to published at FM 2014 [2] covers 21 programs, not just in Eiffel but also in JML and in Code Contracts for C#, totaling 830,000 lines of code, and following the program’s revision history for a grand total of 260 million lines of code over 7700 revisions. It analyzes in detail whether programmers use contracts, how they use them (in particular, which kinds, among preconditions, postconditions and invariants), how contracts evolve over time, and how inheritance interacts with contracts.

The paper is easy to read so I will refer you to it for the detailed conclusions, but one thing is clear: anyone who thinks contracts are for special development or special developers is completely off-track. In an environment supporting contracts, especially as a native part of the language, programmers understand their benefits and apply them as a matter of course.


[1] Patrice Chalin: Are practitioners writing contracts?, in Fault-Tolerant System, eds. Butler, Jones, Romanovsky, Troubitsyna, Springer LNCS, vol. 4157, pp. 100–113, 2006.

[2] H.-Christian Estler, Carlo A. Furia, Martin Nordio, Marco Piccioni and Bertrand Meyer: Contracts in Practice, to appear in proceedings of 19th International Symposium on Formal Methods (FM 2014), Singapore, May 2014, draft available here.

VN:F [1.9.10_1130]
Rating: 8.4/10 (11 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 6 votes)

The biggest software-induced disaster ever


In spite of the brouhaha surrounding the Affordable Care Act, the US administration and its partisans seem convinced that “the Web site problems will be fixed”.

That is doubtful. All reports suggest that the problem is not to replace a checkbox by a menu, or buy a few more servers. The analysis, design and implementation are wrong, and the sites will not work properly any time soon.

Barring sabotage (for which we have seen no evidence), this can only be the result of incompetence. An insurance exchange? Come on. Any half-awake group of developers could program it over breakfast.

Who chose the contractors?

When the problems first surfaced a few weeks ago, anyone with experience and guts would have done the right thing: fire all the companies responsible for  the mess and start from scratch with a dedicated, competent and well-managed team.

The latest promises published are that by the end of the month “four out of five” of the people trying to register will manage to do it. Nice. Imagine that when trying to make a purchase at Amazon you would succeed 80% of the time.

And that is only an optimistic goal.

The people building the site do not have infinite time. In fact, the process is crucially time-driven: if people do not get health coverage in time, they will be fined. But what if they cannot get coverage because the Web sites do not respond, or mess up?

Consider for a second another example of another strictly time-driven project: on January 1, 2002, twelve countries switched to a common currency, with the provision that their current legal tender would lose its status only a bare two months later. The IT infrastructure had to work on the appointed day. It did. How come Europe could implement the Euro in time and the US cannot get a basic health exchange to work?

Here is a possible scenario: the sites do not work (cannot handle the load, give inconsistent results). A massive wave of protests ensues, boosted by those who were against universal health coverage in the first place. Faced with popular revolt and with the evidence, the administration announces that the implementation of the universal mandate — the enforcement of the fines — is delayed by a year. In a year much can happen; opposition grows and the first exchanges are an economic disaster since the “young healthy adults” feel no pressure to enroll. The law fades into oblivion. Americans do not get universal health care for another generation. Show me it is not going to happen.

The software engineering lessons here are clear: hire competent companies; faced with a complicated system, implement the essential functions first, but stress-test them; deploy step by step, with the assurance that whatever is deployed works.

The exact reverse strategy was applied. As a result, we face the prospect of a software disaster that will dwarf Y2K and other famous mishaps; a disaster that software engineering textbooks will feature for decades to come.


VN:F [1.9.10_1130]
Rating: 9.1/10 (17 votes cast)
VN:F [1.9.10_1130]
Rating: +6 (from 10 votes)

The laws of branching (part 2): Tichy and Joy

Recently I mentioned the first law of branching (see earlier article) to Walter Tichy, famed creator of RCS, the system that established modern configuration management. He replied with the following anecdote, which is worth reproducing in its entirety (in his own words):

I started work on RCS in 1980, because I needed an alternative for SCCS, for which the license cost would have been prohibitive. Also, I wanted to experiment with reverse deltas. With reverse deltas, checking out the latest version is fast, because it is stored intact. For older ones, RCS applied backward deltas. So the older revisions took longer to extract, but that was OK, because most accesses are to the newest revision anyway.

At first, I didn’t know how to handle branches in this scheme. Storing each branch tip in full seemed like a waste. So I simply left out the branches.

It didn’t take long an people were using RCS. Bill Joy, who was at Berkeley at the time and working on Berkeley Unix, got interested. He gave me several hints about unpleasant features of SCCS that I should correct. For instance, SCCS didn’t handle identification keywords properly under certain circumstances, the locking scheme was awkward, and the commands too. I figured out a way to solve these issue. Bill was actually my toughest critic! When I was done with all the modifications, Bill cam back and said that he was not going to use RCS unless I put in branches. So I figured out a way. In order to reconstruct a branch tip, you start with the latest version on the main trunk, apply backwards deltas up to the branch point, and then apply forward deltas out to the branch tip. I also implemented a numbering scheme for branches that is extensible.

When discussing the solution, Bill asked me whether this scheme meant that it would take longer to check in and out on branches. I had to admit that this was true. With the machines at that time (VAXen) efficiency was important. He thought about this for a moment and then said that that was actually great. It would discourage programmers from using branches! He felt they were a necessary evil.

VN:F [1.9.10_1130]
Rating: 5.8/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: -2 (from 4 votes)

New course partners sought: a DOSE of software engineering education


Since 2007 we have conducted, as part of a course at ETH, the DOSE project, Distributed and Outsourced Software Engineering, developed by cooperating student teams from a dozen universities around the world. We are finalizing the plans for the next edition, October to December 2013, and will be happy to welcome a few more universities.

The project consists of building a significant software system collaboratively, using techniques of distributed software development. Each university contributes a number of “teams”, typically of two or three students each; then “groups”, each made up of three teams from different universities, produce a version of the project.

The project’s theme has varied from year to year, often involving games. We make sure that the development naturally divides into three subsystems or “clusters”, so that each group can quickly distribute the work among its teams. An example of division into clusters, for a game project, is: game logic; database and player management; user interface. The page that describes the setup in more detail [1] has links enabling you to see the results of some of the best systems developed by students in recent years.

The project is a challenge. Students are in different time zones, have various backgrounds (although there are minimum common requirements [1]), different mother tongues (English is the working language of the project). Distributed development is always hard, and is harder in the time-constrained context of a university course. (In industry, while we do not like that a project’s schedule slips, we can often survive if it does; in a university, when the semester ends, we have to give students a grade and they go away!) It is typical, after the initial elation of meeting new student colleagues from exotic places has subsided and the reality of interaction sets in, that some groups will after a month, just before the first or second deadline, start to panic — then take matters into their own hands and produce an impressive result. Students invariably tell us that they learn a lot through the course; it is a great opportunity to practice the principles of modern software engineering and to get prepared for the realities of today’s developments in industry, which are in general distributed.

For instructors interested in software engineering research, the project is also a great way to study issues of distributed development in  a controlled setting; the already long list of publications arising from studies performed in earlier iterations [3-9] suggests the wealth of available possibilities.

Although the 2013 project already has about as many participating universities as in previous years, we are always happy to consider new partners. In particular it would be great to include some from North America. Please read the requirements on participating universities given in [1]; managing such a complex process is a challenge in itself (as one can easily guess) and all teaching teams must share goals and commitment.


[1] General description of DOSE, available here.

[2] Bertrand Meyer: Offshore Development: The Unspoken Revolution in Software Engineering, in Computer (IEEE), January 2006, pages 124, 122-123, available here.

[3] Bertrand Meyer and Marco Piccioni: The Allure and Risks of a Deployable Software Engineering Project: Experiences with Both Local and Distributed Development, in Proceedings of IEEE Conference on Software Engineering & Training (CSEE&T), Charleston (South Carolina), 14-17 April 2008, available here.

[4] Martin Nordio, Roman Mitin, Bertrand Meyer, Carlo Ghezzi, Elisabetta Di Nitto and Giordano Tamburelli: The Role of Contracts in Distributed Development, in Proceedings of Software Engineering Advances For Offshore and Outsourced Development, Lecture Notes in Business Information Processing 35, Springer-Verlag, 2009, available here.

[5] Martin Nordio, Roman Mitin and Bertrand Meyer: Advanced Hands-on Training for Distributed and Outsourced Software Engineering, in Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering – Volume 1, ACM. 2010 available here.

[6] Martin Nordio, Carlo Ghezzi, Bertrand Meyer, Elisabetta Di Nitto, Giordano Tamburrelli, Julian Tschannen, Nazareno Aguirre and Vidya Kulkarni: Teaching Software Engineering using Globally Distributed Projects: the DOSE course, in Collaborative Teaching of Globally Distributed Software Development – Community Building Workshop (CTGDSD — an ICSE workshop), ACM, 2011, available here.

[7] Martin Nordio, H.-Christian Estler, Bertrand Meyer, Julian Tschannen, Carlo Ghezzi, and Elisabetta Di Nitto: How do Distribution and Time Zones affect Software Development? A Case Study on Communication, in Proceedings of the 6th International Conference on Global Software Engineering (ICGSE), IEEE, pages 176–184, 2011, available here.

[8] H.-Christian Estler, Martin Nordio, Carlo A. Furia, and Bertrand Meyer: Distributed Collaborative Debugging, to appear in Proceedings of 7th International Conference on Global Software Engineering (ICGSE), 2013.

[9] H.-Christian Estler, Martin Nordio, Carlo A. Furia, and Bertrand Meyer: Unifying Configuration Management with Awareness Systems and Merge Conflict Detection, to appear in Proceedings of the 22nd Australasian Software Engineering Conference (ASWEC), 2013.


VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

What is wrong with CMMI


The CMMI model of process planning and assessment has been very successful in some industry circles, essentially as a way for software suppliers to establish credibility. It is far, however, from having achieved the influence it deserves. It is, for example, not widely taught in universities, which in turn limits its attractiveness to industry. The most tempting explanation involves the substance of CMMI: that it prescribes processes that are too heavy. But in fact the basic ideas are elegant, they are not so complicated, and they have been shown to be compatible with flexible approaches to development, such as agile methods.

I think there is a simpler reason, of form rather than substance: the CMMI defining documents are atrociously written.  Had they benefited from well-known techniques of effective technical writing, the approach would have been adopted much more widely. The obstacles to comprehension discourage many people and companies which could benefit from CMMI.

Defining the concepts

One of the first qualities you expect from a technical text is that it defines the basic notions. Take one of the important concepts of CMMI, “process area”. Not just important, but fundamental; you cannot understand anything about CMMI if you do not understand what a process area is. The glossary of the basic document ([1], page 449) defines it as

A cluster of related practices in an area that, when implemented collectively, satisfies a set of goals considered important for making improvement in that area.

The mangled syntax is not a good omen: contrary to what the sentence states, it is not the area that should be “implemented collectively”, but the practices. Let us ignore it and try to understand the intended definition. A process area is a collection of practices? A bit strange, but could make sense, provided the notion of “practice” is itself well defined. Before we look at that, we note that these are practices “in an area”. An area of what? Presumably, a process area, since no other kind of area is ever mentioned, and CMMI is about processes. But then a process area is… a collection of practices in a process area? Completely circular! (Not recursive: a meaningful recursive definition is one that defines simple cases directly and builds complex cases from them. A circular definition defines nothing.)

All that this is apparently saying is that if we already know what a process area is, CMMI adds the concept that a process area is characterized by a set of associated practices. This is actually a useful idea, but it does not give us a definition.

Let’s try to see if the definition of “practice” helps. The term itself does not have an entry in the glossary, a bit regrettable but not too worrying given that in CMMI there are two relevant kinds of practices: specific and generic. “Specific practice” is defined (page 461) as

An expected model component that is considered important in achieving the associated specific goal. (See also “process area” and “specific goal.”)

This statement introduces the important observation that in CMMI a practice is always related to a “goal” (another one of the key CMMI concepts); it is one of the ways to achieve that goal. But this is not a definition of “practice”! As to the phrase “an expected model component”, it simply tells us that practices, along with goals, are among the components of CMMI (“the model”), but that is a side remark, not a definition: we cannot define “practice” as meaning “model component”.

What is happening here is that the glossary does not give a definition at all; it simply relies on the ordinary English meaning of “practice”. Realizing this also helps us understand the definition of “process area”: it too is not a definition, but assumes that the reader already understands the words “process” and “area” from their ordinary meanings. It simply tells us that in CMMI a process area has a set of associated practices. But that is not what a glossary is for: the reader expects it to give precise definitions of the technical terms used in the document.

This misuse of the glossary is typical of what makes CMMI documents so ineffective. In technical discourse it is common to hijack words from ordinary language and give them a special meaning: the mathematical use of such words as “matrix” or “edge” (of a graph) denotes well-defined objects. But you have to explain such technical terms precisely, and be clear at each step whether you are using the plain-language meaning or the technical meaning. If you mix them up, you completely confuse the reader.

In fact any decent glossary should make the distinction explicit by underlining, in definitions, terms that have their own entries (as in: a cluster or related practices, assuming there is an entry for “practice”); then it is clear to the reader whether a word is used in its ordinary or technical sense. In an electronic version the underlined words can be links to the corresponding entries. It is hard to understand why the CMMI documents do not use this widely accepted convention.

Towards suitable definitions

Let us try our hand at what suitable definitions could have been for the two concepts just described; not a vacuous exercise since process area and practice are among the five or six core concepts of CMMI. (Candidates for the others are process, goal, institutionalization and assessment).

“Practice” is the more elementary concept. In fact it retains its essential meaning from ordinary language, but in the CMMI context a practice is related to a process and, as noted, is intended to satisfy a goal. What distinguishes a practice from a mere activity is that the practice is codified and repeatable. If a project occasionally decides to conduct a  design review that is not a practice; a systematically observed daily Scrum meeting is a practice. Here is my take on defining “practice” in CMMI:

Practice: A process-related activity, repeatable as part of a plan, that helps achieve a stated goal.

CMMI has both generic practices, applicable to the process as a whole, and specific practices, applicable to a particular process area. From this definition we can easily derive definitions for both variants.

Now for “process area”. In discussing this concept above, there is one point I did not mention: the reason the CMMI documents can get away with the bizarre definition (or rather non-definition) cited is that if you ask what a process area really is you will immediately be given examples: configuration management, project planning, risk management, supplier agreement management… Then  you get it. But examples are not a substitute for a definition, at least in a standard that is supposed to be precise and complete. Here is my attempt:

Process area: An important aspect of the process, characterized by a clearly identified set of issues and activities, and in CMMI by a set of applicable practices.

The definition of “specific practice” follows simply: a practice that is associated with a particular process area. Similarly, a “generic practice” is a practice not associated with any process area.

I’ll let you be the judge: which definitions do you prefer, these or the ones in the CMMI documents?

By the way, I can hear the cynical explanation: that the jargon and obscurity are intentional, the goal being to justify the need for experts that will interpret the sacred texts. Similar observations have been made to explain the complexity of certain programming languages. Maybe. But bad writing is common enough that we do not need to invoke a conspiracy in this case.

Training for the world championship of muddy writing

The absence of clear definitions of basic concepts is inexcusable. But the entire documents are written in government-committee-speak that erect barriers against comprehension. As an example among hundreds, take the following extract, the entire description of the generic practice GP2.2, “Establish and maintain the plan for performing the organizational training process“” , part of the Software CMM (a 729-page document!), [2], page 360:

This plan for performing the organizational training process differs from the tactical plan for organizational training described in a specific practice in this process area. The plan called for in this generic practice would address the comprehensive planning for all of the specific practices in this process area, from the establishment of strategic training needs all the way through to the assessment of the effectiveness of the organizational training effort. In contrast, the organizational training tactical plan called for in the specific practice would address the periodic planning for the delivery of individual training offerings.

Even to a good-willed reader the second and third sentences sound like gibberish. One can vaguely intuit that the practice just introduced is being distinguished from another, but which one, and how? Why the conditional phrases, “would address”? A plan either does or does not address its goals. And if it addresses them, what does it mean that a plan addresses a planning? Such tortured tautologies, in a high-school essay, would result in a firm request to clean up and resubmit.

In fact the text is trying to say something simple, which should have been expressed like this:

This practice is distinct from practice SP1.3, “Establish an Organizational Training Tactical Plan” (page 353). The present practice is strategic: it is covers planning the overall training process. SP1.3 is tactical: it covers the periodic planning of individual training activities.

(In the second sentence we could retain “from defining training needs all the way to assessing the effectiveness of training”, simplified of course from the corresponding phrase in the original, although I do not think it adds much.)

Again, which version do you prefer?

The first step in producing something decent was not even a matter of style but simple courtesy to the reader. The text compares a practice to another, but it is hard to find the target of the comparison: it is identified as the “tactical plan for organizational training” but that phrase does not appear anywhere else in the document!  You have to guess that it is listed elsewhere as the “organizational training tactical plan”, search for that string, and cycle through its 14 occurrences to see which one is the definition.  (To make things worse, the phrase “training tactical plan” also appears in the document — not very smart for what is supposed to be a precisely written standard.)

The right thing to do is to use the precise name, here SP1.3 — what good is it to introduce such code names throughout a document if it does not use them for reference? — and for good measure list the page number, since this is so easy to do with text processing tools.

In the CMMI chapter of my book Touch of Class (yes, an introductory programming textbook does contain an introduction to CMMI) I cited another extract from [2] (page 326):

The plan for performing the organizational process focus process, which is often called “the process-improvement plan,” differs from the process action plans described in specific practices in this process area. The plan called for in this generic practice addresses the comprehensive planning for all of the specific practices in this process area, from the establishment of organizational process needs all the way through to the incorporation of process-related experiences into the organizational process assets.

In this case the translation into text understandable by common mortals is left as an exercise for the reader.

With such uncanny ability to say in fifty words what would better be expressed in ten, it is not surprising that basic documents run into 729 pages and that understanding of CMMI by companies who feel compelled  to adopt it requires an entire industry of commentators of the sacred word.

Well-defined concepts have simple names

The very name of the approach, “Capability Maturity Model Integration”, is already a sign of WMD (Word-Muddying Disease) at the terminal stage. I am not sure if anyone anywhere knows how to parse it correctly: is it the integration of a model of maturity of capability (right-associative interpretation)? Of several models? These and other variants do not make much sense, if only because in CMMI “capability” and “maturity” are alternatives, used respectively for the Continuous and Staged versions.

In fact the only word that seems really useful is “model”; the piling up of tetrasyllabic words with very broad meanings does not help suggest what the whole thing is about. “Integration”, for example, only makes sense if you are aware of the history of CMMI, which generalized the single CMM model to a family of models, but this history is hardly interesting to a newcomer. A name, especially a long one, should carry the basic notion.

A much better name would have been “Catalog of Assessable Process Practices”, which is even pronounceable as an acronym, and defines the key elements: the approach is based on recognized best practices; these practices apply to processes (of an organization); they must be subject to assessment (the most visible part of CMMI — the famous five levels — although not necessarily the most important one); and they are collected into a catalog. If “catalog” is felt too lowly, “collection” would also do.

Catalog of Assessable Process Practices: granted, it sounds less impressive than the accumulation of pretentious words making up the actual acronym. As is often the case in software engineering methods and tools, once you remove the hype you may be disappointed at first (“So that’s all that it was after all!”), and then you realize that the idea, although brought back down to more modest size, remains a good idea, and can be put to effective use.

At least if you explain it in English.


[1] CMMI Product Team: CMMI for Development, Version 1.3, Improving processes for developing better products and services, Technical Report CMU/SEI-2010-TR-033, Software Engineering Institute, Carnegie Mellon University, November 2010, available here.

[2] CMMI Product Team, ; CMMI for Systems Engineering/Software Engineering/Integrated Product and Process Development/Supplier Sourcing, Version 1.1, Staged Representation (CMMI-SE/SW/IPPD/SS, V1.1, Staged) (CMU/SEI-2002-TR-012). Software Engineering Institute, Carnegie Mellon University, 2002, available here.

VN:F [1.9.10_1130]
Rating: 9.6/10 (14 votes cast)
VN:F [1.9.10_1130]
Rating: +12 (from 12 votes)

Bringing C code to the modern world

The C2Eif translator developed by Marco Trudel takes C code and translates it into Eiffel; it produces not just a literal translation but a re-engineering version exhibiting object-oriented properties. Trudel defended his PhD thesis last Friday at ETH (the examiners were Hausi Muller from Victoria University, Manuel Oriol from ABB, Richard Paige from the University of York,  and me as the advisor). The thesis is not yet available online but earlier papers describing C2Eif are, all reachable from the project’s home page [1].

At issue is what we do with legacy code. “J’ai plus de souvenirs que si j’avais mille ans”, wrote Charles Baudelaire in Les Fleurs du Mal (“Spleen de Paris”). The software industry is not a thousand years old, but has accumulated even more “souvenirs” than

A heavy chest of drawers cluttered with balance-sheets,
Poems, love letters, lawsuits, romances
And heavy locks of hair wrapped in invoices

We are suffocating under layers of legacy code heaped up by previous generations of programmers using languages that no longer meet our scientific and engineering standards. We cannot get rid of this heritage; how do we bring it to the modern world? We need automatic tools to wrap it in contemporary code, or, better, translate it into contemporary code. The thesis and the system offer a way out through translation to a modern object-oriented language. It took courage to choose such a topic, since there have been many attempts in the past, leading to conventional wisdom consisting of two strongly established opinions:

  • Plain translation: it has been tried, and it works. Not interesting for a thesis.
  • Object-oriented reengineering: it has been tried, and it does not work. Not realistic for a thesis.

Both are wrong. For translation, many of the proposed solutions “almost work”: they are good enough to translate simple programs, or even some large programs but on the condition that the code avoids murky areas of C programming such as signals, exceptions (setjmp/longjmp) and library mechanisms. In practice, however, most useful C programs need these facilities, so any tool that ignores them is bound to be of conceptual value only. The basis for Trudel’s work has been to tackle C to OO translation “beyond the easy stuff” (as stated in the title of one of the published papers). This effort has been largely successful, as demonstrated by the translation of close to a million lines of actual C code, including some well-known and representative tools such as the Vim editor.

As to OO reengineering, C2Eif makes a serious effort to derive code that exhibits a true object-oriented design and hence resembles, in its structure at least, what a programmer in the target language might produce. The key is to identify the right data abstractions, yielding classes, and specialization properties, yielding inheritance. In this area too, many people have tried to come up with solutions, with little success. Trudel has had the good sense of avoiding grandiose goals and sticking to a number of heuristics that work, such as looking at the signatures of a set of functions to see if they all involve a common argument type. Clearly there is more to be done in this direction but the result is already significant.

Since Eiffel has a sophisticated C interface it is also possible to wrap existing code; some tools are available for that purpose, such as Andreas Leitner’s EWG (Eiffel Wrapper Generator). Wrapping and translating each have their advantages and limitations; wrapping may be more appropriate for C libraries that someone else is still actively updating  (so that you do not have to redo a translation with every new release), and translation for legacy code that you want to take over and bring up to par with the rest of your software. C2Eif is engineered to support both. More generally, this is a practitioner’s tool, devoting considerable attention to the many details that make all the difference between a nice idea and a tool that really works. The emphasis is on full automation, although more parametrization has been added in recent months.

C2Eif will make a big mark on the Eiffel developer community. Try it yourself — and don’t be shy about telling its author about the future directions in which you think the tool should evolve.


[1] C2Eif project page, here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (13 votes cast)
VN:F [1.9.10_1130]
Rating: +8 (from 8 votes)

LASER summer school: Software for the Cloud and Big Data

The 2013 LASER summer school, organized by our chair at ETH, will take place September 8-14, once more in the idyllic setting of the Hotel del Golfo in Procchio, on the island of Elba in Italy. This is already the 10th conference; the roster of speakers so far reads like a who’s who of software engineering.

The theme this year is Software for the Cloud and Big Data and the speakers are Roger Barga from Microsoft, Karin Breitman from EMC,  Sebastian Burckhardt  from Microsoft,  Adrian Cockcroft from Netflix,  Carlo Ghezzi from Politecnico di Milano,  Anthony Joseph from Berkeley,  Pere Mato Vila from CERN and I.

LASER always has a strong practical bent, but this year it is particularly pronounced as you can see from the list of speakers and their affiliations. The topic is particularly timely: exploring the software aspects of game-changing developments currently redefining the IT scene.

The LASER formula is by now well-tuned: lectures over seven days (Sunday to Saturday), about five hours in the morning and three in the early evening, by world-class speakers; free time in the afternoon to enjoy the magnificent surroundings; 5-star accommodation and food in the best hotel of Elba, made affordable as we come towards the end of the season (and are valued long-term customers). The group picture below is from last year’s school.

Participants are from both industry and academia and have ample opportunities for interaction with the speakers, who typically attend each others’ lectures and engage in in-depth discussions. There is also time for some participant presentations; a free afternoon to discover Elba and brush up on your Napoleonic knowledge; and a boat trip on the final day.

Information about the 2013 school can be found here.

LASER 2012, Procchio, Hotel del Golvo

VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Doing it right or doing it over?

(Adapted from an article in the Communications of the ACM blog.)

I have become interested in agile methods because they are all the rage now in industry and, upon dispassionate examination, they appear to be a pretty amazing mix of good and bad ideas. I am finishing a book that tries to sort out the nuggets from the gravel [1].

An interesting example is the emphasis on developing a system by successive increments covering expanding slices of user functionality. This urge to deliver something that can actually be shown — “Are we shipping yet?” — is excellent. It is effective in focusing the work of a team, especially once the foundations of the software have been laid. But does it have to be the only way of working? Does it have to exclude the time-honored engineering practice of building the infrastructure first? After all, when a building gets constructed, it takes many months before any  “user functionality” becomes visible.

In a typical exhortation [2], the Poppendiecks argue that:

The right the first time approach may work for well-structured problems, but the try-it, test-it, fix-it approach is usually the better approach for ill-structured problems.

Very strange. It is precisely ill-structured problems that require deeper analysis before you jump in into wrong architectural decisions which may require complete rework later on. Doing prototypes to try possible solutions can be a great way to evaluate potential solutions, but a prototype is an experiment, something quite different from an increment (an early version of the future system).

One of the problems with the agile literature is that its enthusiastic admonitions to renounce standard software engineering practices are largely based on triumphant anecdotes of successful projects. I am willing to believe all these anecdotes, but they are only anecdotes. In the present case systematic empirical evidence does not seem to support the agile view. Boehm and Turner [3] write:

Experience to date indicates that low-cost refactoring cannot be depended upon as projects scale up.


The only sources of empirical data we have encountered come from less-expert early adopters who found that even for small applications the percentage of refactoring and defect-correction effort increases with [the size of requirements].

They do not cite references here, and I am not aware of any empirical study that definitely answers the question. But their argument certainly fits my experience. In software as in engineering of any kind, experimenting with various solutions is good, but it is critical to engage in the appropriate Big Upfront Thinking to avoid starting out with the wrong decisions. Some of the worst project catastrophes I have seen were those in which the customer or manager was demanding to see something that worked right away — “it doesn’t matter if it’s not the whole thing, just demonstrate a piece of it! — and criticized the developers who worked on infrastructure that did not produce immediately visible results (in other words, were doing their job of responsible software professionals). The inevitable result: feel-good demos throughout the project, reassured customer, and nothing to deliver at the end because the difficult problems have been left to rot. System shelved and never to be heard of again.

When the basis has been devised right, perhaps with nothing much to show for months, then it becomes critical to insist on regular visible releases. Doing it prematurely is just sloppy engineering.

The problem here is extremism. Software engineering is a difficult balance between conflicting criteria. The agile literature’s criticism of teams that spend all their time on design or on foundations and never deliver any usable functionality is unfortunately justified. It does not mean that we have to fall into the other extreme and discard upfront thinking.

In the agile tradition of argument by anecdote, here is an extract from James Surowiecki’s  “Financial Page” article in last month’s New Yorker. It’s not about software but about the current Boeing 787 “Dreamliner” debacle:

Determined to get the Dreamliners to customers quickly, Boeing built many of them while still waiting for the Federal Aviation Administration to certify the plane to fly; then it had to go back and retrofit the planes in line with the FAA’s requirements. “If the saying is check twice and build once, this was more like build twice and check once”, [an industry analyst] said to me. “With all the time and cost pressures, it was an alchemist’s recipe for trouble.”

(Actually, the result is “build twice and check twice”, or more, since every time you rebuild you must also recheck.) Does that ring a bell?

Erich Kästner’s ditty about reaching America, cited in a previous article [5], is once again the proper commentary here.


[1] Bertrand Meyer: Agile! The Good, the Hype and the Ugly, Springer, 2013, to appear.

[2] Mary and Tom Poppendieck: Lean Software Development — An Agile Toolkit, Addison-Wesley, 2003.

[3] Barry W. Boehm and Richard Turner: Balancing Agility with Discipline — A Guide for the Perplexed, Addison-Wesley, 2004. (Second citation slightly abridged.)

[4] James Surowiecki, in the New Yorker, 4 February 2013, available here.

[5] Hitting on America, article from this blog, 5 December 2012, available here.

VN:F [1.9.10_1130]
Rating: 8.9/10 (9 votes cast)
VN:F [1.9.10_1130]
Rating: +5 (from 5 votes)

ESEC/FSE 2013: 18-26 August, Saint Petersburg, Russia

The European Software Engineering Conference takes place every two years in connection with the ACM Foundations of Software Engineering symposium (which in even years is in the US). The next ESEC/FSE  will be held for the first time in Russia, where it will be the first major international software engineering conference ever. It comes at a time when the Russian software industry is ever more present through products and services offered worldwide. See the conference site here. The main conference will be held 21-23 August 2013, with associated events before and after so that the full dates are August 18 to 26. (I am the general chair.)

Other than ICSE, ESEC/FSE is second to none in the quality of the program. We already have four outstanding keynote speakers:  Georges Gonthier from Microsoft Research, Paola Inverardi from L’Aquila in Italy, David Notkin from U. of Washington (in whose honor a symposium will be held as an associated event of ESEC/FSE, chaired by Michael Ernst), and Moshe Vardi of Rice and of course Communications of the ACM.

Saint Petersburg is one of the most beautiful cities in the world, strewn with gilded palaces, canals, world-class museums (not just the Hermitage), and everywhere mementos of the great poets, novelists, musicians and scientists who built up its fame.

Hosted by ITMO National Research University, the conference will be held in the magnificent building of the Razumovsky Palace on the banks of the Moika river; see here.

The Call for Papers has a deadline of March 1st, so there is still plenty of time to polish your best paper and send it to ESEC/FSE. There is also still time to propose worskhops and other associated events. ESEC/FSE will be a memorable moment for the community and we hope to see many of the readers there.

VN:F [1.9.10_1130]
Rating: 9.7/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

Why so many features?


It is a frequent complaint that production software contains too many features: “I use only  maybe 5% of Microsoft Word!” , with the implication that the other 95% are useless, and apparently without the consideration that maybe someone else needs them; how do you know that what is good enough for you is good enough for everyone?

The agile literature frequently makes this complaint against “software bloat“, and has turned it into a principle: build minimal software.

Is software really bloated? Rather than trying to answer this question it is useful to analyze where features come from. In my experience there are three sources: internal ideas; suggestions from the field; needs of key customers.

1. Internal ideas

A software system is always devised by a person or group, who have their own views of what it should offer. Many of the more interesting features come from these inventors and developers, not from the market. A competent group does not wait for users or prospects to propose features, but comes up with its own suggestions all the time.

This is usually the source of the most innovative ideas. Major breakthroughs do not arise from collecting customer wishes but from imagining a new product that starts from a new basis and proposing it to the market without waiting for the market to request it.

2. Suggestions from the field

Customers’ and prospects’ wishes do have a crucial role, especially for improvements to an existing product. A good marketing department will serve as the relay between the field’s wishes and the development team. Many such suggestions are of the “Check that box!” kind: customers and particularly prospects look at the competition and want to make sure that your product does everything that the others do. These suggestions push towards me-too features; they are necessary to keep up with the times, but must be balanced with suggestions from the other two sources, since if they were the only inspiration they would lead to a product that has the same functionality as everyone else’s, only delivered a few months later, not the best recipe for success.

3. Key customers

Every company has its key customers, those who give you so much business that you have to listen to them very carefully. If it’s Boeing calling, you pay more attention than to an unknown individual who has just acquired a copy. I suspect that many of the supposedly strange features, of products the ones that trigger “why would anyone ever need this?” reactions, simply come from a large customer who, at some point in the product’s history, asked for a really, truly, absolutely indispensable facility. And who are we — this includes Microsoft and Adobe and just about everyone else — to say that it is not required or not important?

It is easy to complain about software bloat, and examples of needlessly complex system abound. But your bloat may be my lifeline, and what I dismiss as superfluous may for you be essential. To paraphrase a comment by Ichbiah, the designer of Ada, small systems solve small problems. Outside of academic prototypes it is inevitable that  a successful software system will grow in complexity if it is to address the variety of users’ needs and circumstances. What matters is not size but consistency: maintaining a well-defined architecture that can sustain that growth without imperiling the system’s fundamental solidity and elegance.

VN:F [1.9.10_1130]
Rating: 8.5/10 (11 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

Salad requirements, requirements salad


You know what salad is.

Salad is made of green leaves. Actually no, there are lots of other colors, lots of other kinds; and many, such as rice salad, pasta salad, potato salad, include no leaves at all.

In any case, salad is made of vegetables. Actually no: fruit salad.

I meant vegetal, as in non-animal. Actually no: salads often contain cheese, meat, fish, seafood.

In any case, salad is a cold dish. Actually no: did you never try a warm goat cheese salad?

Salad has dressing. Actually no: I know quite a few people who shun dressings.

Salads are consumed at the beginning of a meal. Actually no: in France, the normal place of a salad is after the main course.

At least they are only part of a meal. Actually no: have you not heard of the dinner salad?

Salads have something to do with salt. Actually no: although you are right etymologically, as the word comes through the French salade from the Latin saleta, salty, in our blood-pressure-conscious world the cook often does not put any salt.

Salads are only consumed at lunch. At dinner too. And maybe… I take that back.

I know a salad when I see one. Or maybe when I taste one. Although I have never tried blindfolded.

Then explain to us what it is.

Well, if it says “salad” on the menu it must be a salad.

Can you do better?

I will have to come back to you on that one.

If it is so hard to come up with a convincing definition for such a banal notion (and it is real fun to look at good dictionaries and see the contortions they go through in trying to make some sense of it), no wonder software requirements specifications (SRS) are so hard. One of the obligatory steps in a requirements process —  “agile” or not — is to build up a glossary for the project [1]: a set of definitions for the terms of the trade, those words from the problem domain that the stakeholders throw in assuredly all the time in discussions, with the assumption that everyone else understands, except that when you try to understand too you realize there is no clear definition and even, in some cases, different people understand them in different ways.

If definitions are so hard, are requirements then impossible? The trick is that we often do not need a dictionary-style definition of what things are; we only need to know what they have, in other words what are their properties and operations. This is the abstract data type approach, also known as object technology. But it is still hard to convince the stakeholders to explain what they mean.

The German language has one more use of salads: the affectionate term to describe the jumble of wires that mars the back of your desk (I am guessing) and also the front of mine (in this case I know) is Kabelsalat, cable salad [2]. More than a few SRS are like that too: requirements salads.


[1] IEEE: Standard 830-1998, Recommended Practice for Software Requirements Specifications, available (for a fee) here.

[2] German Wikipedia: Kabelsalat entry, available here.


VN:F [1.9.10_1130]
Rating: 10.0/10 (10 votes cast)
VN:F [1.9.10_1130]
Rating: +7 (from 9 votes)

Domain Theory: precedents

Both Gary Leavens and Jim Horning commented (partly here, partly on Facebook) about my Domain Theory article [1] to mention that Larch had mechanisms for domain modeling and specification reuse. As Horning writes:

The Larch Shared Language was really all about creating reusable domain theories, including theorems about the domains.  See, for example [2] and [3].

I am honored that they found the time to write about the article and happy to acknowledge Larch, one of the most extensive efforts, over several decades, to provide serious notations and tools for specification. Leavens’s and Horning’s messages gave me the opportunity to re-read some Larch papers and discover a couple I did not know.

My article did not try to provide exhaustive references; if it had, Larch would have been among them. I would probably have cited my own paper on M [4], earlier than [3], which introduces a notation for composing specifications; see section 1.4 (“Features of the M method and the associated notation have thus been devised to allow for modular descriptions of systems. A system description may include an interface paragraph that describes the connection of the current specification with others, existing or yet to be written”) and the  presentation of these mechanisms in section 5.

Larch traits, described in [3], pursue a similar aim, but the earlier article cited by Horning [2] is a general, informal discussion of formal specification; it does not mention traits, and in fact does not cite Larch, stating instead “We have experimented with the use of two very different tools, PIE and Affirm, in constructing modest sized algebraic specifications”. Its general observations about the specification task remain useful today, and it does mention reuse in passing.

If we were to look for precedents, the basic source would have to be the Clear specification language of Goguen and Burstall, for which the citations [5, 6, 7] all appear in my M paper [4] and go back further: 1977-1981. Clear made a convincing case for modularizing specifications, and defined supporting language constructs.

Since these early publications, many people have come to realize that reuse and composition can be as useful on the specification side as they are for programming. Typical specification and verification techniques, however, do not take advantage of this idea and tend to make us restart every time from the lowest level. Domain Theory, as outlined in [1], is intended to bring abstraction, which has proved so beneficial in other parts of software engineering, to the world of specification.


[1] Domain Theory: The Forgotten step in program verification, an article in this blog, see here.

[2] John V. Guttag, James J. Horning, Jeannette M. Wing: Some Notes on Putting Formal Specifications to Productive Use, in Science of Computer Programming, vol. 2, no. 1, 1982, pages 53-68. (BM note: I found a copy here.)

[3] John V. Guttag, James J. Horning: A Larch Shared Language Handbook, in Science of Computer Programming, vol. 6, no. 2, 1986, pages 135-157. (BM note: I found a copy here, which also has a link to the Larch report.)

[4] Bertrand Meyer: M: A System Description Method, Technical Report TR CS 85-15, University of California, Santa Barbara, 1985, available here.

[5] Rod M. Burstall and Joe A. Goguen: Putting Theories Together to Make Specifications, in Proceedings of 5th International Joint Conference on Artificial Intelligence, Cambridge (Mass.), 1977, pages 1045- 1058.

[6] Rod M. Burstall and Joe A. Goguen: “The Semantics of Clear, a Specification Language,” in Proceedings of Advanced Course on Abstract Software Specifications, Copenhagen, Lecture Notes on Computer Science 86, Copenhagen, Springer-Verlag, 1980, pages 292-332, available here.

[7] Rod M. Burstall and Joe A. Goguen: An Informal Introduction to Specifications using Clear, in The Correctness Problem in Computer Science, eds R. S. Boyer and JJ. S. Moore, Springer-Verlag, 1981, pages 185-213.

VN:F [1.9.10_1130]
Rating: 8.8/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: +5 (from 5 votes)

Domain Theory: the forgotten step in program verification


Program verification is making considerable progress but is hampered by a lack of abstraction in specifications. A crucial step is, almost always, absent from the process; this omission is the principal obstacle to making verification a standard component of everyday software development.

Steps in software verification

In the first few minutes of any introduction to program verification, you will be told that the task requires two artifacts: a program, and a specification. The program describes what executions will do; the specification, what they are supposed to do. To verify software is to ascertain that the program matches the specification: that it does is what it should.

The consequence usually drawn is that verification consists of three steps: write a specification, write a program, prove that the program satisfies the specification. The practical process is of course messier, if only because the first two steps may occur in the reverse order and, more generally, all three steps are often intertwined: the specification and the program influence each other, in particular through the introduction of “verification conditions” into the program; and initial proof attempts will often lead to changes in both the specification and the program. But by and large these are the three accepted steps.

Such a description misses a fourth step, a prerequisite to specification that is essential to a scalable verification process: Domain Theory. Any program addresses a specific domain of discourse, be it the domain of network access and communication for a mobile phone system, the domain of air travel for a flight control system, of companies and shares for a stock exchange system and so on. Even simple programs with a limited scope, such as the computation of the maximum of an array, use a specific domain beyond elementary mathematics. In this example, it is the domain of arrays, with their specific properties: an array has a range, a minimum and maximum indexes in that range, an associated sequence of values; we may define a slice a [i..j], ask for the value associated with a given index, replace an element at a given index and so on. The Domain Theory provides a formal model for any such domain, with the appropriate mathematical operations and their properties. In the example the operations are the ones just mentioned, and the properties will include the axiom that if we replace an element at a certain index i with a value v then access the element at an index j, the value we get is v if i = j, and otherwise the earlier value at j.

The role of a Domain Theory

The task of devising a Domain Theory is to describe such a domain of reference, in the spirit of abstract data types: by listing the applicable operations and their properties. If we do not treat this task as a separate step, we end up with the kind of specification that works for toy examples but quickly becomes unmanageable for real-life applications. Most of the verification literature, unfortunately, relies on such specifications. They lack abstraction since they keep using the lowest-level mathematical objects and constructs, such as numbers and quantified expressions. They are to specification what assembly language is to modern programming.

Dines Bjørner has for a long time advocated a closely related idea, domain engineering; see for example his book in progress [1]. Unfortunately, he does not take advantage of modularization through abstract data types; the book is an example of always-back-to-the-basics specification, resorting time and again to fully explicit specifications based on a small number of mathematical primitives, and as a consequence making formal specification look difficult.

Maximum computed from both ends

As a simple example of modeling through an abstract theory consider an algorithm for computing the maximum of an array. We could use the standard technique that goes through the array one-way, but for variety let us take the algorithm that works from both ends, moving two integer cursors towards each other until they meet.  (This example was used in a verification competition at a recent conference, I forgot which one.) The code looks like this:

Two-way maximum

The specification, expressed by the postcondition (ensure) should state that Result is the maximum of the array; the loop invariant will be closely related to it. How do we express these properties? The obvious way is not the right way. It states the postcondition as something like

k: Z | (ka.lowerka.upper) a [k] ≤ Result

k: Z | ka.lowerka.upper a [k] = Result

In words, Result is at least as large as every element of the array, and is equal to at least one of the elements of the array. The invariant can also be expressed in this style (try it).

The preceding specification expresses the desired property, but it is of an outrageously lower level than called for. The notion of maximum is a general one for arrays over an ordered type. It can be computed through many different algorithms in addition to the one shown above, and exists independently of these algorithms. The detailed, assembly-language-like definition of its properties should not have to be repeated in every case. It should be part of the Domain Theory for the underlying notion, arrays.

A specification at the right level of abstraction

In a Domain Theory for arrays of elements from an ordered set, one of the principal operations is maximum, satisfying the above properties. The definition of maximum through these properties belongs at the Domain Theory level. The Domain Theory should include that definition, independent of any particular computational technique such as two_way_max. Then the routine’s postcondition, relying on this notion from the Domain Theory, becomes simply

Result = a.maximum

The application of this approach to the loop invariant is particularly interesting. If you tried to write it at the lowest level, as suggested above, you should have produced something like this:


k: Z | kikj ∧ (∀ l: Z | l a.lowerl a.upper a [l] ≤ a [k])

The first clause is appropriate but the rest is horrible! With its nested quantified expressions it gives an impression of great complexity for a property that is in fact straightforward, simple enough in fact to be explained to a 10-year-old: the maximum of the entire array can be found between indexes i and j. In other words, it is also the maximum of the array slice going from i to j. The Domain Theory will define the notion of slice and enable us to express the invariant as just

a.lowerij a.upper — This bounding clause remains

a.maximum = (a [i..j ]).maximum

(where we will write the slice a [i..j ] as a.slice (i, j ) if we do not have mechanisms for defining special syntax). To verify the routine becomes trivial: on loop exit the invariant still holds and i = j, so the maximum of the entire array is given by the maximum of the single-element slice a [i..i ], which is the value of its single element a [i ]. This last property — the maximum of a single-element array is its single value — is independent of the verification of any particular program and should be proved as a little theorem of the Domain Theory for arrays.

The comparison between the two versions is striking: without Domain Theory, we are back to the most tedious mathematical manipulations again and again; simple, clear properties look complicated and obscure. This just for a small example on basic data structures; now think what it will be for a complex application domain. Without a first step of formal modeling to develop a Domain Theory, no realistic specification and verification process is realistic.

Although the idea is illustrated here through examples of individual routines, the construction of a Domain Theory should usually occur, in an object-oriented development process, at the level of a class: the embodiment of an abstract data type, which is at the appropriate level of granularity. The theory applies to objects of a given type, and hence will be used for the verification of all operations of that type. This observation justifies the effort of devising a Domain Theory, since it will benefit a whole set of software elements.

Components of a Domain Theory

The Domain Theory should include the three ingredients illustrated in the example:

  • Operations, modeled as mathematical functions (no side effects of course, we are in the world of specification).
  • Axioms characterizing the defining properties of these operations.
  • Theorems, characterizing other important properties.

This approach is of course nothing else than abstract data types (the same thing, although few people realize it, as object-oriented analysis). Even though ADTs are a widely popularized notion, supported for example by tools such as CafeOBJ [2] and Maude [3], it is generally not taken to its full conclusions; in particular there is too often a tendency to define every new ADT from scratch, rather than building up libraries of reusable high-level mathematical components in the O-O spirit of reuse.

Results, not just definitions

In devising a Domain Theory with the three kinds of ingredient listed above, we should not forget the last one, the theorems! The most depressing characteristic of much of the work on formal specification is that it is long on definitions and short on results, while good mathematics is supposed to be the reverse. I think people who have seriously looked at formal methods and do not adopt them are turned off not so much by the need to use mathematics but by the impression they get little value for it.

That is why Eiffel contracts do get adopted: even if it’s just for testing and debugging, people see immediate returns. It suffices for a programmer to have caught one bug as the violation of a simple postcondition to be convinced for life and lose any initial math-phobia.

Quantifiers are evil

As we go beyond simple contract properties — this argument must be positive, this reference will not be void — the math needs to be at the same level of abstraction to which, as modern programmers, we are accustomed. For example, one should always be wary of program specifications relying directly on quantified expressions, as in the low-level variants of the postcondition and loop invariant of the two_way_max routine.

This is not just a matter of taste, as in the choice in logic [4] between lambda expressions (more low-level but also more immediately understandable) and combinators (more abstract but, for many, more abstruse). We are talking here about the fundamental software engineering problem of scalability; more generally, of the understandability, extendibility and reusability of programs, and the same criteria for their specification and verification. Quantifiers are of course needed to express fundamental properties of a structure but in general should not directly appear in program assertions: as the example illustrated, their level of abstraction is lower than the level of discourse of a modern object-oriented program. If the rule — Quantifiers Considered Harmful — is not absolute, it must be pretty close.

Quantified expressions, “All elements of this structure possess this property” and “Some element of this structure possesses this property” — belong in the description of the structure and not in the program. They should appear in the Domain Theory, not in the verification. If you want to express that a hash table search found an element of key K, you should not write

(Result = Void ∧ (∀ i: Z | i a.loweri a.upper a.item (i).key ≠ K))

(ResultVoid ∧ (∀ i: Z | i a.loweri a.upper a.item (i).key = K ∧ Result = a.item (i))


Result /= Void     (Result a.elements_of_key (K))

The quantified expressions will appear in the Domain Theory for the corresponding structure, in the definition of such domain properties as elements_of_key. Then the program’s specification — the contracts to be verified — can rely on concepts that make sense to the programmer; the verification will take advantage of theorems that have been proved independently since they belong to the Domain Theory and do not depend on individual programs.

Even the simplest examples…

Practical software verification requires Domain Theory even in the simplest cases, including those often used as purely academic examples. Perhaps the most common (and convenient) way to explain the notion of loop invariant is Euclid’s algorithm to compute the greatest common divisor (gcd) of two numbers (with a structure remarkably similar to that of two_way_max):

I have expressed the postcondition using a concept from an assumed Domain Theory for the underlying problem: gcd, the mathematical function that yields the greatest common divisor of two integers. Many specifications I have seen go back to the basics, with something like this (using \\ for integer remainder):

a \\ Result = 0 b \\ Result = 0   ∀ i: N | (a \\ i = 0) ∧ (b \\ i = 0)  i Result

This is indeed the definition of what it means for Result to be the gcd of a and b (it divides a, it divides b, and is greater than any other integer that also has these two properties). But it makes no sense to include such a detailed mathematical property in the specification of a program element. It belongs in the domain theory, where it will serve as the definition of a function gcd, which we can then use directly in the specification of the program.

Note how the invariant makes the necessity of the Domain Theory approach even more clear: try to express it in the basic mathematical form, not using the function gcd, It can be done, but the result is typical of the high complexity to usefulness ratio of traditional formal specifications mentioned above. Instead, the invariant that I have included in the program text above says exactly what there is to say, clearly and concisely: at each iteration, the gcd of our two temporary values, i and j, is the result that we are seeking, the gcd of the original values a and b. On exit from the loop, when i and j are equal, their common value is that result.

It is also thanks to the Domain Theory modeling that the verification of the program — consisting of proving that the stated property is indeed invariant — will be so simple: as part of the theory, we should have the two little theorems

i > j > 0 gcd (i, j) = gcd (ij, j)
(i, i) = i

which immediately show the implementation to be correct.

Inside of any big, fat, messy, quantifier-ridden specification there is a simple, elegant and clear Domain-Theory-based specification desperately trying to get out. Find it and use it.

From Domain Theory to domain library

One of the reasons most people working on program verification have not used the division into levels of discourse described here, with a clear role for developing a Domain Theory, is that they lack the appropriate notational support. Mathematical notation is of course available, but we are talking about programs a general verification framework cannot resort to a new special notation for every new application domain.

This is one of the places where Eiffel provides a consistent solution, through its seamless approach to integrating programs and specifications in a single notation. Thanks to mechanisms such as deferred classes (classes that describe concepts through detailed specifications without committing to an implementation), Eiffel is as much for specification as for design and implementation; a Domain Theory can be expressed though a set of deferred Eiffel classes, which we may call a domain library. The classes in a domain library should not just be deferred, meaning devoid of implementation; they should in addition describe stateless operations only — queries, not commands — since they are modeling purely mathematical concepts.

An earlier article in this blog [5] outlined the context of our verification work: the EVE project (Eiffel Verification Environment), a practical approach to integrating software verification in the day-to-day practice of modern software development, with the slogan ““Verification As a Matter Of Course”. In this project we have applied the idea of Domain Theory by building a domain library covering fundamental concepts of set theory, including functions and relations. This is the Mathematical Model Library (MML) [6, 7], which we use to verify the new data structure library EiffelBase 2 using specifications at the appropriate level of abstraction.

MML is in fact useful for the specification of a wide variety of programs, since almost every application area can benefit from the general concepts of set, subset, relation and such. But to cover a specific application domain, say flight traffic control, MML will generally not suffice; you will need to devise a Domain Theory that mathematically models the target domain, and may express it in the form of a domain library written in the same general spirit as MML: all deferred, stateless, focused on high-level abstractions.

It is one of the attractions of Eiffel that you can express such a theory and library in the same notation as the programs that will use it — more precisely in a subset of that notation, since the specification classes do not need the imperative constructs of the language such as instructions and attributes. Then both the development process and the verification use a seamlessly integrated set of notations and techniques, and all use the same tools from a modern IDE, in our case EiffelStudio, for browsing, editing, working with graphical repreentation, metrics etc.

DSL libraries for specifications

A mechanism to express Domain Theories is to a general specification mechanism essentially like a Domain Specific Language (DSL) is to a general programming language: a specialization for a particular domain. Domain libraries make the approach practical by:

  • Embedding the specification language in the programming language.
  • Fundamentally relying on reuse, in the best spirit of object technology.

This approach is in line with the one I presented for handling DSLs in an earlier article of this blog [8] (thanks, by the way, for the many comments received, some of them posted here and some on Facebook and LinkedIn where the post triggered long discussions). It is usually a bad idea to invent a new language for a new application domain. A better solution is to rely on libraries, by taking advantage of the power of object-oriented mechanisms to model (in domain libraries) and implement (for DSLs) the defining features of such a domain, and to make the result widely reusable. The resulting libraries are purely descriptive in the case of a domain library expressing a Domain Theory, and directly usable by programs in the case of a library embodying a DSL, but the goal is the same.

A sound and necessary engineering practice

Many ideas superficially look similar to Domain Theory: domain engineering as mentioned above, “domain analysis” as widely discussed in the requirements literature, model-driven development, abstract data type specification… They all start from some of the same observations, but  Domain Theory as described in this article is something different: a systematic approach to modeling an arbitrary application domain mathematically, which:

  • Describes the concepts through applicable operations, axioms and (most importantly) theorems.
  • Expresses these elements in an applicative (side-effect free, i.e. equivalent to pure mathematics) subset of the programming language, for direct embedding in program specifications.
  • Relies on the class mechanism to structure the results.
  • Collects the specifications into specification libraries and promotes the reuse of specifications in the same way we promote software reuse.
  • Uses the combination of these techniques to ensure that program specifications are at a high level of abstraction, compatible with the programmers’ view of their software.
  • Promotes a clear and effective verification process.

The core idea is in line with standard engineering practices in disciplines other than software: to build a bridge, a car or a chip you need first to develop a sound model of the future system and its environment, using any useful models developed previously rather than always going back to elementary textbook mathematics.

It seems in fact easier to justify doing Domain Analysis than to justify not doing it. The power of expression and abstraction of our programs has grown by leaps and bounds; it’s time for our specifications to catch up.


[1] Dines Bjørner: From Domains to Requirements —The Triptych Approach to Software Engineering, “to be submitted to Springer”, available here.

[2] Kokichi Futatsugi and others: CafeObj page, here.

[3] José Meseguer and others: Maude publication page, here.

[4] J. Roger Hindley, J. P. Seldin: Introduction to Combinators and l-calculus, Cambridge University Press, 1986.

[5] Verification As a Matter Of Course, earlier article on this blog (March 2010), available here.

[6] Bernd Schoeller, Tobias Widmer and Bertrand Meyer. Making specifications complete through models, in Architecting Systems with Trustworthy Components, eds. Ralf Reussner, Judith Stafford and Clemens Szyperski, Lecture Notes in Computer Science, Springer-Verlag, 2006, pages 48-70, available here.

[7] Nadia Polikarpova, Carlo A. Furia and Bertrand Meyer: Specifying Reusable Components, in VSTTE’10: Verified Software: Theories, Tools and Experiments, Edinburgh, August 2010, Lecture Notes in Computer Science, Springer-Verlag, available here.

[8] Never Design a Language, earlier article on this blog (January 2012), available here.

VN:F [1.9.10_1130]
Rating: 9.3/10 (12 votes cast)
VN:F [1.9.10_1130]
Rating: +7 (from 7 votes)

TOOLS 2012, “The Triumph of Objects”, Prague in May: Call for Workshops

Workshop proposals are invited for TOOLS 2012, The Triumph of, to be held in Prague May 28 to June 1. TOOLS is a federated set of conferences:

  • TOOLS EUROPE 2012: 50th International Conference on Objects, Models, Components, Patterns.
  • ICMT 2012: 5th International Conference on Model Transformation.
  • Software Composition 2012: 10th International Conference.
  • TAP 2012: 6th International Conference on Tests And Proofs.
  • MSEPT 2012: International Conference on Multicore Software Engineering, Performance, and Tools.

Workshops, which are normally one- or two-day long, provide organizers and participants with an opportunity to exchange opinions, advance ideas, and discuss preliminary results on current topics. The focus can be on in-depth research topics related to the themes of the TOOLS conferences, on best practices, on applications and industrial issues, or on some combination of these.


Submission proposal implies the organizers’ commitment to organize and lead the workshop personally if it is accepted. The proposal should include:

  •  Workshop title.
  • Names and short bio of organizers .
  • Proposed duration.
  •  Summary of the topics, goals and contents (guideline: 500 words).
  •  Brief description of the audience and community to which the workshop is targeted.
  • Plans for publication if any.
  • Tentative Call for Papers.

Acceptance criteria are:

  • Organizers’ track record and ability to lead a successful workshop.
  •  Potential to advance the state of the art.
  • Relevance of topics and contents to the topics of the TOOLS federated conferences.
  •  Timeliness and interest to a sufficiently large community.

Please send the proposals to me (Bertrand.Meyer AT, with a Subject header including the words “TOOLS WORKSHOP“. Feel free to contact me if you have any question.


  •  Workshop proposal submission deadline: 17 February 2012.
  • Notification of acceptance or rejection: as promptly as possible and no later than February 24.
  • Workshops: 28 May to 1 June 2012.


VN:F [1.9.10_1130]
Rating: 7.3/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

The story of our field, in a few short words


(With all dues to [1], but going up from four to five as it is good to be brief yet not curt.)

At the start there was Alan. He was the best of all: built the right math model (years ahead of the real thing in any shape, color or form); was able to prove that no one among us can know for sure if his or her loops — or their code as a whole — will ever stop; got to crack the Nazis’ codes; and in so doing kind of saved the world. Once the war was over he got to build his own CPUs, among the very first two or three of any sort. But after the Brits had used him, they hated him, let him down, broke him (for the sole crime that he was too gay for the time or at least for their taste), and soon he died.

There was Ed. Once upon a time he was Dutch, but one day he got on a plane and — voilà! — the next day he was a Texan. Yet he never got the twang. The first topic that had put him on  the map was the graph (how to find a path, as short as can be, from a start to a sink); he also wrote an Algol tool (the first I think to deal with all of Algol 60), and built an OS made of many a layer, which he named THE in honor of his alma mater [2]. He soon got known for his harsh views, spoke of the GOTO and its users in terms akin to libel, and wrote words, not at all kind, about BASIC and PL/I. All this he aired in the form of his famed “EWD”s, notes that he would xerox and send by post along the globe (there was no Web, no Net and no Email back then) to pals and foes alike. He could be kind, but often he stung. In work whose value will last more, he said that all we must care about is to prove our stuff right; or (to be more close to his own words) to build it so that it is sure to be right, and keep it so from start to end, the proof and the code going hand in hand. One of the keys, for him, was to use as a basis for ifs and loops the idea of a “guard”, which does imply that the very same code can in one case print a value A and in some other case print a value B, under the watch of an angel or a demon; but he said this does not have to be a cause for worry.

At about that time there was Wirth, whom some call Nick, and Hoare, whom all call Tony. (“Tony” is short for a list of no less than three long first names, which makes for a good quiz at a party of nerds — can you cite them all from rote?) Nick had a nice coda to Algol, which he named “W”; what came after Algol W was also much noted, but the onset of Unix and hence of C cast some shade over its later life. Tony too did much to help the field grow. Early on, he had shown a good way to sort an array real quick. Later he wrote that for every type of unit there must be an axiom or a rule, which gives it an exact sense and lets you know for sure what will hold after every run of your code. His fame also comes from work (based in part on Ed’s idea of the guard, noted above) on the topic of more than one run at once, a field that is very hot today as the law of Moore nears its end and every maker of chips has moved to  a mode where each wafer holds more than one — and often many — cores.

Dave (from the US, but then at work under the clime of the North) must not be left out of this list. In a paper pair, both from the same year and both much cited ever since,  he told the world that what we say about a piece of code must only be a part, often a very small part, of what we could say if we cared about every trait and every quirk. In other words, we must draw a clear line: on one side, what the rest of the code must know of that one piece; on the other, what it may avoid to know of it, and even not care about. Dave also spent much time to argue that our specs must not rely so much on logic, and more on a form of table.  In a later paper, short and sweet, he told us that it may not be so bad that you do not apply full rigor when you chart your road to code, as long as you can “fake” such rigor (his own word) after the fact.

Of UML, MDA and other such TLAs, the less be said, the more happy we all fare.

A big step came from the cold: not just one Norse but two, Ole-J (Dahl) and Kris, came up with the idea of the class; not just that, but all that makes the basis of what today we call “O-O”. For a long time few would heed their view, but then came Alan (Kay), Adele and their gang at PARC, who tied it all to the mouse and icons and menus and all the other cool stuff that makes up a good GUI. It still took a while, and a lot of hit and miss, but in the end O-O came to rule the world.

As to the math basis, it came in part from MIT — think Barb and John — and the idea, known as the ADT (not all TLAs are bad!), that a data type must be known at a high level, not from the nuts and bolts.

There also is a guy with a long first name (he hates it when they call him Bert) but a short last name. I feel a great urge to tell you all that he did, all that he does and all that he will do, but much of it uses long words that would seem hard to fit here; and he is, in any case, far too shy.

It is not all about code and we must not fail to note Barry (Boehm), Watts, Vic and all those to whom we owe that the human side (dear to Tom and Tim) also came to light. Barry has a great model that lets you find out, while it is not yet too late, how much your tasks will cost; its name fails me right now, but I think it is all in upper case.  At some point the agile guys — Kent (Beck) and so on — came in and said we had got it all wrong: we must work in pairs, set our goals to no more than a week away, stand up for a while at the start of each day (a feat known by the cool name of Scrum), and dump specs in favor of tests. Some of this, to be fair, is very much like what comes out of the less noble part of the male of the cow; but in truth not all of it is bad, and we must not yield to the urge to throw away the baby along with the water of the bath.

I could go on (and on, and on); who knows, I might even come back at some point and add to this. On the other hand I take it that by now you got the idea, and even on this last day of the week I have other work to do, so ciao.


[1] Al’s Famed Model Of the World, In Words Of Four Signs Or Fewer (not quite the exact title, but very close): find it on line here.

[2] If not quite his alma mater in the exact sense of the term, at least the place where he had a post at the time. (If we can trust this entry, his true alma mater would have been Leyde, but he did not stay long.)

VN:F [1.9.10_1130]
Rating: 10.0/10 (14 votes cast)
VN:F [1.9.10_1130]
Rating: +11 (from 11 votes)

Agile methods: the good, the bad and the ugly

It was a bit imprudent last Monday to announce the continuation of the SCOOP discussion for this week; with the TOOLS conference happening now, with many satellite events such as the Eiffel Design Feast of the past week-end and today’s “New Eiffel Technology Community” workshop, there is not enough time for a full article. Next week might also be problematic. The SCOOP series will resume, but in the meantime I will report on other matters.

As something that can be conveniently typed in while sitting in the back of the TOOLS room during fascinating presentations, here is a bit of publicity for the next round of one-day seminars for industry — “Compact Course” is the official terminology — that I will be teaching at ETH in Zurich next November (one in October), some of them with colleagues. It’s the most extensive session that we have ever done; you can see the full programs and registration information here.

  • Software Engineering for Outsourced and Distributed Development, 27 October 2011
    Taught with Peter Kolb and Martin Nordio
  • Requirements Engineering, 17 November
  • Software Testing and Verification: state of the art, 18 November
    With Carlo Furia and Sebastian Nanz
  • Agile Methods: the Good, the Bad and the Ugly, 23 November
  • Concepts and Constructs of Concurrent Computation, 24 November
    With Sebastian Nanz
  • Design by Contract, 25 November

The agile methods course is new; its summary reads almost like a little blog article, so here it is.

Agile methods: the Good, the Bad and the Ugly

Agile methods are wonderful. They’ll give you software in no time at all, turn your customers and users into friends, catch bugs before they catch you, change the world, and boost your love life. Do you believe these claims (even excluding the last two)? It’s really difficult to form an informed opinion, since most of the presentations of eXtreme Programming and other agile practices are intended to promote them (and the consultants to whom they provide a living), not to deliver an objective assessment.

If you are looking for a guru-style initiation to the agile religion, this is not the course for you. What it does is to describe in detail the corpus of techniques covered by the “agile” umbrella (so that you can apply them effectively to your developments), and assess their contribution to software engineering. It is neither “for” nor “against” agile methods but fundamentally descriptive, pedagogical, objective and practical. The truth is that agile methods include some demonstrably good ideas along with some whose benefits are at best dubious. In addition (and this should not be a surprise) they cannot make the fundamental laws of software engineering go away.

Agile methods have now been around for more than a decade, during which many research teams, applying proven methods of experimental science, have performed credible empirical studies of how well the methods really work and how they compare to more traditional software engineering practices. This important body of research results, although not widely known, is critical to managers and developers in industry for deciding whether and how to use agile development. The course surveys these results, emphasizing the ones most directly relevant to practitioners.

A short discussion session will enable participants with experience in agile methods to share their results.

Taking this course will give you a strong understanding of agile development, and a clear view of when, where and how to apply them.


Morning session: A presentation of agile methods

  • eXtreme Programming, pair programming, Scrum, Test-Driven Development, continuous integration, refactoring, stakeholder involvement, feature-driven development etc.
  • The agile lifecycle.
  • Variants: lean programming etc.

Afternoon session (I): Assessment of agile methods

  • The empirical software engineering literature: review of available studies. Assessment of their value. Principles of empirical software engineering.
  • Agile methods under the scrutiny of empirical research: what helps, what harms, and what has no effect? How do agile methods fare against traditional techniques?
  • Examples: pair programming versus code reviews; tests versus specifications; iterative development versus “Big Upfront Everything”.

Afternoon session (II): Discussion and conclusion

This final part of the course will present, after a discussion session involving participants with experience in agile methods, a summary of the contribution of agile methods to software engineering.

It will conclude with advice for organizations involved in software development and interested in applying agile methods in their own environment.

Target groups

CIOs; software project leaders; software developers; software testers and QA engineers.

VN:F [1.9.10_1130]
Rating: 8.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 5 votes)

The Professor Smith syndrome: Part 2

As stated in the Quiz of a few days ago (“Part 1 ”), we consider the following hypothetical report in experimental software engineering ([1], [2]):

Professor Smith has developed a new programming technique, “Suspect-Oriented Programming” (SOP). To evaluate SOP, he directs half of the students in his “Software Methodology” class to do the project using traditional techniques, and the others to use SOP.

He finds that projects by the students using SOP have, on the average, 15% fewer bugs than the others, and reports that SOP increases software reliability.

What’s wrong with this story?

Professor Smith’s attempt at empirical software engineering is problematic for at least four reasons. Others could arise, but we do not need to consider them if Professor Smith has applied the expected precautions: the number of students should be large enough (standard statistical theory will tell us how much to trust the result for various sample sizes); the students should be assigned to one of the two groups on a truly random basis; the problem should be amenable to both SOP and non-SOP techniques; and the assessment of the number of bugs should in the results should be based on fair and if possible automated evaluation. Some respondents to the quiz cited these problems, but they would apply to any empirical study and we can assume they are being taken care of.

The first problem to consider is that the evaluator and the author of the concept under evaluation are the same person. This is an approach fraught with danger. We have no reason to doubt Professor Smith’s integrity, but he is human. Deep down, he wants SOP to be better than the alternative. That is bound to affect the study. It would be much more credible if someone else, with no personal stake in SOP, had performed it.

The second problem mirrors the first on the students’ side. The students from group 1 were told that they used Professor Smith’s great idea, those from group 2 that they had to use old, conventional, boring stuff. Did both groups apply the same zeal to their work? After all, the students know that Professor Smith created SOP, and maybe he is an convincing advocate, so group 1 students will (consciously or not) do their best; those from group 2 have less incentive to go the extra mile. What we may have at play here is a phenomenon known as the Hawthorne effect [3]: if you know you are being tested for a new technique, you typically work harder and better — and may produce better results even if the technique is worthless! Experiments dedicated to studying this effect show that even  a group that is in reality using the same technique as another does better, at least at the beginning, if it is told that it is using a new, sexy technique.

The first and second problems arise in all empirical studies, software-related or not. They are the reason why medical experiments use placebos and double-blind techniques (where neither the subjects nor the experimenters themselves know who is using which variant). These techniques often do not directly transpose to software experiments, but we should all the same be careful about empirical studies of assessments of one’s own work and about possible Hawthorne effects.

The third problem, less critical, is the validity of a study relying on students. To what extent can we extrapolate from the results to a situation in industry? Software engineering students are on their way to becoming software professionals, but they are not professionals yet. This is a difficult issue because universities, rather than industry, are usually and understandably the place where experiments take place, an sometimes there is no other choice than using students. But then one can question the validity of the results. It depends on the nature of the questions being asked: if the question under study is whether a certain idea is easy to learn, using students is reasonable. But if it is, for example, whether a technique produces less buggy programs, the results can depend significantly on the subjects’ experience, which is different for students and professionals.

The last problem does not by itself affect the validity of the results, but it is a show-stopper nonetheless: Professor Smith’s experiment is unethical! If is is indeed true that SOP is better than the alternative, he is harming students from group 2; in the reverse case, he is harming students from group 1. Only in the case of the null hypothesis (using SOP makes no statistically significant difference) is the experiment ethical, but then it is also profoundly uninteresting. The rule in course-related experiments is a variant of the Hippocratic oath: before all, do not harm. The first purpose of a course is to enrich the students’ knowledge and skills; secondary aims, such as helping the professor’s research, are definitely acceptable, but must never impede the first. The setup described above is all the less acceptable that the project results presumably count towards the course grade, so the students who were forced to use the less good technique, if there demonstrably was one, have grounds to complain.

Note that Professor Smith could partially address this fairness problem by letting students choose their group, instead of assigning them randomly to group 1 or group 2 (based for example on the first letter of their names). But then the results would lose credibility, because this technique introduces self-selection and hence bias: the students who choose SOP may be the more intellectually curious students, and hence possibly the ones who do better anyway.

If Professor Smith cannot ensure fairness, he can still use students for his experiment, but has to run it outside of a course, for example by paying students, or running the experiment as a competition with some prizes for those who produce the programs with fewest bugs. This technique can work, although it introduces further dangers of self-selection. As part of a course, however, you just cannot assign students, on your own authority, to different techniques that might have an different effect on the core goal of the course: the learning experience.

So Professor Smith has a long way to go before he can run experiments that will convey a significant argument in favor of SOP.

Over the years I have seen, as a reader and sometimes as a referee, many Professor Smith papers: “empirical” evaluation of a technique by its own authors, using questionable techniques and not applying the necessary methodological precautions.

A first step is, whenever possible, to use experimenters who are from a completely different group from the developers of the ideas, as in two studies [4] [5] about the effectiveness of pair programming.

And yet! Sometimes no one else is available, and you do want to obtain objective empirical evidence about the merits of your own ideas. You are aware of the risk, and ready to face the cold reality, including if the results are unfavorable. Can you do it?

A recent attempt of ours seems to suggest that this is possible if you exert great care. It will presented in a paper at the next ESEM (Empirical Software Engineering and Measurement) and even though it discusses assessing some aspects of our own designs, using students, as part of the course project which counts for grading, and separating them into groups, we feel it was fair and ethical, and </modesty_filter_on>an ESEM referee wrote: “this is one of the best designed, conducted, and presented empirical studies I have read about recently”<modesty_filter_on>.

How did we proceed? How would you have proceeded? Think about it; feel free to express your ideas as comments to this post. In the next installment of this blog (The Professor Smith Syndrome: Part 3), I will describe our work, and let you be the judge.


[1] Bertrand Meyer: The rise of empirical software engineering (I): the good news, this blog, 30 July 2010, available here.
[2] Bertrand Meyer: The rise of empirical software engineering (II): what we are still missing, this blog, 31 July 2010, available here.

[3] On the Hawthorne effect, there is a good Wikipedia entry. Acknowledgment: I first heard about the Hawthorne effect from Barry Boehm.

[4] Jerzy R. Nawrocki, Michal Jasinski, Lukasz Olek and Barbara Lange: Pair Programming vs. Side-by-Side Programming, in EuroSPI 2005, pages 28-38. I do not have a URL for this article.

[5] Matthias Müller: Two controlled Experiments concerning the Comparison of Pair Programming to Peer Review, in  Journal of Systems and Software, vol. 78, no. 2, pages 166-179, November 2005; and Are Reviews an Alternative to Pair Programming ?, in  Journal of Empirical Software Engineering, vol. 9, no. 4, December 2004. I don’t have a URL for either version. I am grateful to Walter Tichy for directing me to this excellent article.

VN:F [1.9.10_1130]
Rating: 10.0/10 (2 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

About Watts Humphrey

Watts Humphrey, 2007

At FOSE (see previous post [1]) we will honor the memory of Watts Humphrey, the pioneer of disciplined software engineering, who left us in October. A blog entry on my Communications of the ACM blog [2] briefly recalls some of Humphrey’s main contributions.


[1] The Future Of Software Engineering: previous entry of this blog.
[2] Watts Humphrey: In Honor of a Pioneer, in CACM blog.

VN:F [1.9.10_1130]
Rating: 8.5/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

The rise of empirical software engineering (I): the good news


RecycledIn the next few days I will post a few comments about a topic of particular relevance to the future of our field: empirical software engineering. I am starting by reposting two entries originally posted in the CACM blog. Here is the first. Let me use this opportunity to mention the LASER summer school [1] on this very topic — it is still possible to register.

Empirical software engineering papers, at places like ICSE (the International Conference on Software Engineering), used to be terrible.

There were exceptions, of course, most famously papers by Basili, Zelkowitz, Rombach, Tichy, Berry, Humphrey, Gilb, Boehm, Lehmann, Belady and a few others, who kept hectoring the community about the need to base our opinions and practices on evidence rather than belief. But outside of these cases the typical ICSE empirical paper — I sat through a number of them — was depressing: we made these measurements in our company, found these results, just believe us. A question here in the back? Can you reproduce our results? Access our code? We’d love you to, but unfortunately we work for a company — the Call for Papers said industry contributions were welcome, didn’t it? — and we can’t give you the details. So sorry. But trust us, we checked our results.

Actually, there was another kind of empirical paper, which did not suffer from such secrecy: the university study. Hi, I am professor Bright, the well-known author of the Bright method of software development. Everyone knows it’s the best, but we wanted to assess it scientifically through a rigorous empirical study. I gave the same programming problem to two groups of third-year undergraduates; one group was told to use the Bright method, the other not. Guess what? The Bright group performed 67.94% better! I see the session chair wanting to move to the next speaker; see the details in the paper.

For years, this was most of what we had: unverifiable industry reports and unconvincing student experiments.

And suddenly the scene has changed. Empirical software engineering studies are in full bloom; the papers are flowing, and many are good!

What triggered this radical change is the availability of open-source repositories. Projects such as Linux, Eclipse, Apache, EiffelStudio and many others have records going back 10, 15, sometimes 20 years. These records contain the true history of the project: commits (into the configuration management system), bug reports, bug fixes, test runs and their results, developers involved, and many more elements of project data. All of a sudden empirical research has what any empirical science needs: a large corpus of objects to analyze.

Open-source projects have given the decisive jolt, but now we can rely on industrial data as well: Microsoft and other companies have started making their own records selectively available to researchers. In the work of authors such as Zeller from Sarrebruck, Gall from Uni. Zurich or Nagappan from Microsoft, systematic statistical techniques yield answers, sometimes surprising, to questions on which we could only speculate. Do novices or experts cause more bugs? Does test coverage correlate with software quality, and if so, positively or negatively? Little by little, we are learning about the true properties of software products and processes, based not on fantasies but on quantitative analysis of meaningful samples.

The trend is unmistakable, and irreversible.

Not all is right yet; in the second installment of this post I will describe some of what still needs to be improved for empirical software engineering to achieve full scientific rigor.


[1] LASER summer school 2010, at

VN:F [1.9.10_1130]
Rating: 4.5/10 (2 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 4 votes)

Another DOSE of distributed software development

The software world is not flat; it is multipolar. Gone are the days of one-site, one-team developments. The increasingly dominant model today is a distributed team; the place where the job gets done is the place where the appropriate people reside, even if it means that different parts of the job get done in different places.

This new setup, possibly the most important change to have affected the practice of software engineering in this early part of the millennium,  has received little attention in the literature; and even less in teaching techniques. I got interested in the topic several years ago, initially by looking at the phenomenon of outsourcing from a software engineering perspective [1]. At ETH, since 2004, Peter Kolb and I, aided by Martin Nordio and Roman Mitin, have taught a course on the topic [2], initially called “software engineering for outsourcing”. As far as I know it was the first course of its kind anywhere; not the first course about outsourcing, but the first to explore the software engineering implications, rather than business or political issues. We also teach an industry course on the same issues [3], attended since 2005 by several hundred participants, and started, with Mathai Joseph from Tata Consulting Services, the SEAFOOD conference [4], Software Engineering Advances For Outsourced and Offshore Development, whose fourth edition starts tomorrow in Saint Petersburg.

After a few sessions of the ETH course we realized that the most important property of the mode of software development explored in the course is not that it involves outsourcing but that it is distributed. In parallel I became directly involved with highly distributed development in the practice of Eiffel Software’s development. In 2007 we renamed the ETH course “Distributed and Outsourced Software Engineering” (DOSE) to acknowledge the broadened scope. The topic is still new; each year we learn a little more about what to teach and how to teach it.

The 2007 session saw another important addition. We felt it was no longer sufficient to talk about distributed development, but that students should practice it. Collaboration between groups in Zurich and other groups in Zurich was not good enough. So we contacted colleagues around the world interested in similar issues, and received an enthusiastic response. The DOSE project is itself distributed: teams from students in different universities collaborate in a single development. Typically, we have two or three geographically distributed locations in each project group. The participating universities have been Politecnico di Milano (where our colleagues Carlo Ghezzi and Elisabetta di Nitto have played a major role in the current version of the project), University of Nijny-Novgorod in Russia, University of Debrecen in Hungary, Hanoi University of Technology in Vietnam, Odessa National Polytechnic in the Ukraine and (across town for us) University of Zurich. For the first time in 2010 a university from the Western hemisphere will join: University of Rio Cuarto in Argentina.

We have extensively studied how the projects actually fare (see publications [4-8]). For students, the job is hard. Often, after a couple of weeks, many want to give up: they have trouble reaching their partner teams, understanding their accents on Skype calls, agreeing on modes of collaboration, finalizing APIs, devising a proper test plan. Yet they hang on and, in most cases, succeed. At the end of the course they tell us how much they have learned about software engineering. For example I know few better way of teaching the importance of carefully documented program interfaces — including contracts — than to ask the students to integrate their modules with code from another team halfway around the globe. This is exactly what happens in industrial software development, when you can no longer rely on informal contacts at the coffee machine or in the parking lot to smooth out misunderstandings: software engineering principles and techniques come in full swing. With DOSE, students learn and practice these fundamental techniques in the controlled environment of a university project.

An example project topic, used last year, was based on an idea by Martin Nordio. He pointed out that in most countries there are some card games played in that country only. The project was to program such a game, where the team in charge of the game logic (what would be the “business model” in an industrial project) had to explain enough of their country’s game, and abstractly enough, to enable the other team to produce the user interface, based on a common game engine started by Martin. It was tough, but some of the results were spectacular, and these are students who will not need more preaching on the importance of specifications.

We are currently preparing the next session of DOSE, in collaboration with our partner universities. The more the merrier: we’d love to have other universities participate, including from the US. Adding extra spice to the project, the topic will be chosen among those from the ICSE SCORE competition [9], so that winning students have the opportunity to attend ICSE in Hawaii. If you are teaching a suitable course, or can organize a student group that will fit, please read the project description [10] and contact me or one of the other organizers listed on the page. There is a DOSE of madness in the idea, but no one, teacher or student,  ever leaves the course bored.


[1] Bertrand Meyer: Offshore Development: The Unspoken Revolution in Software Engineering, in Computer (IEEE), January 2006, pages 124, 122-123. Available here.

[2] ETH course page: see here for last year’s session (description of Fall 2010 session will be added soon).

[3] Industry course page: see here for latest (June 2010( session (description of November 2010 session will be added soon).

[4] SEAFOOD 2010 home page.

[5] Bertrand Meyer and Marco Piccioni: The Allure and Risks of a Deployable Software Engineering Project: Experiences with Both Local and Distributed Development, in Proceedings of IEEE Conference on Software Engineering & Training (CSEE&T), Charleston (South Carolina), 14-17 April 2008, ed. H. Saiedian, pages 3-16. Preprint version  available online.

[6] Bertrand Meyer:  Design and Code Reviews in the Age of the Internet, in Communications of the ACM, vol. 51, no. 9, September 2008, pages 66-71. (Original version in Proceedings of SEAFOOD 2008 (Software Engineering Advances For Offshore and Outsourced Development,  Lecture Notes in Business Information Processing 16, Springer Verlag, 2009.) Available online.

[7] Martin Nordio, Roman Mitin, Bertrand Meyer, Carlo Ghezzi, Elisabetta Di Nitto and Giordano Tamburelli: The Role of Contracts in Distributed Development, in Proceedings of SEAFOOD 2009 (Software Engineering Advances For Offshore and Outsourced Development), Zurich, June-July 2009, Lecture Notes in Business Information Processing 35, Springer Verlag, 2009. Available online.

[8] Martin Nordio, Roman Mitin and Bertrand Meyer: Advanced Hands-on Training for Distributed and Outsourced Software Engineering, in ICSE 2010: Proceedings of 32th International Conference on Software Engineering, Cape Town, May 2010, IEEE Computer Society Press, 2010. Available online.

[9] ICSE SCORE 2011 competition home page.

[10] DOSE project course page.

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

The other impediment to software engineering research

In the decades since structured programming, many of the advances in software engineering have come out of non-university sources, mostly of four kinds:

  • Start-up technology companies  (who played a large role, for example, in the development of object technology).
  • Industrial research labs, starting with Xerox PARC and Bell Labs.
  • Independent (non-university-based) author-consultants. 
  • Independent programmer-innovators, who start open-source communities (and often start their own businesses after a while, joining the first category).

 Academic research has had its part, honorable but limited.

Why? In earlier posts [1] [2] I analyzed one major obstacle to software engineering research: the absence of any obligation of review after major software disasters. I will come back to that theme, because the irresponsible attitude of politicial authorities hinders progress by depriving researchers of some of their most important potential working examples. But for university researchers there is another impediment: the near-impossibility of developing serious software.

If you work in theory-oriented parts of computer science, the problem is less significant: as part of a PhD thesis or in preparation of a paper you can develop a software prototype that will support your research all the way to the defense or the publication, and can be left to wither gracefully afterwards. But software engineering studies issues that arise for large systems, where  “large” encompasses not only physical size but also project duration, number of users, number of changes. A software engineering researcher who only ever works on prototypes will be denied the opportunity to study the most significant and challenging problems of the field. The occasional consulting job is not a substitute for this hands-on experience of building and maintaining large software, which is, or should be, at the core of research in our field.

The bodies that fund research in other sciences understood this long ago for physics and chemistry with their huge labs, for mechanical engineering, for electrical engineering. But in computer science or any part of it (and software engineering is generally viewed as a subset of computer science) the idea that we would actually do something , rather than talk about someone else’s artifacts, is alien to the funding process.

The result is an absurd situation that blocks progress. Researchers in experimental physics or mechanical engineering employ technicians: often highly qualified personnel who help researchers set up experiments and process results. In software engineering the equivalent would be programmers, software engineers, testers, technical writers; in the environments that I have seen, getting financing for such positions from a research agency is impossible. If you have requested a programmer position as part of a successful grant request, you can be sure that this item will be the first to go. Researchers quickly understand the situation and learn not even to bother including such requests. (I have personally never seen a counter-example. If you have a different experience, I will be interested to learn who the enlightened agency is. )

The result of this attitude of funding bodies is a catastrophe for software engineering research: the only software we can produce, if we limit ourselves to official guidelines, is demo software. The meaningful products of software engineering (large, significant, usable and useful open-source software systems) are theoretically beyond our reach. Of course many of us work around the restrictions and do manage to produce working software, but only by spending considerable time away from research on programming and maintenance tasks that would be far more efficiently handled by specialized personnel.

The question indeed is efficiency. Software engineering researchers should program as part of their normal work:  only by writing programs and confronting the reality of software development can we hope to make relevant contributions. But in the same way that an experimental physicist is helped by professionals for the parts of experimental work that do not carry a research value, a software engineering researcher should not have to spend time on porting the software to other architectures, performing configuration management, upgrading to new releases of the operating system, adapting to new versions of the libraries, building standard user interfaces, and all the other tasks, largely devoid of research potential, that software-based innovation requires.

Until  research funding mechanisms integrate the practical needs of software engineering research, we will continue to be stymied in our efforts to produce a substantial effect on the quality of the world’s software.


[1] The one sure way to advance software engineering: this blog, see here.
[2] Dwelling on the point: this blog, see here.

VN:F [1.9.10_1130]
Rating: 8.3/10 (18 votes cast)
VN:F [1.9.10_1130]
Rating: +6 (from 8 votes)

Verification As a Matter Of Course

At the ACM Symposium on Applied Computing (SAC) in Sierre last week, I gave a talk entitled “How you will be programming in 10 years”, describing a number of efforts by various people, with a special emphasis on our work at both ETH and Eiffel Software, which I think point to the future of software development. Several people have asked me for the slides, so I am making them available [1].

It occurred to me after the talk that the slogan “Verification As a Matter Of Course” (VAMOC) characterizes the general idea well. The world needs verified software, but the software development community is reluctant  to use traditional heavy-duty verification techniques. While some of the excuses are unacceptable, others sources of resistance are justified and it is our job to make verification part of the very fabric of everyday software development.

My bet, and the basis of large part of both Eiffel and the ETH verification work, is that it is possible to bring verification to practicing developers as a natural, unobtrusive component of the software development process, through the tools they use.

The talk also broaches on concurrency, where many of the same ideas apply; CAMOC is the obvious next slogan.


[1] Slides of “How you will be programming in 10 years” talk (PDF).

VN:F [1.9.10_1130]
Rating: 8.8/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Dwelling on the point

Once again, and we are not learning!

La Repubblica of last Thursday [1] and other Italian newspapers have reported on a “computer” error that temporarily brought thousands of accounts at the national postal service bank into the red. It is a software error, due to a misplacement of the decimal points in some transactions.

As usual the technical details are hazy; La Repubblica writes that:

Because of a software change that did not succeed, the computer system did not always read the decimal point during transactions”.

As a result, it could for example happen that a 15.00-euro withdrawal was understood as 1500 euros.
I have no idea what “reading the decimal point ” means. (There is no mention of OCR, and the affected transactions seem purely electronic.) Only some of the 12 million checking or “Postamat” accounts were affected; the article cites a number of customers who could not withdraw money from ATMs because the system wrongly treated their accounts as over-drawn. It says that this was the only damage and that the postal service will send a letter of apology. The account leaves many questions unanswered, for example whether the error could actually have favored some customers, by allowing them to withdraw money they did not have, and if so what will happen.

The most important unanswered question is the usual one: what was the software error? As usual, we will probably never know. The news items will soon be forgotten, the postal service will somehow fix its code, life will go on. Nothing will be learned; the next time around similar causes will produce similar effects.

I criticized this lackadaisical attitude in an earlier column [2] and have to hammer its conclusion again: any organization using public money should be required, when it encounters a significant software malfunction, to let experts investigate the incident in depth and report the results publicly. As long as we keep forgetting our errors we will keep repeating them. Where would airline safety be in the absence of thorough post-accident reports? That a software error did not kill anyone is not a reason to ignore it. Whether it is the Italian post messing up, a US agency’s space vehicle crashing on the moon or any other software fault causing systems to fail, it is not enough to fix the symptoms: we must have a professional report and draw the lessons for the future.


[1] Luisa Grion: Poste in tilt per una virgola — conti gonfiati, stop ai prelievi. In La Repubblica, 26 November 2009, page 18 of the print version. (At the time of writing it does not appear at,  but see  the TV segment also titled “Poste in tilt per una virgola” on Primocanale Web TV here, and other press articles e.g. in Il Tempo here.)

[2] On this blog: The one sure way to advance software engineering (post of 21 August 2009).

VN:F [1.9.10_1130]
Rating: 9.5/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 5 votes)

Specifying user interfaces

Many blogs including this one rely on the WordPress software. In previous states of the present page you may have noticed a small WordPress bug, which I find interesting.

“Tags” are a nifty WordPress feature. When you post a message, you can specify one or more informative “tags”. The tags of all messages appear in the right sidebar, each with a smaller or bigger font size depending on the number of messages that specified it. You can click such a tag in the sidebar and get, on the left, a page containing all the associated messages.

Now assume that many posts use a particular tag; in our example it is “Design by Contract”, not unexpected for this blog. Assume further that the tag name is long. It is indeed in this case: 18 characters. As a side note, no problem would arise if I used normal spaces in the name, which would then appear on two or three lines; precisely to avoid this  I use HTML “non-breaking spaces”. This is probably not in the WordPress spirit, but any other long tag without spaces would create the same problem. That problem is a garbled display:


The long tag overflows the bluish browser area assigned to tags, producing an ugly effect. This behavior is hard to defend: either the tag should have been rejected as too long when the poster specified it or it should fit in its zone, whether by truncation or by applying a smaller font.

I quickly found a workaround, not nice but good enough: make sure that some short tag  (such as “Hoare”) appears much more often than the trouble-making tag. Since font size indicates the relative frequency of tags, the long one will be scaled down to a smaller font which fits.

Minor as it is, this WordPress glitch raises some general questions. First, is it really a bug? Assume, by a wild stretch of imagination, that a jury had to resolve this question; it could easily find an expert to answer positively, by stating that the behavior does not satisfy reasonable user expectations, and another who notes that it is not buggy behavior since it does not appear to violate any expressly stated property of the specification. (At least I did not find in the WordPress documentation any mention of either the display size of tags or a limit on tag length; if I missed it please indicate the reference.)

Is it a serious matter? Not in this particular example; uncomely Web display does not kill.   But the distinction between “small matter of esthetics” and software fault can be tenuous. We may note in particular that the possibility for large data to overflow its assigned area is a fundamental source of security risks; and even pure user interface issues can become life-threatening in the case of critical applications such as air-traffic control.

Our second putative expert is right, however: no behavior is buggy unless it contradicts a specification. Where will the spec be in such an example? There are three possibilities, each with its limitations.

The first solution is to expect that in a carefully developed system every such property will have to be specified. This is conceivable, but hard, and the question arises of how to make sure nothing has been forgotten. Past  some threshold of criticality and effort, the only specifications that count are formal; there is not that much literature on specifying user interfaces formally, since much of the work on formal specifications has understandably concentrated on issues thought to be more critical.

Because of the tediousness of specifying such general properties again and again for each case, it might be better  — this is the second solution — to specify them globally, for an entire system, or for the user interfaces of an entire class of systems. Like any serious effort at specification, if it is worth doing, it is worth doing formally.

In either of these approaches the question remains of how we know we have specified everything of interest. This question, specification completeness, is not as hopeless as most people think; I plan to write an entry about it sometime (hint:  bing for “guttag horning”). Still, it is hard to be sure you did not miss anything relevant. Remember this the next time formal methods advocates — who should know better — tell you that with their techniques there “no longer is a need to test”, or when you read about the latest OS kernel that is “guaranteed correct and secure”. However important formal methods and proofs are, they can only guarantee satisfaction of the properties that the specifier has considered and stated. To paraphrase someone [1], I would venture that

Proofs can only show the absence of envisaged bugs, never rule out the presence of unimagined ones.

This is one of the reasons why tests will always, regardless of the progress of proofs, remain an indispensable part of the software development landscape [2]. Whatever you have specified and proved, you will still want to run the system (or, for certain classes of embedded software, some simulation of it) and see the results for yourself.

What then if we do not explicitly specify the desired property (as we did in the two approaches considered so far) but testing or actual operation still reveals some behavior that is clearly unsatisfactory? On what basis do we complain to the software’s producer? A solution here, the third in our list, might be to rely on generally accepted standards of professional development. This is common in other kinds of engineering: if you commission someone to build you a house, the contract may not explicitly state that the roof should not fall on your head while you are asleep; when this happens, you will still sue and accuse the builder of malpractice. Such remedies can work for software too, but the rules are murkier because we have not accepted, or even just codified, a set of general professional practices that would cover such details as “no display of information should overflow its assigned area”.

Until then I will remember to use one short tag a lot.


[1] Edsger W. Dijksra, Notes on Structured Programming, in Dahl, Dijkstra, Hoare, Structured Programming, Academic Press, 1972.

[2]  See Tests And Proofs (TAP) conference series, since 2007. The next conference, program-chaired by Angelo Gargantini and Gordon Fraser, will take place in the week of the TOOLS Federated Conferences in Málaga, Spain, in the week of June 28, 2010.

VN:F [1.9.10_1130]
Rating: 5.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)