Archive for the ‘Software engineering’ Category.

PhD positions in concurrency/distribution/verification at ETH

As part of our “Concurrency Made Easy” ERC Advanced Investigator Grant project (2012-2017), we are offering PhD positions at the Chair of Software Engineering of ETH Zurich. The goal of the project is to build a sophisticated programming and verification architecture to make concurrent and distributed programming simple and reliable, based on the ideas of Eiffel and particularly the SCOOP concurrency model. Concurrency in its various forms (particularly multithreading) as well as distributed computing are required for most of today’s serious programs, but programming concurrent applications remains a challenge. The CME project is determined to break this complexity barrier.  Inevitably, achieving simplicity for users (in this case, application programmers) requires, under the hood, a sophisticated infrastructure, both conceptual (theoretical models) and practical (the implementation). We are building that infrastructure.

ETH offers an outstanding research and education environment and competitive salaries for “assistants” (PhD students), who are generally expected in addition to their research to participate in teaching, in particular introductory programming, and other activities of the Chair.  The candidates we seek have: a master’s degree in computer science or related field from a recognized institution (as required by ETH); a strong software engineering background, both practical and theoretical, and more generally a strong computer science and mathematical culture; a good knowledge of verification techniques (e.g. Hoare-style, model-checking, abstract interpretation); some background in concurrency or distribution; and a passion for high-quality software development. Prior publications, and experience with Eiffel, are pluses. In line with ETH policy, particular attention will be given to female candidates.

Before applying, you should become familiar with our work; see in particular the research pages at se.ethz.ch including the full description of the CME project at cme.ethz.ch.

Candidates should send (in PDF or text ) to se-open-positions@lists.inf.ethz.ch a CV and a short cover letter describing their view of the CME project and ideas about their possible contribution.

VN:F [1.9.10_1130]
Rating: 5.7/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 4 votes)

Negative variables: new version

I have mentioned this paper before (see the earlier blog entry here) but it is now going to be published [1] and has been significantly revised, both to take referee comments into account and because we found better ways to present the concepts.

We have  endeavored to explain better than in the draft why the concept of negative variable is necessary and why the usual techniques for modeling object-oriented programs do not work properly for the fundamental OO operation, qualified call x.r (…). These techniques are based on substitution and are simply unable to express certain properties (let alone verify them). The affected properties are those involving properties of the calling context or the global project structure.

The basic idea (repeated in part from the earlier post) is as follows. In modeling OO programs, we have to take into account the unique “general relativity” property of OO programming: all the operations you write are expressed relative to a “current object” which changes repeatedly during execution. More precisely at the start of a call x.r (…) and for the duration of that call the current object changes to whatever x denotes — but to determine that object we must again interpret x in the context of the previous current object. This raises a challenge for reasoning about programs; for example in a routine the notation f.some_reference, if f is a formal argument, refers to objects in the context of the calling object, and we cannot apply standard rules of substitution as in the non-OO style of handling calls.

We introduced a notion of negative variable to deal with this issue. During the execution of a call x.r (…) the negation of x , written x’, represents a back pointer to the calling object; negative variables are characterized by axiomatic properties such as x.x’= Current and x’.(old x)= Current.

Negative variable as back pointer

The paper explains why this concept is necessary, describes the associated formal rules, and presents applications.

Reference

[1] Bertrand Meyer and Alexander Kogtenkov: Negative Variables and the Essence of Object-Oriented Programming, to appear in Specification, Algebra, and Software, eds. Shusaku Iida, Jose Meseguer and Kazuhiro Ogata, Springer Lecture Notes in Computer Science, 2014, to appear. See text here.

VN:F [1.9.10_1130]
Rating: 7.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 4 votes)

Saint Petersburg Software Engineering Seminar: 14 January 2014 (6 PM)

There will be two talks in the Software Engineering Seminar at ITMO, 18:00 local time, Tuesday, January 14, 2014. Please arrive 10 minutes early for registration.

Place: ITMO, Sytninskaya Ulitsa, Saint Petersburg.

Andrey Terekhov (SPBGU): Programming crystals

(I do not know whether this talk will be in Russian or English. An abstract follows but the talk is meant as the start of a discussion rather than a formal lecture.)

В течение последних 20-30 лет основными языками программирования кристаллов были VHDL и Verilog. Эти языки изначально проектировались как средства создания проектной документации, потом они стали использоваться в качестве инструмента моделирования и только сравнительно недавно для этих языков появились средства генерации кода уровня RTL (Register Transfer Language). Тексты на  VHDL и Verilog очень громоздки, трудно читаемые, плохо стандартизованы (одна и та же программа может синтезироваться на одном инструменте и не поддаваться синтезу на другом. Лет 10 назад появился язык SystemC – это С++ с огромным набором библиотек. С одной стороны, любая программа на SystemC может транслироваться стандартными трансляторами С++ , есть удобные средства потактного моделирования и приличные средства генгерации RTL, с другой стороны, требование совместимости с С++ не прошло даром – если в базовом языке нет средств описания параллелизма и конвейеризации, их приходится добавлять весьма искусственными приемами через приставные библиотеки. Буквально в прошлом году фирма Xilinx выпустила продукт Vivado, в рекламе которого утверждается, что он способен автоматически транслировать обычные программы на С/C++ в RTL промышленного качества.

Мы выполнили несколько экспериментов по использованию этого продукта, оказалось, что обещанной автоматизации там нет, пользователь должен писать на С, постоянно думая о том, как его код будет выглядеть в финальном RTL,  расставлять огромное количество прагм, причем не всегда очевидных.

Основной тезис доклада – такая важная область, как проектирование кристаллов, нуждается в специализированных языковых и инструментальных средствах, обеспечивающих  создание компактных и  легко читаемых программ, которые могут быть использованы как для симуляции, так и для генерации эффективного RTL. В докладе будут приведены примеры программ на языке HaSCoL (Hardware and Software Codesign Language), разработанном на кафедре системного программирования СПбГУ, и даны некоторые сравнительные характеристики.

Sergey Velder (ITMO): Alias graphs

(My summary – BM.) In the ITMO SEL work on automatic alias analysis, a new model has been developed: alias graphs, an abstraction of the object structure. This short talk will compare it to previously used approaches.

VN:F [1.9.10_1130]
Rating: 5.5/10 (2 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 2 votes)

New paper: alias calculus and frame inference

For a while now I have  been engaged in  a core problem of software verification: the aliasing problem. As with many difficult problems in science, it is easy to state the basic question: can we determine automatically whether at a program point p the values of two reference expressions e and f can ever denote the same object?

Alias analysis lies at the core of many problems in software analysis and verification.

Earlier work [2] I introduced an “alias calculus”. The calculus is a set of rules, attached to the constructs of the programming language, to compute the “alias relation”: the set of possibly aliased expression pairs. A new paper [1] with Sergey Velder and Alexander Kogtenkov improves the model (correcting in particular an error in the axiom for assignment, whose new version has been proved sound using Coq) and applies it to the inference of frame properties. Here the abstract:

Alias analysis, which determines whether two expressions in a program may reference to the same object, has many potential applications in program construction and verification. We have developed a theory for alias analysis, the “alias calculus”, implemented its application to an object-oriented language, and integrated the result into a modern IDE. The calculus has a higher level of precision than many existing alias analysis techniques. One of the principal applications is to allow automatic change analysis, which leads to inferring “modifies clauses”, providing a significant advance towards addressing the Frame Problem. Experiments were able to infer the “modifies” clauses of an existing formally specied library. Other applications, in particular to concurrent programming, also appear possible. The article presents the calculus, the application to frame inference including experimental results, and other projected applications. The ongoing work includes building more efficient model capturing aliasing properties and soundness proof for its essential elements.

This is not the end of the work, as better models and implementations are needed, but an important step.

References

[1] Sergey Velder, Alexander Kogtenkovand Bertrand Meyer: Alias Calculus, Frame Calculus and Frame Inference, in Science of Computer Programming, to appear in 2014 (appeared online 26 November 2013); draft available here, published version here.
[2] Bertrand Meyer: Steps Towards a Theory and Calculus of Aliasing, in International Journal of Software and Informatics, Chinese Academy of Sciences, 2011, pages 77-116, available here.

 

VN:F [1.9.10_1130]
Rating: 7.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 4 votes)

The biggest software-induced disaster ever

 

In spite of the brouhaha surrounding the Affordable Care Act, the US administration and its partisans seem convinced that “the Web site problems will be fixed”.

That is doubtful. All reports suggest that the problem is not to replace a checkbox by a menu, or buy a few more servers. The analysis, design and implementation are wrong, and the sites will not work properly any time soon.

Barring sabotage (for which we have seen no evidence), this can only be the result of incompetence. An insurance exchange? Come on. Any half-awake group of developers could program it over breakfast.

Who chose the contractors?

When the problems first surfaced a few weeks ago, anyone with experience and guts would have done the right thing: fire all the companies responsible for  the mess and start from scratch with a dedicated, competent and well-managed team.

The latest promises published are that by the end of the month “four out of five” of the people trying to register will manage to do it. Nice. Imagine that when trying to make a purchase at Amazon you would succeed 80% of the time.

And that is only an optimistic goal.

The people building the site do not have infinite time. In fact, the process is crucially time-driven: if people do not get health coverage in time, they will be fined. But what if they cannot get coverage because the Web sites do not respond, or mess up?

Consider for a second another example of another strictly time-driven project: on January 1, 2002, twelve countries switched to a common currency, with the provision that their current legal tender would lose its status only a bare two months later. The IT infrastructure had to work on the appointed day. It did. How come Europe could implement the Euro in time and the US cannot get a basic health exchange to work?

Here is a possible scenario: the sites do not work (cannot handle the load, give inconsistent results). A massive wave of protests ensues, boosted by those who were against universal health coverage in the first place. Faced with popular revolt and with the evidence, the administration announces that the implementation of the universal mandate — the enforcement of the fines — is delayed by a year. In a year much can happen; opposition grows and the first exchanges are an economic disaster since the “young healthy adults” feel no pressure to enroll. The law fades into oblivion. Americans do not get universal health care for another generation. Show me it is not going to happen.

The software engineering lessons here are clear: hire competent companies; faced with a complicated system, implement the essential functions first, but stress-test them; deploy step by step, with the assurance that whatever is deployed works.

The exact reverse strategy was applied. As a result, we face the prospect of a software disaster that will dwarf Y2K and other famous mishaps; a disaster that software engineering textbooks will feature for decades to come.

 

VN:F [1.9.10_1130]
Rating: 9.1/10 (17 votes cast)
VN:F [1.9.10_1130]
Rating: +6 (from 10 votes)

Informatics education in Europe: Just the facts

 

In 2005 a number of us started Informatics Europe [1], the association of university departments and industrial research labs in computer science in Europe. The association has now grown to 80 members across the entire continent; it organizes the annual European Computer Science Summit and has published a number of influential reports. The last one just came out: Informatics Education in Europe: Institutions, Degrees, Students, Positions, Salaries — Key Data 2008-2012 [2]. The principal author is Cristina Pereira, who collected and organized the relevant data over more than a year; I helped with the preparation of the final text.

At the beginning of Informatics Europe we considered with particular attention  the model of the Computing Research Association [3], which played a crucial role in giving computer science (informatics) its due place in the US academic landscape. Several past and current officers of the CRA,  such as Willy Zwaenepoel, Ed Lazowska, Bob Constable, Andy Bernat, Jeannette Wing, Moshe Vardi and J Strother Moore gave keynotes at our early conferences and we of course asked them for the secrets of their organization’s success. One answer that struck us was the central role played by data collection. Just gathering the  facts, such as degrees and salaries, established for the first time a solid basis for serious discussions. We took this advice to heart and the report is the first result.

Gathering the information is particularly difficult for Europe given the national variations and the absence of centralized statistical data. Even the list of names under which institutions teach informatics in Europe fills a large table in the report. Cristina’s decision was, from the start, to favor quality over quantity: to focus on impeccable data for countries for which we could get it, rather than trying to cover the whole continent with data of variable credibility.

The result is the first systematic repository of basic information on informatics education in Europe: institutions, degrees offered and numbers awarded, student numbers, position titles and definitions, and (a section which will not please everyone) salaries for PhD students, postdocs and professors of various ranks.

The report is a first step; it only makes sense if we can regularly continue to update it and particularly extend it to other countries. But even in its current form (and with the obvious observation that my opinion is not neutral)  I see it as a major step forward for the discipline in Europe. We need an impeccable factual basis to convince the public at large and political decision-makers to give informatics the place it deserves in today’s educational systems.

References

[1] Informatics Europe site, see here.

[2] Cristina Pereira and Bertrand Meyer: Informatics Education in Europe: Institutions, Degrees, Students, Positions, Salaries — Key Data 2008-2012, Informatics Europe report, 30 September 2013, available here.

[3] Computing Research Association (US), see here.

.

VN:F [1.9.10_1130]
Rating: 7.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 2 votes)

The secret of success

In the process of finishing a book right now, it occurred to me that writing is really a simple matter. There are only three issues to address:

  1. How to start.
  2. How to finish.
  3. How to take care of the stuff in-between.

In the early stages  (1) you face the anguish of the “empty paper, defended by its whiteness” (Mallarmé) and do not know where to begin. If you overcome it, you have all the grunt work to do (3), step after step. At the end (2), while deep down you feel you have done enough and should be permitted to  click “Publish”, you still have to fill the holes, which may be small in number but have remained holes for a reason: you have again and again delayed filling them because in reality you do not know what to write there.

All things considered, then, it is not such a big deal.

VN:F [1.9.10_1130]
Rating: 8.8/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 4 votes)

The laws of branching (part 2): Tichy and Joy

Recently I mentioned the first law of branching (see earlier article) to Walter Tichy, famed creator of RCS, the system that established modern configuration management. He replied with the following anecdote, which is worth reproducing in its entirety (in his own words):

I started work on RCS in 1980, because I needed an alternative for SCCS, for which the license cost would have been prohibitive. Also, I wanted to experiment with reverse deltas. With reverse deltas, checking out the latest version is fast, because it is stored intact. For older ones, RCS applied backward deltas. So the older revisions took longer to extract, but that was OK, because most accesses are to the newest revision anyway.

At first, I didn’t know how to handle branches in this scheme. Storing each branch tip in full seemed like a waste. So I simply left out the branches.

It didn’t take long an people were using RCS. Bill Joy, who was at Berkeley at the time and working on Berkeley Unix, got interested. He gave me several hints about unpleasant features of SCCS that I should correct. For instance, SCCS didn’t handle identification keywords properly under certain circumstances, the locking scheme was awkward, and the commands too. I figured out a way to solve these issue. Bill was actually my toughest critic! When I was done with all the modifications, Bill cam back and said that he was not going to use RCS unless I put in branches. So I figured out a way. In order to reconstruct a branch tip, you start with the latest version on the main trunk, apply backwards deltas up to the branch point, and then apply forward deltas out to the branch tip. I also implemented a numbering scheme for branches that is extensible.

When discussing the solution, Bill asked me whether this scheme meant that it would take longer to check in and out on branches. I had to admit that this was true. With the machines at that time (VAXen) efficiency was important. He thought about this for a moment and then said that that was actually great. It would discourage programmers from using branches! He felt they were a necessary evil.

VN:F [1.9.10_1130]
Rating: 6.1/10 (10 votes cast)
VN:F [1.9.10_1130]
Rating: -2 (from 4 votes)

Another displaced business

Front-page notice in yesterday’s Tages Anzeiger (one of the principal Swiss newspapers):

Dear Readers:

From today the employment-ads section will no longer appear as a separate supplement, but directly as a section of the Tuesdays and Thursday paper. The reason is the ever smaller number of position offerings.

It seems clear that what has decreased is not the number of positions offered — the job market in Switzerland is healthy — but the number of those offered through newspapers. End of an era.

VN:F [1.9.10_1130]
Rating: 7.5/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: -1 (from 1 vote)

The laws of branching (part 1)

 

The first law of branching is: don’t. There is no other law.

The only sane way to develop software in a group, whether collocated or distributed, is to have a single branch (“trunk”) to which everyone commits changes, with constant running of the regression test suite to make sure that any breaking change is detected and corrected right away.

To allow branching, that is to say the emergence of separate lines of development with the expectation that they will be merged back later on, is to guarantee disaster. It is easy to diverge, but hard to converge; not only hard, but unpredictable. It can take days, weeks or more to reconcile changes and resolve conflicts, when the reason for the changes is no longer fresh in the developers’ memories, and the developers themselves may even no longer be there. Conflicts should be detected right away, and corrected immediately.

The EiffelStudio development never uses branches. A related development, EVE (Eiffel Verification Environment), maintained at ETH, includes all research tools, and is reconciled every Friday with the EiffelStudio trunk, so it is never allowed to diverge into a separate branch. Most other successful teams I know apply the first law of branching too. Solve conflicts before they have had the time to become conflicts.

VN:F [1.9.10_1130]
Rating: 5.1/10 (28 votes cast)
VN:F [1.9.10_1130]
Rating: -10 (from 20 votes)

Swiss railway prowess

Yesterday, wanting to find out the platform of an arriving  train (stations are good at displaying departures, but don’t seem to consider arrival information worth showing), I fired up the “SBB Business” mobile application of the Swiss railways which I purchased some time ago:

SBB schedule

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The results are impressive: one can, they tell us, ride from Paris to Zurich — about 500 kilometers or 300 miles — in 56 minutes by leaving at 17:02. Wow! No wonder the entry displays (on the right) the graphical symbol for the highest possible expected passenger load, in both first and second class. With such a speed I would flock to that train too!

Nonsense of course. The TGV is fast, at least in the French part (from Basel to Zurich they seem to make it as slow as possible to prove some point), but it still takes four hours and three minutes. I have no idea where the program got the information it displays.

Trying to tap the “earlier” or “later” buttons does not help much, since what you get (consistently repeated if you keep trying, and confirmed again one day later) is this screen:

SBB error message

 

 

 

 

 

 

 

 

 

 

 

 

 

 

No clue what “F1” means.

Whatever software solutions the SBB uses, I am not impressed. Of course, they are welcome to ask for my suggestions.

VN:F [1.9.10_1130]
Rating: 7.5/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 5 votes)

Empirical answers to fundamental software engineering questions

This is a slightly reworked version of an article in the CACM blog, which also served as the introduction to a panel which I moderated at ESEC/FSE 2013 last week; the panelists were Harald Gall, Mark Harman, Giancarlo Succi (position paper only) and Tony Wasserman.

For all the books on software engineering, and the articles, and the conferences, a remarkable number of fundamental questions, so fundamental indeed that just about every software project runs into them, remain open. At best we have folksy rules, some possibly true, others doubtful, and others — such as “adding people to a late software project delays it further” [1] — wrong to the point of absurdity. Researchers in software engineering should, as their duty to the community of practicing software practitioners, try to help provide credible answers to such essential everyday questions.

The purpose of this panel discussion is to assess what answers are already known through empirical software engineering, and to define what should be done to get more.

Empirical software engineering” applies the quantitative methods of the natural sciences to the study of software phenomena. One of its tasks is to subject new methods — whose authors sometimes make extravagant and unsupported claims — to objective scrutiny. But the benefits are more general: empirical software engineering helps us understand software construction better.

There are two kinds of target for empirical software studies: products and processes. Product studies assess actual software artifacts, as found in code repositories, bug databases and documentation, to infer general insights. Project studies assess how software projects proceed and how their participants work; as a consequence, they can share some properties with studies in other fields that involve human behavior, such as sociology and psychology. (It is a common attitude among computer scientists to express doubts: “Do you really want to bring us down to the standards of psychology and sociology?” Such arrogance is not justified. These sciences have obtained many results that are both useful and sound.)

Empirical software engineering has been on a roll for the past decade, thanks to the availability of large repositories, mostly from open-source projects, which hold information about long-running software projects and can be subjected to data mining techniques to identify important properties and trends. Such studies have already yielded considerable and often surprising insights about such fundamental matters as the typology of program faults (bugs), the effectiveness of tests and the value of certain programming language features.

Most of the uncontested successes, however, have been from the product variant of empirical software engineering. This situation is understandable: when analyzing a software repository, an empirical study is dealing with a tangible and well-defined artifact; if any of the results seems doubtful, it is possible and sometimes even easy for others to reproduce the study, a key condition of empirical science. With processes, the object of study is more elusive. If I follow a software project working with Scrum and another using a more traditional lifecycle, and find that one does better than the other, how do I know what other factors may have influenced the outcome? And even if I bring external factors under control how do I compare my results with those of another researcher following other teams in other companies? Worse, in a more realistic scenario I do not always have the luxury of tracking actual industry projects since few companies are enlightened enough to let researchers into their developments; how do I know that I can generalize to industry the conclusions of experiments made with student groups?

Such obstacles do not imply that sound results are impossible; studies involving human behavior in psychology and sociology face many of the same difficulties and yet do occasionally yield insights. But these obstacles explain why there are still few incontrovertible results on process aspects of software engineering. This situation is regrettable since it means that projects large and small embark on specific methods, tools and languages on the basis of hearsay, opinions and sometimes hype rather than solid knowledge.

No empirical study is going to give us all-encompassing results of the form “Agile methods yield better products” or “Object-oriented programming is better than functional programming”. We are entitled to expect, however, that they help practitioners assess some of the issues that await every project. They should also provide a perspective on the conventional wisdom, justified or not, that pervades the culture of software engineering. Here are some examples of general statements and questions on which many people in the field have opinions, often reinforced by the literature, but crying for empirical backing:

  • The effect of requirements faults: the famous curve by Boehm is buttressed by old studies on special kinds of software (large mission-critical defense projects). What do we really lose by not finding an error early enough?
  • The cone of uncertainty: is that idea just folklore?
  • What are the successful techniques for shortening delivery time by adding manpower?
  • The maximum compressibility factor: is there a nominal project delivery time, and how much can a project decrease it by throwing in money and people?
  • Pair programming: when does it help, when does it hurt? If it has any benefits, are there in quality or in productivity (delivery time)?
  • In iterative approaches, what is the ideal time for a sprint under various circumstances?
  • How much requirements analysis should be done at the beginning of a project, and how much deferred to the rest of the cycle?
  • What predictors of size correlate best with observed development effort?
  • What predictors of quality correlate best with observed quality?
  • What is the maximum team size, if any, beyond which a team should be split?
  • Is it better to use built-in contracts or just to code assertions in tests?

When asking these and other similar questions relating to core aspects of practical software development, I sometimes hear “Oh, but we know the answer conclusively, thanks to so-and-so’s study“. This may be true in some cases, but in many others one finds, in looking closer, that the study is just one particular experiment, fraught with the same limitations as any other.

The principal aim of the present panel is to find out, through the contributions of the panelists which questions have useful and credible empirical answers already available, whether or not widely known. The answers must indeed be:

  • Empirical: obtained through objective quantitative studies of projects.
  • Useful: providing answers to questions of interest to practitioners.
  • Credible: while not necessarily absolute (a goal difficult to reach in any matter involving human behavior), they must be backed by enough solid evidence and confirmation to be taken as a serious input to software project decisions.

An auxiliary outcome of the panel should be to identify fundamental questions on which credible, useful empirical answers do not exist but seem possible, providing fuel for researchers in the field.

To mature, software engineering must shed the folkloric advice and anecdotal evidence that still pervade the field and replace them with convincing results, established with all the limitations but also the benefits of quantitative, scientific empirical methods.

Note

[1] From Brooks’s Mythical Man-Month.

VN:F [1.9.10_1130]
Rating: 8.7/10 (10 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 5 votes)

Concurrency video

Our Concurrency Made Easy project, the result of an ERC Advanced Investigator Grant, is trying to solve the problem of making concurrent programming simple, reliable and effective. It has spurred related efforts, in particular the Roboscoop project applying concurrency to robotics software.

Sebastian Nanz and other members of the CME project at ETH have just produced a video that describes the aims of the project and presents some of the current achievements. The video is available on the CME project page [1] (also directly on YouTube [2]).

References

[1] Concurrency Made Easy project, here.

[2] YouTube CME video, here.

VN:F [1.9.10_1130]
Rating: 9.5/10 (17 votes cast)
VN:F [1.9.10_1130]
Rating: +13 (from 15 votes)

Smaller, better textbook

A new version of my Touch of Class [1] programming textbook is available. It is not quite a new edition but more than just a new printing. All the typos that had been reported as of a few months ago have been corrected.

The format is also significantly smaller. This change is more than a trifle. When а  reader told me for the first time “really nice book, pity it is so heavy!”, I commiserated and did not pay much attention. After twenty people said that, and many more after them, including professors looking for textbooks for their introductory programming classes, I realized it was a big deal. The reason the book was big and heavy was not so much the size of the contents (876 is not small, but not outrageous for a textbook introducing all the fundamental concepts of programming). Rather, it is a technical matter: the text is printed in color, and Springer really wanted to do a good job, choosing thick enough paper that the colors would not seep though. In addition I chose a big font to make the text readable, resulting in a large format. In fact I overdid it; the font is bigger than necessary, even for readers who do not all have the good near-reading sight of a typical 19-year-old student.

We kept the color and the good paper,  but reduced the font size and hence the length and width. The result is still very readable, and much more portable. I am happy to make my contribution to reducing energy consumption (at ETH alone, think of the burden on Switzerland’s global energy bid of 200+ students carrying the book — as I hope they do — every morning on the buses, trains and trams crisscrossing the city!).

Springer also provides electronic access.

Touch of Class is the textbook developed on the basis of the Introduction to Programming course [2], which I have taught at ETH Zurich for the last ten years. It provides a broad overview of programming, starting at an elementary level (zeros and ones!) and encompassing topics not usually covered in introductory courses, even a short introduction to lambda calculus. You can get an idea of the style of coverage of such topics by looking up the sample chapter on recursion at touch.ethz.ch. Examples of other topics covered include a general introduction to software engineering and software tools. The presentation uses full-fledged object-oriented concepts (including inheritance, polymorphism, genericity) right from the start, and Design by Contract throughout. Based on the “inverted curriculum” ideas on which I published a number of articles, it presents students with a library of reusable components, the Traffic library for graphical modeling of traffic in a city, and builds on this infrastructure both to teach students abstraction (reusing code through interfaces including contracts) and to provide them models of high-quality code for imitation and inspiration.

For more details see the article on this blog that introduced the book when it was first published [3].

References

[1] Bertrand Meyer, Touch of Class: An Introduction to Programming Well Using Objects and Contracts, Springer Verlag, 2nd printing, 2013. The Amazon page is here. See the book’s own page (with slides and other teaching materials, sample chapter etc.) here. (Also available in Russian, see here.)

[2] Einführung in die Programmierung (Introduction to Programming) course, English course page here.

[3] Touch of Class published, article on this blog, 11 August 2009, see [1] here.

VN:F [1.9.10_1130]
Rating: 7.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +5 (from 7 votes)

Barenboim = Rubinstein?

 

I have always admired Daniel Barenboim, both as a pianist and as a conductor — and not just because years ago, from pictures on disk covers, we looked strikingly alike, see e.g. [1] which could almost be me at that time (Then I went to see him in concert and realized that he was a good 15 centimeters shorter; the pictures were only head-and-shoulders. Since that time the difference of our physical appearances has considerably increased, not compensated, regrettably, by any decrease of the difference of our musical abilities.)

Nowadays you can find lots of good music, an unbelievable quantity in fact, on YouTube. Like this excellent performance of Beethoven’s Emperor Concerto [2] by Barenboim with a Danish orchestra.

If you go to that page and expand the “about” tab an interesting story unfolds. If I parse it right (I have no direct information) it is a record of a discussion between the person who uploaded the video and the YouTube copyright police. It seems YouTube initially rejected the upload on the basis that it violated no fewer than three different copyrights, all apparently for recordings of the concerto: one by the Berlin Philharmonic (pianist not named), one by Arthur Rubinstein (orchestra not named), and one by Ivan Szekely (orchestra not named). The uploader contested these copyright claims, pointing out that the performers are different in all four cases. It took a little more than a month before YouTube accepted the explanation and released the video on 22 April 2012.

Since the page clearly listed  the performers’ names and contained a full video, the initial copyright complaints must have been made on the basis of the audio track alone. Further, the detection must have been automatic, as it is hard to imagine that either YouTube or the copyright owners employ a full staff of music experts to listen all day to recordings on the web and,  once in a while, write an email of the form “I just heard something at http://musicsite.somewhere that sounds suspiciously close to bars 37-52  of what I remember from the `Adagio un poco mosso’ in the 1964 Rubinstein performance, or possibly his 1975 performance, of Beethoven’s Emperor“. (The conductor in Rubinstein’s 1975 recording, by the way, is… Daniel Barenboim.) Almost certainly, the check is done by a program which scours the Web for clones.

It seems, then, that the algorithm used by YouTube or whoever runs these checks can, reasonably enough, detect that a recording is from a certain piece of music, but — now the real scandal — cannot distinguish between Rubinstein and Barenboim.

If this understanding is correct one would like to think that some more research can solve the problem. That would assume that humans can always distinguish performers. On French radio there used to be a famous program, the “Tribune of Record Critics“, where for several hours on every Sunday the moderator would play excerpts of a given piece in various interpretations, and the highly opinionated star experts on the panel would praise some to the sky and excoriate others (“This Karajan guy — does he even know what music is about?“).  One day, probably an April 1st,  they broadcast a parody of themselves, pretending to fight over renditions of Beethoven’s The Ruins of Athens overture while all were actually the same recording being played again and again. After that I always wondered whether in normal instances of the program the technicians were not tempted once in a while to switch recordings to fool the experts. (The version of the program that runs today, which is much less fun, relies on blind tasting, if I may call it that way.) Presumably no professional listener would ever confuse the playing of Barenboim (the pianist) with that of Rubinstein. Presumably… and yet reading about the very recent Joyce Batto scandal [3], in which a clever fraudster  tricked the whole profession  for a decade about more than a hundred recordings, is disturbing.

If my understanding of the situation regarding the Barenboim video is correct, then it remarkable that any classical music recordings can appear at all on YouTube without triggering constant claims of copyright infringement; specifically, any multiple recordings of the same piece. In classical music, interpretation is crucial, and one never tires of comparing performances of the same piece by different artists, with differences that can be subtle at times and striking at others. Otherwise, why would we go hear Mahler’s 9th or see Cosi Fan Tutte after having been there, done that so many times? And now we can perform even more comparisons without leaving home, just by browsing YouTube. Try for example Schumann’s Papillons by Arrau, Kempf, Argerich and — my absolute favorite for many years — Richter. Perhaps a reader with expertise on the topic can tell us about the current state of plagiarism detection for music: how finely can it detect genuine differences of interpretation, without being fooled by simple tricks as were used in the Batto case?

Still. To confuse Barenboim with Rubinstein!

References and notes

[1] A photograph of the young Barenboim: see here.

[2] Video recording of performance of Beethoven’s 5th piano concert by Daniel Barenboim and Det kongelige kapel conducted by Michael Schønvandt on the occasion of the Sonning Prize award, 2009, uploaded to YouTube by “mugge62” and available here.

[3] Wikipedia entry on Joyce Hatto and the Barrington-Coupe fraud, here.

VN:F [1.9.10_1130]
Rating: 8.8/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 5 votes)

The invariants of key algorithms (new paper)

 

I have mentioned this paper before but as a draft. It has now been accepted by ACM’s Computing Surveys and is scheduled to appear in September 2014; the current text, revised from the previous version, is available [1].

Here is the abstract:

Software verification has emerged as a key concern for ensuring the continued progress of information technology. Full verification generally requires, as a crucial step, equipping each loop with a “loop invariant”. Beyond their role in verification, loop invariants help program understanding by providing fundamental insights into the nature of algorithms. In practice, finding sound and useful invariants remains a challenge. Fortunately, many invariants seem intuitively to exhibit a common flavor. Understanding these fundamental invariant patterns could therefore provide help for understanding and verifying a large variety of programs.

We performed a systematic identification, validation, and classification of loop invariants over a range of fundamental algorithms from diverse areas of computer science. This article analyzes the patterns, as uncovered in this study,governing how invariants are derived from postconditions;it proposes a taxonomy of invariants according to these patterns, and presents its application to the algorithms reviewed. The discussion also shows the need for high-level specifications based on “domain theory”. It describes how the invariants and the corresponding algorithms have been mechanically verified using an automated program prover; the proof source files are available. The contributions also include suggestions for invariant inference and for model-based specification.

Reference

[1] Carlo Furia, Bertrand Meyer and Sergey Velder: Loop invariants: Analysis, Classification and Examples, in ACM Computing Surveys, to appear in September 2014, preliminary text available here.

VN:F [1.9.10_1130]
Rating: 7.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

Informatics: catch them early

 

Recycled[I occasionally post on the Communications of the ACM blog. It seems that there is little overlap between readers of that blog and the present one, so — much as I know, from software engineering, about the drawbacks of duplication — I will continue to repost articles here when relevant.]

Some call it computer science, others informatics, but they face the same question: when do we start teaching the subject? In many countries where high schools began to introduce it in the seventies, they actually retreated since then; sure, students are shown how to use word processors and spreadsheets, but that’s not the point.

Should we teach computer science in secondary (and primary) school? In a debate at SIGCSE a few years ago, Bruce Weide said in strong words that we should not: better give students the strong grounding in mathematics and especially logic that they will need to become good at programming and CS in general. I found the argument convincing: I teach first-semester introductory programming to 200 entering CS students every year, and since many have programming experience, of highly diverse nature but usually without much of a conceptual basis, I find myself unteaching a lot. In a simple world, high-school teachers would teach students to reason, and we would teach them to program. The world, however, is not simple. The arguments for introducing informatics earlier are piling up:

  • What about students who do not enter CS programs?
  • Many students will do some programming anyway. We might just as well teach them to do it properly, rather than let them develop bad practices and try to correct them at the university level.
  • Informatics is not just a technique but an original scientific discipline, with its own insights and paradigms (see [2] and, if I may include a self-citation, [3]). Its intellectual value is significant for all educated citizens, not just computer scientists.
  • Countries that want to be ahead of the race rather than just consumers of IT products need their population to understand the basic concepts, just as they want everyone, and not just future mathematicians, to master the basics of arithmetics, algebra and geometry.

These and other observations led Informatics Europe and ACM Europe, two years ago, to undertake the writing of a joint report, which has now appeared [1]. The report is concise and makes strong points, emphasizing in particular the need to distinguish education in informatics from a mere training in digital literacy (the mastery of basic IT tools, the Web etc.). The distinction is often lost on the general public and decision-makers (and we will surely have to emphasize it again and again).

The report proposes general principles for both kinds of programs, emphasizing in particular:

  • For digital literacy, the need to teach not just how-tos but also safe, ethical and effective use of IT resources and tools.
  • For informatics, the role of this discipline as a cross-specialty subject, like mathematics.

The last point is particularly important since we should make it clear that we are not just pushing (out of self-interest, as members of any discipline could) for schools to give our specialty a share, but that informatics is a key educational, scientific and economic resource for the citizens of any modern country.

The report is written from a European perspective, but the analysis and conclusions will, I think, be useful in any country.

It does not include any detailed curriculum recommendation, first because of the wide variety of educational contexts, but also because that next task is really work for another committee, which ACM Europe and Informatics Europe are in the process of setting up. The report also does not offer a magic solution to the key issue of bootstrapping the process — by finding teachers to make the courses possible, and courses to justify training the teachers — but points to successful experiences in various countries that show a way to break the deadlock.

The introduction of informatics as a full-fledged discipline in the K-12 curriculum is clearly where the winds of history are blowing. Just as the report was being finalized, the UK announced that it was making CS one of the choices of required scientific topics would become a topic in the secondary school exam on a par with traditional sciences. The French Academy of Sciences recently published its own report on the topic, and many other countries have similar recommendations in progress. The ACM/IE report is a major milestone which should provide a common basis for all these ongoing efforts.

References

[1] Informatics education: Europe Cannot Afford to Miss the Boat,  Report of the joint Informatics Europe & ACM Europe Working Group on Informatics Education,  April 2013,  available here.

[2] Jeannette Wing: Computational Thinking, in Communications of the ACM, vol. 49, no. 3, March 2006, pages 33-35, available here.

[3] Bertrand Meyer: Software Engineering in the Academy, in Computer (IEEE), vol. 34, no. 5, May 2001, pages 28-35, available here.

VN:F [1.9.10_1130]
Rating: 8.2/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 2 votes)

Reading notes: strong specifications are well worth the effort

 

This report continues the series of ICSE 2013 article previews (see the posts of these last few days, other than the DOSE announcement), but is different from its predecessors since it talks about a paper from our group at ETH, so you should not expect any dangerously delusional,  disingenuously dubious or downright deceptive declaration or display of dispassionate, disinterested, disengaged describer’s detachment.

The paper [1] (mentioned on this blog some time ago) is entitled How good are software specifications? and will be presented on Wednesday by Nadia Polikarpova. The basic result: stronger specifications, which capture a more complete part of program functionality, cause only a modest increase in specification effort, but the benefits are huge; in particular, automatic testing finds twice as many faults (“bugs” as recently reviewed papers call them).

Strong specifications are specifications that go beyond simple contracts. A straightforward example is a specification of a push operation for stacks; in EiffelBase, the basic Eiffel data structure library, the contract’s postcondition will read

item =                                          /A/
count = old count + 1

where x is the element being pushed, item the top of the stack and count the number of elements. It is of course sound, since it states that the element just pushed is now the new top of the stack, and that there is one more element; but it is also  incomplete since it says nothing about the other elements remaining as they were; an implementation could satisfy the contract and mess up with these elements. Using “complete” or “strong” preconditions, we associate with the underlying domain a theory [2], or “model”, represented by a specification-only feature in the class, model, denoting a sequence of elements; then it suffices (with the convention that the top is the first element of the model sequence, and that “+” denotes concatenation of sequences) to use the postcondition

model = <x> + old model         /B/

which says all there is to say and implies the original postconditions /A/.

Clearly, the strong contracts, in the  /B/ style, are more expressive [3, 4], but they also require more specification effort. Are they worth the trouble?

The paper explores this question empirically, and the answer, at least according to the criteria used in the study, is yes.  The work takes advantage of AutoTest [5], an automatic testing framework which relies on the contracts already present in the software to serve as test oracles, and generates test cases automatically. AutoTest was applied to both to the classic EiffelBase, with classic partial contracts in the /A/ style, and to the more recent EiffelBase+ library, with strong contracts in the /B/ style. AutoTest is for Eiffel programs; to check for any language-specificity in the results the work also included testing a smaller set of classes from a C# library, DSA, for which a student developed a version (DSA+) equipped with strong model-based contracts. In that case the testing tool was Microsoft Research’s Pex [7]. The results are similar for both languages: citing from the paper, “the fault rates are comparable in the C# experiments, respectively 6 . 10-3 and 3 . 10-3 . The fault complexity is also qualitatively similar.

The verdict on the effect of strong specifications as captured by automated testing is clear: the same automatic testing tools applied to the versions with strong contracts yield twice as many real faults. The term “real fault” comes from excluding spurious cases, such as specification faults (wrong specification, right implementation), which are a phenomenon worth studying but should not count as a benefit of the strong specification approach. The paper contains a detailed analysis of the various kinds of faults and the corresponding empirically determined measures. This particular analysis is for the Eiffel code, since in the C#/Pex case “it was not possible to get an evaluation of the faults by the original developers“.

In our experience the strong specifications are not that much harder to write. The paper contains a precise measure: about five person-weeks to create EiffelBase+, yielding an “overall benefit/effort ratio of about four defects detected per person-day“. Such a benefit more than justifies the effort. More study of that effort is needed, however, because the “person” in the person-weeks was not just an ordinary programmer. True, Eiffel experience has shown that most programmers quickly get the notion of contract and start applying it; as the saying goes in the community, “if you can write an if-then-else, you can write a contract”. But we do not yet have significant evidence of whether that observation extends to model-based contracts.

Model-based contracts (I prefer to call them “theory-based” because “model” means so many other things, but I do not think I will win that particular battle) are, in my opinion, a required component of the march towards program verification. They are the right compromise between simple contracts, which have proved to be attractive to many practicing programmers but suffer from incompleteness, and full formal specification à la Z, which say everything but require too much machinery. They are not an all-or-nothing specification technique but a progressive one: programmers can start with simple contracts, then extend and refine them as desired to yield exactly the right amount of precision and completeness appropriate in any particular context. The article shows that the benefits are well worth the incremental effort.

According to the ICSE program the talk will be presented in the formal specification session, Wednesday, May 22, 13:30-15:30, Grand Ballroom C.

References

[1] Nadia Polikarpova, Carlo A. Furia, Yu Pei, Yi Wei and Bertrand Meyer: What Good Are Strong Specifications?, to appear in ICSE 2013 (Proceedings of 35th International Conference on Software Engineering), San Francisco, May 2013, draft available here.

[2] Bertrand Meyer: Domain Theory: the forgotten step in program verification, article on this blog, see here.

[3] Bernd Schoeller, Tobias Widmer and Bertrand Meyer: Making Specifications Complete Through Models, in Architecting Systems with Trustworthy Components, eds. Ralf Reussner, Judith Stafford and Clemens Szyperski, Lecture Notes in Computer Science, Springer-Verlag, 2006, available here.

[4] Nadia Polikarpova, Carlo Furia and Bertrand Meyer: Specifying Reusable Components, in Verified Software: Theories, Tools, Experiments (VSTTE ‘ 10), Edinburgh, UK, 16-19 August 2010, Lecture Notes in Computer Science, Springer Verlag, 2010, available here.

[5] Bertrand Meyer, Ilinca Ciupa, Andreas Leitner, Arno Fiva, Yi Wei and Emmanuel Stapf: Programs that Test Themselves, IEEE Computer, vol. 42, no. 9, pages 46-55, September 2009, also available here.

[6] Bertrand Meyer, Ilinca Ciupa, Andreas Leitner, Arno Fiva, Yi Wei and Emmanuel Stapf: Programs that Test Themselves, in IEEE Computer, vol. 42, no. 9, pages 46-55, September 2009, also available here.

[7] Nikolai Tillman and Peli de Halleux, Pex: White-Box Generation for .NET, in Tests And Proofs (TAP 2008), pp. 134-153.

 

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

New course partners sought: a DOSE of software engineering education

 

Since 2007 we have conducted, as part of a course at ETH, the DOSE project, Distributed and Outsourced Software Engineering, developed by cooperating student teams from a dozen universities around the world. We are finalizing the plans for the next edition, October to December 2013, and will be happy to welcome a few more universities.

The project consists of building a significant software system collaboratively, using techniques of distributed software development. Each university contributes a number of “teams”, typically of two or three students each; then “groups”, each made up of three teams from different universities, produce a version of the project.

The project’s theme has varied from year to year, often involving games. We make sure that the development naturally divides into three subsystems or “clusters”, so that each group can quickly distribute the work among its teams. An example of division into clusters, for a game project, is: game logic; database and player management; user interface. The page that describes the setup in more detail [1] has links enabling you to see the results of some of the best systems developed by students in recent years.

The project is a challenge. Students are in different time zones, have various backgrounds (although there are minimum common requirements [1]), different mother tongues (English is the working language of the project). Distributed development is always hard, and is harder in the time-constrained context of a university course. (In industry, while we do not like that a project’s schedule slips, we can often survive if it does; in a university, when the semester ends, we have to give students a grade and they go away!) It is typical, after the initial elation of meeting new student colleagues from exotic places has subsided and the reality of interaction sets in, that some groups will after a month, just before the first or second deadline, start to panic — then take matters into their own hands and produce an impressive result. Students invariably tell us that they learn a lot through the course; it is a great opportunity to practice the principles of modern software engineering and to get prepared for the realities of today’s developments in industry, which are in general distributed.

For instructors interested in software engineering research, the project is also a great way to study issues of distributed development in  a controlled setting; the already long list of publications arising from studies performed in earlier iterations [3-9] suggests the wealth of available possibilities.

Although the 2013 project already has about as many participating universities as in previous years, we are always happy to consider new partners. In particular it would be great to include some from North America. Please read the requirements on participating universities given in [1]; managing such a complex process is a challenge in itself (as one can easily guess) and all teaching teams must share goals and commitment.

References

[1] General description of DOSE, available here.

[2] Bertrand Meyer: Offshore Development: The Unspoken Revolution in Software Engineering, in Computer (IEEE), January 2006, pages 124, 122-123, available here.

[3] Bertrand Meyer and Marco Piccioni: The Allure and Risks of a Deployable Software Engineering Project: Experiences with Both Local and Distributed Development, in Proceedings of IEEE Conference on Software Engineering & Training (CSEE&T), Charleston (South Carolina), 14-17 April 2008, available here.

[4] Martin Nordio, Roman Mitin, Bertrand Meyer, Carlo Ghezzi, Elisabetta Di Nitto and Giordano Tamburelli: The Role of Contracts in Distributed Development, in Proceedings of Software Engineering Advances For Offshore and Outsourced Development, Lecture Notes in Business Information Processing 35, Springer-Verlag, 2009, available here.

[5] Martin Nordio, Roman Mitin and Bertrand Meyer: Advanced Hands-on Training for Distributed and Outsourced Software Engineering, in Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering – Volume 1, ACM. 2010 available here.

[6] Martin Nordio, Carlo Ghezzi, Bertrand Meyer, Elisabetta Di Nitto, Giordano Tamburrelli, Julian Tschannen, Nazareno Aguirre and Vidya Kulkarni: Teaching Software Engineering using Globally Distributed Projects: the DOSE course, in Collaborative Teaching of Globally Distributed Software Development – Community Building Workshop (CTGDSD — an ICSE workshop), ACM, 2011, available here.

[7] Martin Nordio, H.-Christian Estler, Bertrand Meyer, Julian Tschannen, Carlo Ghezzi, and Elisabetta Di Nitto: How do Distribution and Time Zones affect Software Development? A Case Study on Communication, in Proceedings of the 6th International Conference on Global Software Engineering (ICGSE), IEEE, pages 176–184, 2011, available here.

[8] H.-Christian Estler, Martin Nordio, Carlo A. Furia, and Bertrand Meyer: Distributed Collaborative Debugging, to appear in Proceedings of 7th International Conference on Global Software Engineering (ICGSE), 2013.

[9] H.-Christian Estler, Martin Nordio, Carlo A. Furia, and Bertrand Meyer: Unifying Configuration Management with Awareness Systems and Merge Conflict Detection, to appear in Proceedings of the 22nd Australasian Software Engineering Conference (ASWEC), 2013.

 

VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Reading notes: misclassified bugs

 

(Please note the general disclaimer [1].)

How Misclassification Impacts Bug Prediction [2], an article to be presented on Thursday at ICSE, is the archetype of today’s successful empirical software engineering research, deriving significant results from the mining of publicly available software project repositories — in this case Tomcat5 and three others from Apache, as well as Rhino from Mozilla. The results are in some sense meta-results, because many studies have already mined the bug records of such repositories to draw general lessons about bugs in software development; what Herzig, Just and Zeller now tell us is that the mined data is highly questionable: many problems classified as bugs are not bugs.

The most striking results (announced in a style a bit stentorian to my taste, but indeed striking) are that: every third bug report does not describe a bug, but a request for a new feature, an improvement, better documentation or tests, code cleanup or refactoring; and that out of five program files marked as defective, two do not in fact contain any bug.

These are both false positive results. The repositories signal very few misclassifications the other way: only a small subset of enhancement and improvement requests (around 5%) should have been classified as bugs, and even fewer faulty files are missed (8%, but in fact less than 1% if one excludes an outlier, tomcat5 with 38%, a discrepancy that the paper does not discuss).

The authors have a field day, in the light of this analysis, of questioning the validity of the many studies in recent years — including some, courageously cited, by Zeller himself and coauthors — that start from bug repositories to derive general lessons about bugs and their properties.

The methodology is interesting if a bit scary. The authors (actually, just the two non-tenured authors, probably just a coincidence) analyzed 7401 issue reports manually; more precisely, one of them analyzed all of them and the second one took a second look at the reports that came out from the first step as misclassified, without knowing what the proposed reclassification was, then the results were merged. At 4 minutes per report this truly stakhanovite effort took 90 working days. I sympathize, but I wonder what the rules are in Saarland for experiments involving living beings, particularly graduate students.

Precise criteria were used for the reclassification; for example a report describes a bug, in the authors’ view, if it mentions a null pointer exception (I will skip the opportunity of a pitch for Eiffel’s void safety mechanism), says that the code has to be corrected to fix the semantics, or if there is a “memory issue” or infinite loop. These criteria are reasonable if a bit puzzling (why null pointer exceptions and not other crashes such as arithmetic overflows?); but more worryingly there is no justification for them. I wonder  how much of the huge discrepancy found by the authors — a third or reported bugs are not bugs, and 40% of supposedly defective program files are not defective — can be simply explained by different classification criteria applied by the software projects under examination. The authors give no indication that they interacted with the people in charge of these projects. To me this is the major question hovering over this paper and its spectacular results. If you are in the room and get the chance, don’t hesitate to ask this question on my behalf or yours!

Another obvious question is how much the results depend on the five projects selected. If there ever was room for replicating a study (a practice whose rarity in software engineering we lament, but whose growth prospects are limited by the near-impossibility of convincing selective software engineering venues to publish confirmatory empirical studies), this would be it. In particular it would be good to see some of the results for commercial products.

The article offers an explanation for the phenomena it uncovered: in its view, the reason why so many bug reports end up misclassified is the difference of perspective between users of the software, who complain about the problems they encounter,  and the software professionals  who prepare the actual bug reports. The explanation is plausible but I was surprised not to see any concrete evidence that supports it. It is also surprising that the referees did not ask the authors to provide more solid arguments to buttress that explanation. Yet another opportunity to raise your hand and ask a question.

This (impressive) paper will call everyone’s attention to the critical problem of data quality in empirical studies. It is very professionally prepared, and could, in addition to its specific contributions, serve as a guide on how to get an empirical software engineering paper accepted at ICSE: take a critical look at an important research area; study it from a viewpoint that has not been considered much so far; perform an extensive study, with reasonable methodological assumptions; derive a couple of striking results, making sure they are both visibly stated and backed by the evidence; and include exactly one boxplot.

Notes and references

[1] This article review is part of the “Reading Notes” series. General disclaimer here.

[2] Kim Herzig, Sascha Just and Andreas Zeller: It’s not a Bug, it’s a Feature: How Misclassification Impacts Bug Prediction, in ICSE 2013, available here. According to the ICSE program the paper will be presented on May 23 in the Bug Prediction session, 16 to 17:30.

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

Reading notes: the design of bug fixes

 

To inaugurate the “Reading Notes” series [1] I will take articles from the forthcoming International Conference on Software Engineering. Since I am not going to ICSE this year I am instead spending a little time browsing through the papers, obligingly available on the conference site. I’ll try whenever possible to describe a paper before it is presented at the conference, to alert readers to interesting sessions. I hope in July and August to be able to do the same for some of the papers to be presented at ESEC/FSE [2].

Please note the general disclaimer [1].

The Design of Bug Fixes [3] caught my attention partly for selfish reasons, since we are working, through the AutoFix project [3], on automatic bug fixing, but also out of sheer interest and because I have seen previous work by some of the authors. There have been article about bug patterns before, but not so much is known with credible empirical evidence about bug fixes (corrections of faults). When a programmer encounters a fault, what strategies does he use to correct it? Does he always produce the best fix he can, and if so, why not? What is the influence of the project phase on such decisions (e.g. will you fix a bug the same way early in the process and close to shipping)? These are some of the questions addressed by the paper.

The most interesting concrete result is a list of properties of bug fixes, classified along two criteria: nature of a fix (the paper calls it “design space”), and reasoning behind the choice of a fix. Here are a few examples of the “nature” classification:

  • Data propagation: the bug arises in a component, fix it in another, for example a library class.
  • More or less accuracy: are we fixing the symptom or the cause?
  • Behavioral alternatives: rather than directly correcting the reported problem, change the user-experienced behavior (evoking the famous quip that “it’s not a bug, it’s a feature”). The authors were surprised to see that developers (belying their geek image) seem to devote a lot of effort trying to understand how users actually use the products, but also found that even so developers do not necessarily gain a solid, objective understanding of these usage patterns. It would be interesting to know if the picture is different for traditional locally-installed products and for cloud-based offerings, since in the latter case it is possible to gather more complete, accurate and timely usage data.

On the “reasoning” side, the issue is why and how programmers decide to adopt a particular approach. For example, bug fixes tend to be more audacious (implying redesign if appropriate) at the beginning of a project, and more conservative as delivery nears and everyone is scared of breaking something. Another object of the study is how deeply developers understand the cause rather than just the symptom; the paper reports that 18% “did not have time to figure out why the bug occurred“. Surprising or not, I don’t know, but scary! Yet another dimension is consistency: there is a tension between providing what might ideally be the best fix and remaining consistent with the design decisions that underlie a software system throughout its architecture.

I was more impressed by the individual categories of the classification than by that classification as a whole; some of the categories appear redundant (“interface breakage“, “data propagation” and “internal vs external“, for example, seem to be pretty much the same; ditto for “cause understanding” and “accuracy“). On the other hand the paper does not explicitly claim that the categories are orthogonal. If they turn this conference presentation into a journal article I am pretty sure they will rework the classification and make it more robust. It does not matter that it is a bit shaky at the moment since the main insights are in the individual kinds of fix and fix-reasoning uncovered by the study.

The authors are from Microsoft Research (one of them was visiting faculty) and interviewed numerous programmers from various Microsoft product groups to find out how they fix bugs.

The paper is nicely written and reads easily. It includes some audacious syntax, as in “this dimension” [internal vs external] “describes how much internal code is changed versus external code is changed as part of a fix“. It has a discreet amount of humor, some of which may escape non-US readers; for example the authors explain that when approaching programmers out of the blue for the survey they tried to reassure them through the words “we are from Microsoft Research, and we are here to help“, a wry reference to the celebrated comment by Ronald Reagan (or his speechwriter) that the most dangerous words in the English language are “I am from the government, and I am here to help“. To my taste the authors include too many details about the data collection process; I would have preferred the space to be used for a more detailed discussion of the findings on bug fixes. On the other hand we all know that papers to selective conferences are written for referees, not readers, and this amount of methodological detail was probably the minimum needed to get past the reviewers (by avoiding the typical criticism, for empirical software engineering research, that the sample is too small, the questions biased etc.). Thankfully, however, there is no pedantic discussion of statistical significance; the authors openly present the results as dependent on the particular population surveyed and on the interview technique. Still, these results seem generalizable in their basic form to a large subset of the industry. I hope their publication will spawn more detailed studies.

According to the ICSE program the paper will be presented on May 23 in the Debugging session, 13:30 to 15:30.

Notes and references

[1] This article review is part of the “Reading Notes” series. General disclaimer here.

[2] European Software Engineering Conference 2013, Saint Petersburg, Russia, 18-24 August, see here.

[3] Emerson Murphy-Hill, Thomas Zimmerman, Christian Bird and Nachiappan Nagapan: The Design of Bug Fixes, in ICSE 2013, available here.

[4] AutoFix project at ETH Zurich, see project page here.

[5] Ronald Reagan speech extract on YouTube.

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

Adult entertainment

 

I should occasionally present examples of the strange reasons people sometimes invoke for not using Eiffel. In an earlier article [1] I gave the basic idea common to all these reasons, but there are many variants, in the general style “I am responsible for IT policy and purchases for IBM, the US Department of Defense and Nikke, and was about to sign the PO for the triple site license when I noticed that an article about Eiffel was published in 1997. How dare you! I had a tooth removed that year and it hurt a lot. I would really have liked to use Eiffel but you just made it impossible“.

While going through old email I found one of these carefully motivated strategic policy decisions: a missing “L” in a class name. Below is, verbatim [2], a message posted on the EiffelStudio developers list in 2006, and my answer. Also provides an interesting glimpse of what supposedly grown-up people find it worthwhile to spend their days on.

Original message

From: es-devel-bounces@origo.ethz.ch [mailto:es-devel-bounces@origo.ethz.ch] On Behalf Of Peter Gummer
Sent: Tuesday, 29 August, 2006 14:01
To: es-devel@origo.ethz.ch
Subject: [Es-devel] Misspelling as a naming convention
From: es-devel-bounces@origo.ethz.ch [mailto:es-devel-bounces@origo.ethz.ch] On Behalf Of Peter Gummer

Today I submitted a problem report that one of the EiffelVision classes has misspelt “tabbable” as “tabable“. Manu replied that the EiffelVision naming convention is that class or feature names ending in “able” will not double the preceding consonant, regardless of whether this results in wrong spelling.

Looking at the latest Es-changes Digest email, I see various changes implementing this naming convention. For example, the comment for revision 63043 is, “Changed from controllable to controlable to meet naming convention‘.

This is lunacy! “Controlable” (implying the existence of some verb “to controle“) might look quite ok to French eyes, but it looks utterly unprofessional to me. It does have a sort of Chaucerian, Middle English, pre-Gutenberg charm I suppose. Is this part of a plot for a Seconde Invasion Normande of the Langue Anglaise?

We are about to embark on some GUI work. Although we are probably going to use .NET WinForms, EiffelVision was a possible choice. But bad spelling puts me in a bad mood. I’d be very reluctant to work with EiffelVision because of this ridiculous naming convention.

– Peter Gummer

Answer

From: Bertrand Meyer
Sent: Wednesday, 30 August, 2006 00:52
To: Peter Gummer
Cc: es-devel@origo.ethz.ch
Subject: Re: [Es-devel] Misspelling as a naming convention

This has nothing to do with French. If anything, French practices the doubling of consonants before a suffix more than English does; an example (extracted from reports of users’ attitudes towards EiffelVision) is English “passionate“, French “passionné“. For the record, there’s no particular French dominance in the Eiffel development team, either at Eiffel Software or elsewhere. The recent discussion on EiffelVision’s “-able” class names involved one native English speaker out of three people, invalidating at the 33% level Kristen Nygaard’s observation that the language of science is English as spoken by foreigners.

The problem in English is that the rules defining which consonants should be doubled before a suffix such as “able” are not obvious. See for example this page from the University of Ottawa:

Double the final consonant before a suffix beginning with a vowel if both of the following are true: the consonant ends a stressed syllable or a one-syllable word, and the consonant is preceded by a single vowel.

Now close your eyes and repeat this from memory. I am sure that won’t be hard because you knew the rule all along, but can we expect this from all programmers using EiffelVision?

Another Web page , from a school in Oxfordshire, England, says:

Rule: Double the last consonant when adding a vowel suffix to a single syllable word ending in one vowel and one consonant.

Note that this is not quite the same rule; it doesn’t cover multi-syllable words with the stress (tonic accent) on the last syllable; and it would suggest “GROUPPABLE” (“group” is a one-syllable word ending in one vowel and one consonant), whereas the first rule correctly prescribes “GROUPABLE“. But apparently this is what is being taught to Oxfordshire pupils, whom we should stand ready to welcome as Eiffel programmers in a few years.

Both rules yield “TRANSFERABLE” because the stress is on the first syllable of “transfer“. But various dictionaries we have consulted also list “TRANSFERRABLE” and “TRANSFERRIBLE“.

As another example consider “FORMATING“. Both rules suggest a single “t“. The Solaris spell checker indeed rejects the form with two “t“s and accepts the version with one; but — a case of Unix-Windows incompatibility that seems so far to have escaped the attention of textbook authors — Microsoft Word does the reverse! In fact in default mode if you type “FORMATING” it silently corrects it to “FORMATTING“. It’s interesting in this example to note a change of tonic accent between the original and derived words: “fórmat” (both noun and verb) but “formáting“. Maybe the Word convention follows the “Ottawa” rule but by considering the stress in the derivation rather than the root? There might be a master’s thesis topic in this somewhere.

Both rules imply “MIXXABLE” and “FIXXABLE“, but we haven’t found a dictionary that accepts either of these forms.

Such rules cannot cover all cases anyway (they are “UNFATHOMMABLE“) because “consonant” vs “vowel” is a lexical distinction that doesn’t reflect the subtleties of English pronunciation. For example either rule would lead to “DRAWWABLE” because the word “draw” ends with “w“, a letter that you’ll find everywhere characterized as a consonant. Lexically it is a consonant, but phonetically it is sometimes a consonant and sometimes not, in particular at the end of a word. In “Wow“, the first “w” is a consonant, but not the second one. A valid rule would need to take into account not only spelling but also pronunciation. This is probably the reason behind the examples involving words ending in “x“: phonetically “X” can be considered two consonants, “KS“. But then the rule becomes more tricky, forcing the inquirer, who is understandably getting quite “PERPLEXXED” at this stage, to combine lexical and phonetic reasoning in appropriate doses.

No wonder then a page from the Oxford Dictionaries site states:

One of the most common types of spelling error is a mistake over whether a word is spelled with a double or a single consonant.

and goes on to list many examples.

You can find a list of of words ending in “ablehere . Here are a few cases involving derivations from a word ending with “p“:

Single consonant
DEVELOPABLE
GRASPABLE
GROUPABLE
HELPABLE
KEEPABLE
REAPABLE
RECOUPABLE

Doubled consonant
DIPPABLE
DROPPABLE (but: DRAPABLE)
FLOPPABLE
MAPPABLE
RECAPPABLE (but: CAPABLE)
RIPPABLE (but: ROPABLE)
SHIPPABLE
SKIPPABLE
STOPPABLE
STRIPPABLE
TIPPABLE

There are also differences between British and American usage.

True, the “Ottawa” rule could be amended to take into account words ending in “w“, “x“, “h” and a few other letters, and come reasonably close to matching dictionary practice. But should programmers have to remember all this? Will they?

Since we are dealing in part with artificial words, there is also some doubt as to what constitutes a “misspelt” word as you call it (or a “misspelled” one as Eiffel conventions — based on American English — would have it). Applying the rule yields “MAPPABLE“, which is indeed found in dictionaries. But in the world of graphics we have the term “bitmap“, where the stress is on the first syllable. The rule then yields “BITMAPABLE“. That’s suspicious but “GOOGLABLE“; a search produces 31 “BITMAPPABLE” and two “BITMAPABLE“, one of which qualified by “(Is that a word?)”. So either EiffelVision uses something that looks inconsistent (“BITMAPABLE” vs “MAPPABLE“) but follows the rule; or we decide for consistency.

In our view this case can be generalized. The best convention is the one that doesn’t require programmers to remember delicate and sometimes fuzzy rules of English spelling, but standardizes on a naming convention that will be as easy to remember as possible. To achieve this goal the key is consistency. A simple rule for EiffelVision classes is:

  • For an “-able” name derived from a word ending with “e“, drop the “e“: REUSABLE. There seems to be no case of words ending with another vowel.
  • If the name is derived from a word ending with a consonant, just add “able“: CONTROLABLE, TOOLTIPABLE, GROUPABLE.

Some of these might look strange the first couple of times but from then on you will remember the convention.

While we are flattered that EiffelVision should be treated as literature, we must admit that there are better recommendations for beach reading, and that Eiffel is not English (or French).

The above rule is just a convention and someone may have a better suggestion.

With best regards,

— Bertrand Meyer, Ian King, Emmanuel Stapf

Reference and note

[1] Habit, happiness and programming languages, article in this blog, 22 October 2012, see here.

[2] I checked the URLs, found that two pages have disappeared since 2006, and replaced them with others having the same content. The formatting (fonts, some of the indentation) is added. Peter Gummer asked me to make sure that his name always appears with two “m“.

VN:F [1.9.10_1130]
Rating: 10.0/10 (7 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

Apocalypse no! (Part 2)

 

Recycled(Revised from an article originally published in the CACM blog. Part 2 of a two-part article.)

Part 1 of this article (to be found here, please read it first) made fun of authors who claim that software engineering is a total failure — and, like everyone else, benefit from powerful software at every step of their daily lives.

Catastrophism in talking about software has a long history. Software engineering started around 1966 [1] with the recognition of a “software crisis“. For many years it was customary to start any article on any software topic by a lament about the terrible situation of the field, leaving in your reader’s mind the implicit suggestion that the solution to the “crisis” lay in your article’s little language or tool.

After the field had matured, this lugubrious style went out of fashion. In fact, it is hard to sustain: in a world where every device we use, every move we make and every service we receive is powered by software, it seems a bit silly to claim that in software development everyone is wrong and everything is broken.

The apocalyptic mode has, however, made a comeback of late in the agile literature, which is fond in particular of citing the so-called “Chaos” reports. These reports, emanating from Standish, a consulting firm, purport to show that a large percentage of projects either do not produce anything or do not meet their objectives. It was fashionable to cite Standish (I even included a citation in a 2003 article), until the methodology and the results were thoroughly debunked starting in 2006 [2, 3, 4]. The Chaos findings are not replicated by other studies, and the data is not available to the public. Capers Jones, for one, publishes his sources and has much more credible results.

Yet the Chaos results continue to be reverently cited as justification for agile processes, including, at length, in the most recent book by the creators of Scrum [5].

Not long ago, I raised the issue with a well-known software engineering author who was using the Standish findings in a talk. Wasn’t he aware of the shakiness of these results? His answer was that we don’t have anything better. It did not sound like the kind of justification we should use in a mature discipline. Either the results are sound, or we should not rely on them.

Software engineering is hard enough and faces enough obstacles, so obvious to everyone in the industry and to every user of software products, that we do not need to conjure up imaginary scandals and paint a picture of general desolation and universal (except for us, that is) incompetence. Take Schwaber and Sutherland, in their introductory chapter:

You have been ill served by the software industry for 40 years—not purposely, but inextricably. We want to restore the partnership.

No less!

Pretending that the whole field is a disaster and no one else as a clue may be a good way to attract attention (for a while), but it is infantile as well as dishonest. Such gross exaggerations discredit their authors, and beyond them, the ideas they promote, good ones included.

As software engineers, we can in fact feel some pride when we look at the world around us and see how much our profession has contributed to it. Not just the profession as a whole, but, crucially, software engineering research: advances in programming methodology, software architecture, programming languages, specification techniques, software tools, user interfaces, formal modeling of software concepts, process management, empirical analysis and human aspects of computing have, step after step, belied the dismal picture that may have been painfully accurate in the early days.

Yes, challenges and unsolved problems face us at every corner of software engineering. Yes, we are still at the beginning, and on many topics we do not even know how to proceed. Yes, there are lots of things to criticize in current practices (and I am not the least vocal of the critics, including in this blog). But we need a sense of measure. Software engineering has made tremendous progress over the last five decades; neither the magnitude of the remaining problems nor the urge to sell one’s products and services justifies slandering the rest of the discipline.

Notes and references

[1] Not in 1968 with the NATO conference, as everyone seems to believe. See an earlier article in this blog.

[2] Robert L. Glass: The Standish report: does it really describe a software crisis?, in Communications of the ACM, vol. 49, no. 8, pages 15-16, August 2006; see here.

[3] J. Laurens Eveleens and Chris Verhoef: The Rise and Fall of the Chaos Report Figures, in IEEE Software, vol. 27, no. 1, Jan-Feb 2010, pages 30-36; see here.

[4] S. Aidane, The “Chaos Report” Myth Busters, 26 March 2010, see here.

[5] Ken Schwaber and Jeff Sutherland: Software in 30 Days: How Agile Managers Beat the Odds, Delight Their Customers, And Leave Competitors In the Dust, Wiley, 2012.

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 6 votes)

Specify less to prove more

Software verification is progressing slowly but surely. Much of that progress is incremental: making the fundamental results applicable to real programs as they are built every day by programmers working in standard circumstances. A key condition is to minimize the amount of annotations that they have to provide.

The article mentioned in my previous post, “Program Checking With Less Hassle” [1], to be presented at VSTTE in San Francisco on Friday by its lead author, Julian Tschannen, introduces several interesting contributions in this direction. One of the surprising conclusions is that sometimes it pays to specify less. That goes against intuition: usually, the more specification information (correctness annotations) you provide the more you help the prover. But in fact partial specifications can hurt rather than help. Consider for example a swap routine with a partial specification, which actually stands in the way of a proof. If modularity is not a concern, for example if the routine is part of the code being verified rather than of a library, it may be more effective to ignore the specification and use the routine’s implementation. This is particularly appropriate for smallhelper routines such as the swap example.

This inlining technique is applicable in other cases, for example to make up for a missing precondition: assume that a helper routine will only work for x > 0 but does not state that precondition, or maybe states only the weaker one x ≥ 0 ; in the code, however, it is only called with positive arguments. If we try to verify the code modularly we will fail, as indeed we should since the routine is incorrect as a general-purpose primitive. But within the context of the code there is nothing wrong with it. Forgetting the contract of the routine if any, and instead using its actual implementation, we may be able to show that everything is fine.

Another component of the approach is to fill in preconditions that programmers have omitted because they are somehow obvious to them. For example it is tempting and common to write just a [1] > 0 rather than a /= Void and then a [1] > 0 for a detachable array a. The tool takes care of  interpreting the simpler precondition as the more complete one.

The resulting “two-step verification”, integrated into the AutoProof verification tool for Eiffel, should turn out to be an important simplification towards the goal of “Verification As a Matter Of Course” [2].

References

[1] Julian Tschannen, Carlo A. Furia, Martin Nordio and Bertrand Meyer: Program Checking With Less Hassle, in VSTTE 2013, Springer LNCS, to appear, draft available here; presentation on May 17 in the 15:30-16:30  session.

[2] Verification As a Matter Of Course, article in this blog, 29 March 2010, see here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Presentations at ICSE and VSTTE

 

The following presentations from our ETH group in the ICSE week (International Conference on Software Engineering, San Francisco) address important issues of software specification and verification, describing new techniques that we have recently developed as part of our work building EVE, the Eiffel Verification Environment. One is at ICSE proper and the other at VSTTE (Verified Software: Tools, Theories, Experiments). If you are around please attend them.

Julian Tschannen will present Program Checking With Less Hassle, written with Carlo A. Furia, Martin Nordio and me, at VSTTE on May 17 in the 15:30-16:30 session (see here in the VSTTE program. The draft is available here. I will write a blog article about this work in the coming days.

Nadia Polikarpova will present What Good Are Strong Specifications?, written with , Carlo A. Furia, Yu Pei, Yi Wei and me at ICSE on May 22 in the 13:30-15:30 session (see here in the ICSE program). The draft is available here. I wrote about this paper in an earlier post: see here. It describes the systematic application of theory-based modeling to the full specification and verification of advanced software.

VN:F [1.9.10_1130]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

What is wrong with CMMI

 

The CMMI model of process planning and assessment has been very successful in some industry circles, essentially as a way for software suppliers to establish credibility. It is far, however, from having achieved the influence it deserves. It is, for example, not widely taught in universities, which in turn limits its attractiveness to industry. The most tempting explanation involves the substance of CMMI: that it prescribes processes that are too heavy. But in fact the basic ideas are elegant, they are not so complicated, and they have been shown to be compatible with flexible approaches to development, such as agile methods.

I think there is a simpler reason, of form rather than substance: the CMMI defining documents are atrociously written.  Had they benefited from well-known techniques of effective technical writing, the approach would have been adopted much more widely. The obstacles to comprehension discourage many people and companies which could benefit from CMMI.

Defining the concepts

One of the first qualities you expect from a technical text is that it defines the basic notions. Take one of the important concepts of CMMI, “process area”. Not just important, but fundamental; you cannot understand anything about CMMI if you do not understand what a process area is. The glossary of the basic document ([1], page 449) defines it as

A cluster of related practices in an area that, when implemented collectively, satisfies a set of goals considered important for making improvement in that area.

The mangled syntax is not a good omen: contrary to what the sentence states, it is not the area that should be “implemented collectively”, but the practices. Let us ignore it and try to understand the intended definition. A process area is a collection of practices? A bit strange, but could make sense, provided the notion of “practice” is itself well defined. Before we look at that, we note that these are practices “in an area”. An area of what? Presumably, a process area, since no other kind of area is ever mentioned, and CMMI is about processes. But then a process area is… a collection of practices in a process area? Completely circular! (Not recursive: a meaningful recursive definition is one that defines simple cases directly and builds complex cases from them. A circular definition defines nothing.)

All that this is apparently saying is that if we already know what a process area is, CMMI adds the concept that a process area is characterized by a set of associated practices. This is actually a useful idea, but it does not give us a definition.

Let’s try to see if the definition of “practice” helps. The term itself does not have an entry in the glossary, a bit regrettable but not too worrying given that in CMMI there are two relevant kinds of practices: specific and generic. “Specific practice” is defined (page 461) as

An expected model component that is considered important in achieving the associated specific goal. (See also “process area” and “specific goal.”)

This statement introduces the important observation that in CMMI a practice is always related to a “goal” (another one of the key CMMI concepts); it is one of the ways to achieve that goal. But this is not a definition of “practice”! As to the phrase “an expected model component”, it simply tells us that practices, along with goals, are among the components of CMMI (“the model”), but that is a side remark, not a definition: we cannot define “practice” as meaning “model component”.

What is happening here is that the glossary does not give a definition at all; it simply relies on the ordinary English meaning of “practice”. Realizing this also helps us understand the definition of “process area”: it too is not a definition, but assumes that the reader already understands the words “process” and “area” from their ordinary meanings. It simply tells us that in CMMI a process area has a set of associated practices. But that is not what a glossary is for: the reader expects it to give precise definitions of the technical terms used in the document.

This misuse of the glossary is typical of what makes CMMI documents so ineffective. In technical discourse it is common to hijack words from ordinary language and give them a special meaning: the mathematical use of such words as “matrix” or “edge” (of a graph) denotes well-defined objects. But you have to explain such technical terms precisely, and be clear at each step whether you are using the plain-language meaning or the technical meaning. If you mix them up, you completely confuse the reader.

In fact any decent glossary should make the distinction explicit by underlining, in definitions, terms that have their own entries (as in: a cluster or related practices, assuming there is an entry for “practice”); then it is clear to the reader whether a word is used in its ordinary or technical sense. In an electronic version the underlined words can be links to the corresponding entries. It is hard to understand why the CMMI documents do not use this widely accepted convention.

Towards suitable definitions

Let us try our hand at what suitable definitions could have been for the two concepts just described; not a vacuous exercise since process area and practice are among the five or six core concepts of CMMI. (Candidates for the others are process, goal, institutionalization and assessment).

“Practice” is the more elementary concept. In fact it retains its essential meaning from ordinary language, but in the CMMI context a practice is related to a process and, as noted, is intended to satisfy a goal. What distinguishes a practice from a mere activity is that the practice is codified and repeatable. If a project occasionally decides to conduct a  design review that is not a practice; a systematically observed daily Scrum meeting is a practice. Here is my take on defining “practice” in CMMI:

Practice: A process-related activity, repeatable as part of a plan, that helps achieve a stated goal.

CMMI has both generic practices, applicable to the process as a whole, and specific practices, applicable to a particular process area. From this definition we can easily derive definitions for both variants.

Now for “process area”. In discussing this concept above, there is one point I did not mention: the reason the CMMI documents can get away with the bizarre definition (or rather non-definition) cited is that if you ask what a process area really is you will immediately be given examples: configuration management, project planning, risk management, supplier agreement management… Then  you get it. But examples are not a substitute for a definition, at least in a standard that is supposed to be precise and complete. Here is my attempt:

Process area: An important aspect of the process, characterized by a clearly identified set of issues and activities, and in CMMI by a set of applicable practices.

The definition of “specific practice” follows simply: a practice that is associated with a particular process area. Similarly, a “generic practice” is a practice not associated with any process area.

I’ll let you be the judge: which definitions do you prefer, these or the ones in the CMMI documents?

By the way, I can hear the cynical explanation: that the jargon and obscurity are intentional, the goal being to justify the need for experts that will interpret the sacred texts. Similar observations have been made to explain the complexity of certain programming languages. Maybe. But bad writing is common enough that we do not need to invoke a conspiracy in this case.

Training for the world championship of muddy writing

The absence of clear definitions of basic concepts is inexcusable. But the entire documents are written in government-committee-speak that erect barriers against comprehension. As an example among hundreds, take the following extract, the entire description of the generic practice GP2.2, “Establish and maintain the plan for performing the organizational training process“” , part of the Software CMM (a 729-page document!), [2], page 360:

This plan for performing the organizational training process differs from the tactical plan for organizational training described in a specific practice in this process area. The plan called for in this generic practice would address the comprehensive planning for all of the specific practices in this process area, from the establishment of strategic training needs all the way through to the assessment of the effectiveness of the organizational training effort. In contrast, the organizational training tactical plan called for in the specific practice would address the periodic planning for the delivery of individual training offerings.

Even to a good-willed reader the second and third sentences sound like gibberish. One can vaguely intuit that the practice just introduced is being distinguished from another, but which one, and how? Why the conditional phrases, “would address”? A plan either does or does not address its goals. And if it addresses them, what does it mean that a plan addresses a planning? Such tortured tautologies, in a high-school essay, would result in a firm request to clean up and resubmit.

In fact the text is trying to say something simple, which should have been expressed like this:

This practice is distinct from practice SP1.3, “Establish an Organizational Training Tactical Plan” (page 353). The present practice is strategic: it is covers planning the overall training process. SP1.3 is tactical: it covers the periodic planning of individual training activities.

(In the second sentence we could retain “from defining training needs all the way to assessing the effectiveness of training”, simplified of course from the corresponding phrase in the original, although I do not think it adds much.)

Again, which version do you prefer?

The first step in producing something decent was not even a matter of style but simple courtesy to the reader. The text compares a practice to another, but it is hard to find the target of the comparison: it is identified as the “tactical plan for organizational training” but that phrase does not appear anywhere else in the document!  You have to guess that it is listed elsewhere as the “organizational training tactical plan”, search for that string, and cycle through its 14 occurrences to see which one is the definition.  (To make things worse, the phrase “training tactical plan” also appears in the document — not very smart for what is supposed to be a precisely written standard.)

The right thing to do is to use the precise name, here SP1.3 — what good is it to introduce such code names throughout a document if it does not use them for reference? — and for good measure list the page number, since this is so easy to do with text processing tools.

In the CMMI chapter of my book Touch of Class (yes, an introductory programming textbook does contain an introduction to CMMI) I cited another extract from [2] (page 326):

The plan for performing the organizational process focus process, which is often called “the process-improvement plan,” differs from the process action plans described in specific practices in this process area. The plan called for in this generic practice addresses the comprehensive planning for all of the specific practices in this process area, from the establishment of organizational process needs all the way through to the incorporation of process-related experiences into the organizational process assets.

In this case the translation into text understandable by common mortals is left as an exercise for the reader.

With such uncanny ability to say in fifty words what would better be expressed in ten, it is not surprising that basic documents run into 729 pages and that understanding of CMMI by companies who feel compelled  to adopt it requires an entire industry of commentators of the sacred word.

Well-defined concepts have simple names

The very name of the approach, “Capability Maturity Model Integration”, is already a sign of WMD (Word-Muddying Disease) at the terminal stage. I am not sure if anyone anywhere knows how to parse it correctly: is it the integration of a model of maturity of capability (right-associative interpretation)? Of several models? These and other variants do not make much sense, if only because in CMMI “capability” and “maturity” are alternatives, used respectively for the Continuous and Staged versions.

In fact the only word that seems really useful is “model”; the piling up of tetrasyllabic words with very broad meanings does not help suggest what the whole thing is about. “Integration”, for example, only makes sense if you are aware of the history of CMMI, which generalized the single CMM model to a family of models, but this history is hardly interesting to a newcomer. A name, especially a long one, should carry the basic notion.

A much better name would have been “Catalog of Assessable Process Practices”, which is even pronounceable as an acronym, and defines the key elements: the approach is based on recognized best practices; these practices apply to processes (of an organization); they must be subject to assessment (the most visible part of CMMI — the famous five levels — although not necessarily the most important one); and they are collected into a catalog. If “catalog” is felt too lowly, “collection” would also do.

Catalog of Assessable Process Practices: granted, it sounds less impressive than the accumulation of pretentious words making up the actual acronym. As is often the case in software engineering methods and tools, once you remove the hype you may be disappointed at first (“So that’s all that it was after all!”), and then you realize that the idea, although brought back down to more modest size, remains a good idea, and can be put to effective use.

At least if you explain it in English.

References

[1] CMMI Product Team: CMMI for Development, Version 1.3, Improving processes for developing better products and services, Technical Report CMU/SEI-2010-TR-033, Software Engineering Institute, Carnegie Mellon University, November 2010, available here.

[2] CMMI Product Team, ; CMMI for Systems Engineering/Software Engineering/Integrated Product and Process Development/Supplier Sourcing, Version 1.1, Staged Representation (CMMI-SE/SW/IPPD/SS, V1.1, Staged) (CMU/SEI-2002-TR-012). Software Engineering Institute, Carnegie Mellon University, 2002, available here.

VN:F [1.9.10_1130]
Rating: 9.6/10 (14 votes cast)
VN:F [1.9.10_1130]
Rating: +12 (from 12 votes)

Ershov lecture

 

On April 11 I gave the “Ershov lecture” in Novosibirsk. I talked about concurrency; a video recording is available here.

The lecture is given annually in memory of Andrey P. Ershov, one of the founding fathers of Russian computer science and originator of many important concepts such as partial evaluation. According to Wikipedia, Knuth considers Ershov to be the inventor of hashing. I was fortunate to make Ershov’s acquaintance in the late seventies and to meet him regularly afterwards. He invited me to his institute in Novosibirsk for a two-month stay where I learned a lot. He had a warm, caring personality, and set many young computer scientists in their tracks. His premature death in 1988 was a shock to all and his memory continues to be revered; it was gratifying to be able to give the lecture named in his honor.

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Bringing C code to the modern world

The C2Eif translator developed by Marco Trudel takes C code and translates it into Eiffel; it produces not just a literal translation but a re-engineering version exhibiting object-oriented properties. Trudel defended his PhD thesis last Friday at ETH (the examiners were Hausi Muller from Victoria University, Manuel Oriol from ABB, Richard Paige from the University of York,  and me as the advisor). The thesis is not yet available online but earlier papers describing C2Eif are, all reachable from the project’s home page [1].

At issue is what we do with legacy code. “J’ai plus de souvenirs que si j’avais mille ans”, wrote Charles Baudelaire in Les Fleurs du Mal (“Spleen de Paris”). The software industry is not a thousand years old, but has accumulated even more “souvenirs” than

A heavy chest of drawers cluttered with balance-sheets,
Poems, love letters, lawsuits, romances
And heavy locks of hair wrapped in invoices
.

We are suffocating under layers of legacy code heaped up by previous generations of programmers using languages that no longer meet our scientific and engineering standards. We cannot get rid of this heritage; how do we bring it to the modern world? We need automatic tools to wrap it in contemporary code, or, better, translate it into contemporary code. The thesis and the system offer a way out through translation to a modern object-oriented language. It took courage to choose such a topic, since there have been many attempts in the past, leading to conventional wisdom consisting of two strongly established opinions:

  • Plain translation: it has been tried, and it works. Not interesting for a thesis.
  • Object-oriented reengineering: it has been tried, and it does not work. Not realistic for a thesis.

Both are wrong. For translation, many of the proposed solutions “almost work”: they are good enough to translate simple programs, or even some large programs but on the condition that the code avoids murky areas of C programming such as signals, exceptions (setjmp/longjmp) and library mechanisms. In practice, however, most useful C programs need these facilities, so any tool that ignores them is bound to be of conceptual value only. The basis for Trudel’s work has been to tackle C to OO translation “beyond the easy stuff” (as stated in the title of one of the published papers). This effort has been largely successful, as demonstrated by the translation of close to a million lines of actual C code, including some well-known and representative tools such as the Vim editor.

As to OO reengineering, C2Eif makes a serious effort to derive code that exhibits a true object-oriented design and hence resembles, in its structure at least, what a programmer in the target language might produce. The key is to identify the right data abstractions, yielding classes, and specialization properties, yielding inheritance. In this area too, many people have tried to come up with solutions, with little success. Trudel has had the good sense of avoiding grandiose goals and sticking to a number of heuristics that work, such as looking at the signatures of a set of functions to see if they all involve a common argument type. Clearly there is more to be done in this direction but the result is already significant.

Since Eiffel has a sophisticated C interface it is also possible to wrap existing code; some tools are available for that purpose, such as Andreas Leitner’s EWG (Eiffel Wrapper Generator). Wrapping and translating each have their advantages and limitations; wrapping may be more appropriate for C libraries that someone else is still actively updating  (so that you do not have to redo a translation with every new release), and translation for legacy code that you want to take over and bring up to par with the rest of your software. C2Eif is engineered to support both. More generally, this is a practitioner’s tool, devoting considerable attention to the many details that make all the difference between a nice idea and a tool that really works. The emphasis is on full automation, although more parametrization has been added in recent months.

C2Eif will make a big mark on the Eiffel developer community. Try it yourself — and don’t be shy about telling its author about the future directions in which you think the tool should evolve.

Reference

[1] C2Eif project page, here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (13 votes cast)
VN:F [1.9.10_1130]
Rating: +8 (from 8 votes)

The origin of “software engineering”

 

(October 2018: see here for a follow-up.)

RecycledEveryone and her sister continues to repeat the canard that the term “software engineering” was coined on the occasion of the eponymous 1968 NATO conference. A mistake repeated in every software engineering textbook remains a mistake. Below is a note I published twenty years ago on the topic in a newsgroup discussion. I found it in an archive here, where you can read the longer exchange of which it was part.

All textbooks on software engineering that I know, and many articles in the field, claim (that is to say, repeat someone else’s claim) that the term “software engineering” itself was coined on the occasion of the Fall 1968 Garmisch-Partenkirchen conference on S.E., organized by the NATO Science Affairs committee. (See [1] for the proceedings, published several years later.)

The very term, it is said, was a challenge to the software community to get its act together and start rationalizing the software production process.

This common wisdom may need to be revised. The August, 1966 issue of Communications of the ACM (Volume 9, number 8) contains an interesting “letter to the ACM membership” by Anthony A. Oettinger, then ACM President. I must confess I don’t know much about the author; he is identified (in the announcement of his election in the June 1966 issue) as Professor of Applied Mathematics and Linguistics, Harvard University, and from his picture looks like a nice fellow. The sentence of interest appears on page 546 at the end of a long paragraph, which I have reproduced below in its entirety because by looking at the full context it appears clearly that Professor Oettinger did not just use two words together by accident, as it were, but knew exactly what he was talking about. Here is the paragraph (italics in original):

“A concern with the science of computing and information processing, while undeniably of the utmost importance and an historic root of our organization [i.e. the ACM – BM] is, alone, too exclusive. While much of what we do is or has its root in not only computer and information science, but also many older and better defined sciences, even more is not at all scientific but of a professional and engineering nature. We must recognize ourselves – not necessarily all of us and not necessarily any one of us all the time – as members of an engineering profession, be it hardware engineering or software engineering, a profession without artificial and irrelevant boundaries like that between ‘scientific’ and ‘business’ applications.”

(The last point would still be worth making today. The end of the second sentence would seem to indicate that the writer viewed engineering as being remote from science, but this would not be accurate; in the paragraph following the one reproduced above, Prof. Oettinger discussed in more detail his view of the close relation between science and engineering.)

The above quotation is clear evidence that the term “software engineering” was used significantly earlier than commonly thought – more than two years before the Garmisch conference.

What I don’t know is whether Prof. Oettinger created the term, or whether it had been in use before. In the latter case, does anyone know of an older reference? Is Prof. Oettinger still around to enlighten us? (For all I know he could be reading this!)

Switching now our theme from the past to the future: does anyone have an idea of when some actual semantics might become attached to the expression “software engineering”? Is 2025 too optimistic?

Reference

[1] J. M. Buxton, P. Naur, B. Randell: Software Engineering Concepts and Techniques (Proceeedings of 1968 NATO Conference on Software Engineering), Van Nostrand Reinhold, 1976.

The last sentence’s sarcasm is, by the way, no longer warranted today.

VN:F [1.9.10_1130]
Rating: 8.2/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 2 votes)

LASER summer school: Software for the Cloud and Big Data

The 2013 LASER summer school, organized by our chair at ETH, will take place September 8-14, once more in the idyllic setting of the Hotel del Golfo in Procchio, on the island of Elba in Italy. This is already the 10th conference; the roster of speakers so far reads like a who’s who of software engineering.

The theme this year is Software for the Cloud and Big Data and the speakers are Roger Barga from Microsoft, Karin Breitman from EMC,  Sebastian Burckhardt  from Microsoft,  Adrian Cockcroft from Netflix,  Carlo Ghezzi from Politecnico di Milano,  Anthony Joseph from Berkeley,  Pere Mato Vila from CERN and I.

LASER always has a strong practical bent, but this year it is particularly pronounced as you can see from the list of speakers and their affiliations. The topic is particularly timely: exploring the software aspects of game-changing developments currently redefining the IT scene.

The LASER formula is by now well-tuned: lectures over seven days (Sunday to Saturday), about five hours in the morning and three in the early evening, by world-class speakers; free time in the afternoon to enjoy the magnificent surroundings; 5-star accommodation and food in the best hotel of Elba, made affordable as we come towards the end of the season (and are valued long-term customers). The group picture below is from last year’s school.

Participants are from both industry and academia and have ample opportunities for interaction with the speakers, who typically attend each others’ lectures and engage in in-depth discussions. There is also time for some participant presentations; a free afternoon to discover Elba and brush up on your Napoleonic knowledge; and a boat trip on the final day.

Information about the 2013 school can be found here.

LASER 2012, Procchio, Hotel del Golvo

VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)