Archive for the ‘Software verification’ Category.

AutoProof workshop: Verification As a Matter of Course

The AutoProof technology pursues the goal of “Verification As a Matter Of Course”, integrated into the EVE development environment. (The AutoProof  project page here; see particularly the online interactive tutorial.) A one-day workshop devoted to the existing AutoProof and current development will take place on October 1 near Toulouse in France. It is an informal event (no proceedings planned at this point, although based on the submissions we might decide to produce a volume), on a small scale, designed to bring together people interested in making the idea of practical verification a reality.

The keynote will be given by Rustan Leino from Microsoft Research, the principal author of the Boogie framework on which the current implementation of AutoProof relies.

For submissions (or to attend without submitting) see the workshop page here. You are also welcome to contact me for more information.

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

Design by Contract: ACM Webinar this Thursday

A third ACM webinar this year (after two on agile methods): I will be providing a general introduction to Design by Contract. The date is this coming Thursday, September 17, and the time is noon New York (18 Paris/Zurich, 17 London, 9 Los Angeles, see here for hours elsewhere). Please tune in! The event is free but requires registration here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

New paper: Theory of Programs

Programming, wrote Dijkstra many years ago, is a branch of applied mathematics. That is only half of the picture: the other half is engineering, and this dual nature of programming is part of its attraction.

Descriptions of the mathematical side are generally, in my view, too complicated. This article [1] presents a mathematical theory of programs and programming based on concepts taught in high school: elementary set theory. The concepts covered include:

  • Programming.
  • Specification.
  • Refinement.
  • Non-determinism.
  • Feasibility.
  • Correctness.
  • Programming languages.
  • Kinds of programs: imperative, functional, object-oriented.
  • Concurrency (small-step and large-step)
  • Control structures (compound, if-then-else and Dijkstra-style conditional, loop).
  • State, store and environment.
  • Invariants.
  • Notational conventions for building specifications and programs incrementally.
  • Loop invariants and variants.

One of the principal ideas is that a program is simply the description of a mathematical relation. The program text is a rendering of that relation. As a consequence, one may construct programming languages simply as notations to express certain kinds of mathematics. This approach is the reverse of the usual one, where the program text and its programming languages are the starting point and the center of attention: theoreticians develop techniques to relate them to mathematical concepts. It is more effective to start from the mathematics (“unparsing” rather than parsing).

All the results (74 properties expressed formally, a number of others in the text) are derived as theorems from rules of elementary set theory; there are no new axioms whatsoever.

The paper also has a short version [2], omitting proofs and many details.


[1] Theory of Programs, available here.
[2] Theory of Programs, short version of [1] (meant for quick understanding of the ideas, not for publication), available here.


VN:F [1.9.10_1130]
Rating: 6.4/10 (18 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 9 votes)

Attached by default?


Opinions requested! See at end.

A void call, during the execution of an object-oriented program, is a call of the standard OO form

x·some_routine (…)                                                /CALL/

where x, a reference, happens to be void (null) instead of denoting, as expected, an object. The operation is not possible; it leads to an exception and, usually, a crash of the program. Void calls are also called “null pointer dereferencing”.

One of the major advances in Eiffel over the past years has been the introduction of attached types, entirely removing the risk of void calls. The language mechanisms, extending the type system, make void-call avoidance a static property, part of type checking: just as the compiler will prevent you from assigning a boolean value to an integer variable, so will it flag your program if it sees a risk of void call. Put the other way around, if your program passes compilation, you have the guarantee that its executions will never produce a void call. Attached types thus remove one of the major headaches of programming, what Tony Hoare [1] called his “one-billion-dollar mistake”:

I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W) [2]. My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty year

Thanks to attached types, Eiffel programmers can sleep at night: their programs will not encounter void calls.

To benefit from this advance, you must declare variables accordingly, as either attached (never void after initialization) or detachable (possibly void). You must also write the program properly:

  • If you declare x attached, you must ensure in the rest of the program that before its first use x will have been attached to an object, for example through a creation instruction create x.
  • If you declare x detachable, you must make sure that any call of the above form /CALL/ happens in a context where x is guaranteed to be non-void; for example, you could protect it by a test if x /= Void then or, better, an “object test”.

Code satisfying these properties is called void-safe.

Void safety is the way to go: who wants to worry about programs, even after they have been thoroughly tested and have seemingly worked for a while, crashing at unpredictable times? The absence of null-pointer-dereferencing can be a statically  enforced property, as the experience of Eiffel now demonstrates; and that what it should be. One day, children will think void-safely from the most tender age, and their great-grandparents will tell them, around the fireplace during long and scary winter nights, about the old days when not everyone was programming in Eiffel and even those who did were worried about the sudden null-pointer-derefencing syndrome. To get void safety through ordinary x: PERSON declarations, you had (children, hold your breath) to turn on a compiler option!

The transition to void safety was neither fast nor easy; in fact, it has taken almost ten years. Not everyone was convinced from the beginning, and we have had to improve and simplify the mechanism along the way to make void-safe programming practical. Compatibility has been a key issue throughout: older classes are generally not void-safe, but in a language that has been around for many years and has a large code base of operational software it is essential to ensure a smooth transition. Void safety has, from its introduction, been controlled by a compiler option:

  • With the option off, old code will compile as it used to do, but you do not get any guarantee of void safety. At execution time, a void call can still cause your program to go berserk.
  • With the option on, you get the guarantee: no void calls. To achieve this goal, you have to make sure the classes obey the void safety rules; if they do not, the compiler will reject them until you fix the problem.

In the effort to reconcile the compatibility imperative with the inexorable evolution to void safety, the key decisions have affected default values for compiler options and language conventions. Three separate decisions, in fact. Two of the defaults have already been switched; the question asked at the end of this article addresses the switching of the last remaining one.

The first default governed the void-safety compiler option. On its introduction, void-safety was off by default; the mechanism had to be turned on explicitly, part of the “experimental” option that most EiffelStudio releases offer for new, tentative mechanisms. That particular decision changed a year ago, with version 7.3 (May 2013): now void safety is the default. To include non-void-safe code you must mark  it explicitly.

The second default affects a language convention: the meaning of a standard declaration. A typical declaration, such as

x: PERSON                                                                                      /A/

says that at run time x denotes a reference which, if not void, will be attached to an object of type PERSON.  In pre-void-safety Eiffel, as in today’s other typed OO languages,  the reference could occasionally become void at run time; in other words, x was detachable. With the introduction of void safety, you could emphasize this property by specifying it explicitly:

x: detachable PERSON                                                             /B/

You could also specify that x would never be void by declaring it attached, asking the compiler to guarantee this property for you (through its application of the void-safety rules to all operations involving x). The explicit form in this case is

x: attached PERSON                                                               /C/

In practical programming, of course, you do not want to specify attached or detachable all the time: you want to use the simple form /A/ as often as possible. Originally, since we were starting from a non-void-safe language, compatibility required /A/ to mean /B/ by default. But it turns out that “attached” really is the dominant case: most references should remain attached at all times and Void values should be reserved for important but highly specialized cases such as terminating linked data structures. So the simple form should, in the final state of the language, mean /C/. That particular default was indeed switched early (version 7.0, November 2011) for people using the void-safety compiler option. As a result, the attached keyword is no longer necessary for declarations such as the above, although it remains available. Everything is attached by default; when you want a reference that could be void (and are prepared to bear the responsibility for convincing the compiler that it won’t when you actually use it in a call), you declare it as detachable; that keyword remains necessary.

There remains one last step in the march to all-aboard-for-void-safety: removing the “detachable by default” option, that is to say, the compiler option that will make /A/ mean /B/ (rather than /C/). It is only an option, and not the default; but still it remains available. Do we truly need it? The argument for removing it  is that it simplifies the specification (the fewer options the better) and encourages everyone, even more than before, to move to the new world. The argument against is to avoid disturbing existing projects, including their compiler control files (ECFs).

The question looms: when do we switch the defaults? Some of us think the time is now; specifically, the November release (14.11) [4].

Do you think the option should go? We would like your opinion. Please participate in the Eiffelroom poll [5].


References and note

[1] C.A.R. Hoare: Null References: The Billion Dollar Mistake , abstract of talk at QCon London, 9-12 March 2009, available here.

[2] (BM note) As a consolation, before Algol W, LISP already had NIL, which is the null pointer.

[3] Bertrand Meyer, Alexander Kogtenkov and Emmanuel Stapf: Avoid a Void: The Eradication of Null Dereferencing, in Reflections on the Work of C.A.R. Hoare, eds. C. B. Jones, A.W. Roscoe and K.R. Wood, Springer-Verlag, 2010, pages 189-211, available here.

[4] EiffelStudio version numbering changed in 2014: from a classic major_number.minor_number to a plain year.month, with two principal releases, 5 and 11 (May and November).

[5] Poll on switching the attachment defaults: at the bottom of the Eiffelroom page here (direct access here).

VN:F [1.9.10_1130]
Rating: 8.6/10 (7 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 4 votes)

Code matters

(Adapted from an article previously published on the CACM blog.)

Often, you will be told that programming languages do not matter much. What actually matters more is not clear; maybe tools, maybe methodology, maybe process. It is a pretty general rule that people arguing that language does not matter are partisans of bad languages.

Let us consider the Apple bug of a few weeks ago. Only a few weeks; the world has already moved to Heartbleed (to be discussed in a subsequent article), but that is not a reason to sweep away the memory of the Apple bug and the language design that it reflects.

In late February, users of  iPhones, iPads and iPods were enjoined to upgrade their devices immediately because  “an attacker with a privileged network position may capture or modify data in sessions protected by SSL/TLS.” The bug was traced [1] to code of the following form:

if (error_of_first_kind)
goto fail;
if (error_of_second_kind)
goto fail;
if (error_of_third_kind)
goto fail;
if (error_of_fourth_kind)
goto fail;
if (error_of_fifth_kind)
goto fail;
goto fail;
if (error_of_sixth_kind)
goto fail;

In other words: just a duplicated line! (The extra line is highlighted above.) But the excess “goto” is beyond the scope of the preceding “if“, so it is executed unconditionally: all executions go directly to the “fail” label, so that The_truly_important_code_handling_non_erroneous_case never gets executed.

Critics have focused their ire on the  goto instruction, but it is of little relevance. What matters, language-wise, is the C/C++-Java-C# convention of delimiting the scope of conditional instructions, loops and other kinds of composite structures. Every component of such structures in these languages is syntactically a single instruction, so that:

  • If you want the branch to consist of an atomic instruction, you write that instruction by itself, as in: if (c) a = b;
  • If you want a sequence of instructions, you write it as a compound, enclosed by the ever so beautiful braces: if (c) {a = b; x = y;}

Although elegant in principle (after all, it comes from Algol), this convention is disastrous from a software engineering perspective because software engineering means understanding that programs change. One day, a branch of a conditional or loop has one atomic instruction; sometime later, a maintainer realizes that the corresponding case requires more sophisticated treatment, and adds an instruction, but fails to add the braces.

The proper language solution is to do away with the notion of compound instruction as a separate concept, but simply expect all branches of composite instructions to consist of a sequence, which could consist of several instructions, just one, or none at all. In Eiffel, you will write

if  c then
   x := y


 if  c then
   a := b
   x := y
   u := v


from i := 1 until c loop
   a := b
   i := i + 1


across my_list as l loop
   l.add (x)

and so on. This syntax also gets rid of all the noise that pollutes programs in languages retaining C’s nineteen-sixties conventions: parentheses around the conditions, semicolons for instructions on different lines; these small distractions accumulate into serious impediments to program readability.

With such a modern language design, the Apple bug could not have arisen. A duplicated line is either:

  • A keyword such as end, immediately caught as a syntax error.
  • An actual instruction such as an assignment, whose duplication causes either no effect or an effect limited to the particular case covered by the branch, rather than catastrophically disrupting all cases, as in the Apple bug.

Some people, however, find it hard to accept the obvious responsibility of language design. Take this comment derisively entitled  “the goto squirrel” by Dennis Hamilton in the ACM Risks forum [2]:

It is amazing to me that, once the specific defect is disclosed (and the diff of the actual change has also been published), the discussion has devolved into one of coding style and whose code is better.  I remember similar distractions around the Ariane 501 defect too, although in that case there was nothing wrong with the code—the error was that it was being run when it wasn’t needed and it was not simulation tested with new launch parameters under the mistaken assumption that if the code worked for Ariane 4, it should work for Ariane 5.

It is not about the code.  It is not about the code.  It is not about goto. It is not about coming up with ways to avoid introducing this particular defect by writing the code differently.

Such certainty! Repeating a wrong statement ( “it is not about the code“) does not make it  right. Of course “it” is about the code! If the code had been different the catastrophe would not have happened, so one needs some gall to state that the code is not the issue — and just as much gall, given that the catastrophe would also not have happened if the programming language had been different, to state that it is not about the programming language.

When Mr. Hamilton dismisses as “distractions” the explanations pointing to programming-related causes for the Ariane-5 disaster, I assume he has in mind the analysis which I published at the time with Jean-Marc Jézéquel [3], which explained in detail how the core issue was the absence of proper specifications (contracts). At that time too, we heard dismissive comments; according to one of the critics, the programming aspects did not count, since the whole thing was really a social problem: the French engineers in Toulouse did not communicate properly with their colleagues in England! What is great with such folk explanations is that they sound just right and please people because they reinforce existing stereotypes. They are by nature as impossible to refute as they are impossible to prove. And they avoid raising the important but disturbing questions: were the teams using the right programming language, the right specification method (contracts, as our article suggested), appropriate tools? In both the Ariane-5 and Apple cases, they were not.

If you want to be considered polite, you are not supposed to point out that the use of programming languages designed for the PDP-8 or some other long-gone machine is an invitation to disaster. The more terrible the programming language people use, and the more they know it is terrible (even if they will not admit it), the more scandalized they will be that you point out that it is, indeed, terrible. It is as if you had said something about their weight or the pimples on their cheeks. Such reactions do not make the comment less true. The expression of outrage is particularly inappropriate when technical choices are not just matters for technical argument, but have catastrophic consequences on society.

The usual excuse, in response to language criticisms, is that better tools, better quality control (the main recommendation of the Ariane-5 inquiry committee back in 1997), better methodology would also have avoided the problem. Indeed, a number of the other comments in the comp.risks discussion that includes Hamilton’s dismissal of code [2] point in this direction, noting for example that static analyzers could have detected code duplication and unreachable instructions. These observations are all true, but change nothing to the role of programming languages and coding issues.  One of the basic lessons from the study of software and other industrial disasters — see for example the work of Nancy Leveson — is that a disaster results from a combination of causes. This property is in fact easy to understand: a disaster coming from a single cause would most likely have been avoided. Consider the hypothetical example of a disastrous flaw in Amazon’s transaction processing. It seems from various sources that Amazon processes something like 300 transactions a second. Now let us assume three independent factors, each occurring with a probability of a thousandth (10-3), which could contribute to a failure. Then:

  • It is impossible that one of the factors could cause failure just by itself: that means it would make a transaction after around 3 seconds, and would be caught even in the most trivial unit testing. No one but the developer would ever know about it.
  • If two of the factors together cause failure, they will occur every million transactions, meaning about once an hour. Any reasonable testing will discover the problem before a release is ever deployed.
  • If all three factors are required, the probability is 10-9, meaning that a failure will occur about once a year. Only in that case will a real problem exist: a flaw that goes undetected for a long time, during which everything seems normal, until disaster strikes.

These observations explain why post-mortem examinations of catastrophes always point to a seemingly impossible combination of unfortunate circumstances. The archduke went to Sarajevo and he insisted on seeing the wounded and someone forgot to tell the drivers about the prudent decision to bypass the announced itinerary and the convoy stalled  and the assassin saw it and he hit Franz-Ferdinand right in the neck and there was nationalistic resentment in various countries and the system of alliances required countries to declare war [4]. Same thing for industrial accidents. Same thing for the Apple bug: obviously, there were no good code reviews and no static analysis tools applied and no good management; and, obviously, a programming language that blows out innocent mistakes into disasters of planetary import.

So much for the accepted wisdom, heard again and again in software engineering circles, that code does not matter, syntax does not count, typos are caught right away, and that all we should care about is process or agility or requirements or some other high-sounding concern more respectable than programming. Code? Programming languages? Did we not take care of those years ago? I remember similar distractions.”

There is a  positive conclusion to the “and” nature (in probabilistic terms, the multiplicative nature) of causes necessary to produce a catastrophe in practice: it suffices to get rid of one of the operands of the “and” to falsify its result, hence avoiding the catastrophe. When people tell you that code does not matter or that language does not matter, just understand the comment for what it really means, “I am ashamed of the programming language and techniques I use but do not want to admit it so I prefer to blame problems on the rest of the world“, and make the correct deduction: use a good programming language.


[1] Paul Duckline:  Anatomy of a “goto fail” – Apple’s SSL bug explained, plus an unofficial patch for OS X!, Naked Security blog (Sophos), 24 February 2014, available here.

[2] Dennis E. Hamilton: The Goto Squirrel, ACM Risks Forum, 28 February 2014, available here.

[3] Jean-Marc Jézéquel and Bertrand Meyer: Design by Contract: The Lessons of Ariane, in Computer (IEEE), vol. 30, no. 1, January 1997, pages 129-130, available online here and, with reader responses here.

[4] Assassination of Ferdinand of Autria: here.

VN:F [1.9.10_1130]
Rating: 8.1/10 (12 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 6 votes)

New article: contracts in practice

For almost anyone programming in Eiffel, contracts are just a standard part of daily life; Patrice Chalin’s pioneering study of a few years ago [1] confirmed this impression. A larger empirical study is now available to understand how developers actually use contracts when available. The study, to published at FM 2014 [2] covers 21 programs, not just in Eiffel but also in JML and in Code Contracts for C#, totaling 830,000 lines of code, and following the program’s revision history for a grand total of 260 million lines of code over 7700 revisions. It analyzes in detail whether programmers use contracts, how they use them (in particular, which kinds, among preconditions, postconditions and invariants), how contracts evolve over time, and how inheritance interacts with contracts.

The paper is easy to read so I will refer you to it for the detailed conclusions, but one thing is clear: anyone who thinks contracts are for special development or special developers is completely off-track. In an environment supporting contracts, especially as a native part of the language, programmers understand their benefits and apply them as a matter of course.


[1] Patrice Chalin: Are practitioners writing contracts?, in Fault-Tolerant System, eds. Butler, Jones, Romanovsky, Troubitsyna, Springer LNCS, vol. 4157, pp. 100–113, 2006.

[2] H.-Christian Estler, Carlo A. Furia, Martin Nordio, Marco Piccioni and Bertrand Meyer: Contracts in Practice, to appear in proceedings of 19th International Symposium on Formal Methods (FM 2014), Singapore, May 2014, draft available here.

VN:F [1.9.10_1130]
Rating: 8.4/10 (11 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 6 votes)

PhD positions in concurrency/distribution/verification at ETH

As part of our “Concurrency Made Easy” ERC Advanced Investigator Grant project (2012-2017), we are offering PhD positions at the Chair of Software Engineering of ETH Zurich. The goal of the project is to build a sophisticated programming and verification architecture to make concurrent and distributed programming simple and reliable, based on the ideas of Eiffel and particularly the SCOOP concurrency model. Concurrency in its various forms (particularly multithreading) as well as distributed computing are required for most of today’s serious programs, but programming concurrent applications remains a challenge. The CME project is determined to break this complexity barrier.  Inevitably, achieving simplicity for users (in this case, application programmers) requires, under the hood, a sophisticated infrastructure, both conceptual (theoretical models) and practical (the implementation). We are building that infrastructure.

ETH offers an outstanding research and education environment and competitive salaries for “assistants” (PhD students), who are generally expected in addition to their research to participate in teaching, in particular introductory programming, and other activities of the Chair.  The candidates we seek have: a master’s degree in computer science or related field from a recognized institution (as required by ETH); a strong software engineering background, both practical and theoretical, and more generally a strong computer science and mathematical culture; a good knowledge of verification techniques (e.g. Hoare-style, model-checking, abstract interpretation); some background in concurrency or distribution; and a passion for high-quality software development. Prior publications, and experience with Eiffel, are pluses. In line with ETH policy, particular attention will be given to female candidates.

Before applying, you should become familiar with our work; see in particular the research pages at including the full description of the CME project at

Candidates should send (in PDF or text ) to a CV and a short cover letter describing their view of the CME project and ideas about their possible contribution.

VN:F [1.9.10_1130]
Rating: 5.7/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 4 votes)

Negative variables: new version

I have mentioned this paper before (see the earlier blog entry here) but it is now going to be published [1] and has been significantly revised, both to take referee comments into account and because we found better ways to present the concepts.

We have  endeavored to explain better than in the draft why the concept of negative variable is necessary and why the usual techniques for modeling object-oriented programs do not work properly for the fundamental OO operation, qualified call x.r (…). These techniques are based on substitution and are simply unable to express certain properties (let alone verify them). The affected properties are those involving properties of the calling context or the global project structure.

The basic idea (repeated in part from the earlier post) is as follows. In modeling OO programs, we have to take into account the unique “general relativity” property of OO programming: all the operations you write are expressed relative to a “current object” which changes repeatedly during execution. More precisely at the start of a call x.r (…) and for the duration of that call the current object changes to whatever x denotes — but to determine that object we must again interpret x in the context of the previous current object. This raises a challenge for reasoning about programs; for example in a routine the notation f.some_reference, if f is a formal argument, refers to objects in the context of the calling object, and we cannot apply standard rules of substitution as in the non-OO style of handling calls.

We introduced a notion of negative variable to deal with this issue. During the execution of a call x.r (…) the negation of x , written x’, represents a back pointer to the calling object; negative variables are characterized by axiomatic properties such as x.x’= Current and x’.(old x)= Current.

Negative variable as back pointer

The paper explains why this concept is necessary, describes the associated formal rules, and presents applications.


[1] Bertrand Meyer and Alexander Kogtenkov: Negative Variables and the Essence of Object-Oriented Programming, to appear in Specification, Algebra, and Software, eds. Shusaku Iida, Jose Meseguer and Kazuhiro Ogata, Springer Lecture Notes in Computer Science, 2014, to appear. See text here.

VN:F [1.9.10_1130]
Rating: 7.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 4 votes)

Saint Petersburg Software Engineering Seminar: 14 January 2014 (6 PM)

There will be two talks in the Software Engineering Seminar at ITMO, 18:00 local time, Tuesday, January 14, 2014. Please arrive 10 minutes early for registration.

Place: ITMO, Sytninskaya Ulitsa, Saint Petersburg.

Andrey Terekhov (SPBGU): Programming crystals

(I do not know whether this talk will be in Russian or English. An abstract follows but the talk is meant as the start of a discussion rather than a formal lecture.)

В течение последних 20-30 лет основными языками программирования кристаллов были VHDL и Verilog. Эти языки изначально проектировались как средства создания проектной документации, потом они стали использоваться в качестве инструмента моделирования и только сравнительно недавно для этих языков появились средства генерации кода уровня RTL (Register Transfer Language). Тексты на  VHDL и Verilog очень громоздки, трудно читаемые, плохо стандартизованы (одна и та же программа может синтезироваться на одном инструменте и не поддаваться синтезу на другом. Лет 10 назад появился язык SystemC – это С++ с огромным набором библиотек. С одной стороны, любая программа на SystemC может транслироваться стандартными трансляторами С++ , есть удобные средства потактного моделирования и приличные средства генгерации RTL, с другой стороны, требование совместимости с С++ не прошло даром – если в базовом языке нет средств описания параллелизма и конвейеризации, их приходится добавлять весьма искусственными приемами через приставные библиотеки. Буквально в прошлом году фирма Xilinx выпустила продукт Vivado, в рекламе которого утверждается, что он способен автоматически транслировать обычные программы на С/C++ в RTL промышленного качества.

Мы выполнили несколько экспериментов по использованию этого продукта, оказалось, что обещанной автоматизации там нет, пользователь должен писать на С, постоянно думая о том, как его код будет выглядеть в финальном RTL,  расставлять огромное количество прагм, причем не всегда очевидных.

Основной тезис доклада – такая важная область, как проектирование кристаллов, нуждается в специализированных языковых и инструментальных средствах, обеспечивающих  создание компактных и  легко читаемых программ, которые могут быть использованы как для симуляции, так и для генерации эффективного RTL. В докладе будут приведены примеры программ на языке HaSCoL (Hardware and Software Codesign Language), разработанном на кафедре системного программирования СПбГУ, и даны некоторые сравнительные характеристики.

Sergey Velder (ITMO): Alias graphs

(My summary – BM.) In the ITMO SEL work on automatic alias analysis, a new model has been developed: alias graphs, an abstraction of the object structure. This short talk will compare it to previously used approaches.

VN:F [1.9.10_1130]
Rating: 5.5/10 (2 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 2 votes)

New paper: alias calculus and frame inference

For a while now I have  been engaged in  a core problem of software verification: the aliasing problem. As with many difficult problems in science, it is easy to state the basic question: can we determine automatically whether at a program point p the values of two reference expressions e and f can ever denote the same object?

Alias analysis lies at the core of many problems in software analysis and verification.

Earlier work [2] I introduced an “alias calculus”. The calculus is a set of rules, attached to the constructs of the programming language, to compute the “alias relation”: the set of possibly aliased expression pairs. A new paper [1] with Sergey Velder and Alexander Kogtenkov improves the model (correcting in particular an error in the axiom for assignment, whose new version has been proved sound using Coq) and applies it to the inference of frame properties. Here the abstract:

Alias analysis, which determines whether two expressions in a program may reference to the same object, has many potential applications in program construction and verification. We have developed a theory for alias analysis, the “alias calculus”, implemented its application to an object-oriented language, and integrated the result into a modern IDE. The calculus has a higher level of precision than many existing alias analysis techniques. One of the principal applications is to allow automatic change analysis, which leads to inferring “modifies clauses”, providing a significant advance towards addressing the Frame Problem. Experiments were able to infer the “modifies” clauses of an existing formally specied library. Other applications, in particular to concurrent programming, also appear possible. The article presents the calculus, the application to frame inference including experimental results, and other projected applications. The ongoing work includes building more efficient model capturing aliasing properties and soundness proof for its essential elements.

This is not the end of the work, as better models and implementations are needed, but an important step.


[1] Sergey Velder, Alexander Kogtenkovand Bertrand Meyer: Alias Calculus, Frame Calculus and Frame Inference, in Science of Computer Programming, to appear in 2014 (appeared online 26 November 2013); draft available here, published version here.
[2] Bertrand Meyer: Steps Towards a Theory and Calculus of Aliasing, in International Journal of Software and Informatics, Chinese Academy of Sciences, 2011, pages 77-116, available here.


VN:F [1.9.10_1130]
Rating: 7.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 4 votes)

Swiss railway prowess

Yesterday, wanting to find out the platform of an arriving  train (stations are good at displaying departures, but don’t seem to consider arrival information worth showing), I fired up the “SBB Business” mobile application of the Swiss railways which I purchased some time ago:

SBB schedule
















The results are impressive: one can, they tell us, ride from Paris to Zurich — about 500 kilometers or 300 miles — in 56 minutes by leaving at 17:02. Wow! No wonder the entry displays (on the right) the graphical symbol for the highest possible expected passenger load, in both first and second class. With such a speed I would flock to that train too!

Nonsense of course. The TGV is fast, at least in the French part (from Basel to Zurich they seem to make it as slow as possible to prove some point), but it still takes four hours and three minutes. I have no idea where the program got the information it displays.

Trying to tap the “earlier” or “later” buttons does not help much, since what you get (consistently repeated if you keep trying, and confirmed again one day later) is this screen:

SBB error message















No clue what “F1” means.

Whatever software solutions the SBB uses, I am not impressed. Of course, they are welcome to ask for my suggestions.

VN:F [1.9.10_1130]
Rating: 7.5/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 5 votes)

The invariants of key algorithms (new paper)


I have mentioned this paper before but as a draft. It has now been accepted by ACM’s Computing Surveys and is scheduled to appear in September 2014; the current text, revised from the previous version, is available [1].

Here is the abstract:

Software verification has emerged as a key concern for ensuring the continued progress of information technology. Full verification generally requires, as a crucial step, equipping each loop with a “loop invariant”. Beyond their role in verification, loop invariants help program understanding by providing fundamental insights into the nature of algorithms. In practice, finding sound and useful invariants remains a challenge. Fortunately, many invariants seem intuitively to exhibit a common flavor. Understanding these fundamental invariant patterns could therefore provide help for understanding and verifying a large variety of programs.

We performed a systematic identification, validation, and classification of loop invariants over a range of fundamental algorithms from diverse areas of computer science. This article analyzes the patterns, as uncovered in this study,governing how invariants are derived from postconditions;it proposes a taxonomy of invariants according to these patterns, and presents its application to the algorithms reviewed. The discussion also shows the need for high-level specifications based on “domain theory”. It describes how the invariants and the corresponding algorithms have been mechanically verified using an automated program prover; the proof source files are available. The contributions also include suggestions for invariant inference and for model-based specification.


[1] Carlo Furia, Bertrand Meyer and Sergey Velder: Loop invariants: Analysis, Classification and Examples, in ACM Computing Surveys, to appear in September 2014, preliminary text available here.

VN:F [1.9.10_1130]
Rating: 7.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

Reading notes: strong specifications are well worth the effort


This report continues the series of ICSE 2013 article previews (see the posts of these last few days, other than the DOSE announcement), but is different from its predecessors since it talks about a paper from our group at ETH, so you should not expect any dangerously delusional,  disingenuously dubious or downright deceptive declaration or display of dispassionate, disinterested, disengaged describer’s detachment.

The paper [1] (mentioned on this blog some time ago) is entitled How good are software specifications? and will be presented on Wednesday by Nadia Polikarpova. The basic result: stronger specifications, which capture a more complete part of program functionality, cause only a modest increase in specification effort, but the benefits are huge; in particular, automatic testing finds twice as many faults (“bugs” as recently reviewed papers call them).

Strong specifications are specifications that go beyond simple contracts. A straightforward example is a specification of a push operation for stacks; in EiffelBase, the basic Eiffel data structure library, the contract’s postcondition will read

item =                                          /A/
count = old count + 1

where x is the element being pushed, item the top of the stack and count the number of elements. It is of course sound, since it states that the element just pushed is now the new top of the stack, and that there is one more element; but it is also  incomplete since it says nothing about the other elements remaining as they were; an implementation could satisfy the contract and mess up with these elements. Using “complete” or “strong” preconditions, we associate with the underlying domain a theory [2], or “model”, represented by a specification-only feature in the class, model, denoting a sequence of elements; then it suffices (with the convention that the top is the first element of the model sequence, and that “+” denotes concatenation of sequences) to use the postcondition

model = <x> + old model         /B/

which says all there is to say and implies the original postconditions /A/.

Clearly, the strong contracts, in the  /B/ style, are more expressive [3, 4], but they also require more specification effort. Are they worth the trouble?

The paper explores this question empirically, and the answer, at least according to the criteria used in the study, is yes.  The work takes advantage of AutoTest [5], an automatic testing framework which relies on the contracts already present in the software to serve as test oracles, and generates test cases automatically. AutoTest was applied to both to the classic EiffelBase, with classic partial contracts in the /A/ style, and to the more recent EiffelBase+ library, with strong contracts in the /B/ style. AutoTest is for Eiffel programs; to check for any language-specificity in the results the work also included testing a smaller set of classes from a C# library, DSA, for which a student developed a version (DSA+) equipped with strong model-based contracts. In that case the testing tool was Microsoft Research’s Pex [7]. The results are similar for both languages: citing from the paper, “the fault rates are comparable in the C# experiments, respectively 6 . 10-3 and 3 . 10-3 . The fault complexity is also qualitatively similar.

The verdict on the effect of strong specifications as captured by automated testing is clear: the same automatic testing tools applied to the versions with strong contracts yield twice as many real faults. The term “real fault” comes from excluding spurious cases, such as specification faults (wrong specification, right implementation), which are a phenomenon worth studying but should not count as a benefit of the strong specification approach. The paper contains a detailed analysis of the various kinds of faults and the corresponding empirically determined measures. This particular analysis is for the Eiffel code, since in the C#/Pex case “it was not possible to get an evaluation of the faults by the original developers“.

In our experience the strong specifications are not that much harder to write. The paper contains a precise measure: about five person-weeks to create EiffelBase+, yielding an “overall benefit/effort ratio of about four defects detected per person-day“. Such a benefit more than justifies the effort. More study of that effort is needed, however, because the “person” in the person-weeks was not just an ordinary programmer. True, Eiffel experience has shown that most programmers quickly get the notion of contract and start applying it; as the saying goes in the community, “if you can write an if-then-else, you can write a contract”. But we do not yet have significant evidence of whether that observation extends to model-based contracts.

Model-based contracts (I prefer to call them “theory-based” because “model” means so many other things, but I do not think I will win that particular battle) are, in my opinion, a required component of the march towards program verification. They are the right compromise between simple contracts, which have proved to be attractive to many practicing programmers but suffer from incompleteness, and full formal specification à la Z, which say everything but require too much machinery. They are not an all-or-nothing specification technique but a progressive one: programmers can start with simple contracts, then extend and refine them as desired to yield exactly the right amount of precision and completeness appropriate in any particular context. The article shows that the benefits are well worth the incremental effort.

According to the ICSE program the talk will be presented in the formal specification session, Wednesday, May 22, 13:30-15:30, Grand Ballroom C.


[1] Nadia Polikarpova, Carlo A. Furia, Yu Pei, Yi Wei and Bertrand Meyer: What Good Are Strong Specifications?, to appear in ICSE 2013 (Proceedings of 35th International Conference on Software Engineering), San Francisco, May 2013, draft available here.

[2] Bertrand Meyer: Domain Theory: the forgotten step in program verification, article on this blog, see here.

[3] Bernd Schoeller, Tobias Widmer and Bertrand Meyer: Making Specifications Complete Through Models, in Architecting Systems with Trustworthy Components, eds. Ralf Reussner, Judith Stafford and Clemens Szyperski, Lecture Notes in Computer Science, Springer-Verlag, 2006, available here.

[4] Nadia Polikarpova, Carlo Furia and Bertrand Meyer: Specifying Reusable Components, in Verified Software: Theories, Tools, Experiments (VSTTE ‘ 10), Edinburgh, UK, 16-19 August 2010, Lecture Notes in Computer Science, Springer Verlag, 2010, available here.

[5] Bertrand Meyer, Ilinca Ciupa, Andreas Leitner, Arno Fiva, Yi Wei and Emmanuel Stapf: Programs that Test Themselves, IEEE Computer, vol. 42, no. 9, pages 46-55, September 2009, also available here.

[6] Bertrand Meyer, Ilinca Ciupa, Andreas Leitner, Arno Fiva, Yi Wei and Emmanuel Stapf: Programs that Test Themselves, in IEEE Computer, vol. 42, no. 9, pages 46-55, September 2009, also available here.

[7] Nikolai Tillman and Peli de Halleux, Pex: White-Box Generation for .NET, in Tests And Proofs (TAP 2008), pp. 134-153.


VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

Reading notes: misclassified bugs


(Please note the general disclaimer [1].)

How Misclassification Impacts Bug Prediction [2], an article to be presented on Thursday at ICSE, is the archetype of today’s successful empirical software engineering research, deriving significant results from the mining of publicly available software project repositories — in this case Tomcat5 and three others from Apache, as well as Rhino from Mozilla. The results are in some sense meta-results, because many studies have already mined the bug records of such repositories to draw general lessons about bugs in software development; what Herzig, Just and Zeller now tell us is that the mined data is highly questionable: many problems classified as bugs are not bugs.

The most striking results (announced in a style a bit stentorian to my taste, but indeed striking) are that: every third bug report does not describe a bug, but a request for a new feature, an improvement, better documentation or tests, code cleanup or refactoring; and that out of five program files marked as defective, two do not in fact contain any bug.

These are both false positive results. The repositories signal very few misclassifications the other way: only a small subset of enhancement and improvement requests (around 5%) should have been classified as bugs, and even fewer faulty files are missed (8%, but in fact less than 1% if one excludes an outlier, tomcat5 with 38%, a discrepancy that the paper does not discuss).

The authors have a field day, in the light of this analysis, of questioning the validity of the many studies in recent years — including some, courageously cited, by Zeller himself and coauthors — that start from bug repositories to derive general lessons about bugs and their properties.

The methodology is interesting if a bit scary. The authors (actually, just the two non-tenured authors, probably just a coincidence) analyzed 7401 issue reports manually; more precisely, one of them analyzed all of them and the second one took a second look at the reports that came out from the first step as misclassified, without knowing what the proposed reclassification was, then the results were merged. At 4 minutes per report this truly stakhanovite effort took 90 working days. I sympathize, but I wonder what the rules are in Saarland for experiments involving living beings, particularly graduate students.

Precise criteria were used for the reclassification; for example a report describes a bug, in the authors’ view, if it mentions a null pointer exception (I will skip the opportunity of a pitch for Eiffel’s void safety mechanism), says that the code has to be corrected to fix the semantics, or if there is a “memory issue” or infinite loop. These criteria are reasonable if a bit puzzling (why null pointer exceptions and not other crashes such as arithmetic overflows?); but more worryingly there is no justification for them. I wonder  how much of the huge discrepancy found by the authors — a third or reported bugs are not bugs, and 40% of supposedly defective program files are not defective — can be simply explained by different classification criteria applied by the software projects under examination. The authors give no indication that they interacted with the people in charge of these projects. To me this is the major question hovering over this paper and its spectacular results. If you are in the room and get the chance, don’t hesitate to ask this question on my behalf or yours!

Another obvious question is how much the results depend on the five projects selected. If there ever was room for replicating a study (a practice whose rarity in software engineering we lament, but whose growth prospects are limited by the near-impossibility of convincing selective software engineering venues to publish confirmatory empirical studies), this would be it. In particular it would be good to see some of the results for commercial products.

The article offers an explanation for the phenomena it uncovered: in its view, the reason why so many bug reports end up misclassified is the difference of perspective between users of the software, who complain about the problems they encounter,  and the software professionals  who prepare the actual bug reports. The explanation is plausible but I was surprised not to see any concrete evidence that supports it. It is also surprising that the referees did not ask the authors to provide more solid arguments to buttress that explanation. Yet another opportunity to raise your hand and ask a question.

This (impressive) paper will call everyone’s attention to the critical problem of data quality in empirical studies. It is very professionally prepared, and could, in addition to its specific contributions, serve as a guide on how to get an empirical software engineering paper accepted at ICSE: take a critical look at an important research area; study it from a viewpoint that has not been considered much so far; perform an extensive study, with reasonable methodological assumptions; derive a couple of striking results, making sure they are both visibly stated and backed by the evidence; and include exactly one boxplot.

Notes and references

[1] This article review is part of the “Reading Notes” series. General disclaimer here.

[2] Kim Herzig, Sascha Just and Andreas Zeller: It’s not a Bug, it’s a Feature: How Misclassification Impacts Bug Prediction, in ICSE 2013, available here. According to the ICSE program the paper will be presented on May 23 in the Bug Prediction session, 16 to 17:30.

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

Reading notes: the design of bug fixes


To inaugurate the “Reading Notes” series [1] I will take articles from the forthcoming International Conference on Software Engineering. Since I am not going to ICSE this year I am instead spending a little time browsing through the papers, obligingly available on the conference site. I’ll try whenever possible to describe a paper before it is presented at the conference, to alert readers to interesting sessions. I hope in July and August to be able to do the same for some of the papers to be presented at ESEC/FSE [2].

Please note the general disclaimer [1].

The Design of Bug Fixes [3] caught my attention partly for selfish reasons, since we are working, through the AutoFix project [3], on automatic bug fixing, but also out of sheer interest and because I have seen previous work by some of the authors. There have been article about bug patterns before, but not so much is known with credible empirical evidence about bug fixes (corrections of faults). When a programmer encounters a fault, what strategies does he use to correct it? Does he always produce the best fix he can, and if so, why not? What is the influence of the project phase on such decisions (e.g. will you fix a bug the same way early in the process and close to shipping)? These are some of the questions addressed by the paper.

The most interesting concrete result is a list of properties of bug fixes, classified along two criteria: nature of a fix (the paper calls it “design space”), and reasoning behind the choice of a fix. Here are a few examples of the “nature” classification:

  • Data propagation: the bug arises in a component, fix it in another, for example a library class.
  • More or less accuracy: are we fixing the symptom or the cause?
  • Behavioral alternatives: rather than directly correcting the reported problem, change the user-experienced behavior (evoking the famous quip that “it’s not a bug, it’s a feature”). The authors were surprised to see that developers (belying their geek image) seem to devote a lot of effort trying to understand how users actually use the products, but also found that even so developers do not necessarily gain a solid, objective understanding of these usage patterns. It would be interesting to know if the picture is different for traditional locally-installed products and for cloud-based offerings, since in the latter case it is possible to gather more complete, accurate and timely usage data.

On the “reasoning” side, the issue is why and how programmers decide to adopt a particular approach. For example, bug fixes tend to be more audacious (implying redesign if appropriate) at the beginning of a project, and more conservative as delivery nears and everyone is scared of breaking something. Another object of the study is how deeply developers understand the cause rather than just the symptom; the paper reports that 18% “did not have time to figure out why the bug occurred“. Surprising or not, I don’t know, but scary! Yet another dimension is consistency: there is a tension between providing what might ideally be the best fix and remaining consistent with the design decisions that underlie a software system throughout its architecture.

I was more impressed by the individual categories of the classification than by that classification as a whole; some of the categories appear redundant (“interface breakage“, “data propagation” and “internal vs external“, for example, seem to be pretty much the same; ditto for “cause understanding” and “accuracy“). On the other hand the paper does not explicitly claim that the categories are orthogonal. If they turn this conference presentation into a journal article I am pretty sure they will rework the classification and make it more robust. It does not matter that it is a bit shaky at the moment since the main insights are in the individual kinds of fix and fix-reasoning uncovered by the study.

The authors are from Microsoft Research (one of them was visiting faculty) and interviewed numerous programmers from various Microsoft product groups to find out how they fix bugs.

The paper is nicely written and reads easily. It includes some audacious syntax, as in “this dimension” [internal vs external] “describes how much internal code is changed versus external code is changed as part of a fix“. It has a discreet amount of humor, some of which may escape non-US readers; for example the authors explain that when approaching programmers out of the blue for the survey they tried to reassure them through the words “we are from Microsoft Research, and we are here to help“, a wry reference to the celebrated comment by Ronald Reagan (or his speechwriter) that the most dangerous words in the English language are “I am from the government, and I am here to help“. To my taste the authors include too many details about the data collection process; I would have preferred the space to be used for a more detailed discussion of the findings on bug fixes. On the other hand we all know that papers to selective conferences are written for referees, not readers, and this amount of methodological detail was probably the minimum needed to get past the reviewers (by avoiding the typical criticism, for empirical software engineering research, that the sample is too small, the questions biased etc.). Thankfully, however, there is no pedantic discussion of statistical significance; the authors openly present the results as dependent on the particular population surveyed and on the interview technique. Still, these results seem generalizable in their basic form to a large subset of the industry. I hope their publication will spawn more detailed studies.

According to the ICSE program the paper will be presented on May 23 in the Debugging session, 13:30 to 15:30.

Notes and references

[1] This article review is part of the “Reading Notes” series. General disclaimer here.

[2] European Software Engineering Conference 2013, Saint Petersburg, Russia, 18-24 August, see here.

[3] Emerson Murphy-Hill, Thomas Zimmerman, Christian Bird and Nachiappan Nagapan: The Design of Bug Fixes, in ICSE 2013, available here.

[4] AutoFix project at ETH Zurich, see project page here.

[5] Ronald Reagan speech extract on YouTube.

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

Specify less to prove more

Software verification is progressing slowly but surely. Much of that progress is incremental: making the fundamental results applicable to real programs as they are built every day by programmers working in standard circumstances. A key condition is to minimize the amount of annotations that they have to provide.

The article mentioned in my previous post, “Program Checking With Less Hassle” [1], to be presented at VSTTE in San Francisco on Friday by its lead author, Julian Tschannen, introduces several interesting contributions in this direction. One of the surprising conclusions is that sometimes it pays to specify less. That goes against intuition: usually, the more specification information (correctness annotations) you provide the more you help the prover. But in fact partial specifications can hurt rather than help. Consider for example a swap routine with a partial specification, which actually stands in the way of a proof. If modularity is not a concern, for example if the routine is part of the code being verified rather than of a library, it may be more effective to ignore the specification and use the routine’s implementation. This is particularly appropriate for smallhelper routines such as the swap example.

This inlining technique is applicable in other cases, for example to make up for a missing precondition: assume that a helper routine will only work for x > 0 but does not state that precondition, or maybe states only the weaker one x ≥ 0 ; in the code, however, it is only called with positive arguments. If we try to verify the code modularly we will fail, as indeed we should since the routine is incorrect as a general-purpose primitive. But within the context of the code there is nothing wrong with it. Forgetting the contract of the routine if any, and instead using its actual implementation, we may be able to show that everything is fine.

Another component of the approach is to fill in preconditions that programmers have omitted because they are somehow obvious to them. For example it is tempting and common to write just a [1] > 0 rather than a /= Void and then a [1] > 0 for a detachable array a. The tool takes care of  interpreting the simpler precondition as the more complete one.

The resulting “two-step verification”, integrated into the AutoProof verification tool for Eiffel, should turn out to be an important simplification towards the goal of “Verification As a Matter Of Course” [2].


[1] Julian Tschannen, Carlo A. Furia, Martin Nordio and Bertrand Meyer: Program Checking With Less Hassle, in VSTTE 2013, Springer LNCS, to appear, draft available here; presentation on May 17 in the 15:30-16:30  session.

[2] Verification As a Matter Of Course, article in this blog, 29 March 2010, see here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Presentations at ICSE and VSTTE


The following presentations from our ETH group in the ICSE week (International Conference on Software Engineering, San Francisco) address important issues of software specification and verification, describing new techniques that we have recently developed as part of our work building EVE, the Eiffel Verification Environment. One is at ICSE proper and the other at VSTTE (Verified Software: Tools, Theories, Experiments). If you are around please attend them.

Julian Tschannen will present Program Checking With Less Hassle, written with Carlo A. Furia, Martin Nordio and me, at VSTTE on May 17 in the 15:30-16:30 session (see here in the VSTTE program. The draft is available here. I will write a blog article about this work in the coming days.

Nadia Polikarpova will present What Good Are Strong Specifications?, written with , Carlo A. Furia, Yu Pei, Yi Wei and me at ICSE on May 22 in the 13:30-15:30 session (see here in the ICSE program). The draft is available here. I wrote about this paper in an earlier post: see here. It describes the systematic application of theory-based modeling to the full specification and verification of advanced software.

VN:F [1.9.10_1130]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

How good are strong specifications? (New paper, ICSE 2013)


A core aspect of our verification work is the use of “strong” contracts, which express sophisticated specification properties without requiring a separate specification language: even for advanced properties, there is no need for a separate specification language, with special notations such as those of first-order logic; instead, one can continue to rely, in the tradition of Design by Contract, on the built-in notations of the programming language, Eiffel.

This is the idea of domain theory, as discussed in earlier posts on this blog, in particular [1]. An early description of the approach, part of Bernd Schoeller’s PhD thesis work, was [2]; the next step was [3], presented at VSTTE in 2010.

A new paper to be presented at ICSE in May [3], part of an effort led by Nadia Polikarpova for her own thesis in progress, shows new advances in using strong specifications, demonstrating their expressive power and submitting them to empirical evaluation. The results show in particular that strong specifications justify the extra effort; in particular they enable automatic tests to find significantly more bugs.

A byproduct of this work is to show again the complementarity between various forms of verification, including not only proofs but (particularly in the contribution of two of the co-authors, Yi Wei and Yu Pei, as well as Carlo Furia) tests.


[1] Bertrand Meyer: Domain Theory: the forgotten step in program verification, article on this blog, see here.

[2] Bernd Schoeller, Tobias Widmer and Bertrand Meyer: Making Specifications Complete Through Models, in Architecting Systems with Trustworthy Components, eds. Ralf Reussner, Judith Stafford and Clemens Szyperski, Lecture Notes in Computer Science, Springer-Verlag, 2006, available here.

[3] Nadia Polikarpova, Carlo Furia and Bertrand Meyer: Specifying Reusable Components, in Verified Software: Theories, Tools, Experiments (VSTTE ‘ 10), Edinburgh, UK, 16-19 August 2010, Lecture Notes in Computer Science, Springer Verlag, 2010, available here.

[4] Nadia Polikarpova, Carlo A. Furia, Yu Pei, Yi Wei and Bertrand Meyer: What Good Are Strong Specifications?, to appear in ICSE 2013 (Proceedings of 35th International Conference on Software Engineering), San Francisco, May 2013, draft available here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

ESEC/FSE 2013: 18-26 August, Saint Petersburg, Russia

The European Software Engineering Conference takes place every two years in connection with the ACM Foundations of Software Engineering symposium (which in even years is in the US). The next ESEC/FSE  will be held for the first time in Russia, where it will be the first major international software engineering conference ever. It comes at a time when the Russian software industry is ever more present through products and services offered worldwide. See the conference site here. The main conference will be held 21-23 August 2013, with associated events before and after so that the full dates are August 18 to 26. (I am the general chair.)

Other than ICSE, ESEC/FSE is second to none in the quality of the program. We already have four outstanding keynote speakers:  Georges Gonthier from Microsoft Research, Paola Inverardi from L’Aquila in Italy, David Notkin from U. of Washington (in whose honor a symposium will be held as an associated event of ESEC/FSE, chaired by Michael Ernst), and Moshe Vardi of Rice and of course Communications of the ACM.

Saint Petersburg is one of the most beautiful cities in the world, strewn with gilded palaces, canals, world-class museums (not just the Hermitage), and everywhere mementos of the great poets, novelists, musicians and scientists who built up its fame.

Hosted by ITMO National Research University, the conference will be held in the magnificent building of the Razumovsky Palace on the banks of the Moika river; see here.

The Call for Papers has a deadline of March 1st, so there is still plenty of time to polish your best paper and send it to ESEC/FSE. There is also still time to propose worskhops and other associated events. ESEC/FSE will be a memorable moment for the community and we hope to see many of the readers there.

VN:F [1.9.10_1130]
Rating: 9.7/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

Negative variables and the essence of object-oriented programming (new paper)

In modeling object-oriented programs, for purposes of verification (proofs) or merely for a better understanding, we are faced with the unique “general relativity” property of OO programming: all the operations you write (excluding non-OO mechanisms such as static functions) are expressed relative to a “current object” which changes repeatedly during execution. More precisely at the start of a call x.r (…) and for the duration of that call the current object changes to whatever x denotes — but to determine that object we must again interpret x in the context of the previous current object. This raises a challenge for reasoning about programs; for example in a routine the notation f.some_reference, if f is a formal argument, refers to objects in the context of the calling object, and we cannot apply standard rules of substitution as in the non-OO style of handling calls.

In earlier work [1, 2] initially motivated by the development of the Alias Calculus, I introduced a notion of negative variable to deal with this issue. During the execution of a call x.r (…) the negation of x , written x’, represents a back pointer to the calling object; negative variables are characterized by axiomatic properties such as x.x’= Current and x’.(old x)= Current. Alexander Kogtenkov has implemented these ideas and refined them.

Negative variable as back pointer

In a recent paper under submission [3], we review the concepts and applications of negative variables.


[1] Bertrand Meyer: Steps Towards a Theory and Calculus of Aliasing, in International Journal of Software and Informatics, 2011, available here.

[2] Bertrand Meyer: Towards a Calculus of Object Programs, in Patterns, Programming and Everything, Judith Bishop Festschrift, eds. Karin Breitman and Nigel Horspool, Springer-Verlag, 2012, pages 91-128, available here.

[3] Bertrand Meyer and Alexander Kogtenkov: Negative Variables and the Essence of Object-Oriented Programming, submitted for publication, 2012. [Updated 13 January 2014: I have removed the link to the draft mentioned in this post since it is now superseded by the new version, soon to be published, and available here.]

VN:F [1.9.10_1130]
Rating: 9.5/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 5 votes)

A fundamental duality of software engineering

A couple of weeks ago I proposed a small quiz. (I also stated that the answer would come “on Wednesday” — please understand any such promise as “whenever I find the time”. Sorry.) Here is the answer.

The quiz was:

I have a function:

  • For 0 it yields 0.
  • For 1 it yields 1.
  • For 2 it yields 4.
  • For 3 it yields 9.
  • For 4 it yields 16.

What is the value for 5?

Shortly thereafter I added a hint: the value for 5 is 25, and changed the question to: “What is the value for 6?”. For good measure we can also ask about the value for 1000. Now compare your answer to  what follows.

A good answer for the value at 6 is: 34 . The function in this case is -10 + 5 x + |2 x – 3| + |2 x -7|. It matches the values for the given inputs.

Linear, small values










The value for 1000 is 8980:

Linear function, full range










Another good answer at position 6 is 35.6. It comes up if we assume the function is over reals rather than integers; then a possible formula, which correlates very well (R-square of 0.9997) with the values at the given inputs, is:

869.42645566111 (1 – 0.4325853145802 e-.0467615868913719  (x – 17.7342512233011))2.3116827277657443

Exponential function, initial range











with a quite different asymptotic behavior, giving the value 869.4 at position 1000:

Exponential, full range











Some readers might have thought of another possibility, the square function x2, which again matches all the given values:

Square function, initial range











So which of these answers is right? Each is as good as the others, and as bad. There is in particular no reason to believe that the values given in the quiz’s statement suggest the square function. Any function that fits the given values, exactly (if we stick to integers) or approximately (with reals as simulated on a computer) is an equally worthy candidate. Six inputs, or six thousand, do not resolve the question. At best they are hints.

This difference between a hint and a solution is at the core of software engineering. It is, for example, the difference between a test and a specification. A test tells us that the program works for some values; as Dijkstra famously pointed out, and anyone who has developed a serious program has experienced, it does not tell us that it will work for others. The more successful tests, the more hints; but they are still only hints. I have always wondered whether Dijkstra was explicitly thinking of the Popperian notion of falsifiability: no number of experiments will prove a physical theory (although a careful experiment may boost the confidence in the theory, especially if competing theories fail to explain it, as the famous Eddington expedition did for relativity in 1919 [1]); but a single experiment can disprove a theory. Similarly, being told that our function’s value at 6 is 34 disqualifies the square function and the last one (the exponential), but does not guarantee that the first function (the linear combination) is the solution.

The specification-testing duality is the extension to computer science of the basic duality of logic. It starts with the elementary boolean operators: to prove a or b it suffices to establish a or to establish b; and to disprove a and b it suffices to show that a does not hold or to show that b does not hold. The other way round, to disprove a or b we have to show that a does not hold and to show that b does not hold; to prove that a and b holds, we have to show that a holds and to show that b holds.

Predicate calculus generalizes or to , “there exists”, and and to , “for all”. To prove ∃ x | p (x) (there is an x of which p holds) it suffices to find one value a such that p (a); let’s be pretentious and say we have “skolemized” x. To disprove∀ x | p (x) (p holds of all x) it suffices to find one value for which p does not hold.

In software engineering the corresponding duality is between proofs and tests, or (equivalently) specifications and use cases. A specification is like a “for all”: it tells us what must happen for all envisioned inputs. A test is like a “there exists”: it tells us what happens for a particular input and hence, as in predicate calculus, it is interesting as a disproof mechanism:

  • A successful test brings little information (like learning the value for 5 when trying to figure out what a function is, or finding one true value in trying to prove a or a false value in trying to prove a ).
  • An unsuccessful test brings us decisive information (like a false value for a ): the program is definitely not correct. It skolemizes incorrectness.

A proof, for its part, brings the discussion to an end when it is successful. In practice, testing may still be useful in this case, but only testing that addresses issues not covered by the proof:

  • Correctness of the compiler and platform, if not themselves proved correct.
  • Correctness the proof tools themselves, since most practical proofs require software support.
  • Aspects not covered by the specification such as, typically, performance and usability.

But for the properties it does cover the proof is final.

It is as foolish, then, to use tests in lieu of specifications as it would be to ignore the limitations of a proof. Agile approaches have caused much confusion here; as often happens in the agile literature [2], the powerful insight is mixed up with harmful advice. The insight, which has significantly improved the practice of software development, is that the regression test suite is a key asset of a project and that tests should be run throughout. The bad advice is to ditch upfront requirements and specifications in favor of tests. The property that tests lack and specifications possess is generality. A test is an instance; a thousand tests can never be more than a thousand instances. As I pointed out in a short note in EiffelWorld (the precursor to this blog) a few years ago [3], the relationship is not symmetric: one can generate tests from a specification, but not the other way around.

The same relationship holds between use cases and requirements. It is stunning to see how many people think that use cases (scenarios) are a form of requirements. As requirements they are as useless as one or ten values are to defining a function. Use cases are a way to complement the requirements by describing the system’s behavior in selected important cases. A kind of reality check, to ensure that whatever abstract aims have been defined for the system it still covers the cases known to be of immediate interest. But to rely on use cases as requirements means that you will get a system that will satisfy the use cases — and possibly little else.

When I use systems designed in recent years, in particular Web-based systems, I often find myself in a stranglehold: I am stuck with the cases that the specifiers thought of. Maybe it’s me, but my needs tend, somehow, to fall outside of these cases. Actually it is not just me. Not long ago, I was sitting close to a small-business owner who was trying to find her way through an insurance site. Clearly the site had a planned execution path for employees, and another for administrators. Problem: she was both an employee and the administrator. I do not know how the session ended, but it was a clear case of misdesign: a system built in terms of standard scenarios. Good specification performs an extra step of abstraction (for example using object-oriented techniques and contracts, but this is for another article). Skipping this step means forsaking the principal responsibility of the requirements phase: to generalize from an analysis of the behavior in known cases to a definition of the desired behaviors in all relevant cases.

Once more, as everywhere else in computer science [4], abstraction is the key to solid results that stand the test of time. Definitely better than judging a book by its cover, inferring a function by its first few values, verifying a program by its tests, or specifying a system by its use cases.


[1] See e.g. a blog article: Einstein and Eddington, here.

[2] Bertrand Meyer: Agile! The Good, the Hype and the Ugly, 2013, to appear.

[3] Bertrand Meyer: Test or spec? Test and spec? Test from spec!, EiffelWorld column, 2004 available here.

[4] Jeff Kramer: Is Abstraction the Key to Computer Science?, in Communications of the ACM, vol. 50, no. 4, April 2007, pages 36-42,  available from CiteSeer here

VN:F [1.9.10_1130]
Rating: 9.5/10 (32 votes cast)
VN:F [1.9.10_1130]
Rating: +13 (from 13 votes)

The manhood test


I came across an obscure and surprisingly interesting article by Cliff Jones [1], about the history of rely-guarantee but with the following extract:

It was perhaps not fully appreciated at the time of [Hoare’s 1969 axiomatic semantics paper] that the roles of pre and post conditions differ in that a pre condition gives permission to a developer to ignore certain possibilities; the onus is on a user to prove that a component will not be initiated in a state that does not satisfy its pre condition. In contrast a post condition is an obligation on the code that is created according to the specification. This Deontic view carries over [to rely-guarantee reasoning].

I use words more proletarian than “deontic”, but this view is exactly what stands behind the concepts of Design by Contract and has been clearly emphasized in all Eiffel literature ever since the first edition of OOSC. It remains, however, misunderstood outside of the Eiffel community; many people confuse Design by Contract with its opposite, defensive programming. The criterion is simple: if you have a precondition to a routine, are you willing entirely to forsake the corresponding checks (conditionals, exceptions…) in the routine body? If not, you may be using the word “contract” as a marketing device, but that’s all. The courage to remove the checks is the true test of adulthood.

The application of Microsoft’s “Code Contracts” mechanism to the .NET libraries fails that test: a precondition may say “buffer not full” or “insertions allowed”, but the code still checks the condition and triggers an exception. The excuse I have heard is that one cannot trust those unwashed developers. But the methodological discipline is lost. Now let me repeat this using clearer terminology: it’s not deontic.


[1] Cliff Jones: The role of auxiliary variables in the formal development of concurrent programs, in Reflections on the work of C. A. R. Hoare, eds. Jones, Roscoe and Wood, Springer Lecture Notes in Computer Science,  2009, technical report version available here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

Domain Theory: precedents

Both Gary Leavens and Jim Horning commented (partly here, partly on Facebook) about my Domain Theory article [1] to mention that Larch had mechanisms for domain modeling and specification reuse. As Horning writes:

The Larch Shared Language was really all about creating reusable domain theories, including theorems about the domains.  See, for example [2] and [3].

I am honored that they found the time to write about the article and happy to acknowledge Larch, one of the most extensive efforts, over several decades, to provide serious notations and tools for specification. Leavens’s and Horning’s messages gave me the opportunity to re-read some Larch papers and discover a couple I did not know.

My article did not try to provide exhaustive references; if it had, Larch would have been among them. I would probably have cited my own paper on M [4], earlier than [3], which introduces a notation for composing specifications; see section 1.4 (“Features of the M method and the associated notation have thus been devised to allow for modular descriptions of systems. A system description may include an interface paragraph that describes the connection of the current specification with others, existing or yet to be written”) and the  presentation of these mechanisms in section 5.

Larch traits, described in [3], pursue a similar aim, but the earlier article cited by Horning [2] is a general, informal discussion of formal specification; it does not mention traits, and in fact does not cite Larch, stating instead “We have experimented with the use of two very different tools, PIE and Affirm, in constructing modest sized algebraic specifications”. Its general observations about the specification task remain useful today, and it does mention reuse in passing.

If we were to look for precedents, the basic source would have to be the Clear specification language of Goguen and Burstall, for which the citations [5, 6, 7] all appear in my M paper [4] and go back further: 1977-1981. Clear made a convincing case for modularizing specifications, and defined supporting language constructs.

Since these early publications, many people have come to realize that reuse and composition can be as useful on the specification side as they are for programming. Typical specification and verification techniques, however, do not take advantage of this idea and tend to make us restart every time from the lowest level. Domain Theory, as outlined in [1], is intended to bring abstraction, which has proved so beneficial in other parts of software engineering, to the world of specification.


[1] Domain Theory: The Forgotten step in program verification, an article in this blog, see here.

[2] John V. Guttag, James J. Horning, Jeannette M. Wing: Some Notes on Putting Formal Specifications to Productive Use, in Science of Computer Programming, vol. 2, no. 1, 1982, pages 53-68. (BM note: I found a copy here.)

[3] John V. Guttag, James J. Horning: A Larch Shared Language Handbook, in Science of Computer Programming, vol. 6, no. 2, 1986, pages 135-157. (BM note: I found a copy here, which also has a link to the Larch report.)

[4] Bertrand Meyer: M: A System Description Method, Technical Report TR CS 85-15, University of California, Santa Barbara, 1985, available here.

[5] Rod M. Burstall and Joe A. Goguen: Putting Theories Together to Make Specifications, in Proceedings of 5th International Joint Conference on Artificial Intelligence, Cambridge (Mass.), 1977, pages 1045- 1058.

[6] Rod M. Burstall and Joe A. Goguen: “The Semantics of Clear, a Specification Language,” in Proceedings of Advanced Course on Abstract Software Specifications, Copenhagen, Lecture Notes on Computer Science 86, Copenhagen, Springer-Verlag, 1980, pages 292-332, available here.

[7] Rod M. Burstall and Joe A. Goguen: An Informal Introduction to Specifications using Clear, in The Correctness Problem in Computer Science, eds R. S. Boyer and JJ. S. Moore, Springer-Verlag, 1981, pages 185-213.

VN:F [1.9.10_1130]
Rating: 8.8/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: +5 (from 5 votes)

Domain Theory: the forgotten step in program verification


Program verification is making considerable progress but is hampered by a lack of abstraction in specifications. A crucial step is, almost always, absent from the process; this omission is the principal obstacle to making verification a standard component of everyday software development.

Steps in software verification

In the first few minutes of any introduction to program verification, you will be told that the task requires two artifacts: a program, and a specification. The program describes what executions will do; the specification, what they are supposed to do. To verify software is to ascertain that the program matches the specification: that it does is what it should.

The consequence usually drawn is that verification consists of three steps: write a specification, write a program, prove that the program satisfies the specification. The practical process is of course messier, if only because the first two steps may occur in the reverse order and, more generally, all three steps are often intertwined: the specification and the program influence each other, in particular through the introduction of “verification conditions” into the program; and initial proof attempts will often lead to changes in both the specification and the program. But by and large these are the three accepted steps.

Such a description misses a fourth step, a prerequisite to specification that is essential to a scalable verification process: Domain Theory. Any program addresses a specific domain of discourse, be it the domain of network access and communication for a mobile phone system, the domain of air travel for a flight control system, of companies and shares for a stock exchange system and so on. Even simple programs with a limited scope, such as the computation of the maximum of an array, use a specific domain beyond elementary mathematics. In this example, it is the domain of arrays, with their specific properties: an array has a range, a minimum and maximum indexes in that range, an associated sequence of values; we may define a slice a [i..j], ask for the value associated with a given index, replace an element at a given index and so on. The Domain Theory provides a formal model for any such domain, with the appropriate mathematical operations and their properties. In the example the operations are the ones just mentioned, and the properties will include the axiom that if we replace an element at a certain index i with a value v then access the element at an index j, the value we get is v if i = j, and otherwise the earlier value at j.

The role of a Domain Theory

The task of devising a Domain Theory is to describe such a domain of reference, in the spirit of abstract data types: by listing the applicable operations and their properties. If we do not treat this task as a separate step, we end up with the kind of specification that works for toy examples but quickly becomes unmanageable for real-life applications. Most of the verification literature, unfortunately, relies on such specifications. They lack abstraction since they keep using the lowest-level mathematical objects and constructs, such as numbers and quantified expressions. They are to specification what assembly language is to modern programming.

Dines Bjørner has for a long time advocated a closely related idea, domain engineering; see for example his book in progress [1]. Unfortunately, he does not take advantage of modularization through abstract data types; the book is an example of always-back-to-the-basics specification, resorting time and again to fully explicit specifications based on a small number of mathematical primitives, and as a consequence making formal specification look difficult.

Maximum computed from both ends

As a simple example of modeling through an abstract theory consider an algorithm for computing the maximum of an array. We could use the standard technique that goes through the array one-way, but for variety let us take the algorithm that works from both ends, moving two integer cursors towards each other until they meet.  (This example was used in a verification competition at a recent conference, I forgot which one.) The code looks like this:

Two-way maximum

The specification, expressed by the postcondition (ensure) should state that Result is the maximum of the array; the loop invariant will be closely related to it. How do we express these properties? The obvious way is not the right way. It states the postcondition as something like

k: Z | (ka.lowerka.upper) a [k] ≤ Result

k: Z | ka.lowerka.upper a [k] = Result

In words, Result is at least as large as every element of the array, and is equal to at least one of the elements of the array. The invariant can also be expressed in this style (try it).

The preceding specification expresses the desired property, but it is of an outrageously lower level than called for. The notion of maximum is a general one for arrays over an ordered type. It can be computed through many different algorithms in addition to the one shown above, and exists independently of these algorithms. The detailed, assembly-language-like definition of its properties should not have to be repeated in every case. It should be part of the Domain Theory for the underlying notion, arrays.

A specification at the right level of abstraction

In a Domain Theory for arrays of elements from an ordered set, one of the principal operations is maximum, satisfying the above properties. The definition of maximum through these properties belongs at the Domain Theory level. The Domain Theory should include that definition, independent of any particular computational technique such as two_way_max. Then the routine’s postcondition, relying on this notion from the Domain Theory, becomes simply

Result = a.maximum

The application of this approach to the loop invariant is particularly interesting. If you tried to write it at the lowest level, as suggested above, you should have produced something like this:


k: Z | kikj ∧ (∀ l: Z | l a.lowerl a.upper a [l] ≤ a [k])

The first clause is appropriate but the rest is horrible! With its nested quantified expressions it gives an impression of great complexity for a property that is in fact straightforward, simple enough in fact to be explained to a 10-year-old: the maximum of the entire array can be found between indexes i and j. In other words, it is also the maximum of the array slice going from i to j. The Domain Theory will define the notion of slice and enable us to express the invariant as just

a.lowerij a.upper — This bounding clause remains

a.maximum = (a [i..j ]).maximum

(where we will write the slice a [i..j ] as a.slice (i, j ) if we do not have mechanisms for defining special syntax). To verify the routine becomes trivial: on loop exit the invariant still holds and i = j, so the maximum of the entire array is given by the maximum of the single-element slice a [i..i ], which is the value of its single element a [i ]. This last property — the maximum of a single-element array is its single value — is independent of the verification of any particular program and should be proved as a little theorem of the Domain Theory for arrays.

The comparison between the two versions is striking: without Domain Theory, we are back to the most tedious mathematical manipulations again and again; simple, clear properties look complicated and obscure. This just for a small example on basic data structures; now think what it will be for a complex application domain. Without a first step of formal modeling to develop a Domain Theory, no realistic specification and verification process is realistic.

Although the idea is illustrated here through examples of individual routines, the construction of a Domain Theory should usually occur, in an object-oriented development process, at the level of a class: the embodiment of an abstract data type, which is at the appropriate level of granularity. The theory applies to objects of a given type, and hence will be used for the verification of all operations of that type. This observation justifies the effort of devising a Domain Theory, since it will benefit a whole set of software elements.

Components of a Domain Theory

The Domain Theory should include the three ingredients illustrated in the example:

  • Operations, modeled as mathematical functions (no side effects of course, we are in the world of specification).
  • Axioms characterizing the defining properties of these operations.
  • Theorems, characterizing other important properties.

This approach is of course nothing else than abstract data types (the same thing, although few people realize it, as object-oriented analysis). Even though ADTs are a widely popularized notion, supported for example by tools such as CafeOBJ [2] and Maude [3], it is generally not taken to its full conclusions; in particular there is too often a tendency to define every new ADT from scratch, rather than building up libraries of reusable high-level mathematical components in the O-O spirit of reuse.

Results, not just definitions

In devising a Domain Theory with the three kinds of ingredient listed above, we should not forget the last one, the theorems! The most depressing characteristic of much of the work on formal specification is that it is long on definitions and short on results, while good mathematics is supposed to be the reverse. I think people who have seriously looked at formal methods and do not adopt them are turned off not so much by the need to use mathematics but by the impression they get little value for it.

That is why Eiffel contracts do get adopted: even if it’s just for testing and debugging, people see immediate returns. It suffices for a programmer to have caught one bug as the violation of a simple postcondition to be convinced for life and lose any initial math-phobia.

Quantifiers are evil

As we go beyond simple contract properties — this argument must be positive, this reference will not be void — the math needs to be at the same level of abstraction to which, as modern programmers, we are accustomed. For example, one should always be wary of program specifications relying directly on quantified expressions, as in the low-level variants of the postcondition and loop invariant of the two_way_max routine.

This is not just a matter of taste, as in the choice in logic [4] between lambda expressions (more low-level but also more immediately understandable) and combinators (more abstract but, for many, more abstruse). We are talking here about the fundamental software engineering problem of scalability; more generally, of the understandability, extendibility and reusability of programs, and the same criteria for their specification and verification. Quantifiers are of course needed to express fundamental properties of a structure but in general should not directly appear in program assertions: as the example illustrated, their level of abstraction is lower than the level of discourse of a modern object-oriented program. If the rule — Quantifiers Considered Harmful — is not absolute, it must be pretty close.

Quantified expressions, “All elements of this structure possess this property” and “Some element of this structure possesses this property” — belong in the description of the structure and not in the program. They should appear in the Domain Theory, not in the verification. If you want to express that a hash table search found an element of key K, you should not write

(Result = Void ∧ (∀ i: Z | i a.loweri a.upper a.item (i).key ≠ K))

(ResultVoid ∧ (∀ i: Z | i a.loweri a.upper a.item (i).key = K ∧ Result = a.item (i))


Result /= Void     (Result a.elements_of_key (K))

The quantified expressions will appear in the Domain Theory for the corresponding structure, in the definition of such domain properties as elements_of_key. Then the program’s specification — the contracts to be verified — can rely on concepts that make sense to the programmer; the verification will take advantage of theorems that have been proved independently since they belong to the Domain Theory and do not depend on individual programs.

Even the simplest examples…

Practical software verification requires Domain Theory even in the simplest cases, including those often used as purely academic examples. Perhaps the most common (and convenient) way to explain the notion of loop invariant is Euclid’s algorithm to compute the greatest common divisor (gcd) of two numbers (with a structure remarkably similar to that of two_way_max):

I have expressed the postcondition using a concept from an assumed Domain Theory for the underlying problem: gcd, the mathematical function that yields the greatest common divisor of two integers. Many specifications I have seen go back to the basics, with something like this (using \\ for integer remainder):

a \\ Result = 0 b \\ Result = 0   ∀ i: N | (a \\ i = 0) ∧ (b \\ i = 0)  i Result

This is indeed the definition of what it means for Result to be the gcd of a and b (it divides a, it divides b, and is greater than any other integer that also has these two properties). But it makes no sense to include such a detailed mathematical property in the specification of a program element. It belongs in the domain theory, where it will serve as the definition of a function gcd, which we can then use directly in the specification of the program.

Note how the invariant makes the necessity of the Domain Theory approach even more clear: try to express it in the basic mathematical form, not using the function gcd, It can be done, but the result is typical of the high complexity to usefulness ratio of traditional formal specifications mentioned above. Instead, the invariant that I have included in the program text above says exactly what there is to say, clearly and concisely: at each iteration, the gcd of our two temporary values, i and j, is the result that we are seeking, the gcd of the original values a and b. On exit from the loop, when i and j are equal, their common value is that result.

It is also thanks to the Domain Theory modeling that the verification of the program — consisting of proving that the stated property is indeed invariant — will be so simple: as part of the theory, we should have the two little theorems

i > j > 0 gcd (i, j) = gcd (ij, j)
(i, i) = i

which immediately show the implementation to be correct.

Inside of any big, fat, messy, quantifier-ridden specification there is a simple, elegant and clear Domain-Theory-based specification desperately trying to get out. Find it and use it.

From Domain Theory to domain library

One of the reasons most people working on program verification have not used the division into levels of discourse described here, with a clear role for developing a Domain Theory, is that they lack the appropriate notational support. Mathematical notation is of course available, but we are talking about programs a general verification framework cannot resort to a new special notation for every new application domain.

This is one of the places where Eiffel provides a consistent solution, through its seamless approach to integrating programs and specifications in a single notation. Thanks to mechanisms such as deferred classes (classes that describe concepts through detailed specifications without committing to an implementation), Eiffel is as much for specification as for design and implementation; a Domain Theory can be expressed though a set of deferred Eiffel classes, which we may call a domain library. The classes in a domain library should not just be deferred, meaning devoid of implementation; they should in addition describe stateless operations only — queries, not commands — since they are modeling purely mathematical concepts.

An earlier article in this blog [5] outlined the context of our verification work: the EVE project (Eiffel Verification Environment), a practical approach to integrating software verification in the day-to-day practice of modern software development, with the slogan ““Verification As a Matter Of Course”. In this project we have applied the idea of Domain Theory by building a domain library covering fundamental concepts of set theory, including functions and relations. This is the Mathematical Model Library (MML) [6, 7], which we use to verify the new data structure library EiffelBase 2 using specifications at the appropriate level of abstraction.

MML is in fact useful for the specification of a wide variety of programs, since almost every application area can benefit from the general concepts of set, subset, relation and such. But to cover a specific application domain, say flight traffic control, MML will generally not suffice; you will need to devise a Domain Theory that mathematically models the target domain, and may express it in the form of a domain library written in the same general spirit as MML: all deferred, stateless, focused on high-level abstractions.

It is one of the attractions of Eiffel that you can express such a theory and library in the same notation as the programs that will use it — more precisely in a subset of that notation, since the specification classes do not need the imperative constructs of the language such as instructions and attributes. Then both the development process and the verification use a seamlessly integrated set of notations and techniques, and all use the same tools from a modern IDE, in our case EiffelStudio, for browsing, editing, working with graphical repreentation, metrics etc.

DSL libraries for specifications

A mechanism to express Domain Theories is to a general specification mechanism essentially like a Domain Specific Language (DSL) is to a general programming language: a specialization for a particular domain. Domain libraries make the approach practical by:

  • Embedding the specification language in the programming language.
  • Fundamentally relying on reuse, in the best spirit of object technology.

This approach is in line with the one I presented for handling DSLs in an earlier article of this blog [8] (thanks, by the way, for the many comments received, some of them posted here and some on Facebook and LinkedIn where the post triggered long discussions). It is usually a bad idea to invent a new language for a new application domain. A better solution is to rely on libraries, by taking advantage of the power of object-oriented mechanisms to model (in domain libraries) and implement (for DSLs) the defining features of such a domain, and to make the result widely reusable. The resulting libraries are purely descriptive in the case of a domain library expressing a Domain Theory, and directly usable by programs in the case of a library embodying a DSL, but the goal is the same.

A sound and necessary engineering practice

Many ideas superficially look similar to Domain Theory: domain engineering as mentioned above, “domain analysis” as widely discussed in the requirements literature, model-driven development, abstract data type specification… They all start from some of the same observations, but  Domain Theory as described in this article is something different: a systematic approach to modeling an arbitrary application domain mathematically, which:

  • Describes the concepts through applicable operations, axioms and (most importantly) theorems.
  • Expresses these elements in an applicative (side-effect free, i.e. equivalent to pure mathematics) subset of the programming language, for direct embedding in program specifications.
  • Relies on the class mechanism to structure the results.
  • Collects the specifications into specification libraries and promotes the reuse of specifications in the same way we promote software reuse.
  • Uses the combination of these techniques to ensure that program specifications are at a high level of abstraction, compatible with the programmers’ view of their software.
  • Promotes a clear and effective verification process.

The core idea is in line with standard engineering practices in disciplines other than software: to build a bridge, a car or a chip you need first to develop a sound model of the future system and its environment, using any useful models developed previously rather than always going back to elementary textbook mathematics.

It seems in fact easier to justify doing Domain Analysis than to justify not doing it. The power of expression and abstraction of our programs has grown by leaps and bounds; it’s time for our specifications to catch up.


[1] Dines Bjørner: From Domains to Requirements —The Triptych Approach to Software Engineering, “to be submitted to Springer”, available here.

[2] Kokichi Futatsugi and others: CafeObj page, here.

[3] José Meseguer and others: Maude publication page, here.

[4] J. Roger Hindley, J. P. Seldin: Introduction to Combinators and l-calculus, Cambridge University Press, 1986.

[5] Verification As a Matter Of Course, earlier article on this blog (March 2010), available here.

[6] Bernd Schoeller, Tobias Widmer and Bertrand Meyer. Making specifications complete through models, in Architecting Systems with Trustworthy Components, eds. Ralf Reussner, Judith Stafford and Clemens Szyperski, Lecture Notes in Computer Science, Springer-Verlag, 2006, pages 48-70, available here.

[7] Nadia Polikarpova, Carlo A. Furia and Bertrand Meyer: Specifying Reusable Components, in VSTTE’10: Verified Software: Theories, Tools and Experiments, Edinburgh, August 2010, Lecture Notes in Computer Science, Springer-Verlag, available here.

[8] Never Design a Language, earlier article on this blog (January 2012), available here.

VN:F [1.9.10_1130]
Rating: 9.3/10 (12 votes cast)
VN:F [1.9.10_1130]
Rating: +7 (from 7 votes)

Aliasing and framing: Saint Petersburg seminar next week

In  last Thursday’s session of the seminar, Kokichi Futatsugi’s talk took longer than planned (and it would have been a pity to stop him), so I postponed my own talk on Automatic inference of frame conditions through the alias calculus to next week (Thursday local date). As usual it will be broadcast live.

Seminar page: here, including the link to follow the webcast.

Time and date: 5 April 2012, 18 Saint Petersburg time; you can see the local time at your location here.


Frame specifications, the description of what does not change in a routine call, are one of the most annoying components of verification, in particular for object-oriented software. Ideally frame conditions should be inferred automatically. I will present how the alias calculus, described in recent papers, can address this need.

There may be a second talk, on hybrid systems, by Sergey Velder.

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

Seminar sessions in Saint Petersburg: CafeOBJ and the frame issue

The Saint Petersburg software engineering seminar has two sessions today (29 March 2012, 18 local time, see here for the date and time in your area), broadcast live:

  • By Kokichi Futatsugi from KAIST (Japan): Combining Inference and Search in Verification with CafeOBJ.
  • By me: Automatic inference of frame conditions through the alias calculus.

See details including the link for the live webcast on the seminar page. The page also includes links to video recordings of recent sessions.

VN:F [1.9.10_1130]
Rating: 8.5/10 (2 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

New LASER proceedings

Springer has just published in the tutorial sub-series of Lecture Notes in Computer Science a new proceedings volume for the LASER summer school [1]. The five chapters are notes from the 2008, 2009 and 2010 schools (a previous volume [2] covered earlier schools). The themes range over search-based software engineering (Mark Harman and colleagues), replication of software engineering experiments (Natalia Juristo and Omar Gómez), integration of testing and formal analysis (Mauro Pezzè and colleagues), and, in two papers by our ETH group, Is branch coverage a good measure of testing effectiveness (with Yi Wei and Manuel Oriol — answer: not really!) and a formal reference for SCOOP (with Benjamin Morandi and Sebastian Nanz).

The idea of these LASER tutorial books — which are now a tradition, with the volume from the 2011 school currently in preparation — is to collect material from the presentations at the summer school, prepared by the lecturers themselves, sometimes in collaboration with some of the participants. Reading them is not quite as fun as attending the school, but it gives an idea.

The 2012 school is in full preparation, on the theme of “Advanced Languages for Software Engineering” and with once again an exceptional roster of speakers, or should I say an exceptional roster of exceptional speakers: Guido van Rossum (Python), Ivar Jacobson (from UML to Semat), Simon Peyton-Jones (Haskell), Roberto Ierusalimschy (Lua), Martin Odersky (Scala), Andrei Alexandrescu (C++ and D),Erik Meijer (C# and LINQ), plus me on the design and evolution of Eiffel.

The preparation of LASER 2012 is under way, with registration now open [3]; the school will take place from Sept. 2 to Sept. 8 and, like its predecessors, in the wonderful setting on the island of Elba, off the coast of Tuscany, with a very dense technical program but time for enjoying the beach, the amenities of a 4-star hotel and the many treasures of the island. On the other hand not everyone likes Italy, the sun, the Mediterranean etc.; that’s fine too, you can wait for the 2013 proceedings.


[1] Bertrand Meyer and Martin Nordio (eds): Empirical Software Engineering and Verification, International Summer Schools LASER 2008-2010, Elba Island, Italy, Revised Tutorial Lectures, Springer Verlag, Lecture Notes in Computer Science 7007, Springer-Verlag, 2012, see here.

[2] Peter Müller (ed.): Advanced Lectures on Software Engineering, LASER Summer School 2007-2008, Springer Verlag, Lecture Notes in Computer Science 7007, Springer-Verlag, 2012, see here.

[3] LASER summer school information and registration form,

VN:F [1.9.10_1130]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

ERC Advanced Investigator Grant: Concurrency Made Easy

In April we will be starting the  “Concurrency Made Easy” research project, the result of a just announced Advanced Investigator Grant from the European Research Council. Such ERC grants are awarded to a specific person, rather than a consortium of research organizations as in the usual EU funding scheme. The usual amount, which applies in my case, is 2.5 million euros (currently almost 3 .3 million dollars) over five years, on a specific theme. According to the ERC’s own description [1],

ERC Advanced Grants allow exceptional established research leaders of any nationality and any age to pursue ground-breaking, high-risk projects that open new directions in their respective research fields or other domains.

This is the most sought-after research funding instrument of the EU, with a success rate of about 12% [2], out of a group already preselected by the host institutions. What makes ERC Advanced Investigator Grants so coveted is the flexibility of the scheme (no constraints on the topic, light administrative baggage) and the trust that an award implies in a particular researcher and his ability to carry out advanced research.

The name of the CME project clearly signals its ambition: to turn concurrent programming into a normal, unheroic part of programming. Today adding concurrency to a program, usually in the form of multithreading, is very hard, complexity and risk of all kinds. Everyone is telling us that we must rethink programming, retrain programmers and revamp curricula to put the specific reasoning modes of concurrent programming at the center. I don’t think this can work; thinking concurrently is just too hard to become the default mode. Instead, we should adapt programming languages, theories and tools so that programmers can continue to apply the reasoning schemes that have proved so successful in classical programming, especially object-oriented programming with the benefit of Design by Contract.

The starting point is the SCOOP model, to which I started an introduction in an earlier article of this blog [3], with a sequel yet to come. SCOOP is a minimal extension to the O-O framework to support concurrency, yielding very simple (the S in the acronym) solutions to concurrent programming problems. As part of the CME project we plan to develop it in many different directions and establish a sound and effective formal basis.

I have put the project description — the scientific part of the actual proposal text accepted by the ERC — online [4].

In the next few weeks I will be publishing here specific announcements for the positions we are seeking to fill very quickly; they include postdocs, PhD students, and one research engineer. We are looking for candidates with excellent knowledge and practice of concurrency, Eiffel, formal techniques etc. The formal application procedure will be Web-based and is not in place yet but you can contact me if you fit the profile and are interested.

We can defeat the curse: concurrent programming (an obligatory condition of any path towards a successful future for information technology) does not have to be black magic. It can be made simple and efficient. Such is the challenge of the CME project.


[1] European Research Council: Advanced Grants, available here.

[2] European Research Council: Press release on 2011 Advanced Investigator Grants, 24 January 2012, available here.

[3] Concurrent Programming is Easy, article from this blog, available here.

[4] CME Advanced Investigator Grant project description, available here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

R&D positions in software engineering

In the past couple of weeks, three colleagues separately sent me announcements of PhD and postdoc positions, asking me to circulate them. This blog is as good a place as any other (as a matter of fact when I asked Oge de Moor if it was appropriate  he replied: “Readers of your blog are just the kind of applicants we look for…” — I hope you appreciate the compliment). So here they are:

  • Daniel Jackson, for work on Alloy, see here. (“Our plan now is to take Alloy in new directions, and dramatically improve the usability and scalability of the analyzer. We are looking for someone who is excited by these possibilities and will be deeply involved not only in design and implementation but also in strategic planning. There are also opportunities to co-advise students and participate in research proposals.”)
  • University College London (Anthony Finkelstein), a position as lecturer, similar to assistant professor), see here. The title is “Lecturer in software testing” (which, by the way, seems a bit restrictive, as I am more familiar with an academic culture in which the general academic discipline is set, say software engineering or possibly software verification, rather than a narrow specialty, but I hope the scope can be broadened).
  • Semmle, Oge de Moor’s company, see here.

These are all excellent environments, so if you are very, very good, please consider applying. If you are very, very, very good don’t, as I am soon going to post a whole set of announcements for PhD and postdoc positions at ETH in connection with a large new grant.


VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

TOOLS 2012, “The Triumph of Objects”, Prague in May: Call for Workshops

Workshop proposals are invited for TOOLS 2012, The Triumph of, to be held in Prague May 28 to June 1. TOOLS is a federated set of conferences:

  • TOOLS EUROPE 2012: 50th International Conference on Objects, Models, Components, Patterns.
  • ICMT 2012: 5th International Conference on Model Transformation.
  • Software Composition 2012: 10th International Conference.
  • TAP 2012: 6th International Conference on Tests And Proofs.
  • MSEPT 2012: International Conference on Multicore Software Engineering, Performance, and Tools.

Workshops, which are normally one- or two-day long, provide organizers and participants with an opportunity to exchange opinions, advance ideas, and discuss preliminary results on current topics. The focus can be on in-depth research topics related to the themes of the TOOLS conferences, on best practices, on applications and industrial issues, or on some combination of these.


Submission proposal implies the organizers’ commitment to organize and lead the workshop personally if it is accepted. The proposal should include:

  •  Workshop title.
  • Names and short bio of organizers .
  • Proposed duration.
  •  Summary of the topics, goals and contents (guideline: 500 words).
  •  Brief description of the audience and community to which the workshop is targeted.
  • Plans for publication if any.
  • Tentative Call for Papers.

Acceptance criteria are:

  • Organizers’ track record and ability to lead a successful workshop.
  •  Potential to advance the state of the art.
  • Relevance of topics and contents to the topics of the TOOLS federated conferences.
  •  Timeliness and interest to a sufficiently large community.

Please send the proposals to me (Bertrand.Meyer AT, with a Subject header including the words “TOOLS WORKSHOP“. Feel free to contact me if you have any question.


  •  Workshop proposal submission deadline: 17 February 2012.
  • Notification of acceptance or rejection: as promptly as possible and no later than February 24.
  • Workshops: 28 May to 1 June 2012.


VN:F [1.9.10_1130]
Rating: 7.3/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)