Archive for the ‘Design by Contract’ Category.

AutoProof workshop: Verification As a Matter of Course

The AutoProof technology pursues the goal of “Verification As a Matter Of Course”, integrated into the EVE development environment. (The AutoProof  project page here; see particularly the online interactive tutorial.) A one-day workshop devoted to the existing AutoProof and current development will take place on October 1 near Toulouse in France. It is an informal event (no proceedings planned at this point, although based on the submissions we might decide to produce a volume), on a small scale, designed to bring together people interested in making the idea of practical verification a reality.

The keynote will be given by Rustan Leino from Microsoft Research, the principal author of the Boogie framework on which the current implementation of AutoProof relies.

For submissions (or to attend without submitting) see the workshop page here. You are also welcome to contact me for more information.

VN:F [1.9.10_1130]
Rating: 8.2/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

Design by Contract: ACM Webinar this Thursday

A third ACM webinar this year (after two on agile methods): I will be providing a general introduction to Design by Contract. The date is this coming Thursday, September 17, and the time is noon New York (18 Paris/Zurich, 17 London, 9 Los Angeles, see here for hours elsewhere). Please tune in! The event is free but requires registration here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

New paper: Theory of Programs

Programming, wrote Dijkstra many years ago, is a branch of applied mathematics. That is only half of the picture: the other half is engineering, and this dual nature of programming is part of its attraction.

Descriptions of the mathematical side are generally, in my view, too complicated. This article [1] presents a mathematical theory of programs and programming based on concepts taught in high school: elementary set theory. The concepts covered include:

  • Programming.
  • Specification.
  • Refinement.
  • Non-determinism.
  • Feasibility.
  • Correctness.
  • Programming languages.
  • Kinds of programs: imperative, functional, object-oriented.
  • Concurrency (small-step and large-step)
  • Control structures (compound, if-then-else and Dijkstra-style conditional, loop).
  • State, store and environment.
  • Invariants.
  • Notational conventions for building specifications and programs incrementally.
  • Loop invariants and variants.

One of the principal ideas is that a program is simply the description of a mathematical relation. The program text is a rendering of that relation. As a consequence, one may construct programming languages simply as notations to express certain kinds of mathematics. This approach is the reverse of the usual one, where the program text and its programming languages are the starting point and the center of attention: theoreticians develop techniques to relate them to mathematical concepts. It is more effective to start from the mathematics (“unparsing” rather than parsing).

All the results (74 properties expressed formally, a number of others in the text) are derived as theorems from rules of elementary set theory; there are no new axioms whatsoever.

The paper also has a short version [2], omitting proofs and many details.

References

[1] Theory of Programs, available here.
[2] Theory of Programs, short version of [1] (meant for quick understanding of the ideas, not for publication), available here.

 

VN:F [1.9.10_1130]
Rating: 6.4/10 (18 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 9 votes)

Framing the frame problem (new paper)

Among the open problems of verification, particularly the verification of object-oriented programs, one of the most vexing is framing: how to specify and verify what programs element do not change. Continuing previous work, this article presents a “double frame inference” method, automatic on both sides the specification and verification sides. There is no need to write frame specifications: they will be inferred from routine postconditions. For verification, the method computes the set of actually changed properties through a “change calculus”, itself based on the previously developed alias calculus.

Some verification techniques, such as Hoare-style proofs, require significant annotation effort and potentially yield full functional verification; others, such as model checking and abstract interpretation, have more limited goals but seek full automation. Framing, in my opinion, should be automatic, freeing the programmer-verifier to devote the annotation effort to truly interesting properties.

Reference

[1] Bertrand Meyer: Framing the Frame Problem, in Dependable Software Systems, Proceedings of August 2014 Marktoberdorf summer school, eds. Alexander Pretschner, Manfred Broy and Maximilian Irlbeck, NATO Science for Peace and Security, Series D: Information and Communication Security, Springer, 2015 (to appear), pages 174-185; preprint available here.

VN:F [1.9.10_1130]
Rating: 8.8/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

Lampsort

 

In support of his view of software methodology, Leslie Lamport likes to use the example of non-recursive Quicksort. Independently of the methodological arguments, his version of the algorithm should be better known. In fact, if I were teaching “data structures and algorithms” I would consider introducing it first.

As far as I know he has not written down his version in an article, but he has presented it in lectures; see [1]. His trick is to ask the audience to give a non-recursive version of Quicksort, and of course everyone starts trying to remove the recursion, for example by making the stack explicit or looking for invertible functions in calls. But his point is that recursion is not at all fundamental in Quicksort. The recursive version is a specific implementation of a more general idea.

Lamport’s version — let us call it Lampsort —is easy to express in Eiffel. We may assume the following context:

a: ARRAY [G -> COMPARABLE]        — The array to be sorted.
pivot: INTEGER                                      —  Set by partition.
picked: INTEGER_INTERVAL            — Used by the sorting algorithm, see below.
partition (i, j: INTEGER)
……..require      — i..j is a sub-interval of the array’s legal indexes:
……..……..i < j
……..……..i >= a.lower
……..……..j <= a.upper
……..do
……..……..… Usual implementation of partition
……..ensure     — The expected effect of partition:
……..……..pivot >= i
……..……..pivot < j
……..……..a [i..j] has been reshuffled so that elements in i..pivot are less than
……..……..or equal to those in pivot+1 .. j.
……..end

We do not write the implementation of partition since the point of the present discussion is the overall algorithm. In the usual understanding, that algorithm consists of doing nothing if the array has no more than one element, otherwise performing a partition and then recursively calling itself on the two resulting intervals. The implementation can take advantage of parallelism by forking the recursive calls out to different processors. That presentation, says Lamport, describes only a possible implementation. The true Quicksort is more general. The algorithm works on a set not_sorted of integer intervals i..j such that the corresponding array slices a [i..j] are the only ones possibly not sorted; the goal of the algorithm is to make not_sorted empty, since then we know the entire array is sorted. In Eiffel we declare this set as:

not_sorted: SET [INTEGER_INTERVAL]

The algorithm initializes not_sorted to contain a single element, the entire interval; at each iteration, it removes an interval from the set, partitions it if that makes sense (i.e. the interval has more than one element), and inserts the resulting two intervals into the set. It ends when not_sorted is empty. Here it is:

……..from                                 — Initialize interval set to contain a single interval, the array’s entire index range:
……..…..create not_sorted.make_one (a.lower |..| a.upper)….         ..……..
……..invariant
……..…..— See below
……..until
……..…..not_sorted.is_empty                                                            — Stop when there are no more intervals in set
……..loop
……..…..picked := not_sorted.item                                                     — Pick an interval from (non-empty) interval set.
……..……if picked.count > 1 then                                                      — (The precondition of partition holds, see below.)
……..……..…..partition (picked.lower, picked.upper)                    — Split it, moving small items before and large ones after pivot.
……..……..…..not_sorted.extend (picked.lower |..| pivot)            — Insert new intervals into the set of intervals: first
……..……....not_sorted.extend (pivot + 1 |..| picked.upper)     — and second.
……..……end
……..…...not_sorted.remove (picked)                                               — Remove interval that was just partitioned.
…….end

Eiffel note: the function yielding an integer interval is declared in the library class INTEGER using the operator |..| (rather than just  ..).

The query item from SET, with the precondition not is_empty,  returns an element of the set. It does not matter which element. In accordance with the Command-Query Separation principle, calling item does not modify the set; to remove the element you have to use the command remove. The command extend adds an element to the set.

The abstract idea behind Lampsort, explaining why it works at all, is the following loop invariant (see [2] for a more general discussion of how invariants provide the basis for understanding loop algorithms). We call “slice” of an array a non-empty contiguous sub-array; for adjacent slices we may talk of concatenation; also, for slices s and t s <= t means that every element of s is less than or equal to every element of t. The invariant is:

a is the concatenation of the members of a set slices of disjoint slices, such that:
– The elements of a are a permutation of its original elements.
– The index range of any member  of slices having more than one element is in not_sorted.
– For any adjacent slices s and t (with s before t), s <= t.

The first condition (conservation of the elements modulo permutation) is a property of partition, the only operation that can modify the array. The rest of the invariant is true after initialization (from clause) with slices made of a single slice, the full array. The loop body maintains it since it either removes a one-element interval from not_sorted (slices loses the corresponding slice) or performs partition with the effect of partitioning one slice into two adjacent ones satisfying s <= t, whose intervals replace the original one in not_sorted. On exit, not_sorted is empty, so slices is a set of one-element slices, each less than or equal to the next, ensuring that the array is sorted.

The invariant also ensures that the call to partition satisfies that routine’s precondition.

The Lampsort algorithm is a simple loop; it does not use recursion, but relies on an interesting data structure, a set of intervals. It is not significantly longer or more difficult to understand than the traditional recursive version

sort (i, j: INTEGER)
……..require
……..……..i <= j
……..……..i >= a.lower
……..……..j <= a.upper
……..do
……..……if j > i then                    — Note that precondition of partition holds.
……..……..…..partition (i, j)         — Split into two slices s and t such that s <= t.
……..……..…..sort (i, pivot)          — Recursively sort first slice.
……..……..…..sort (pivot+1, j)      — Recursively sort second slice.
……..……end……..…..
……..end

Lampsort, in its author’s view, captures the true idea of Quicksort; the recursive version, and its parallelized variants, are only examples of possible implementations.

I wrote at the start that the focus of this article is Lampsort as an algorithm, not issues of methodology. Let me, however, give an idea of the underlying methodological debate. Lamport uses this example to emphasize the difference between algorithms and programs, and to criticize the undue attention being devoted to programming languages. He presents Lampsort in a notation which he considers to be at a higher level than programming languages, and it is for him an algorithm rather than a program. Programs will be specific implementations guided in particular by efficiency considerations. One can derive them from higher-level versions (algorithms) through refinement. A refinement process may in particular remove or restrict non-determinism, present in the above version of Lampsort through the query item (whose only official property is that it returns an element of the set).

The worldview underlying the Eiffel method is almost the reverse: treating the whole process of software development as a continuum; unifying the concepts behind activities such as requirements, specification, design, implementation, verification, maintenance and evolution; and working to resolve the remaining differences, rather than magnifying them. Anyone who has worked in both specification and programming knows how similar the issues are. Formal specification languages look remarkably like programming languages; to be usable for significant applications they must meet the same challenges: defining a coherent type system, supporting abstraction, providing good syntax (clear to human readers and parsable by tools), specifying the semantics, offering modular structures, allowing evolution while ensuring compatibility. The same kinds of ideas, such as an object-oriented structure, help on both sides. Eiffel as a language is the notation that attempts to support this seamless, continuous process, providing tools to express both abstract specifications and detailed implementations. One of the principal arguments for this approach is that it supports change and reuse. If everything could be fixed from the start, maybe it could be acceptable to switch notations between specification and implementation. But in practice specifications change and programs change, and a seamless process relying on a single notation makes it possible to go back and forth between levels of abstraction without having to perform repeated translations between levels. (This problem of change is, in my experience, the biggest obstacle to refinement-based approaches. I have never seen a convincing description of how one can accommodate specification changes in such a framework without repeating the whole process. Inheritance, by the way, addresses this matter much better.)

The example of Lampsort in Eiffel suggests that a good language, equipped with the right abstraction mechanisms, can be effective at describing not only final implementations but also abstract algorithms. It does not hurt, of course, that these abstract descriptions can also be executable, at the possible price of non-optimal performance. The transformation to an optimal version can happen entirely within the same method and language.

Quite apart from these discussions of software engineering methodology, Lamport’s elegant version of Quicksort deserves to be known widely.

References

[1] Lamport video here, segment starting at 0:32:34.
[2] Carlo Furia, Bertrand Meyer and Sergey Velder: Loop invariants: Analysis, Classification and Examples, in ACM Computing Surveys, September 2014, preliminary text here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (12 votes cast)
VN:F [1.9.10_1130]
Rating: +8 (from 8 votes)

New article: contracts in practice

For almost anyone programming in Eiffel, contracts are just a standard part of daily life; Patrice Chalin’s pioneering study of a few years ago [1] confirmed this impression. A larger empirical study is now available to understand how developers actually use contracts when available. The study, to published at FM 2014 [2] covers 21 programs, not just in Eiffel but also in JML and in Code Contracts for C#, totaling 830,000 lines of code, and following the program’s revision history for a grand total of 260 million lines of code over 7700 revisions. It analyzes in detail whether programmers use contracts, how they use them (in particular, which kinds, among preconditions, postconditions and invariants), how contracts evolve over time, and how inheritance interacts with contracts.

The paper is easy to read so I will refer you to it for the detailed conclusions, but one thing is clear: anyone who thinks contracts are for special development or special developers is completely off-track. In an environment supporting contracts, especially as a native part of the language, programmers understand their benefits and apply them as a matter of course.

References

[1] Patrice Chalin: Are practitioners writing contracts?, in Fault-Tolerant System, eds. Butler, Jones, Romanovsky, Troubitsyna, Springer LNCS, vol. 4157, pp. 100–113, 2006.

[2] H.-Christian Estler, Carlo A. Furia, Martin Nordio, Marco Piccioni and Bertrand Meyer: Contracts in Practice, to appear in proceedings of 19th International Symposium on Formal Methods (FM 2014), Singapore, May 2014, draft available here.

VN:F [1.9.10_1130]
Rating: 8.4/10 (11 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 6 votes)

Smaller, better textbook

A new version of my Touch of Class [1] programming textbook is available. It is not quite a new edition but more than just a new printing. All the typos that had been reported as of a few months ago have been corrected.

The format is also significantly smaller. This change is more than a trifle. When а  reader told me for the first time “really nice book, pity it is so heavy!”, I commiserated and did not pay much attention. After twenty people said that, and many more after them, including professors looking for textbooks for their introductory programming classes, I realized it was a big deal. The reason the book was big and heavy was not so much the size of the contents (876 is not small, but not outrageous for a textbook introducing all the fundamental concepts of programming). Rather, it is a technical matter: the text is printed in color, and Springer really wanted to do a good job, choosing thick enough paper that the colors would not seep though. In addition I chose a big font to make the text readable, resulting in a large format. In fact I overdid it; the font is bigger than necessary, even for readers who do not all have the good near-reading sight of a typical 19-year-old student.

We kept the color and the good paper,  but reduced the font size and hence the length and width. The result is still very readable, and much more portable. I am happy to make my contribution to reducing energy consumption (at ETH alone, think of the burden on Switzerland’s global energy bid of 200+ students carrying the book — as I hope they do — every morning on the buses, trains and trams crisscrossing the city!).

Springer also provides electronic access.

Touch of Class is the textbook developed on the basis of the Introduction to Programming course [2], which I have taught at ETH Zurich for the last ten years. It provides a broad overview of programming, starting at an elementary level (zeros and ones!) and encompassing topics not usually covered in introductory courses, even a short introduction to lambda calculus. You can get an idea of the style of coverage of such topics by looking up the sample chapter on recursion at touch.ethz.ch. Examples of other topics covered include a general introduction to software engineering and software tools. The presentation uses full-fledged object-oriented concepts (including inheritance, polymorphism, genericity) right from the start, and Design by Contract throughout. Based on the “inverted curriculum” ideas on which I published a number of articles, it presents students with a library of reusable components, the Traffic library for graphical modeling of traffic in a city, and builds on this infrastructure both to teach students abstraction (reusing code through interfaces including contracts) and to provide them models of high-quality code for imitation and inspiration.

For more details see the article on this blog that introduced the book when it was first published [3].

References

[1] Bertrand Meyer, Touch of Class: An Introduction to Programming Well Using Objects and Contracts, Springer Verlag, 2nd printing, 2013. The Amazon page is here. See the book’s own page (with slides and other teaching materials, sample chapter etc.) here. (Also available in Russian, see here.)

[2] Einführung in die Programmierung (Introduction to Programming) course, English course page here.

[3] Touch of Class published, article on this blog, 11 August 2009, see [1] here.

VN:F [1.9.10_1130]
Rating: 7.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +5 (from 7 votes)

The invariants of key algorithms (new paper)

 

I have mentioned this paper before but as a draft. It has now been accepted by ACM’s Computing Surveys and is scheduled to appear in September 2014; the current text, revised from the previous version, is available [1].

Here is the abstract:

Software verification has emerged as a key concern for ensuring the continued progress of information technology. Full verification generally requires, as a crucial step, equipping each loop with a “loop invariant”. Beyond their role in verification, loop invariants help program understanding by providing fundamental insights into the nature of algorithms. In practice, finding sound and useful invariants remains a challenge. Fortunately, many invariants seem intuitively to exhibit a common flavor. Understanding these fundamental invariant patterns could therefore provide help for understanding and verifying a large variety of programs.

We performed a systematic identification, validation, and classification of loop invariants over a range of fundamental algorithms from diverse areas of computer science. This article analyzes the patterns, as uncovered in this study,governing how invariants are derived from postconditions;it proposes a taxonomy of invariants according to these patterns, and presents its application to the algorithms reviewed. The discussion also shows the need for high-level specifications based on “domain theory”. It describes how the invariants and the corresponding algorithms have been mechanically verified using an automated program prover; the proof source files are available. The contributions also include suggestions for invariant inference and for model-based specification.

Reference

[1] Carlo Furia, Bertrand Meyer and Sergey Velder: Loop invariants: Analysis, Classification and Examples, in ACM Computing Surveys, to appear in September 2014, preliminary text available here.

VN:F [1.9.10_1130]
Rating: 7.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

Reading notes: strong specifications are well worth the effort

 

This report continues the series of ICSE 2013 article previews (see the posts of these last few days, other than the DOSE announcement), but is different from its predecessors since it talks about a paper from our group at ETH, so you should not expect any dangerously delusional,  disingenuously dubious or downright deceptive declaration or display of dispassionate, disinterested, disengaged describer’s detachment.

The paper [1] (mentioned on this blog some time ago) is entitled How good are software specifications? and will be presented on Wednesday by Nadia Polikarpova. The basic result: stronger specifications, which capture a more complete part of program functionality, cause only a modest increase in specification effort, but the benefits are huge; in particular, automatic testing finds twice as many faults (“bugs” as recently reviewed papers call them).

Strong specifications are specifications that go beyond simple contracts. A straightforward example is a specification of a push operation for stacks; in EiffelBase, the basic Eiffel data structure library, the contract’s postcondition will read

item =                                          /A/
count = old count + 1

where x is the element being pushed, item the top of the stack and count the number of elements. It is of course sound, since it states that the element just pushed is now the new top of the stack, and that there is one more element; but it is also  incomplete since it says nothing about the other elements remaining as they were; an implementation could satisfy the contract and mess up with these elements. Using “complete” or “strong” preconditions, we associate with the underlying domain a theory [2], or “model”, represented by a specification-only feature in the class, model, denoting a sequence of elements; then it suffices (with the convention that the top is the first element of the model sequence, and that “+” denotes concatenation of sequences) to use the postcondition

model = <x> + old model         /B/

which says all there is to say and implies the original postconditions /A/.

Clearly, the strong contracts, in the  /B/ style, are more expressive [3, 4], but they also require more specification effort. Are they worth the trouble?

The paper explores this question empirically, and the answer, at least according to the criteria used in the study, is yes.  The work takes advantage of AutoTest [5], an automatic testing framework which relies on the contracts already present in the software to serve as test oracles, and generates test cases automatically. AutoTest was applied to both to the classic EiffelBase, with classic partial contracts in the /A/ style, and to the more recent EiffelBase+ library, with strong contracts in the /B/ style. AutoTest is for Eiffel programs; to check for any language-specificity in the results the work also included testing a smaller set of classes from a C# library, DSA, for which a student developed a version (DSA+) equipped with strong model-based contracts. In that case the testing tool was Microsoft Research’s Pex [7]. The results are similar for both languages: citing from the paper, “the fault rates are comparable in the C# experiments, respectively 6 . 10-3 and 3 . 10-3 . The fault complexity is also qualitatively similar.

The verdict on the effect of strong specifications as captured by automated testing is clear: the same automatic testing tools applied to the versions with strong contracts yield twice as many real faults. The term “real fault” comes from excluding spurious cases, such as specification faults (wrong specification, right implementation), which are a phenomenon worth studying but should not count as a benefit of the strong specification approach. The paper contains a detailed analysis of the various kinds of faults and the corresponding empirically determined measures. This particular analysis is for the Eiffel code, since in the C#/Pex case “it was not possible to get an evaluation of the faults by the original developers“.

In our experience the strong specifications are not that much harder to write. The paper contains a precise measure: about five person-weeks to create EiffelBase+, yielding an “overall benefit/effort ratio of about four defects detected per person-day“. Such a benefit more than justifies the effort. More study of that effort is needed, however, because the “person” in the person-weeks was not just an ordinary programmer. True, Eiffel experience has shown that most programmers quickly get the notion of contract and start applying it; as the saying goes in the community, “if you can write an if-then-else, you can write a contract”. But we do not yet have significant evidence of whether that observation extends to model-based contracts.

Model-based contracts (I prefer to call them “theory-based” because “model” means so many other things, but I do not think I will win that particular battle) are, in my opinion, a required component of the march towards program verification. They are the right compromise between simple contracts, which have proved to be attractive to many practicing programmers but suffer from incompleteness, and full formal specification à la Z, which say everything but require too much machinery. They are not an all-or-nothing specification technique but a progressive one: programmers can start with simple contracts, then extend and refine them as desired to yield exactly the right amount of precision and completeness appropriate in any particular context. The article shows that the benefits are well worth the incremental effort.

According to the ICSE program the talk will be presented in the formal specification session, Wednesday, May 22, 13:30-15:30, Grand Ballroom C.

References

[1] Nadia Polikarpova, Carlo A. Furia, Yu Pei, Yi Wei and Bertrand Meyer: What Good Are Strong Specifications?, to appear in ICSE 2013 (Proceedings of 35th International Conference on Software Engineering), San Francisco, May 2013, draft available here.

[2] Bertrand Meyer: Domain Theory: the forgotten step in program verification, article on this blog, see here.

[3] Bernd Schoeller, Tobias Widmer and Bertrand Meyer: Making Specifications Complete Through Models, in Architecting Systems with Trustworthy Components, eds. Ralf Reussner, Judith Stafford and Clemens Szyperski, Lecture Notes in Computer Science, Springer-Verlag, 2006, available here.

[4] Nadia Polikarpova, Carlo Furia and Bertrand Meyer: Specifying Reusable Components, in Verified Software: Theories, Tools, Experiments (VSTTE ‘ 10), Edinburgh, UK, 16-19 August 2010, Lecture Notes in Computer Science, Springer Verlag, 2010, available here.

[5] Bertrand Meyer, Ilinca Ciupa, Andreas Leitner, Arno Fiva, Yi Wei and Emmanuel Stapf: Programs that Test Themselves, IEEE Computer, vol. 42, no. 9, pages 46-55, September 2009, also available here.

[6] Bertrand Meyer, Ilinca Ciupa, Andreas Leitner, Arno Fiva, Yi Wei and Emmanuel Stapf: Programs that Test Themselves, in IEEE Computer, vol. 42, no. 9, pages 46-55, September 2009, also available here.

[7] Nikolai Tillman and Peli de Halleux, Pex: White-Box Generation for .NET, in Tests And Proofs (TAP 2008), pp. 134-153.

 

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

Presentations at ICSE and VSTTE

 

The following presentations from our ETH group in the ICSE week (International Conference on Software Engineering, San Francisco) address important issues of software specification and verification, describing new techniques that we have recently developed as part of our work building EVE, the Eiffel Verification Environment. One is at ICSE proper and the other at VSTTE (Verified Software: Tools, Theories, Experiments). If you are around please attend them.

Julian Tschannen will present Program Checking With Less Hassle, written with Carlo A. Furia, Martin Nordio and me, at VSTTE on May 17 in the 15:30-16:30 session (see here in the VSTTE program. The draft is available here. I will write a blog article about this work in the coming days.

Nadia Polikarpova will present What Good Are Strong Specifications?, written with , Carlo A. Furia, Yu Pei, Yi Wei and me at ICSE on May 22 in the 13:30-15:30 session (see here in the ICSE program). The draft is available here. I wrote about this paper in an earlier post: see here. It describes the systematic application of theory-based modeling to the full specification and verification of advanced software.

VN:F [1.9.10_1130]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

The ABC of software engineering

Lack of a precise context can render discussions of software engineering and particularly of software quality meaningless. Take for example the (usually absurd) statement “We cannot expect that programmers will equip their programs with contracts”. Whom do you mean? A physicist who writes 50 lines of Matlab code to produce a graph illustrating his latest experiment? A member of the maintenance team for Microsoft Word? A programmer on the team for a flight control system? These are completely different constituencies, and the answer is also different. In the last case, the answer is probably that we do not care what the programmers like and do not like. When you buy an electrical device that malfunctions, would you accept from the manufacturer the excuse that differential equations are, really, you see, too hard for our electrical engineers?

In discussing the evolution of software methods and tools we must first specify what and whom we are talking about. The following ABC characterization is sufficient for most cases.

C is for Casual. Programs in that category do all kinds of useful things, and like anything else they should work properly, but if they are not ideal in software engineering terms of reliability, reusability, extendibility and so on — if sometimes they crash, sometimes produce not-quite-right results,  cannot be easily understood or maintained by anyone other than their original developers, target just one platform, run too slowly, eat up too much memory, are not easy to change, include duplicated code — it is not the end of the world. I do not have any scientific figures, but I suspect that most of the world’s software is actually in that category, from JavaScript or Python code that runs web sites to spreadsheet macros. Obviously it has to be good enough to serve its needs, but “good enough” is good enough.

B is for Business. Programs in that category run key processes in the organization. While often far from impeccable, they must satisfy strict quality constraints; if they do not, the organization will suffer significantly.

A is for Acute. This is life-critical software: if it does not work — more precisely, if it does not work exactly right — someone will get killed, someone will lose huge amounts of money, or something else will go terribly wrong. We are talking transportation systems, software embedded in critical devices, make-or-break processes of an organization.

Even in a professional setting, and even within a single company, the three categories usually coexist. Take for example a large engineering or scientific organization.  Some programs are developed to support experiments or provide an answer to a specific technical question. Some programs run the organization, both on the information systems side (enterprise management) and on the technical side (large scientific simulations, experiment set-up). And some programs play a critical role in making strategy decisions, or run the organization’s products.

The ABC classification is independent of the traditional division between enterprise and technical computing. Organizations often handle these two categories separately, whereas in fact they raise issues of similar difficulty and are subject to solutions of a similar nature. It is more important to assess the criticality of each software projects, along the ABC scale.

It is surprising that few organizations make that scale explicit.  It is partly a consequence of that neglect that many software quality initiatives and company-wide software engineering policies are ineffective: they lump everything together, and since they tend to be driven by A-grade applications, for which the risk of bad quality is highest, they create a burden that can be too high for C- and even B-grade developments. People resent the constraints where they are not justified, and as a consequence ignore them where they would be critical. Whether your goal for the most demanding projects is to achieve CMMI qualification or to establish an effective agile process, you cannot impose the same rules on everyone. Sometimes the stakes are high; and sometimes a program is just a program.

The first step in establishing a successful software policy is to separate levels of criticality, and require every development to position itself along the resulting scale. The same observation qualifies just about any discussion of software methodology. Acute, Business or Casual: you must know your ABC.

VN:F [1.9.10_1130]
Rating: 9.8/10 (11 votes cast)
VN:F [1.9.10_1130]
Rating: +9 (from 9 votes)

How good are strong specifications? (New paper, ICSE 2013)

 

A core aspect of our verification work is the use of “strong” contracts, which express sophisticated specification properties without requiring a separate specification language: even for advanced properties, there is no need for a separate specification language, with special notations such as those of first-order logic; instead, one can continue to rely, in the tradition of Design by Contract, on the built-in notations of the programming language, Eiffel.

This is the idea of domain theory, as discussed in earlier posts on this blog, in particular [1]. An early description of the approach, part of Bernd Schoeller’s PhD thesis work, was [2]; the next step was [3], presented at VSTTE in 2010.

A new paper to be presented at ICSE in May [3], part of an effort led by Nadia Polikarpova for her own thesis in progress, shows new advances in using strong specifications, demonstrating their expressive power and submitting them to empirical evaluation. The results show in particular that strong specifications justify the extra effort; in particular they enable automatic tests to find significantly more bugs.

A byproduct of this work is to show again the complementarity between various forms of verification, including not only proofs but (particularly in the contribution of two of the co-authors, Yi Wei and Yu Pei, as well as Carlo Furia) tests.

References

[1] Bertrand Meyer: Domain Theory: the forgotten step in program verification, article on this blog, see here.

[2] Bernd Schoeller, Tobias Widmer and Bertrand Meyer: Making Specifications Complete Through Models, in Architecting Systems with Trustworthy Components, eds. Ralf Reussner, Judith Stafford and Clemens Szyperski, Lecture Notes in Computer Science, Springer-Verlag, 2006, available here.

[3] Nadia Polikarpova, Carlo Furia and Bertrand Meyer: Specifying Reusable Components, in Verified Software: Theories, Tools, Experiments (VSTTE ‘ 10), Edinburgh, UK, 16-19 August 2010, Lecture Notes in Computer Science, Springer Verlag, 2010, available here.

[4] Nadia Polikarpova, Carlo A. Furia, Yu Pei, Yi Wei and Bertrand Meyer: What Good Are Strong Specifications?, to appear in ICSE 2013 (Proceedings of 35th International Conference on Software Engineering), San Francisco, May 2013, draft available here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Multirequirements (new paper)

 

As part of a Festschrift volume for Martin Glinz of the university of Zurich I wrote a paper [1] describing a general approach to requirements that I have been practicing and developing for a while, and presented in a couple of talks. The basic idea is to rely on object-oriented techniques, including contracts for the semantics, and to weave several levels of discourse: natural-language, formal and graphical.

Reference

[1] Bertrand Meyer: Multirequirements, to appear in Martin Glinz Festschrift, eds. Anne Koziolek and Norbert Scheyff, 2013, available here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

Loop invariants: the musical

 

Actually it is not a musical but an extensive survey. I have long been fascinated by the notion of loop invariant, which describes the essence of a loop. Considering a loop without its invariant is like conducting an orchestra without a score.

In this submitted survey paper written with Sergey Velder and Carlo Furia [1], we study loop invariants in depth and describe many algorithms from diverse areas of computer science through their invariants. For simplicity and clarity, the specification technique uses the Domain Theory technique described in an earlier article on this blog [2] (see also [3]). The invariants were verified mechanically using Boogie, a sign of how much more realistic verification technology has become in recent years.

The survey was a major effort (we worked on it for a year and a half); it is not perfect but we hope it will prove useful in the understanding, teaching and verification of important algorithms.

Here is the article’s abstract:

At the heart of every loop, and hence of all significant algorithms, lies a loop invariant: a property ensured by the initialization and maintained by every iteration so that, when combined with the exit condition, it yields the loop’s final effect. Identifying the invariant of every loop is not only a required step for software verification, but also a key requirement for understanding the loop and the program to which it belongs. The systematic study of loop invariants of important algorithms can, as a consequence, yield insights into the nature of software.

We performed this study over a wide range of fundamental algorithms from diverse areas of computer science. We analyze the patterns according to which invariants are derived from postconditions, propose a classification of invariants according to these patterns, and present its application to the algorithms reviewed. The discussion also shows the need for high-level specification and invariants based on “domain theory”. The included invariants and the corresponding algorithms have been mechanically verified using an automatic program prover. Along with the classification and applications, the conclusions include suggestions for automatic invariant inference and general techniques for model-based specification.

 

References

[1] Carlo Furia, Bertrand Meyer and Sergey Velder: Loop invariants: analysis, classification, and examples, submitted for publication, December 2012, draft available here.

[2] Domain Theory: the Forgotten Step in Program Verification, article from this blog, 11 April 2012, available here.

[3] Domain Theory: Precedents, article from this blog, 11 April 2012, available here

VN:F [1.9.10_1130]
Rating: 9.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

A fundamental duality of software engineering

A couple of weeks ago I proposed a small quiz. (I also stated that the answer would come “on Wednesday” — please understand any such promise as “whenever I find the time”. Sorry.) Here is the answer.

The quiz was:

I have a function:

  • For 0 it yields 0.
  • For 1 it yields 1.
  • For 2 it yields 4.
  • For 3 it yields 9.
  • For 4 it yields 16.

What is the value for 5?

Shortly thereafter I added a hint: the value for 5 is 25, and changed the question to: “What is the value for 6?”. For good measure we can also ask about the value for 1000. Now compare your answer to  what follows.

A good answer for the value at 6 is: 34 . The function in this case is -10 + 5 x + |2 x – 3| + |2 x -7|. It matches the values for the given inputs.

Linear, small values

 

 

 

 

 

 

 

 

 

The value for 1000 is 8980:

Linear function, full range

 

 

 

 

 

 

 

 

 

Another good answer at position 6 is 35.6. It comes up if we assume the function is over reals rather than integers; then a possible formula, which correlates very well (R-square of 0.9997) with the values at the given inputs, is:

869.42645566111 (1 – 0.4325853145802 e-.0467615868913719  (x – 17.7342512233011))2.3116827277657443

Exponential function, initial range

 

 

 

 

 

 

 

 

 

 

with a quite different asymptotic behavior, giving the value 869.4 at position 1000:

Exponential, full range

 

 

 

 

 

 

 

 

 

 

Some readers might have thought of another possibility, the square function x2, which again matches all the given values:

Square function, initial range

 

 

 

 

 

 

 

 

 

 

So which of these answers is right? Each is as good as the others, and as bad. There is in particular no reason to believe that the values given in the quiz’s statement suggest the square function. Any function that fits the given values, exactly (if we stick to integers) or approximately (with reals as simulated on a computer) is an equally worthy candidate. Six inputs, or six thousand, do not resolve the question. At best they are hints.

This difference between a hint and a solution is at the core of software engineering. It is, for example, the difference between a test and a specification. A test tells us that the program works for some values; as Dijkstra famously pointed out, and anyone who has developed a serious program has experienced, it does not tell us that it will work for others. The more successful tests, the more hints; but they are still only hints. I have always wondered whether Dijkstra was explicitly thinking of the Popperian notion of falsifiability: no number of experiments will prove a physical theory (although a careful experiment may boost the confidence in the theory, especially if competing theories fail to explain it, as the famous Eddington expedition did for relativity in 1919 [1]); but a single experiment can disprove a theory. Similarly, being told that our function’s value at 6 is 34 disqualifies the square function and the last one (the exponential), but does not guarantee that the first function (the linear combination) is the solution.

The specification-testing duality is the extension to computer science of the basic duality of logic. It starts with the elementary boolean operators: to prove a or b it suffices to establish a or to establish b; and to disprove a and b it suffices to show that a does not hold or to show that b does not hold. The other way round, to disprove a or b we have to show that a does not hold and to show that b does not hold; to prove that a and b holds, we have to show that a holds and to show that b holds.

Predicate calculus generalizes or to , “there exists”, and and to , “for all”. To prove ∃ x | p (x) (there is an x of which p holds) it suffices to find one value a such that p (a); let’s be pretentious and say we have “skolemized” x. To disprove∀ x | p (x) (p holds of all x) it suffices to find one value for which p does not hold.

In software engineering the corresponding duality is between proofs and tests, or (equivalently) specifications and use cases. A specification is like a “for all”: it tells us what must happen for all envisioned inputs. A test is like a “there exists”: it tells us what happens for a particular input and hence, as in predicate calculus, it is interesting as a disproof mechanism:

  • A successful test brings little information (like learning the value for 5 when trying to figure out what a function is, or finding one true value in trying to prove a or a false value in trying to prove a ).
  • An unsuccessful test brings us decisive information (like a false value for a ): the program is definitely not correct. It skolemizes incorrectness.

A proof, for its part, brings the discussion to an end when it is successful. In practice, testing may still be useful in this case, but only testing that addresses issues not covered by the proof:

  • Correctness of the compiler and platform, if not themselves proved correct.
  • Correctness the proof tools themselves, since most practical proofs require software support.
  • Aspects not covered by the specification such as, typically, performance and usability.

But for the properties it does cover the proof is final.

It is as foolish, then, to use tests in lieu of specifications as it would be to ignore the limitations of a proof. Agile approaches have caused much confusion here; as often happens in the agile literature [2], the powerful insight is mixed up with harmful advice. The insight, which has significantly improved the practice of software development, is that the regression test suite is a key asset of a project and that tests should be run throughout. The bad advice is to ditch upfront requirements and specifications in favor of tests. The property that tests lack and specifications possess is generality. A test is an instance; a thousand tests can never be more than a thousand instances. As I pointed out in a short note in EiffelWorld (the precursor to this blog) a few years ago [3], the relationship is not symmetric: one can generate tests from a specification, but not the other way around.

The same relationship holds between use cases and requirements. It is stunning to see how many people think that use cases (scenarios) are a form of requirements. As requirements they are as useless as one or ten values are to defining a function. Use cases are a way to complement the requirements by describing the system’s behavior in selected important cases. A kind of reality check, to ensure that whatever abstract aims have been defined for the system it still covers the cases known to be of immediate interest. But to rely on use cases as requirements means that you will get a system that will satisfy the use cases — and possibly little else.

When I use systems designed in recent years, in particular Web-based systems, I often find myself in a stranglehold: I am stuck with the cases that the specifiers thought of. Maybe it’s me, but my needs tend, somehow, to fall outside of these cases. Actually it is not just me. Not long ago, I was sitting close to a small-business owner who was trying to find her way through an insurance site. Clearly the site had a planned execution path for employees, and another for administrators. Problem: she was both an employee and the administrator. I do not know how the session ended, but it was a clear case of misdesign: a system built in terms of standard scenarios. Good specification performs an extra step of abstraction (for example using object-oriented techniques and contracts, but this is for another article). Skipping this step means forsaking the principal responsibility of the requirements phase: to generalize from an analysis of the behavior in known cases to a definition of the desired behaviors in all relevant cases.

Once more, as everywhere else in computer science [4], abstraction is the key to solid results that stand the test of time. Definitely better than judging a book by its cover, inferring a function by its first few values, verifying a program by its tests, or specifying a system by its use cases.

References

[1] See e.g. a blog article: Einstein and Eddington, here.

[2] Bertrand Meyer: Agile! The Good, the Hype and the Ugly, 2013, to appear.

[3] Bertrand Meyer: Test or spec? Test and spec? Test from spec!, EiffelWorld column, 2004 available here.

[4] Jeff Kramer: Is Abstraction the Key to Computer Science?, in Communications of the ACM, vol. 50, no. 4, April 2007, pages 36-42,  available from CiteSeer here

VN:F [1.9.10_1130]
Rating: 9.5/10 (32 votes cast)
VN:F [1.9.10_1130]
Rating: +13 (from 13 votes)

The manhood test

 

I came across an obscure and surprisingly interesting article by Cliff Jones [1], about the history of rely-guarantee but with the following extract:

It was perhaps not fully appreciated at the time of [Hoare’s 1969 axiomatic semantics paper] that the roles of pre and post conditions differ in that a pre condition gives permission to a developer to ignore certain possibilities; the onus is on a user to prove that a component will not be initiated in a state that does not satisfy its pre condition. In contrast a post condition is an obligation on the code that is created according to the specification. This Deontic view carries over [to rely-guarantee reasoning].

I use words more proletarian than “deontic”, but this view is exactly what stands behind the concepts of Design by Contract and has been clearly emphasized in all Eiffel literature ever since the first edition of OOSC. It remains, however, misunderstood outside of the Eiffel community; many people confuse Design by Contract with its opposite, defensive programming. The criterion is simple: if you have a precondition to a routine, are you willing entirely to forsake the corresponding checks (conditionals, exceptions…) in the routine body? If not, you may be using the word “contract” as a marketing device, but that’s all. The courage to remove the checks is the true test of adulthood.

The application of Microsoft’s “Code Contracts” mechanism to the .NET libraries fails that test: a precondition may say “buffer not full” or “insertions allowed”, but the code still checks the condition and triggers an exception. The excuse I have heard is that one cannot trust those unwashed developers. But the methodological discipline is lost. Now let me repeat this using clearer terminology: it’s not deontic.

Reference

[1] Cliff Jones: The role of auxiliary variables in the formal development of concurrent programs, in Reflections on the work of C. A. R. Hoare, eds. Jones, Roscoe and Wood, Springer Lecture Notes in Computer Science,  2009, technical report version available here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

EIS: Putting into Practice the Single Model Principle

Since release 6.2 (November 2008) EiffelStudio has included the EIS system, Eiffel Information System. It has been regularly revised, and significantly improved for the recent 7.1 release.

For us EIS is a key contribution with far-reaching software engineering implications, but many users seem unaware of it, perhaps because we have not been explicit enough about why we think it is important. We would love to have more people try it and give us their feedback. (Please make sure to use the 7.1 version.) Information on EIS can be found in the documentation [1] and also in a blog entry by Tao Feng [2].

EIS connects an Eiffel system with external documents in arbitrary formats; examples of formats currently supported are Microsoft Word and PDF, but you can easily add protocols. Such a connection links an element of the Eiffel text, such as a feature, with an element of the external document, such as a paragraph. Then clicking the Eiffel element in EiffelStudio will open the document at the corresponding place in the external tool (Word, Acrobat etc.); this is the EIS “outgoing” mechanism. Conversely the external element has a back link: clicking in the external tool will open EiffelStudio at the right place; this is the EIS “incoming” mechanism.

For the outgoing mechanism, the link will appear as part of a note clause (with attributes filled by default, you need only edit the URL and any option that you wish to change):

EIS incoming note

The fundamental idea behind EIS is to support the seamless form of software development promoted and permitted by Eiffel, where all phases of a project’s lifecycle are closely linked and the code provides the ultimate reference. Since other documents are often involved, in particular a requirements document (SRS, Software Requirements Specification), it is essential to record their precise associations with elements of the software text. For example a paragraph in the SRS could state that “Whenever the tank temperature reaches 50 degrees, the valve shall be closed”. In the software text, there will be some feature, for example monitor_temperature in the class TANK, reflecting this requirement. The two elements should be linked, in particular to ensure that dependencies appear clearly and that any change in either the requirements or the code triggers the corresponding update to the other side. This is what EIS provides.

We envision further tools to track dependencies and in particular to warn users if an element of a connection (e.g. requirement or code) changes, alerting them to the need to check the linked elements on the other side. One of the key goals here is traceability: effective project management, particular during the evolution of a system, requires that all dependencies between the project’s artifact are properly recorded so that it is possible to find out the consequences of any change, proposed or carried out.

The general approach reflects the essential nature of Eiffel development, with its Single Product Principle linking all elements of a software system and minimizing, rather than exaggerating, the inevitable differences of levels of abstraction between requirements, design, code, test plans, test logs, schedules and all the other products of a software project. The core problem of software engineering is change: if we use different tools and notations at each step, and keep the documents separate, we constantly run the risk of divergence between intent and reality. Eiffel by itself offers a good part of the solution by providing a single method (with all its principles, from Design by Contract to open-closed etc.), a single notation (the Eiffel language itself) and a single integrated set of tools (the EiffelStudio IDE) supporting the entire lifecycle; the language, in particular is meant for requirements and design as much as for implementation. The graphical forms (BON and UML, as produced by the Diagram Tool of EiffelStudio in a roundtrip style, i.e. changes to the diagram immediately generate code and changes to the code are reflected in the diagram) directly support these ideas. Of course documents in other formalisms, for example SRS, remain necessary for human consumption; but they should be closely linked to the core project asset, the Eiffel code; hence the need for EIS and its connection mechanisms.

This approach, as I have often noted when presenting it in public, is hard to convey to people steeped in the mindset of the past (UML as separate from code, model-driven development) which magnify the differences between software levels, hence introducing the risk of divergence and making change painful. The Eiffel approach is innovative enough to cause incomprehension or even rejection. (“What, you are not model-driven, but everyone says model-driven is good!” – well, models are bad if they are inaccurate. In the Eiffel approach the model and the program are the same thing, or more precisely the model is the abstract view of the program, obtained through abstraction mechanisms such as deferred classes with contracts and the “contract view” tool of EiffelStudio.)

To be effective, these ideas require proper tool support, for which EIS is a start. But we would like to know if we are on the right track and hence need feedback. We would be grateful if you could try out EIS and tell us what you think, both about the current state of the mechanism and its long-term prospects in the general framework of high-quality, sustainable software development.

References

[1] EIS documentation, here.

[2] Tao Feng, Start using Eiffel Information System, Eiffelroom blog entry of 17 April 2008, available here.

VN:F [1.9.10_1130]
Rating: 9.8/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

Aliasing and framing: Saint Petersburg seminar next week

In  last Thursday’s session of the seminar, Kokichi Futatsugi’s talk took longer than planned (and it would have been a pity to stop him), so I postponed my own talk on Automatic inference of frame conditions through the alias calculus to next week (Thursday local date). As usual it will be broadcast live.

Seminar page: here, including the link to follow the webcast.

Time and date: 5 April 2012, 18 Saint Petersburg time; you can see the local time at your location here.

Abstract:

Frame specifications, the description of what does not change in a routine call, are one of the most annoying components of verification, in particular for object-oriented software. Ideally frame conditions should be inferred automatically. I will present how the alias calculus, described in recent papers, can address this need.

There may be a second talk, on hybrid systems, by Sergey Velder.

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

A carefully designed Result

 

In the Eiffel user discussion group [1], Ian Joyner recently asked:

A lot of people are now using Result as a variable name for the return value in many languages. I believe this first came from Eiffel, but can’t find proof. Or was it adopted from an earlier language?

Proof I cannot offer, but certainly my recollection is that the mechanism was an original design and not based on any previous language. (Many of Eiffel’s mechanisms were inspired by other languages, which I have always acknowledged as precisely as I could, but this is not one of them. If there is any earlier language with this convention — in which case a reader will certainly tell me — I was and so far am not aware of it.)

The competing conventions are a return instruction, as in C and languages based on it (C++, Java, C#), and Fortran’s practice, also used in Pascal, of using the function name as a variable within the function body. Neither is satisfactory. The return instruction suffers from two deficiencies:

  • It is an extreme form of goto, jumping out of a function from anywhere in its control structure. The rest of the language sticks to one-entry, one-exit structures, as I think all languages should.
  • In most non-trivial cases the return value is not just a simple formula but has to be computed through some algorithm, requiring the declaration of a local variable just to denote that result. In every case the programmer must invent a name for that variable and, in a typed language, include a declaration. This is tedious and suggests that the language should take care of the declaration for the programmer.

The Fortran-Pascal convention does not combine well with recursion (which Fortran for a long time did not support). In the body of the function, an occurrence of the function’s name can denote the result, or it can denote a recursive call; conventions can be defined to remove the ambiguity, but they are messy, especially for a function without arguments: in function f, does the instruction

f := f + 1

add one to the value of the function’s result as computed so far, as it would if f were an ordinary variable, or to the result of calling f recursively?

Another problem with the Fortran-Pascal approach is that in the absence of a language-defined rule for variable initialization a function can return an undefined result, if some path has failed to initialize the corresponding variable.

The Eiffel design addresses these problems. It combines several ideas:

  • No nesting of routines. This condition is essential because without it the name Result would be ambiguous. In all Algol- and Pascal-like languages it was considered really cool to be able to declare routines within routines, without limitation on the depth of recursion. I realized that in an object-oriented language such a mechanism was useless and in fact harmful: a class should be a collection of features — services offered to the rest of the world — and it would be confusing to define features within features. Simula 67 offered such a facility; I wrote an analysis of inter-module relations in Simula, including inheritance and all the mechanisms retained from Algol such as nesting (I am trying to find that document, and if I do I will post it in this blog); my conclusion was the result was too complicated and that the main culprit was nesting. Requiring classes to be flat structures was, in my opinion, one of the most effective design decisions for Eiffel.
  • Language-defined initialization. Even a passing experience with C and C++ shows that uninitialized variables are one of the major sources of bugs. Eiffel introduced a systematic rule for all variables, including Result, and it is good to see that some subsequent languages such as Java have retained that convention. For a function result, it is common to ignore the default case, relying on the standard initialization, as in if “interesting case” then Result:= “interesting value” end without an else clause (I like this convention, but some people prefer to make all cases explicit).
  • One-entry, one-exit blocks; no goto in overt or covert form (break, continue etc.).
  • Design by Contract mechanisms: postconditions usually need to refer to the result computed by a function.

The convention is then simple: in any function, you can use a language-defined local variable Result for you, of the type that you declared for the function result; you can use it as a normal variable, and the result returned by any particular call will be the final value of the variable on exit from the function body.

The convention has been widely imitated, starting with Delphi and most recently in Microsoft’s “code contracts”, a kind of poor-man’s Design by Contract emulation, achieved through libraries; it requires a Result notation to denote the function result in a postcondition, although this notation is unrelated to the mechanisms in the target languages such as C#. As the example of Eiffel’s design illustrates, a programming language is a delicate construction where all elements should fit together; the Result convention relies on many other essential concepts of the language, and in turn makes them possible.

Reference

[1] Eiffel Software discussion group, here.

VN:F [1.9.10_1130]
Rating: 8.3/10 (12 votes cast)
VN:F [1.9.10_1130]
Rating: +5 (from 9 votes)

Never design a language

It is a common occurrence in software development. Someone says: “We should design a language”. The usual context is that some part of the development requires a rich functionality set, and it appears appropriate to provide a flexible solution through a specialized language. As an example, in the development of an airline’s frequent flyer program on which I once worked the suggestion came to design a “Flyer Award Language” , with instructions appropriate for that application domain: record a trip, redeem an award, provide a statement of available miles and so on. A common term for such notations is DSL, for Domain-Specific Language.

Designing a language in such a context is almost always a bad idea (and I am not sure why I wrote “almost”). Languages are endless objects of discussion, usually on the least important aspects, which are also the most visible and those on which everyone has a strong opinion: concrete syntactic properties. People might pretend otherwise (“let’s not get bogged down on syntax, this is just one possible form”) but syntax is what the discussions will get bogged down to — keywords or symbols, this order or that order of operands, one instruction with several variants vs. several instructions… — at the expense of discussing the fundamental issues of functionality.

Worse yet, even if a language will be part of the solution it is usually just one facet to the solution. As was already explained in detail in [1], any useful functionality set will naturally be useful through several interfaces: a textual notation with concrete syntax may be one of them, but other possible ones include an API (Abstract Program Interface) for use from other software elements, a Graphical User Interface, a web user interface, yet another for web services (typically WSDL or some other XML or JSON format).

In such cases, starting with a concrete textual language is pretty silly, since it cannot yield the others directly (it would have to be parsed and further analyzed, which does not make sense). Of all the kinds of interface listed, the most fundamental one is the API: it describes the raw functionality, excluding any choice of syntax but including, thanks to contracts, elements of semantics. For example, a class AWARD in our frequent flyer application might include the feature


             redeem_for_upgrade (c: CUSTOMER; f : FLIGHT)
                                     — Upgrade c to next class of service on f.
                       require
                                    c /= holder
implies holder.allowed_substitute (c)
                                    f.permitted_for_upgrade
(Current)
                                    c.booked
( f )
                       
ensure
                                    c.class_of_service
( f ) =  old c.class_of_service ( f ) + 1

There is of course no implementation as this declaration only specifies an interface, but it says what needs to be said: to redeem the award for an upgrade, the intended customer must be either the holder of the award or an allowed substitute; the flight must be available for an upgrade with the current award (including the availability of enough miles); the intended customer must already be booked on the flight; and the upgrade will be for the next class of service.

These details are the kind of things that need to be discussed and agreed before the API is finalized. Then one can start discussing about a textual form (a DSL), a graphical interface, a web services interface. They all consist of relatively simple layers to be superimposed on a solidly defined and precisely specified basis. Once you have that basis, you can have all the fun you like arguing over everyone’s favorite forms of concrete syntax; it cannot hurt the project any more. Having these discussions early, at the expense of the more fundamental issues, is a great danger.

One of the key rules for successful software construction — as for many other ventures of course, especially in science and technology — is to distinguish the essential from the auxiliary, and consequently to devote proper attention to the essential issues while avoiding disputations of auxiliary issues. To define functionality, API is essential; language is auxiliary.

So when should you design a language? Never. Well, hardly ever.

Reference

[1] Bertrand Meyer: Introduction to the Theory of Programming Languages, Prentice Hall, 1990.

VN:F [1.9.10_1130]
Rating: 7.9/10 (18 votes cast)
VN:F [1.9.10_1130]
Rating: +8 (from 16 votes)

Webinar today: the Varieties of Loop Invariants

I did not have time to complete my Monday post this week; it will be for next Monday (title: Never design a language). In the meantime, here is the announcement for today’s Saint Petersburg Software Engineering seminar , which can be followed live at http://sel.ifmo.ru/seminar/live (19:30 Saint Petersburg time, meaning 16:30 Zurich/Paris, 7:30 PDT on 12 January 2012), duration about one hour.

I will be talking today; the topic is “The varieties of loop invariants”, reporting on joint work with Carlo Furia and Sergey Velder. The abstract appears below.

A recording of previous talks, starting from those of last week, will soon be available on the seminar page.

 

Abstract

The key practical issue in verifying software is to come up with the right loop invariants. We are performing an extensive analysis of loop invariants in important algorithms across all major areas of computer science, and have developed a taxonomy. I will present some of the results of this ongoing work, performed with Sergey Velder (ITMO) and Carlo Furia (ETH).

VN:F [1.9.10_1130]
Rating: 9.0/10 (2 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

All Bugs Great and Small

(Acknowledgment: this article came out of a discussion with Manuel Oriol, Carlo Furia and Yi Wei. The material is largely theirs but the opinions are mine.)

A paper on automatic testing, submitted some time ago, received the following referee comment:

The case study seems unrealistic and biased toward the proposed technique. 736 unique faults found in 92 classes means at least 8 unique faults per class at the same time. I have never seen in all my life a published library with so many faults …

This would be a good start for a discussion of what is wrong with refereeing in computer science today (on the negativism of our field see [1]); we have a referee who mistakes experience for expertise, prejudice for truth, and refuses to accept carefully documented evidence because “in all his life”, presumably a rich and rewarding life, he has never seen anything of the sort. That is not the focus of the present article, however; arrogant referees eventually retire and good papers eventually get published. The technical problems are what matters. The technical point here is about testing.

Specifically, what bugs are worth finding, and are high bug rates extraordinary?

The paper under review was a step in the work around the automatic testing tool AutoTest (see [2] for a slightly older overall description and [3] for the precise documentation). AutoTest applies a fully automatic strategy, exercising classes and their routines without the need to provide test cases or test oracles. What makes such automation possible is the combination of  random generation of tests and reliance on contracts to determine the success of tests.

For several years we have regularly subjected libraries, in particular the EiffelBase data structure library, to long AutoTest sessions, and we keep finding bugs (the better term is faults). The fault counts are significant; here they caught the referee’s eye. In fact we have had such comments before: I don’t believe your fault counts for production software; your software must be terrible!

Well, maybe.

My guess is that in fact EiffelBase has no more bugs, and possibly far fewer bugs, than other “production” code. The difference is that the  AutoTest framework performs far more exhaustive tests than usually practiced.

This is only a conjecture; unlike the referee I do not claim any special powers that make my guesses self-evident. Until we get test harnesses comparable to AutoTest for environments other than Eiffel and, just as importantly, libraries that are fully equipped with contracts, enabling the detection of bugs that otherwise might not come to light, we will not know whether the explanation is the badness of EiffelBase or the goodness of AutoTest.

What concrete, incontrovertible evidence demonstrates is that systematic random testing does find faults that human testers typically do not. In a 2008 paper [4] with Ilinca Ciupa, Manuel Oriol and Alexander Pretschner, we ran AutoTest on some classes and compared the results with those of human testers (as well as actual bug reports from the field, since this was released software). We found that the two categories are complementary: human testers find faults that are still beyond the reach of automated tools, but they typically never find certain faults that AutoTest, with its stubborn dedication to leaving no stone unturned, routinely uncovers. We keep getting surprised at bugs that AutoTest detects and which no one had sought to test before.

A typical set of cases that human programmers seldom test, but which frequently lead to uncovering bugs, involves boundary values. AutoTest, in its “random-plus” strategy, always exercises special values of every type, such as MAXINT, the maximum representable integer. Programmers don’t. They should — all testing textbooks tell them so — but they just don’t, and perhaps they can’t, as the task is often too tedious for a manual process. It is remarkable how many routines using integers go bezerk when you feed them MAXINT or its negative counterpart. Some of the fault counts that seem so outrageous to our referee directly come from trying such values.

Some would say the cases are so extreme as to be insignificant. Wrong. Many documented software failures and catastrophes are due to untested extreme values. Perhaps the saddest is the case of the Patriot anti-missile system, which at the beginning of the first Gulf war was failing to catch Scud missiles, resulting in one case in the killing of twenty-eight American soldiers in an army barrack. It was traced to a software error [5]. To predict the position of the incoming missile, the computation multiplied time by velocity. The time computation used multiples of the time unit, a tenth of a second, stored in a 24-bit register and hence approximated. After enough time, long enough to elapse on the battlefield, but longer than what the tests had exercised, the accumulated error became so large as to cause a significant — and in the event catastrophic — deviation. The unique poser of automatic testing is that unlike human testers it is not encumbered by a priori notions of a situation being extreme or unlikely. It tries all the possibilities it can.

The following example, less portentous in its consequences but just as instructive, is directly related to AutoTest. For his work on model-based contracts [6] performed as part of his PhD completed in 2008 at ETH, Bernd Schoeller developed classes representing the mathematical notion of set. There were two implementations; it turned out that one of them, say SET1, uses data structures that make the subset operation easy to program efficiently; in the corresponding class, the superset operation, ab, is then simply implemented as ba. In the other implementation, say SET2, it is the other way around: is directly implemented, and ab, is implemented as ba. This all uses a nice object-oriented structure, with a general class SET defining the abstract notion and the two implementations inheriting from it.

Now you may see (if you have developed a hunch for automated testing) where this is heading: AutoTest knows about polymorphism and dynamic binding, and tries all the type combinations that make sense. One of the generated test cases has two variables s1 and s2 of type SET, and tries out s2s1; in one of the combinations that AutoTest tries, s1 is dynamically and polymorphically of type SET1 and s2 of type SET2. The version of that it will use is from SET2, so it actually calls s1s2; but this tests the SET1 version of , which goes back to SET2. The process would go on forever, were it not for a timeout in AutoTest that uncovers the fault. Bernd Schoeller had tried AutoTest on these classes not in the particular expectation of finding bugs, but more as a favor to the then incipient development of AutoTest, to see how well the tool could handle model-based contracts. The uncovering of the fault, testament to the power of relentless, systematic automatic testing, surprised us all.

In this case no contract was violated; the problem was infinite recursion, due to a use of O-O techniques that for all its elegance had failed to notice a pitfall. In most cases, AutoTest finds the faults through violated postconditions or class invariants. This is one more reason to be cautious about sweeping generalizations of the kind “I do not believe these bug rates, no serious software that I have seen shows anything of the sort!”. Contracts express semantic properties of the software, which the designer takes care of stating explicitly. In run-of-the-mill code that does not benefit from such care, lots of things can go wrong but remain undetected during testing, only to cause havoc much later during some actual execution.

When you find such a fault, it is irrelevant that the case is extreme, or special, or rare, or trivial. When a failure happens it no longer matter that the fault was supposed to be rare; and you will only know how harmful it is when you deal with the consequences. Testing, single-mindedly  devoted to the uncovering of faults [7], knows no such distinction: it hunts all bugs large and small.

References

[1] The nastiness problem in computer science, article on the CACM blog, 22 August 2011, available here.

[2] Bertrand Meyer, Ilinca Ciupa, Andreas Leitner, Arno Fiva, Yi Wei and Emmanuel Stapf: Programs that Test Themselves, IEEE Computer, vol. 42, no. 9, pages 46-55, September 2009, also available here.

[3] Online AutoTest documentation, available here at docs.eiffel.com.

[4] Ilinca Ciupa, Bertrand Meyer, Manuel Oriol and Alexander Pretschner: Finding Faults: Manual Testing vs. Random+ Testing vs. User Reports, in ISSRE ’08, Proceedings of the 19th IEEE International Symposium on Software Reliability Engineering, Redmond, November 2008, available here.

[5] US General Accounting Office: GAO Report: Patriot Missile Defense– Software Problem Led to System Failure at Dhahran, Saudi Arabia, February 4, 1992, available here.

[6] Bernd Schoeller, Tobias Widmer and Bertrand Meyer: Making Specifications Complete Through Models, in Architecting Systems with Trustworthy Components, eds. Ralf Reussner, Judith Stafford and Clemens Szyperski, Lecture Notes in Computer Science, Springer-Verlag, 2006, available here.

[7] Bertrand Meyer: Seven Principles of Software testing, in IEEE Computer, vol. 41, no. 10, pages 99-101, August 2008available here.

VN:F [1.9.10_1130]
Rating: 9.1/10 (9 votes cast)
VN:F [1.9.10_1130]
Rating: +6 (from 6 votes)

Towards a Calculus of Object Programs

I posted here a draft of a new article, Towards a Calculus of Object Programs.

Here is the abstract:

Verifying properties of object-oriented software requires a method for handling references in a simple and intuitive way, closely related to how O-O programmers reason about their programs. The method presented here, a Calculus of Object Programs, combines four components: compositional logic, a framework for describing program semantics and proving program properties; negative variables to address the specifics of O-O programming, in particular qualified calls; the alias calculus, which determines whether reference expressions can ever have the same value; and the calculus of object structures, a specification technique for the structures that arise during the execution of an object-oriented program.
The article illustrates the Calculus by proving the standard algorithm for reversing a linked list.

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

Testing insights

Lionel Briand and his group at the Simula Research Laboratory in Oslo have helped raise the standard for empirical research in testing and other software engineering practices by criticizing work that in their opinion relies on wrong assumptions or insufficiently supported evidence. In one of their latest papers [1] they take aim at “Adaptive Random Testing” (ART); one of the papers they criticize is from our group at ETH, on the ARTOO extension [2] to this testing method. Let’s examine the criticism!

We need a bit of background on random testing, ART, and ARTOO:

  • Random testing tries inputs based on a random process rather than attempting a more sophisticated strategy; it was once derided as silly [3], but has emerged in recent years as a useful technique. Our AutoTest tool [4], now integrated in EiffelStudio, has shown it to be particularly effective when applied to code equipped with contracts, which provide built-in test oracles. As a result of this combination, testing can be truly automatic: the two most tedious tasks of traditional testing, test case preparation and test oracle definition, can be performed without human intervention.
  • ART, developed by Chen and others [5], makes random testing not entirely random by ensuring that the inputs are spread reasonably evenly in the input domain.
  • ARTOO, part of Ilinca Ciupa’s PhD thesis on testing defended in 2008,   generalized ART to object-oriented programs, by defining a notion of distance between objects; the ARTOO strategy  avoids choosing objects that are too close to each other. The distance formula, which you can find in[2], combines three elementary distances: between the types of the objects involved,  the values in their primitive fields (integers etc.), and, recursively, the objects to which they have references.

Arcuri and Briand dispute the effectiveness of ART and criticize arguments that various papers have used to show its effectiveness. About the ARTOO paper they write

The authors concluded that ART was better than random testing since it needed to sample less test cases before finding the first failure. However, ART was also reported as taking on average 1.6 times longer due to the distance calculations!

To someone not having read our paper the comment and the exclamation mark would seem to suggest that the paper somehow downplays this property of random testing, but in fact it stresses it repeatedly. The property appears for example in boldface as part of the caption to Table 2: In most cases ARTOO requires significantly less tests to find a fault, but entails a time overhead, and again in boldface in the caption to Table 3: The overhead that the distance calculations introduce in the testing process causes ARTOO to require on average 1.6 times more time than RAND to find the first fault.

There is no reason, then, to criticize the paper on this point. It reports the results clearly and fairly.

If we move the focus from the paper to the method, however, Arcuri and Briand have a point. As they correctly indicate, the number of tests to first fault is not a particularly useful criterion. In fact I argued against it in another paper on testing [6]

The number of tests is not that useful to managers, who need help deciding when to stop testing and ship, or to customers, who need an estimate of fault densities. More relevant is the testing time needed to uncover the faults. Otherwise we risk favoring strategies that uncover a failure quickly but only after a lengthy process of devising the test; what counts is total time. This is why, just as flies get out faster than bees, a seemingly dumb strategy such as random testing might be better overall.

(To understand the mention of flies and bees you need to read [6].) The same article states, as its final principle:

Principle 7: Assessment criteria A testing strategy’s most important property is the number of faults it uncovers as a function of time.

The ARTOO paper, which appeared early in our testing work, used “time to first failure” because it has long been a standard criterion in the testing literature, but it should have applied our own advice and focused on more important properties of testing strategies.

The “principles” paper [6] also warned against a risk awaiting anyone looking for new test strategies:

Testing research is vulnerable to a risky thought process: You hit upon an idea that seemingly promises improvements and follow your intuition. Testing is tricky; not all clever ideas prove helpful when submitted to objective evaluation.

The danger is that the clever ideas may result in so much strategy setup time that any benefit on the rest of the testing process is lost. This danger threatens testing researchers, including those who are aware of it.

The idea of ARTOO and object distance remains attractive, but more work is needed to make it an effective contributor to automated random testing and demonstrate that effectiveness. We can be grateful to Arcuri and Briand for their criticism, and I hope they continue to apply their iconoclastic zeal to empirical software engineering work, ours included.

I have objections of my own to their method. They write that “all the work in the literature is based either on simulations or case studies with unreasonably high failure rates”. This is incorrect for our work, which does not use simulations, relying instead on actual, delivered software, where AutoTest routinely finds faults in an automatic manner.

In contrast, however, Arcuri and Briand rely on fault seeding (also known as fault introduction or fault injection):

To obtain more information on how shapes appear in actual SUT, we carried out a large empirical analysis on 11 programs. For each program, a series of mutants were generated to introduce faults in these programs in a systematic way. Faults generated through mutation [allow] us to generate a large number of faults, in an unbiased and varied manner. We generated 3727 mutants and selected the 780 of them with lower detection probabilities to carry out our empirical analysis of faulty region shapes.

In the absence of objective evidence attesting to the realism of fault seeding, I do not believe any insights into testing obtained from such a methodology. In fact we adopted, from the start of our testing work, the principle that we would never rely on fault seeding. The problem with seeded faults is that there is no guarantee they reflect the true faults that programmers make, especially the significant ones. Techniques for fault seeding are understandably good at introducing typographical mistakes, such as a misspelling or the replacement of a “+” by a “-”; but these are not interesting kinds of fault, as they are easily caught by the compiler, by inspection, by low-tech static tools, or by simple tests. Interesting faults are those resulting from a logical error in the programmer’s mind, and in my experience (I do not know of good empirical studies on this topic) seeding techniques do not generate them.

For these reasons, all our testing research has worked on real software, and all the faults that AutoTest has found were real faults, resulting from a programmer’s mistake.

We can only apply this principle because we work with software equipped with contracts, where faults will be detected through the automatic oracle of a violated assertion clause. It is essential, however, to the credibility and practicality of any testing strategy; until I see evidence to the contrary, I will continue to disbelieve any testing insights resulting from studies based on artificial fault injection.

References

[1] Andrea Arcuri and Lionel Briand: Adaptive Random Testing: An Illusion of Effectiveness, in ISSTA 2011 (International Symposium on Software Testing and Analysis), available here.

[2] Ilinca Ciupa, Andreas Leitner, Manuel Oriol and Bertrand Meyer: ARTOO: Adaptive Random Testing for Object-Oriented Software, in ICSE 2008: Proceedings of 30th International Conference on Software Engineering, Leipzig, 10-18 May 2008, IEEE Computer Society Press, 2008, also available here.

[3] Glenford J. Myers. The Art of Software Testing. Wiley, New York, 1979. Citation:

Probably the poorest methodology of all is random-input testing: the process of testing a program by selecting, at random, some subset of all possible input values. In terms of the probability of detecting the most errors, a randomly selected collection of test cases has little chance of being an optimal, or close to optimal, subset. What we look for is a set of thought processes that allow one to select a set of test data more intelligently. Exhaustive black-box and white-box testing are, in general, impossible, but a reasonable testing strategy might use elements of both. One can develop a reasonably rigorous test by using certain black-box-oriented test-case-design methodologies and then supplementing these test cases by examining the logic of the program (i.e., using white-box methods).

[4] Bertrand Meyer, Ilinca Ciupa, Andreas Leitner, Arno Fiva, Yi Wei and Emmanuel Stapf: Programs that Test Themselves, IEEE Computer, vol. 42, no. 9, pages 46-55, September 2009, available here. For practical uses of AutoTest within EiffelStudio see here.

[5] T. Y. Chen, H Leung and I K Mak: Adaptive Random Testing, in  Advances in Computer Science, ASIAN 2004, Higher-Level Decision Making,  ed. M.J. Maher,  Lecture Notes in Computer Science 3321, Springer-Verlag, pages 320-329, 2004, available here.

[6] Bertrand Meyer: Seven Principles of Software testing, in IEEE Computer, vol. 41, no. 10, pages 99-101, August 2008, also available here.

VN:F [1.9.10_1130]
Rating: 8.5/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 7 votes)

If I’m not pure, at least my functions are

..

If I’m not pure, at least my jewels are [1].

..

We often need to be reassured that a routine, usually a function, is “pure”, meaning that it does not change the state of the computation. For example, a function used in a contract element (precondition, postcondition, class invariant, loop invariant) should be purely descriptive, since it is a specification element; evaluating it, typically for testing and debugging, should not create a change of behavior — a “Heisenberg effect” — in the very computation that it is intended to assess. Another application is in a concurrency context, particularly in SCOOP (see earlier posts and forthcoming ones): if one or more functions are pure, several of their executions can run  concurrently on the same object.

The notion of purity admits variants. The usual notion is what  [2] calls weak purity, which precludes changes to previously existing objects but allow creating new objects. In the EiffelBase library we also encounter routines that have another form of purity, which we may call “relative” purity: they can leave the same state on exit as they found on entry, but in-between they might change the state.  For the rest of this discussion we will rely on the standard notion of weak purity: no changes permitted on existing objects.

It is often suggested that the programming language should support specifying that a routine is pure; many people have indeed proposed the addition of a keyword such as pure to Eiffel. One of the reasons this is not — in my opinion — such a great idea is that purity is just a special case of the more general problem of framing: specifying and verifying what a routine does not change. If we can specify an arbitrary frame property, then we can, as a special case covered by the general mechanism, specify that a routine changes nothing.

To see why framing is so important, consider a class ACCOUNT with a routine deposit that has the postcondition

balance = old balance + sum………..— Where sum is the argument of deposit

Framing here means stating that nothing else than balance changes; for example the account’s owner and its number should remain the same. It is not practical to write all individual postcondition clauses such as

owner= old owner
number=
old number

and so on. But we do need to specify these properties and enforce them, if only to avoid that a descendant class (maybe MAFIA_ACCOUNT) distort the rules defined by the original.

One technique is to add a so-called “modifies clause”, introduced by verification tools such as ESC-Java [3] and JML [4]. Modifies clauses raise some theoretical issues; in particular, the list of modified expressions is often infinite, so we must restrict ourselves to an abstract-data-type view where we characterize a class by commands and queries and the modifies clause only involves queries of the class. Many people find this hard to accept at first, since anything that is not talked about can change, but it is the right approach. A modifies clause of sorts, included in the postcondition, appeared in an earlier iteration of the Eiffel specification, with the keyword only (which is indeed preferable to modifies, which in the Eiffel style favoring grammatically simple keywords would be modify, since what we want to express is not that the routine must change anything at all  but that it may only change certain specified results!). The convention worked well with inheritance, since it included the rule that a clause such as only balance, in class  ACCOUNT, prescribes that the routine may not, in its modifies version as well as in any version redefined in descendants, change any other query known at the level of ACCOUNT; but a descendant version may change, subject to its own ACCOUNT clauses, any new query introduced by a descendant.

To declare a routine as pure, it would suffice to use an empty only clause (not very elegant syntactically — “only” what? — but one can get used to it).

This construct has been discarded, as it places too heavy a burden on the programmer-specifier. Here the key observation resulted from a non-scientific but pretty extensive survey I made of all the JML code I could get my hands on. I found that every time a query appeared in a modifies clause it was also listed in the postcondition! On reflection, this seems reasonable: if you are serious about specification, as anyone bothering to write such a clause surely is, you will not just express that something changes and stop there; you will write something about how it may change. Not necessarily the exact result, as in

my_query = precise_final_value

but at least some property of that result, as in

some_property (my_query)

This observation has, however, an inescapable consequence for language design: modifies or only clauses should be inferred by the compiler from the postcondition, not imposed on the programmer as an extra burden. The convention, which we may call the Implicit Framing Rule, is simple:

A routine may change the value of a query only if the query is specified in the routine’s postcondition

(or, if you like double negation, “no routine may change the value of a query other than those specified in its postcondition”). Here we say that a feature is “specified” in a postcondition if it appears there outside of the scope of an old expression. (Clearly, an occurrence such as old balance does not imply that balance can be modified, hence this restriction to occurrences outside of an old.)

With this convention the only clause becomes implicit: it would simply list all the queries specified in the postcondition, so there is no need for the programmer to write it. For the rare case of wanting to specify that a query q may change, but not wanting to specify how, it is easy to provide a library function, say involved, that always return true and can be used in postconditions, as in involved (q).

The convention is of course not just a matter of programming methodology but, in an IDE supporting verification, such as the EVE “Verification As a Matter Of Course” environment which we are building for Eiffel [5], the compiler will enforce the definition above — no change permitted to anything not specified in the postcondition — and produce an error in case of a violation. This also means that we can easily specify that a routine is pure: it must simply not specify any query in its postcondition. It may still list it in an old clause, as happens often in practice, e.g.

Result = old balance – Minimum_balance………..— In the postcondition of a function available_funds

Note the need to use old here. Apart from this addition of old to some postconditions, a considerable advantage of the convention is that most existing code using pure functions will be suitable to the new purity enforcement without any need to provide new annotations.

I believe that this is the only sustainable convention. It does not, of course, solve the frame problem by itself (f orattempts in this direction, see [6, 7]), but it is a necessary condition for a solution that is simple, easily taught, fairly easily implemented, and effective. It goes well with model-based specifications [8, 9], which I believe are the technique of most promise for usable  specifications of object-oriented software. And it provides a straightforward, no-frills way to enforce purity where prescribed by the Command-Query Separation principle [10, 11]: if I’m not pure, at least my functions must be.

References

[1] From the lyrics of the aria Glitter and Be Gay in Leonard Bernstein’s Candide, text by Lillian Hellman and others. Youtube offers several performances, including  by Diana Damrau (here) anddsi Natalie Dessay (here) . For the text see e.g. here.

[2] Adam Darvas and Peter Müller: Reasoning About Method Calls in Interface Specifications, in Journal of Object Technology, Volume 5, no. 5, jUNE 2006, pages 59-85, doi:10.5381/jot.2006.5.5.a3, available here.

[3] C. Flanagan, K.R.M. Leino, M. Lillibridge, G. Nelson, J. B. Saxe and R. Stata: Extended static checking for Java, in PLDI 2002 (Programming Language Design and Implementation), pages 234–245, 2002.

[4] Gary Leavens et al.: Java Modeling Language, see references here.

[5] Julian Tschannen, Carlo A. Furia, Martin Nordio, and Bertrand Meyer: Verifying Eiffel Programs with Boogie, to appear in Boogie 2011, First International Workshop on Intermediate Verification Languages, Wroclaw, August 2011. See documentation about the EVE project on the project page.

[6] Ioannis Kassios: Dynamic Frames: Support for Framing, Dependencies and Sharing Without Restrictions, in Formal Methods 2006, eds. J. Misra, T. Nipkow and E. Sekerinski, Lecture Notes in Computer Science 4085, Springer Verlag, 2006, pages 268-283.

[7] Bernd Schoeller: Making Classes Provable through Contracts, Models and Frames, PhD thesis, ETH Zurich, 2007, available here

[8] Bernd Schoeller, Tobias Widmer and Bertrand Meyer: Making Specifications Complete Through Models, in Architecting Systems with Trustworthy Components, eds. Ralf Reussner, Judith Stafford and Clemens Szyperski, Lecture Notes in Computer Science, Springer-Verlag, 2006, available here.

[9] Nadia Polikarpova, Carlo Furia and Bertrand Meyer: Specifying Reusable Components, in Verified Software: Theories, Tools, Experiments (VSTTE ’10), Edinburgh, UK, 16-19 August 2010, Lecture Notes in Computer Science, Springer Verlag, 2010, available here.

[10] Bertrand Meyer: Object-Oriented Software Construction, first (1988) and second (1997) editions, Prentice Hall.

[11] Bertrand Meyer: Touch of Class: An Introduction to Programming Well, Using Objects and Contracts, Springer Verlag, 2009.

VN:F [1.9.10_1130]
Rating: 10.0/10 (7 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 5 votes)

Another DOSE of distributed software development

The software world is not flat; it is multipolar. Gone are the days of one-site, one-team developments. The increasingly dominant model today is a distributed team; the place where the job gets done is the place where the appropriate people reside, even if it means that different parts of the job get done in different places.

This new setup, possibly the most important change to have affected the practice of software engineering in this early part of the millennium,  has received little attention in the literature; and even less in teaching techniques. I got interested in the topic several years ago, initially by looking at the phenomenon of outsourcing from a software engineering perspective [1]. At ETH, since 2004, Peter Kolb and I, aided by Martin Nordio and Roman Mitin, have taught a course on the topic [2], initially called “software engineering for outsourcing”. As far as I know it was the first course of its kind anywhere; not the first course about outsourcing, but the first to explore the software engineering implications, rather than business or political issues. We also teach an industry course on the same issues [3], attended since 2005 by several hundred participants, and started, with Mathai Joseph from Tata Consulting Services, the SEAFOOD conference [4], Software Engineering Advances For Outsourced and Offshore Development, whose fourth edition starts tomorrow in Saint Petersburg.

After a few sessions of the ETH course we realized that the most important property of the mode of software development explored in the course is not that it involves outsourcing but that it is distributed. In parallel I became directly involved with highly distributed development in the practice of Eiffel Software’s development. In 2007 we renamed the ETH course “Distributed and Outsourced Software Engineering” (DOSE) to acknowledge the broadened scope. The topic is still new; each year we learn a little more about what to teach and how to teach it.

The 2007 session saw another important addition. We felt it was no longer sufficient to talk about distributed development, but that students should practice it. Collaboration between groups in Zurich and other groups in Zurich was not good enough. So we contacted colleagues around the world interested in similar issues, and received an enthusiastic response. The DOSE project is itself distributed: teams from students in different universities collaborate in a single development. Typically, we have two or three geographically distributed locations in each project group. The participating universities have been Politecnico di Milano (where our colleagues Carlo Ghezzi and Elisabetta di Nitto have played a major role in the current version of the project), University of Nijny-Novgorod in Russia, University of Debrecen in Hungary, Hanoi University of Technology in Vietnam, Odessa National Polytechnic in the Ukraine and (across town for us) University of Zurich. For the first time in 2010 a university from the Western hemisphere will join: University of Rio Cuarto in Argentina.

We have extensively studied how the projects actually fare (see publications [4-8]). For students, the job is hard. Often, after a couple of weeks, many want to give up: they have trouble reaching their partner teams, understanding their accents on Skype calls, agreeing on modes of collaboration, finalizing APIs, devising a proper test plan. Yet they hang on and, in most cases, succeed. At the end of the course they tell us how much they have learned about software engineering. For example I know few better way of teaching the importance of carefully documented program interfaces — including contracts — than to ask the students to integrate their modules with code from another team halfway around the globe. This is exactly what happens in industrial software development, when you can no longer rely on informal contacts at the coffee machine or in the parking lot to smooth out misunderstandings: software engineering principles and techniques come in full swing. With DOSE, students learn and practice these fundamental techniques in the controlled environment of a university project.

An example project topic, used last year, was based on an idea by Martin Nordio. He pointed out that in most countries there are some card games played in that country only. The project was to program such a game, where the team in charge of the game logic (what would be the “business model” in an industrial project) had to explain enough of their country’s game, and abstractly enough, to enable the other team to produce the user interface, based on a common game engine started by Martin. It was tough, but some of the results were spectacular, and these are students who will not need more preaching on the importance of specifications.

We are currently preparing the next session of DOSE, in collaboration with our partner universities. The more the merrier: we’d love to have other universities participate, including from the US. Adding extra spice to the project, the topic will be chosen among those from the ICSE SCORE competition [9], so that winning students have the opportunity to attend ICSE in Hawaii. If you are teaching a suitable course, or can organize a student group that will fit, please read the project description [10] and contact me or one of the other organizers listed on the page. There is a DOSE of madness in the idea, but no one, teacher or student,  ever leaves the course bored.

References

[1] Bertrand Meyer: Offshore Development: The Unspoken Revolution in Software Engineering, in Computer (IEEE), January 2006, pages 124, 122-123. Available here.

[2] ETH course page: see here for last year’s session (description of Fall 2010 session will be added soon).

[3] Industry course page: see here for latest (June 2010( session (description of November 2010 session will be added soon).

[4] SEAFOOD 2010 home page.

[5] Bertrand Meyer and Marco Piccioni: The Allure and Risks of a Deployable Software Engineering Project: Experiences with Both Local and Distributed Development, in Proceedings of IEEE Conference on Software Engineering & Training (CSEE&T), Charleston (South Carolina), 14-17 April 2008, ed. H. Saiedian, pages 3-16. Preprint version  available online.

[6] Bertrand Meyer:  Design and Code Reviews in the Age of the Internet, in Communications of the ACM, vol. 51, no. 9, September 2008, pages 66-71. (Original version in Proceedings of SEAFOOD 2008 (Software Engineering Advances For Offshore and Outsourced Development,  Lecture Notes in Business Information Processing 16, Springer Verlag, 2009.) Available online.

[7] Martin Nordio, Roman Mitin, Bertrand Meyer, Carlo Ghezzi, Elisabetta Di Nitto and Giordano Tamburelli: The Role of Contracts in Distributed Development, in Proceedings of SEAFOOD 2009 (Software Engineering Advances For Offshore and Outsourced Development), Zurich, June-July 2009, Lecture Notes in Business Information Processing 35, Springer Verlag, 2009. Available online.

[8] Martin Nordio, Roman Mitin and Bertrand Meyer: Advanced Hands-on Training for Distributed and Outsourced Software Engineering, in ICSE 2010: Proceedings of 32th International Conference on Software Engineering, Cape Town, May 2010, IEEE Computer Society Press, 2010. Available online.

[9] ICSE SCORE 2011 competition home page.

[10] DOSE project course page.

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Programming on the cloud?

I am blogging live from the “Cloud Futures” conference organized by Microsoft in Redmond [1]. We had two excellent keynotes today, by Ed Lazowska [1] and David Patterson.

Lazowska emphasized the emergence of a new kind of science — eScience — based on analysis of enormous amounts of data. His key point was that this approach is a radical departure from “computational science” as we know it, based mostly on large simulations. With the eScience paradigm, the challenge is to handle the zillions of bytes of data that are available, often through continuous streams, in such fields as astronomy, oceanography or biology. It is unthinkable in his view to process such data through super-computing architectures specific to an institution; the Cloud is the only solution. One of the reasons (developed more explicitly in Patterson’s talk) is that cloud computing supports scaling down as well as scaling up. If your site experiences sudden bursts of popularity — say you get slashdotted — followed by downturns, you just cannot size the hardware right.

Lazowska also noted that it is impossible to convince your average  university president that Cloud is the way to go, as he will get his advice from the science-by-simulation  types. I don’t know who the president is at U. of Washington, but I wonder if the comment would apply to Stanford?

The overall argument for cloud computing is compelling. Of course the history of IT is a succession of swings of the pendulum between centralization and delocalization: mainframes, minis, PCs, client-server, “thin clients”, “The Network Is The Computer” (Sun’s slogan in the late eighties), smart clients, Web services and so on. But this latest swing seems destined to define much of the direction of computing for a while.

Interestingly, no speaker so far has addressed issues of how to program reliably for the cloud, even though cloud computing seems only to add orders of magnitude to the classical opportunities for messing up. Eiffel and contracts have a major role to play here.

More generally the opportunity to improve quality should not be lost. There is a widespread feeling (I don’t know of any systematic studies) that a non-negligible share of results generated by computational science are just bogus, the product of old Fortran programs built by generations of graduate students with little understanding of software principles. At the very least, moving to cloud computing should encourage the use of 21-th century tools, languages and methods. Availability on the cloud should also enhance a critical property of good scientific research: reproducibility.

Software engineering is remarkably absent from the list of scientific application areas that speaker after speaker listed for cloud computing. Maybe software engineering researchers are timid, and do not think of themselves as deserving large computing resources; consider, however, all the potential applications, for example in program verification and empirical software engineering. The cloud is a big part of our own research in verification; in particular the automated testing paradigm pioneered by AutoTest [3] fits ideally with the cloud and we are actively working in this direction.

Lazowska mentioned that development environments are the ultimate application of cloud computing. Martin Nordio at ETH has developed, with the help of Le Minh Duc, a Master’s student at Hanoi University of Technology, a cloud-based version of EiffelStudio: CloudStudio, which I will present in my talk at the conference tomorrow. I’ll write more about it in later posts; just one note for the moment: no one should ever be forced again to update or commit.

References

[1] Program of the Cloud Futures conference.

[2] Keynote by Ed Lazowska. You can see his slides here.

[3] Bertrand Meyer, Arno Fiva, Ilinca Ciupa, Andreas Leitner, Yi Wei, Emmanuel Stapf: Programs That Test Themselves. IEEE Computer, vol. 42, no. 9, pages 46-55, September 2009; online version here.

VN:F [1.9.10_1130]
Rating: 7.7/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

Reflexivity, and other pillars of civilization

Let me start, dear reader of this blog, by probing your view of equality, and also of assignment. Two questions:  

  • Is a value x always equal to itself? (For the seasoned programmers in the audience: I am talking about a value, as in mathematics, not an expression, as in programming, which could have a side effect.)
  • In programming, if we consider an assignment

       x := y

and v is the value of y before that assignment (again, this little detour is to avoid bothering with side effects), is the value of x always equal to v after the assignment?  

Maybe I should include here one of these Web polls that one finds on newspaper sites, so that you can vote and compare your answer to the Wisdom of Crowds. My own vote is clear: yes to both. Equality is reflexive (every value is equal to itself, at any longitude and temperature, no excuses and no exceptions); and the purpose of assignment is to make the value of the target equal to the value of the source. Such properties are some of the last ramparts of civilization. If they go away, what else is left?  

754 enters the picture

Now come floating-point numbers and the famous IEEE “754” floating-point standard [1]. Because not all floating point operations yield a result usable as a floating-point number, the standard introduces a notion of “NaN”, Not a Number; certain operations involving floating-point numbers may yield a NaN. The term NaN does not denote a single value but a set of values, distinguished by their “payloads”.  

Now assume that the value of x is a NaN. If you use a programming language that supports IEEE 754 (as they all do, I think, today) the test in  

        if x = x then …  

is supposed to yield False. Yes, that is specified in the standard: NaN is never equal to NaN (even with the same payload); nor is it equal to anything else; the result of an equality comparison involving NaN will always be False.  

Assignment behavior is consistent with this: if y (a variable, or an expression with no side effect) has a NaN value, then after  

        x := y  

the test xy will yield False. 

Before commenting further let me note the obvious: I am by no means a numerics expert; I know that IEEE 754 was a tremendous advance, and that it was designed by some of the best minds in the field, headed by Velvel Kahan who received a Turing Award in part for that success. So it is quite possible that I am about to bury myself in piles of ridicule. On the other hand I have also learned that (1) ridicule does not kill, so I am game; and more importantly (2) experts are often right but not always, and it is always proper to question their reasoning. So without fear let me not stop at the arguments that “the committee must have voted on this point and they obviously knew what they were doing” and “it is the standard and implemented on zillions of machines, you cannot change it now”. Both are true enough, but not an excuse to censor questions.  

What are the criteria?

The question is: compatibility with an existing computer standard is great, but what about compatibility with a few hundred years of mathematics? Reflexivity of equality  is something that we expect for any data type, and it seems hard to justify that a value is not equal to itself. As to assignment, what good can it be if it does not make the target equal to the source value?  

The question of assignment is particularly vivid in Eiffel because we express the expected abstract properties of programs in the form of contracts. For example, the following “setter” procedure may have a postcondition (expressed by the ensure clause):  

        set_x (v: REAL)
                        — Set the value of x (an attribute, also of type REAL) the value of v.
                do
                        …
                        x := v  
                ensure
                        x = v
                end  

   
If you call this procedure with a NaN argument for a compiler that applies IEEE 754 semantics, and monitor contracts at run time for testing and debugging, the execution will report a contract violation. This is very difficult for a programmer to accept.  

A typical example arises when you have an assignment to an item of an array of REAL values. Assume you are executing a [i] := x. In an object-oriented view of the world (as in Eiffel), this is considered simplified syntax  for the routine call a.put (x, i). The postcondition is that a [i] = x. It will be violated!  

The experts’ view

I queried a number of experts on the topic. (This is the opportunity to express my gratitude to members of the IFIP working group 2.5 on numerical software [2], some of the world’s top experts in the field, for their willingness to respond quickly and with many insights.) A representative answer, from Stuart Feldman, was:  

If I remember the debate correctly (many moons ago), NaN represents an indefinite value, so there is no reason to believe that the result of one calculation with unclear value should match that of another calculation with unclear value. (Different orders of infinity, different asymptotic approaches toward 0/0, etc.)  

Absolutely correct! Only one little detail, though: this is an argument against using the value True as a result of the test; but it is not an argument for using the value False! The exact same argument can be used to assert that the result should not be False:  

… there is no reason to believe that the result of one calculation with unclear value should not match that of another calculation with unclear value.  

Just as convincing! Both arguments complement each other: there is no compelling reason for demanding that the values be equal; and there is no compelling argument either to demand that they be different. If you ignore one of the two sides, you are biased.  

What do we do then?

The conclusion is not that the result should be False. The rational conclusion is that True and False are both unsatisfactory solutions. The reason is very simple: in a proper theory (I will sketch it below) the result of such a comparison should be some special undefined below; the same way that IEEE 754 extends the set of floating-point numbers with NaN, a satisfactory solution would extend the set of booleans with some NaB (Not a Boolean). But there is no NaB, probably because no one (understandably) wanted to bother, and also because being able to represent a value of type BOOLEAN on a single bit is, if not itself a pillar of civilization, one of the secrets of a happy life.  

If both True and False are unsatisfactory solutions, we should use the one that is the “least” bad, according to some convincing criterion . That is not the attitude that 754 takes; it seems to consider (as illustrated by the justification cited above) that False is somehow less committing than True. But it is not! Stating that something is false is just as much of a commitment as stating that it is true. False is no closer to NaB than True is. A better criterion is: which of the two possibilities is going to be least damaging to time-honored assumptions embedded in mathematics? One of these assumptions is the reflexivity of equality:  come rain or shine, x is equal to itself. Its counterpart for programming is that after an assignment the target will be equal to the original value of the source. This applies to numbers, and it applies to a NaN as well. 

Note that this argument does not address equality between different NaNs. The standard as it is states that a specific NaN, with a specific payload, is not equal to itself.  

What do you think?

A few of us who had to examine the issue recently think that — whatever the standard says at the machine level — a programming language should support the venerable properties that equality is reflexive and that assignment yields equality.

Every programming language should decide this on its own; for Eiffel we think this should be the specification. Do you agree?  

Some theory

For readers who like theory, here is a mathematical restatement of the observations above. There is nothing fundamentally new in this section, so if you do not like strange symbols you can stop here.  

The math helps explain the earlier observation that neither True nor False is more“committing” than the other. A standard technique (coming from denotational semantics) for dealing with undefinedness in basic data types, is to extend every data type into a lattice, with a partial order relation meaning “less defined than”. The lattice includes a bottom element, traditionally written “” (pronounced “Bottom”) and a top element written (“Top”). represents an unknown value (not enough information) and an error value (too much information). Pictorially, the lattice for natural numbers would look like this:  

Integer lattice

The lattice of integers

On basic types, we always use very simple lattices of this form, with three kinds of element: , less than every other element; , larger than all other elements; and in-between all the normal values of the type, which for the partial order of interest are all equal. (No, this is not a new math in which all integers are equal. The order in question simply means “is less defined than”. Every integer is as defined as all other integers, more defined than , and less defined than .) Such lattices are not very exciting, but they serve as a starting point; lattices with more interesting structures are those applying to functions on such spaces — including functions of functions —, which represent programs.  

The modeling of floating-point numbers with NaN involves such a lattice; introducing NaN means introducing a value. (Actually, one might prefer to interpret NaN as , but the reasoning transposes immediately.)  More accurately, since there are many NaN values, the lattice will look more like this:

Float lattice

The lattice of floats in IEEE 754

For simplicity we can ignore the variety of NaNs and consider a single .

Functions on lattices — as implemented by programs — should satisfy a fundamental property: monotonicity. A function f  is monotone (as in ordinary analysis) if, whenever xy, then f (x) ≤ f (y). Monotonicity is a common-sense requirement: we cannot get more information from less information. If we know less about x than about y, we cannot expect that any function (with a computer, any program) f will, out of nowhere, restore the missing information.  

Demanding monotonicity of all floating-point operations reflects this exigency of monotonicity: indeed, in IEEE 754, any arithmetic operation — addition, multiplication … — that has a NaN as one of its arguments must yield a Nan as its result. Great. Now for soundness we should also have such a property for boolean-valued operations, such as equality. If we had a NaB as the  of booleans, just like NaN is the  of floating-point numbers,  then the result of NaN = NaN would be NaB. But the world is not perfect and the IEEE 754 standard does not govern booleans. Somehow (I think) the designers thought of False as somehow less defined than True. But it is not! False is just as defined as True in the very simple lattice of booleans; according to the partial order, True and False are equal for the relevant partial order:

Boolean lattice

The lattice of booleans

Because any solution that cannot use a NaB will violate monotonicity and hence will be imperfect, we must resort to heuristic criteria. A very strong criterion in favor of choosing True is reflexivity — remaining compatible with a fundamental property of equality. I do not know of any argument for choosing False. 

The Spartan approach

There is, by the way, a technique that accepts booleans as we love them (with two values and no NaB) and achieves mathematical rigor. If operations involving NaNs  truly give us pimples, we can make any such operation trigger an exception. In the absence of values,  this is an acceptable programming technique for representing undefinedness. The downside, of course, is that just about everywhere the program must be ready to handle the exception in some way. 

It is unlikely that in practice many people would be comfortable with such a solution. 

Final observations

Let me point out two objections that I have seen. Van Snyder wrote: 

NaN is not part of the set of floating-point numbers, so it doesn’t behave as if “bottom” were added to the set. 

Interesting point, but, in my opinion not applicable: is indeed not part of the mathematical set of floating point numbers, but in the form of NaN it is part of the corresponding type (float in C, REAL in Eiffel); and the operations of the type are applicable to all values, including NaN if, as noted, we have not taken the extreme step of triggering an exception every time an operation uses a NaN as one of its operands. So we cannot free ourselves from the monotonicity concern by just a sleight of hands. 

Another comment, also by Van Snyder (slightly abridged): 

Think of [the status of NaN] as a variety of dynamic run-time typing. With static typing, if  x is an integer variable and y

        x := y 

does not inevitably lead to 

        x = y

 True; and a compelling argument to require that conversions satisfy equality as a postcondition! Such  reasoning — reflexivity again — was essential in the design of the Eiffel conversion mechanism [3], which makes it possible to define conversions between various data types (not just integers and reals and the other classical examples, but also any other user types as long as the conversion does not conflict with inheritance). The same conversion rules apply to assignment and equality, so that yes, whenever the assignment x := y is permitted, including when it involves a conversion, the property x = y  is afterwards always guaranteed to hold.

It is rather dangerous indeed to depart from the fundamental laws of mathematics. 

References

[1] IEEE floating-point standard, 754-2008; see summary and references in the Wikipedia entry.  

[2] IFIP Working Group 2.5 on numerical software: home page

[3] Eiffel standard (ECMA and ISO), available on the ECMA site.

VN:F [1.9.10_1130]
Rating: 9.6/10 (21 votes cast)
VN:F [1.9.10_1130]
Rating: +17 (from 19 votes)

More expressive loops for Eiffel

New variants of the loop construct have been introduced into Eiffel, allowing a safer, more concise and more abstract style of programming. The challenge was to remain compatible with the basic loop concept, in particular correctness concerns (loop invariant and loop variant), to provide a flexible mechanism that would cover many different cases of iteration, and to keep things simple.

Here are some examples of the new forms. Consider a structure s, such as a list, which can be traversed in sequence. The following loop prints all elements of the list:

      across s as c loop print (c.item) end

(The procedure print is available on all types, and can be adapted to any of them by redefining the out feature; item gives access to individual values of the list.) More about c in just a moment; in the meantime you can just consider consider that using “as c” and manipulating the structure through c rather than directly through s  is a simple idiom to be learned and applied systematically for such across loops.

The above example is an instruction. The all and some variants yield boolean expressions, as in

across s as c all c.item > 0 end

which denotes a boolean value, true if and only if all elements of the list are positive. To find out if at least one is positive, use

across s as c some c.item > 0 end

Such expressions could appear, for example, in a class invariant, but they may be useful in many different contexts.

In some cases, a from clause is useful, as in

        across s as c from sum := 0  loop sum := sum + c.index c.item end
— Computes Σ i * s [i]

The original form of loop in Eiffel is more explicit, and remains available; you can achieve the equivalent of the last example, on a suitable structure, as

A list and a cursor

A list and a cursor

      from
sum := 0 ; s.start
until
s.after
loop
sum := sum + s.index s.item
s.forth

        end

which directly manipulates a cursor through s, using start to move it to the beginning, forth to advance it, and after to test if it is past the last element. The forms with across achieve the same purpose in a more concise manner. More important than concision is abstraction: you do not need to worry about manipulating the cursor through start, forth and after. Under the hood, however, the effect is the same. More precisely, it is the same as in a loop of the form

from
sum := 0 ; c.start
until
c.after
loop
sum := sum + c.index c.item
c.forth

        end

where c is a cursor object associated with the loop. The advantage of using a cursor is clear: rather than keeping the state of the iteration in the object itself, you make it external, part of a cursor object that, so to speak, looks at the list. This means in particular that many traversals can be active on the same structure at the same time; with an internal cursor, they would mess up with each other, unless you manually took the trouble to save and restore cursor positions. With an external cursor, each traversal has its own cursor object, and so does not interfere with other traversals — at least as long as they don’t change the structure (I’ll come back to that point).

With the across variant, you automatically use a cursor; you do not have to declare or create it: it simply comes as a result of the “as c” part of the construct, which introduces c as the cursor.

On what structures can you perform such iterations? There is no limitation; you simply need a type based on a class that inherits, directly or indirectly, from the library class ITERABLE. All relevant classes from the EiffelBase library have been updated to provide this inheritance, so that you can apply the across scheme to lists of all kinds, hash tables, arrays, intervals etc.

One of these structures is the integer interval. The notation  m |..| n, for integers m and n, denotes the corresponding integer interval. (This is not a special language notation, simply an operator, |..|, defined with the general operator mechanism as an alias for the feature interval of INTEGER classes.) To iterate on such an interval, use the same form as in the examples above:

        across m |..| n  as c from sum := 0  loop sum := sum + a [c.item] end
— Computes Σ a [i], for i ranging from m to n, for an array (or other structure) a

The key feature in ITERABLE is new_cursor, which returns a freshly created cursor object associated with the current structure. By default it is an ITERATION_CURSOR, the most general cursor type, but every descendant of ITERABLE can redefine the result type to something more specific to the current structure. Using a cursor — c in the above examples —, rather than manipulating the structure s directly, provides considerable flexibility thanks to the property that ITERATION_CURSOR itself inherits from ITERABLE   and hence has all the iteration mechanisms. For example you may write

across s.new_cursor.reversed as c loop print (c.item) end

to print elements in reverse order. (Using Eiffel’s operator  mechanism, you may write s.new_cursor, with a minus operator, as a synonym for new_cursor.reversed.) The function reversed gives you a new cursor on the same target structure, enabling you to iterate backwards. Or you can use

        across s.new_cursor + 3 as c loop print (c.item) end

(using s.new_cursor.incremented (3) rather than s.new_cursor + 3 if you are not too keen on operator syntax) to iterate over every third item. A number of other possibilities are available in the cursor classes.

A high-level iteration mechanism is only safe if you have the guarantee that the structure remains intact throughout the iteration. Assume you are iterating through a structure

across  as c loop some_routine end

and some_routine changes the structure s: the whole process could be messed up. Thanks to Design by Contract mechanisms, the library protects you against such mistakes. Features such as item and index, accessing properties of the structure during the iteration, are equipped with a precondition clause

require
is_valid

and every operation that changes the structure sets is_valid to false. So as soon as any change happens, you cannot continue the iteration; all you can do is restart a new one; the command start, used internally to start the operation, does not have the above precondition.

Sometimes, of course, you do want to change a structure while traversing it; for example you may want to add an element just to the right of the iteration position. If you know what you are doing that’s fine. (Let me correct this: if you know what you are doing, express it through precise contracts, and you’ll be fine.) But then you should not use the abstract forms of the loop, across; you should control the iteration itself through the basic form from … until with explicit cursor manipulation, protected by appropriate contracts.

The two styles, by the way, are not distinct constructs. Eiffel has always had only one form of loop and this continues the case. The across forms are simply new possibilities added to the classical loop construct, with obvious constraints stating for example that you may not have both a some or all form and an explicit  loop body.  In particular, an across loop can still have an invariant clause , specifying the correctness properties of the loop, as in

        across s as c from sum := 0  invariant sum = sigma (s, index)  loop sum := sum + c.index c.item end

EiffelStudio 6.5 already included the language update; the library update (not yet including the is_valid preconditions for data structure classes) will be made available this week.

These new mechanisms should increase the level of abstraction and the reliability of loops, a fundamental element of  almost all programs.

VN:F [1.9.10_1130]
Rating: 7.5/10 (19 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 13 votes)

Just another day at the office

In the past few weeks I wrote a program to compute the aliases of variables and expressions in an object-oriented program (based on a new theory [1]).

For one of the data structures, I needed a specific notion of equality, so I did the standard thing in Eiffel: redefine the is_equal function inherited from the top class ANY, to implement the desired variant.

The first time I ran the result, I got a postcondition violation. The violated postcondition clauses was not even any that I wrote: it was an original postcondition of is_equal (other: like Current)  in ANY, which my redefinition inherited as per the rules of Design by Contract; it reads

symmetric: Result implies other ~ Current

meaning: equality is symmetric, so if Result is true, i.e. the Current object is equal to other, then other must also be equal to Current. (~ is object equality, which applies the local version is is_equal).  What was I doing wrong? The structure is a list, so the code iterates on both the current list and the other list:

from
    start ; other.start ; Result := True
until (not Result) or after loop
        if other.after then Result := False else
              Result := (item ~ other.item)
              forth ; other.forth
        end
end

Simple enough: at each position check whether the item in the current list is equal to the item in the other list, and if so move forth in both the current list and the other one; stop whenever we find two unequal elements, or we exhaust either list as told by after list. (Since is_equal is a function and not produce any side effect, the actual code saves the cursors before the iteration and restores them afterwards. Thanks to Ian Warrington for asking about this point in a comment to this post. The new across loop variant described in  two later postings uses external cursors and manages them automatically, so this business of maintaining the cursor manually goes away.)

The problem is that with this algorithm it is possible to return True if the first list was exhausted but not the second, so that the first list is a subset of the other rather than identical. The correction is immediate: add

Result and other_list.after

after the loop; alternatively, enclose the loop in a conditional so that it is only executed if count = other.count (this solution is  better since it saves much computation in cases of lists of different sizes, which cannot be equal).

The lesson (other than that I need to be more careful) is that the error was caught immediately, thanks to a postcondition violation — and one that I did not even have to write. Just another day at the office; and let us shed a tear for the poor folks who still program without this kind of capability in their language and development environment.

Reference

[1] Bertrand Meyer: The Theory and Calculus of Aliasing, draft paper, available here.

VN:F [1.9.10_1130]
Rating: 9.6/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)