Archive for the ‘Programming techniques’ Category.

LASER summer school on software for robotics: last call for registration

Much of the progress in robotics is due to software advances, and software issues remain at the heart of the formidable challenges that remain. The 2017 LASER summer school, held in September in Elba, brings together some of the most prestigious international experts in the area.

The LASER school has established itself as one of the principal forums to discussed advanced software issues. The 2017 school takes place from 9 to 17 September in the idyllic setting of the Hotel del Golfo in Procchio, Elba Island, Italy.

Robotics is progressing at an amazing pace, bringing improvements to almost areas of human activity. Today’s robotics systems rely ever more fundamentally on complex software, raising difficult issues. The LASER 2017 summer school covers both the current state of robotics software technology and open problems. The lecturers are top international experts with both theoretical contributions and major practical achievements in developing robotics systems.
The LASER school is intended for professionals from the industry (engineers and managers) as well as university researchers, including PhD students. Participants learn about the most important software technology advances from the pioneers in the field. The school’s focus is applied, although theory is welcome to establish solid foundations. The format of the school favors extensive interaction between participants and speakers.

We have lined up an impressive roster of speakers from the leading edge of both industry and academia:

Rodolphe Gélin, Aldebaran Robotics
Ashish Kapoor, Microsoft Research
Davide Brugali, University of Bergamo, on Managing software variability in robotic control systems
Nenad Medvidovic, University of Southern California, on Software Architectures of Robotics Systems
Bertrand Meyer, Politecnico di Milano & Innopolis University, on Concurrent Object-Oriented Robotics Software
Issa Nesnas, NASA Jet Propulsion Laboratory, on Experiences from robotic software development for research and planetary flight robots
Hiroshi (“Gitchang”) Okuno, Waseda University & Kyoto University, on Open-Sourced Robot Audition Software HARK: Capabilities and Applications

The school takes place at the magnificent Hotel del Golfo in the Gulf of Procchio, Elba. Along with an intensive scientific program, participants will have time to enjoy the countless natural and cultural riches of this wonderful, history-laden jewel of the Mediterranean.

For more information about the school, the speakers and registration see the LASER site.

VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

The perils of feature interaction

One of the most delicate aspects of design is feature interaction. As users, we suffer daily from systems offering features that individually make sense but clash with each other. In my agile book [1] I explained in detail, building on the work of Pamela Zave, why this very problem makes one of the key ideas of agile methods,  the reliance on “user stories” for requirements, worthless and damaging.

A small recent incident reminded me of the perils of feature interaction. I used my Lenovo W540 laptop without power for a short while, then reached a sedentary location and plugged it in. Hence my surprise when, some hours later, it started beeping to alert me that it was running out of battery. The natural reactions — check the outlet and the power cord — had no effect. I found the solution, but just in time: otherwise, including if I had not heard the warning sound, I would have been unable to use the laptop any further. That’s right: I would not have been able to restart the computer at all, even with access to a power outlet, and even though it was perfectly functional and so was its (depleted) battery. The reason is that the problem arose from a software setting, which (catch-22 situation) I could not correct without starting the computer [2].

The only solution would have been to find another, non-depleted battery. That is not a trivial matter if you have traveled with your laptop outside of a metropolis: the W540 has a special battery which ordinary computer shops do not carry [3].

The analysis of what made such a situation possible must start with the list of relevant hardware and software product features.

Hardware:

  • HA. This Lenovo W series includes high-end laptops with high power requirements, which the typical 65-watt airplane power jack does not satisfy.
  • HB. With models prior to the W540, if you tried to connect a running laptop to the power supply in an airplane, it would not charge, and the power indicator would start flickering.  But you could still charge it if you switched it off.
  • HC. The W540 effectively requires 135 watts and will not take power from a 65-watt power source under any circumstances.

Software:

  • SA. The operating system (this discussion assumes Windows) directly reflects HC by physically disabling charging if the laptop is in the “Airplane” power mode.
  • SB. If you disable wireless, the operating system automatically goes into the “Airplane” power mode.
  • SC. In the “Airplane” power mode, the laptop, whether or not connected through a charger to a power outlet of any wattage, will not charge. The charging function is just disabled.
  • SD. One can edit power modes to change parameters, such as time to automatic shutoff, but the no-charging property in Airplane mode is not editable and not even mentioned in the corresponding UI dialog. It seems to be a behind-the-scenes property magically attached to the power-mode name “Airplane”.
  • SE. There is a function key for disabling wireless: F8. As a consequence of SB it also has the effect of switching to “Airplane” mode.
  • SF. Next to F8 on the keyboard is F7.
  • SG. F7 serves to display the screen content on another monitor (Windows calls it a “projector”). F7 offers a cyclic set of choices: laptop only, laptop plus monitor etc.
  • SH. In the old days (like five years ago), such function keys setting important operating system parameters on laptops used to be activated only if you held them together with a special key labeled “Fn”. For some reason (maybe the requirement was considered too complicated for ordinary computer users) the default mode on Lenovo laptops does not use the “Fn” key anymore: you just press the desired key, such as F7 or F8.
  • SI. You can revert to the old mode, requiring pressing “Fn”, by going into the BIOS and performing some not-absolutely-trivial steps, making this possibility the preserve of techies. (Helpfully, this earlier style is called “Legacy mode”, as a way to remind you that your are an old-timer, probably barely graduated from MS-DOS and still using obsolete conventions. In reality, the legacy mode is the right one to use, whether for techies or novices: it is all too easy to hit a function key by mistake and get totally unexpected results. The novice, not the techie, is the one who will be completely confused and panicked as a result. The first thing I do with a new laptop is to go to the BIOS and set legacy mode.)

By now you have guessed what happened in my case, especially once you know that I had connected the laptop to a large monitor and had some trouble getting that display to work. In the process I hit Fn-F7 (feature SG) several times.  I must have mistakenly (SF) pressed F8 instead of F7 at some point. Normally, Legacy mode (SI) should have made me immune to the effects of hitting a function key by mistake, but I did use the neighboring key F7 for another purpose. Hitting F8 disabled wireless (SE) and switched on Airplane power mode (SB). At that point the laptop, while plugged in correctly, stopped charging (SC, SD).

How did I find out? Since I was looking for a hardware problem I could have missed the real cause entirely and ended up with a seemingly dead laptop. Fortunately I opened the Power Options dialog to see what it said about the battery. I noticed that among the two listed power plans the active one was not “Power Saver”, to which I am used, but “Airplane”. I did not immediately pay  attention to that setting; since I had not used the laptop for a while I just thought that maybe the last time around I had switched on “Airplane”, even though that made little sense since I was not even aware of the existence of that option. After trying everything else, though, I came back to that intriguing setting, changed to the more usual “Power Saver”, and the computer started to charge again. I was lucky to have a few percent of battery still left at that point.

Afterwards I found a relevant discussion thread on a Lenovo user forum.

As is often the case in such feature-interaction mishaps, most of the features make sense individually [4]. What causes trouble is some unforeseen combination of features.

There is no sure way to avoid such trouble, but there is a sure way to cause it: design a system feature by feature, as with user stories in agile development. The system must do this and it must do that. Oh, by the way, it must also do that. And that. User stories have one advantage: everyone understands them. But that is also their limitation. Good requirements and design require professionals who can see the whole beyond the parts.

A pernicious side of this situation is that many people believe that use cases and user stories are part of object-oriented analysis, whereas the OO approach to requirements and design is the reverse: rise above individual examples to uncover the fundamental abstractions.

As to my laptop, it is doing well, thanks. And I will be careful with function keys.

Reference and notes

[1] Bertrand Meyer: Agile! The Good, the Hype and the Ugly, Springer, 2014,  Amazon page: here, book page: here. A description of the book appeared here on this blog at the time of publication.

[2] Caveat: I have not actually witnessed this state in which a plugged-in laptop will not restart. The reason is simply that I do not have an alternate battery at the moment so I cannot perform the experiment with the almost certain result of losing the use of my laptop. I will confirm the behavior as soon as I have access to a spare battery.

[3] It has been my systematic experience over the past decade and a half that Lenovo seems to make a point, every couple of years, to introduce new models with incompatible batteries and docking stations. (They are also ever more incredibly bulky, with the one for the W540 almost as heavy as the laptop itself. On the other hand the laptops are good, otherwise I would not be bothering with them.)

[4] One exception here is feature SB: switching wireless off does not necessaril y mean you want to select a specific power mode! It is a manifestation of the common syndrome  of software tools that think they are smarter than you, and are not. Another exception is SE: to let a simple key press change fundamental system behavior is to court disaster. But I had protected myself by using legacy mode and was hit anyway.

VN:F [1.9.10_1130]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

AutoProof workshop: Verification As a Matter of Course

The AutoProof technology pursues the goal of “Verification As a Matter Of Course”, integrated into the EVE development environment. (The AutoProof  project page here; see particularly the online interactive tutorial.) A one-day workshop devoted to the existing AutoProof and current development will take place on October 1 near Toulouse in France. It is an informal event (no proceedings planned at this point, although based on the submissions we might decide to produce a volume), on a small scale, designed to bring together people interested in making the idea of practical verification a reality.

The keynote will be given by Rustan Leino from Microsoft Research, the principal author of the Boogie framework on which the current implementation of AutoProof relies.

For submissions (or to attend without submitting) see the workshop page here. You are also welcome to contact me for more information.

VN:F [1.9.10_1130]
Rating: 8.2/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

Robotics and concurrency

Many robotics applications are by nature concurrent; in his ongoing PhD work, Andrey Rusakov [1] is building a comprehensive concurrent robot programming framework, Roboscoop [2], based on the SCOOP model for simple concurrent object-oriented programming [3] and the Ros operating system. As part of this work it is important to know how much robotics applications use concurrency, stay away from concurrency — perhaps because programmers are afraid of the risks — and could benefit from more concurrency. Rusakov has prepared a questionnaire [4] to help find out. If you have experience in robot programming, please help him by answering the questionnaire, which takes only a few minutes.

References

[1] Rusakov’s home page here.

[2] Roboscoop project page, here,

[3] Simple Concurrent Object-Oriented Programming, see here.

[4] The questionnaire is here.

VN:F [1.9.10_1130]
Rating: 6.3/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: -1 (from 1 vote)

Design by Contract: ACM Webinar this Thursday

A third ACM webinar this year (after two on agile methods): I will be providing a general introduction to Design by Contract. The date is this coming Thursday, September 17, and the time is noon New York (18 Paris/Zurich, 17 London, 9 Los Angeles, see here for hours elsewhere). Please tune in! The event is free but requires registration here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

New paper: Theory of Programs

Programming, wrote Dijkstra many years ago, is a branch of applied mathematics. That is only half of the picture: the other half is engineering, and this dual nature of programming is part of its attraction.

Descriptions of the mathematical side are generally, in my view, too complicated. This article [1] presents a mathematical theory of programs and programming based on concepts taught in high school: elementary set theory. The concepts covered include:

  • Programming.
  • Specification.
  • Refinement.
  • Non-determinism.
  • Feasibility.
  • Correctness.
  • Programming languages.
  • Kinds of programs: imperative, functional, object-oriented.
  • Concurrency (small-step and large-step)
  • Control structures (compound, if-then-else and Dijkstra-style conditional, loop).
  • State, store and environment.
  • Invariants.
  • Notational conventions for building specifications and programs incrementally.
  • Loop invariants and variants.

One of the principal ideas is that a program is simply the description of a mathematical relation. The program text is a rendering of that relation. As a consequence, one may construct programming languages simply as notations to express certain kinds of mathematics. This approach is the reverse of the usual one, where the program text and its programming languages are the starting point and the center of attention: theoreticians develop techniques to relate them to mathematical concepts. It is more effective to start from the mathematics (“unparsing” rather than parsing).

All the results (74 properties expressed formally, a number of others in the text) are derived as theorems from rules of elementary set theory; there are no new axioms whatsoever.

The paper also has a short version [2], omitting proofs and many details.

References

[1] Theory of Programs, available here.
[2] Theory of Programs, short version of [1] (meant for quick understanding of the ideas, not for publication), available here.

 

VN:F [1.9.10_1130]
Rating: 6.4/10 (18 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 9 votes)

Framing the frame problem (new paper)

Among the open problems of verification, particularly the verification of object-oriented programs, one of the most vexing is framing: how to specify and verify what programs element do not change. Continuing previous work, this article presents a “double frame inference” method, automatic on both sides the specification and verification sides. There is no need to write frame specifications: they will be inferred from routine postconditions. For verification, the method computes the set of actually changed properties through a “change calculus”, itself based on the previously developed alias calculus.

Some verification techniques, such as Hoare-style proofs, require significant annotation effort and potentially yield full functional verification; others, such as model checking and abstract interpretation, have more limited goals but seek full automation. Framing, in my opinion, should be automatic, freeing the programmer-verifier to devote the annotation effort to truly interesting properties.

Reference

[1] Bertrand Meyer: Framing the Frame Problem, in Dependable Software Systems, Proceedings of August 2014 Marktoberdorf summer school, eds. Alexander Pretschner, Manfred Broy and Maximilian Irlbeck, NATO Science for Peace and Security, Series D: Information and Communication Security, Springer, 2015 (to appear), pages 174-185; preprint available here.

VN:F [1.9.10_1130]
Rating: 8.8/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

Lampsort

 

In support of his view of software methodology, Leslie Lamport likes to use the example of non-recursive Quicksort. Independently of the methodological arguments, his version of the algorithm should be better known. In fact, if I were teaching “data structures and algorithms” I would consider introducing it first.

As far as I know he has not written down his version in an article, but he has presented it in lectures; see [1]. His trick is to ask the audience to give a non-recursive version of Quicksort, and of course everyone starts trying to remove the recursion, for example by making the stack explicit or looking for invertible functions in calls. But his point is that recursion is not at all fundamental in Quicksort. The recursive version is a specific implementation of a more general idea.

Lamport’s version — let us call it Lampsort —is easy to express in Eiffel. We may assume the following context:

a: ARRAY [G -> COMPARABLE]        — The array to be sorted.
pivot: INTEGER                                      —  Set by partition.
picked: INTEGER_INTERVAL            — Used by the sorting algorithm, see below.
partition (i, j: INTEGER)
……..require      — i..j is a sub-interval of the array’s legal indexes:
……..……..i < j
……..……..i >= a.lower
……..……..j <= a.upper
……..do
……..……..… Usual implementation of partition
……..ensure     — The expected effect of partition:
……..……..pivot >= i
……..……..pivot < j
……..……..a [i..j] has been reshuffled so that elements in i..pivot are less than
……..……..or equal to those in pivot+1 .. j.
……..end

We do not write the implementation of partition since the point of the present discussion is the overall algorithm. In the usual understanding, that algorithm consists of doing nothing if the array has no more than one element, otherwise performing a partition and then recursively calling itself on the two resulting intervals. The implementation can take advantage of parallelism by forking the recursive calls out to different processors. That presentation, says Lamport, describes only a possible implementation. The true Quicksort is more general. The algorithm works on a set not_sorted of integer intervals i..j such that the corresponding array slices a [i..j] are the only ones possibly not sorted; the goal of the algorithm is to make not_sorted empty, since then we know the entire array is sorted. In Eiffel we declare this set as:

not_sorted: SET [INTEGER_INTERVAL]

The algorithm initializes not_sorted to contain a single element, the entire interval; at each iteration, it removes an interval from the set, partitions it if that makes sense (i.e. the interval has more than one element), and inserts the resulting two intervals into the set. It ends when not_sorted is empty. Here it is:

……..from                                 — Initialize interval set to contain a single interval, the array’s entire index range:
……..…..create not_sorted.make_one (a.lower |..| a.upper)….         ..……..
……..invariant
……..…..— See below
……..until
……..…..not_sorted.is_empty                                                            — Stop when there are no more intervals in set
……..loop
……..…..picked := not_sorted.item                                                     — Pick an interval from (non-empty) interval set.
……..……if picked.count > 1 then                                                      — (The precondition of partition holds, see below.)
……..……..…..partition (picked.lower, picked.upper)                    — Split it, moving small items before and large ones after pivot.
……..……..…..not_sorted.extend (picked.lower |..| pivot)            — Insert new intervals into the set of intervals: first
……..……....not_sorted.extend (pivot + 1 |..| picked.upper)     — and second.
……..……end
……..…...not_sorted.remove (picked)                                               — Remove interval that was just partitioned.
…….end

Eiffel note: the function yielding an integer interval is declared in the library class INTEGER using the operator |..| (rather than just  ..).

The query item from SET, with the precondition not is_empty,  returns an element of the set. It does not matter which element. In accordance with the Command-Query Separation principle, calling item does not modify the set; to remove the element you have to use the command remove. The command extend adds an element to the set.

The abstract idea behind Lampsort, explaining why it works at all, is the following loop invariant (see [2] for a more general discussion of how invariants provide the basis for understanding loop algorithms). We call “slice” of an array a non-empty contiguous sub-array; for adjacent slices we may talk of concatenation; also, for slices s and t s <= t means that every element of s is less than or equal to every element of t. The invariant is:

a is the concatenation of the members of a set slices of disjoint slices, such that:
– The elements of a are a permutation of its original elements.
– The index range of any member  of slices having more than one element is in not_sorted.
– For any adjacent slices s and t (with s before t), s <= t.

The first condition (conservation of the elements modulo permutation) is a property of partition, the only operation that can modify the array. The rest of the invariant is true after initialization (from clause) with slices made of a single slice, the full array. The loop body maintains it since it either removes a one-element interval from not_sorted (slices loses the corresponding slice) or performs partition with the effect of partitioning one slice into two adjacent ones satisfying s <= t, whose intervals replace the original one in not_sorted. On exit, not_sorted is empty, so slices is a set of one-element slices, each less than or equal to the next, ensuring that the array is sorted.

The invariant also ensures that the call to partition satisfies that routine’s precondition.

The Lampsort algorithm is a simple loop; it does not use recursion, but relies on an interesting data structure, a set of intervals. It is not significantly longer or more difficult to understand than the traditional recursive version

sort (i, j: INTEGER)
……..require
……..……..i <= j
……..……..i >= a.lower
……..……..j <= a.upper
……..do
……..……if j > i then                    — Note that precondition of partition holds.
……..……..…..partition (i, j)         — Split into two slices s and t such that s <= t.
……..……..…..sort (i, pivot)          — Recursively sort first slice.
……..……..…..sort (pivot+1, j)      — Recursively sort second slice.
……..……end……..…..
……..end

Lampsort, in its author’s view, captures the true idea of Quicksort; the recursive version, and its parallelized variants, are only examples of possible implementations.

I wrote at the start that the focus of this article is Lampsort as an algorithm, not issues of methodology. Let me, however, give an idea of the underlying methodological debate. Lamport uses this example to emphasize the difference between algorithms and programs, and to criticize the undue attention being devoted to programming languages. He presents Lampsort in a notation which he considers to be at a higher level than programming languages, and it is for him an algorithm rather than a program. Programs will be specific implementations guided in particular by efficiency considerations. One can derive them from higher-level versions (algorithms) through refinement. A refinement process may in particular remove or restrict non-determinism, present in the above version of Lampsort through the query item (whose only official property is that it returns an element of the set).

The worldview underlying the Eiffel method is almost the reverse: treating the whole process of software development as a continuum; unifying the concepts behind activities such as requirements, specification, design, implementation, verification, maintenance and evolution; and working to resolve the remaining differences, rather than magnifying them. Anyone who has worked in both specification and programming knows how similar the issues are. Formal specification languages look remarkably like programming languages; to be usable for significant applications they must meet the same challenges: defining a coherent type system, supporting abstraction, providing good syntax (clear to human readers and parsable by tools), specifying the semantics, offering modular structures, allowing evolution while ensuring compatibility. The same kinds of ideas, such as an object-oriented structure, help on both sides. Eiffel as a language is the notation that attempts to support this seamless, continuous process, providing tools to express both abstract specifications and detailed implementations. One of the principal arguments for this approach is that it supports change and reuse. If everything could be fixed from the start, maybe it could be acceptable to switch notations between specification and implementation. But in practice specifications change and programs change, and a seamless process relying on a single notation makes it possible to go back and forth between levels of abstraction without having to perform repeated translations between levels. (This problem of change is, in my experience, the biggest obstacle to refinement-based approaches. I have never seen a convincing description of how one can accommodate specification changes in such a framework without repeating the whole process. Inheritance, by the way, addresses this matter much better.)

The example of Lampsort in Eiffel suggests that a good language, equipped with the right abstraction mechanisms, can be effective at describing not only final implementations but also abstract algorithms. It does not hurt, of course, that these abstract descriptions can also be executable, at the possible price of non-optimal performance. The transformation to an optimal version can happen entirely within the same method and language.

Quite apart from these discussions of software engineering methodology, Lamport’s elegant version of Quicksort deserves to be known widely.

References

[1] Lamport video here, segment starting at 0:32:34.
[2] Carlo Furia, Bertrand Meyer and Sergey Velder: Loop invariants: Analysis, Classification and Examples, in ACM Computing Surveys, September 2014, preliminary text here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (12 votes cast)
VN:F [1.9.10_1130]
Rating: +8 (from 8 votes)

A gold medal

The French National Research Center (CNRS) has just awarded [1] its annual gold medal to Gérard Berry, a great recognition for an outstanding computer scientist. I first discovered Berry’s work through a brilliant 1976 article, Bottom-Up Computation of Recursive Programs [2], which explained recursion using methods of denotational semantics, and should really figure in collections of classic CS articles. One can hardly understand dynamic programming algorithms, for example, without realizing that they are bottom-up versions of recursive algorithms, through transformations described in the article. I relied extensively on its techniques for my own textbooks, from the advanced concepts of Introduction to the Theory of Programming Languages all the way down to the introductory presentation of recursion in Touch of Class.

Berry then went on to play a founding role in synchronous languages, explaining (in a paper whose reference I don’t have right now) that he went through a revelation of sorts after realizing that the automatic-control industry needed languages completely different from those computer scientists typically love. He created the Estérel language and shepherded it all the way to industrialization, serving as Chief Scientist of Estérel Technologies in the 2000 decade, before coming back to research.

He is also well known in France as a popularizer of informatics (computer science) through a number of successful books aiming at a large audience. He was appointed the first professor of informatics at the Collège de France, and in his inaugural lecture series presented a sweeping description of how information technology advances, based on powerful scientific concepts, permeate today’s world and raise ever new challenges. Hardly could the CNRS have chosen to distinguish a better representative of our discipline.

References

[1] CNRS announcement: here.

[2] Gérard Berry, Bottom-Up Computation of Recursive Programs, in RAIRO (Revue Française d’Automatique, Informatique, Recherche Opérationnelle, Informatique Théorique), tome 10, no R1 (1976), pp. 47-82, available here.

VN:F [1.9.10_1130]
Rating: 5.5/10 (2 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

Computing: the Art, the Magic, the Science

 

My colleagues and I have just finished recording our new MOOC (online course), an official ETH offering on the EdX platform. The preview is available [1] and the course will run starting in September.

As readers of this blog know, I  have enthusiastically, under the impulsion of Marco Piccioni at ETH, embraced MOOC technology to support and spread our courses. The particular target has been the introduction to programming that I have taught for over a decade at ETH based on the Touch of Class textbook [2]. In February this blog announced [3] the release of our first MOOC, embodying the essentials of our ETH course and making it available not only to ETH students but to the whole world. The course does not just include video lectures: it also supports active student participation through online exercises and programs that can be compiled and tested on the cloud, with no software installation. These advanced features result from our research on support for distributed software development (by Christian Estler and Martin Nordio, with Carlo Furia and others).

This first course was a skunkworks project, which we did entirely on our own without any endorsement from ETH or any of the main MOOC players. We and our students have very much benefited from the consequent flexibility, and the use of homegrown technology relying on the MOODLE framework. We will keep this course for our own students and for any outside participant who prefers a small-scale, “boutique” version. But the EdX brand and EdX’s marketing power will enable us to reach a much broader audience. We want to provide the best introductory computing course on the market and the world needs to know about it. In addition, the full support of media services at ETH  helped us reach a higher standard on the technical side. (For our first course, the home-brewed one, we did not have a studio, so that every time an ambulance drove by — our offices are close to the main Zurich hospital — we had to restart the current take.)

The course’s content is not exactly the same: we have broadened the scope from just programming to computing, although it retains a strong programming component. We introduced additional elements such as an interview with Professor Peter Widmayer of ETH on the basics of computer science theory. For both new material and the topics retained from the first version we have adapted to the accepted MOOC practice of short segments, although we did not always exactly meet the eight-minute upper limit that was suggested to us.

We hope that you, and many newcomers, will like the course and benefit from it.

References

[3] EdX course: Computing: Art, Magic, Science, preview available here.

[2] Bertrand Meyer: Touch of Class: Learning how to Program Well, with Objects and Contracts, Springer Verlag, revised printing, 2013, book page here.

[3] Learning to Program, Online: article from this blog, 3 February 2014, available here.

 

 

VN:F [1.9.10_1130]
Rating: 8.3/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

Reading Notes: Single-Entry, Single-Exit

 

It is remarkable that almost half a century after Dijkstra’s goto article, and however copiously and reverently it may be cited, today’s programs (other than in Eiffel) are still an orgy of gotos. There are not called gotos, being described as constructs that break out of a loop or exit a routine in multiple places, but they are gotos all the same. Multiple routine exits are particularly bad since they are in effect interprocedural gotos.

Ian Joyner has just released a simple and cogent summary of why routines should always have one entry and one exit.

References

[1] Ian Joyner: Single-entry, single-exit (SESE) heuristic, available here.

VN:F [1.9.10_1130]
Rating: 9.3/10 (12 votes cast)
VN:F [1.9.10_1130]
Rating: +5 (from 5 votes)

New article: passive processors

 

The SCOOP concurrency model has a clear division of objects into “regions”, improving the clarity and reliability of concurrent programs by establishing a close correspondence between the object structure and the process structure. Each region has an associated “processor”, which executes operations on the region’s objects. A literal application of this rule implies, however, a severe performance penalty. As part of the work for his PhD thesis (defended two weeks ago), Benjamin Morandi found out that a mechanism for specifying certain processors as “passive” yields a considerable performance improvement. The paper, to be published at COORDINATION, describes the technique and its applications.

Reference

Benjamin Morandi, Sebastian Nanz and Bertrand Meyer: Safe and Efficient Data Sharing for Message-Passing Concurrency, to appear in proceedings of COORDINATION 2014, 16th International Conference on Coordination Models and Languages, Berlin, 3-6 June 2014, draft available here.
.

VN:F [1.9.10_1130]
Rating: 8.2/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 4 votes)

Learning to program, online

Introduction to Programming MOOCThe ETH introductory programming course, which I have taught since 2003 and used as the basis for the Springer Touch of Class textbook, is now available as a MOOC: an online course, open to anyone interested [1]. The project was started and led by Marco Piccioni.

The MOOC was released in September; although it was “open” (the other “O”) from the start, we have not publicized it widely until now, since we used it first for the benefit of students taking the course at ETH, and took advantage of this experience to polish it. If you follow the acronym buzz you may say it was first a “SPOC” (Small Personal Online Course”). The experience with our students has been extremely encouraging: they took it as a supplement to the lectures and widely praised its value.We hope that many others will find it useful as well.

MOOCs are hot but they have attracted as much criticism as hype. We have seen the objections: low completion rates, lack of direct human contact, threats to traditional higher education. Two things are clear, though: MOOCs are more than a passing fad; and they have their own pedagogical advantages.

MOOCs are here to stay; one ignores them at one’s own peril. For courses on a popular topic, I believe that in a few years almost everyone will be teaching from a MOOC. Not in the sense of telling students “just follow this course on the Web and come back for the exam“, but as a basis for individual institutions’ courses. For example the students might be told to take the lessons online, then come to in-person lectures (“Flip The Classroom“) or discussion sessions. For any given topic, such as introduction to programming, only a handful of MOOCs will emerge in this role. We would like the ETH course to be one of them.

The question is not just to replace courses and textbooks with an electronic version. MOOCs enrich the learning experience. Introduction to Programming is a particularly fertile application area for taking advantage of technology: the presentation of programming methods and techniques becomes even more effective if students can immediately try out the ideas, compile the result, run it, and see the results on predefined tests. Our course offers many such interactive exercises, thanks to the E4Mooc (Eiffel for MOOCs) framework developed by Christian Estler [2]. This feature has proved to be a key attraction of the course for ETH students. Here for example is an exercise asking you to write a function that determines whether a string is a palindrome (reads identically in both directions):

The program area is pre-filled with a class skeleton where all that remains for you to enter is the algorithm for the relevant function. Then you can click “Compile” and, if there are no compilation errors, “Run” to test your candidate code against a set of predefined test values. One of the benefits for users taking the course is that there is no software to install: everything runs in the cloud, accessible from the browser. Here we see the MOOC not just as a technique for presenting standard material but as an innovative learning tool, opening up pedagogical techniques that were not previously possible.

Besides E4Mooc, our course relies on the Moodle learning platform. Our experience with Moodle has been pleasant; we noticed, for example, that students really liked the Moodle feature enabling them to gain virtual “badges” for good answers, to the point of repeating exercises until they got the badges. For instructors preparing the course, building a MOOC is a huge amount of work (that was not a surprise, people had told us); but it is worth the effort.

We noted that attendance at the lectures increased as compared to previous years. The matter is a natural concern: other than the cold November mornings in Zurich (one of the lectures is at 8 AM) there are many reasons not to show up in class:  the textbook covers much of the material; all the slides are online; so are the slides for exercise sessions (tutorials) and texts of exercises and some earlier exams; lectures were video-recorded in previous years, and students can access the old recordings. Our feeling is that the MOOC makes the course more exciting and gives students want to come to class and hear more.

The MOOC course is not tied to a particular period; you can take it whenever you like. (The current practice of offering MOOCs at fixed times is disappointing: what is the point of putting a course online if participants are forced to fit to a fixed schedule?)

Marco Piccioni and I are now off to our second MOOC, which will be a generalization of the first, covering the basics not just of programming but of computer science and IT overall, and will be available on one of the major MOOC platforms. We will continue to develop the existing MOOC, which directly supports our in-person course, and which we hope will be of use to many other people.

Take the MOOC, or tell a beginner near you to take it, and tell us what you think.

References

[1] Bertrand Meyer, Marco Piccioni and other members of the ETH Chair of Software Engineering: ETH Introduction to Programming MOOC, available here.

[2] H-Christian Estler, E4Mooc demo, available on YouTube: here.

VN:F [1.9.10_1130]
Rating: 9.4/10 (14 votes cast)
VN:F [1.9.10_1130]
Rating: +7 (from 9 votes)

PhD positions in concurrency/distribution/verification at ETH

As part of our “Concurrency Made Easy” ERC Advanced Investigator Grant project (2012-2017), we are offering PhD positions at the Chair of Software Engineering of ETH Zurich. The goal of the project is to build a sophisticated programming and verification architecture to make concurrent and distributed programming simple and reliable, based on the ideas of Eiffel and particularly the SCOOP concurrency model. Concurrency in its various forms (particularly multithreading) as well as distributed computing are required for most of today’s serious programs, but programming concurrent applications remains a challenge. The CME project is determined to break this complexity barrier.  Inevitably, achieving simplicity for users (in this case, application programmers) requires, under the hood, a sophisticated infrastructure, both conceptual (theoretical models) and practical (the implementation). We are building that infrastructure.

ETH offers an outstanding research and education environment and competitive salaries for “assistants” (PhD students), who are generally expected in addition to their research to participate in teaching, in particular introductory programming, and other activities of the Chair.  The candidates we seek have: a master’s degree in computer science or related field from a recognized institution (as required by ETH); a strong software engineering background, both practical and theoretical, and more generally a strong computer science and mathematical culture; a good knowledge of verification techniques (e.g. Hoare-style, model-checking, abstract interpretation); some background in concurrency or distribution; and a passion for high-quality software development. Prior publications, and experience with Eiffel, are pluses. In line with ETH policy, particular attention will be given to female candidates.

Before applying, you should become familiar with our work; see in particular the research pages at se.ethz.ch including the full description of the CME project at cme.ethz.ch.

Candidates should send (in PDF or text ) to se-open-positions@lists.inf.ethz.ch a CV and a short cover letter describing their view of the CME project and ideas about their possible contribution.

VN:F [1.9.10_1130]
Rating: 5.7/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 4 votes)

Saint Petersburg Software Engineering Seminar: 14 January 2014 (6 PM)

There will be two talks in the Software Engineering Seminar at ITMO, 18:00 local time, Tuesday, January 14, 2014. Please arrive 10 minutes early for registration.

Place: ITMO, Sytninskaya Ulitsa, Saint Petersburg.

Andrey Terekhov (SPBGU): Programming crystals

(I do not know whether this talk will be in Russian or English. An abstract follows but the talk is meant as the start of a discussion rather than a formal lecture.)

В течение последних 20-30 лет основными языками программирования кристаллов были VHDL и Verilog. Эти языки изначально проектировались как средства создания проектной документации, потом они стали использоваться в качестве инструмента моделирования и только сравнительно недавно для этих языков появились средства генерации кода уровня RTL (Register Transfer Language). Тексты на  VHDL и Verilog очень громоздки, трудно читаемые, плохо стандартизованы (одна и та же программа может синтезироваться на одном инструменте и не поддаваться синтезу на другом. Лет 10 назад появился язык SystemC – это С++ с огромным набором библиотек. С одной стороны, любая программа на SystemC может транслироваться стандартными трансляторами С++ , есть удобные средства потактного моделирования и приличные средства генгерации RTL, с другой стороны, требование совместимости с С++ не прошло даром – если в базовом языке нет средств описания параллелизма и конвейеризации, их приходится добавлять весьма искусственными приемами через приставные библиотеки. Буквально в прошлом году фирма Xilinx выпустила продукт Vivado, в рекламе которого утверждается, что он способен автоматически транслировать обычные программы на С/C++ в RTL промышленного качества.

Мы выполнили несколько экспериментов по использованию этого продукта, оказалось, что обещанной автоматизации там нет, пользователь должен писать на С, постоянно думая о том, как его код будет выглядеть в финальном RTL,  расставлять огромное количество прагм, причем не всегда очевидных.

Основной тезис доклада – такая важная область, как проектирование кристаллов, нуждается в специализированных языковых и инструментальных средствах, обеспечивающих  создание компактных и  легко читаемых программ, которые могут быть использованы как для симуляции, так и для генерации эффективного RTL. В докладе будут приведены примеры программ на языке HaSCoL (Hardware and Software Codesign Language), разработанном на кафедре системного программирования СПбГУ, и даны некоторые сравнительные характеристики.

Sergey Velder (ITMO): Alias graphs

(My summary – BM.) In the ITMO SEL work on automatic alias analysis, a new model has been developed: alias graphs, an abstraction of the object structure. This short talk will compare it to previously used approaches.

VN:F [1.9.10_1130]
Rating: 5.5/10 (2 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 2 votes)

Niklaus Wirth birthday symposium, 20 February, Zurich

In honor of Niklaus Wirth’s 80-th birthday we are organizing a symposium at ETH on February 20, 2014. This is a full-day event with invited talks by:

  • Vint Cerf
  • Hans Eberlé
  • Michael Franz
  • me
  • Carroll Morgan
  • Martin Odersky
  • Clemens Szyperski
  • Niklaus Wirth himself

From the symposium’s web page:

Niklaus Wirth was a Professor of Computer Science at ETH Zürich, Switzerland, from 1968 to 1999. His principal areas of contribution were programming languages and methodology, software engineering, and design of personal workstations. He designed the programming languages Algol W, Pascal, Modula-2, and Oberon, was involved in the methodologies of structured programming and stepwise refinement, and designed and built the workstations Lilith and Ceres. He published several text books for courses on programming, algorithms and data structures, and logical design of digital circuits. He has received various prizes and honorary doctorates, including the Turing Award, the IEEE Computer Pioneer, and the Award for outstanding contributions to Computer Science Education.

Participation is free (including breaks, lunch and the concluding “Apéro”) but space is strictly limited and we expect to run out of seats quickly. So if you are interested (but only if you are certain to attend) please register right away.

Symposium page and access to registration form: here.

VN:F [1.9.10_1130]
Rating: 8.7/10 (7 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 6 votes)

The laws of branching (part 1)

 

The first law of branching is: don’t. There is no other law.

The only sane way to develop software in a group, whether collocated or distributed, is to have a single branch (“trunk”) to which everyone commits changes, with constant running of the regression test suite to make sure that any breaking change is detected and corrected right away.

To allow branching, that is to say the emergence of separate lines of development with the expectation that they will be merged back later on, is to guarantee disaster. It is easy to diverge, but hard to converge; not only hard, but unpredictable. It can take days, weeks or more to reconcile changes and resolve conflicts, when the reason for the changes is no longer fresh in the developers’ memories, and the developers themselves may even no longer be there. Conflicts should be detected right away, and corrected immediately.

The EiffelStudio development never uses branches. A related development, EVE (Eiffel Verification Environment), maintained at ETH, includes all research tools, and is reconciled every Friday with the EiffelStudio trunk, so it is never allowed to diverge into a separate branch. Most other successful teams I know apply the first law of branching too. Solve conflicts before they have had the time to become conflicts.

VN:F [1.9.10_1130]
Rating: 5.3/10 (27 votes cast)
VN:F [1.9.10_1130]
Rating: -9 (from 19 votes)

Smaller, better textbook

A new version of my Touch of Class [1] programming textbook is available. It is not quite a new edition but more than just a new printing. All the typos that had been reported as of a few months ago have been corrected.

The format is also significantly smaller. This change is more than a trifle. When а  reader told me for the first time “really nice book, pity it is so heavy!”, I commiserated and did not pay much attention. After twenty people said that, and many more after them, including professors looking for textbooks for their introductory programming classes, I realized it was a big deal. The reason the book was big and heavy was not so much the size of the contents (876 is not small, but not outrageous for a textbook introducing all the fundamental concepts of programming). Rather, it is a technical matter: the text is printed in color, and Springer really wanted to do a good job, choosing thick enough paper that the colors would not seep though. In addition I chose a big font to make the text readable, resulting in a large format. In fact I overdid it; the font is bigger than necessary, even for readers who do not all have the good near-reading sight of a typical 19-year-old student.

We kept the color and the good paper,  but reduced the font size and hence the length and width. The result is still very readable, and much more portable. I am happy to make my contribution to reducing energy consumption (at ETH alone, think of the burden on Switzerland’s global energy bid of 200+ students carrying the book — as I hope they do — every morning on the buses, trains and trams crisscrossing the city!).

Springer also provides electronic access.

Touch of Class is the textbook developed on the basis of the Introduction to Programming course [2], which I have taught at ETH Zurich for the last ten years. It provides a broad overview of programming, starting at an elementary level (zeros and ones!) and encompassing topics not usually covered in introductory courses, even a short introduction to lambda calculus. You can get an idea of the style of coverage of such topics by looking up the sample chapter on recursion at touch.ethz.ch. Examples of other topics covered include a general introduction to software engineering and software tools. The presentation uses full-fledged object-oriented concepts (including inheritance, polymorphism, genericity) right from the start, and Design by Contract throughout. Based on the “inverted curriculum” ideas on which I published a number of articles, it presents students with a library of reusable components, the Traffic library for graphical modeling of traffic in a city, and builds on this infrastructure both to teach students abstraction (reusing code through interfaces including contracts) and to provide them models of high-quality code for imitation and inspiration.

For more details see the article on this blog that introduced the book when it was first published [3].

References

[1] Bertrand Meyer, Touch of Class: An Introduction to Programming Well Using Objects and Contracts, Springer Verlag, 2nd printing, 2013. The Amazon page is here. See the book’s own page (with slides and other teaching materials, sample chapter etc.) here. (Also available in Russian, see here.)

[2] Einführung in die Programmierung (Introduction to Programming) course, English course page here.

[3] Touch of Class published, article on this blog, 11 August 2009, see [1] here.

VN:F [1.9.10_1130]
Rating: 7.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +5 (from 7 votes)

The invariants of key algorithms (new paper)

 

I have mentioned this paper before but as a draft. It has now been accepted by ACM’s Computing Surveys and is scheduled to appear in September 2014; the current text, revised from the previous version, is available [1].

Here is the abstract:

Software verification has emerged as a key concern for ensuring the continued progress of information technology. Full verification generally requires, as a crucial step, equipping each loop with a “loop invariant”. Beyond their role in verification, loop invariants help program understanding by providing fundamental insights into the nature of algorithms. In practice, finding sound and useful invariants remains a challenge. Fortunately, many invariants seem intuitively to exhibit a common flavor. Understanding these fundamental invariant patterns could therefore provide help for understanding and verifying a large variety of programs.

We performed a systematic identification, validation, and classification of loop invariants over a range of fundamental algorithms from diverse areas of computer science. This article analyzes the patterns, as uncovered in this study,governing how invariants are derived from postconditions;it proposes a taxonomy of invariants according to these patterns, and presents its application to the algorithms reviewed. The discussion also shows the need for high-level specifications based on “domain theory”. It describes how the invariants and the corresponding algorithms have been mechanically verified using an automated program prover; the proof source files are available. The contributions also include suggestions for invariant inference and for model-based specification.

Reference

[1] Carlo Furia, Bertrand Meyer and Sergey Velder: Loop invariants: Analysis, Classification and Examples, in ACM Computing Surveys, to appear in September 2014, preliminary text available here.

VN:F [1.9.10_1130]
Rating: 7.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

Reading notes: strong specifications are well worth the effort

 

This report continues the series of ICSE 2013 article previews (see the posts of these last few days, other than the DOSE announcement), but is different from its predecessors since it talks about a paper from our group at ETH, so you should not expect any dangerously delusional,  disingenuously dubious or downright deceptive declaration or display of dispassionate, disinterested, disengaged describer’s detachment.

The paper [1] (mentioned on this blog some time ago) is entitled How good are software specifications? and will be presented on Wednesday by Nadia Polikarpova. The basic result: stronger specifications, which capture a more complete part of program functionality, cause only a modest increase in specification effort, but the benefits are huge; in particular, automatic testing finds twice as many faults (“bugs” as recently reviewed papers call them).

Strong specifications are specifications that go beyond simple contracts. A straightforward example is a specification of a push operation for stacks; in EiffelBase, the basic Eiffel data structure library, the contract’s postcondition will read

item =                                          /A/
count = old count + 1

where x is the element being pushed, item the top of the stack and count the number of elements. It is of course sound, since it states that the element just pushed is now the new top of the stack, and that there is one more element; but it is also  incomplete since it says nothing about the other elements remaining as they were; an implementation could satisfy the contract and mess up with these elements. Using “complete” or “strong” preconditions, we associate with the underlying domain a theory [2], or “model”, represented by a specification-only feature in the class, model, denoting a sequence of elements; then it suffices (with the convention that the top is the first element of the model sequence, and that “+” denotes concatenation of sequences) to use the postcondition

model = <x> + old model         /B/

which says all there is to say and implies the original postconditions /A/.

Clearly, the strong contracts, in the  /B/ style, are more expressive [3, 4], but they also require more specification effort. Are they worth the trouble?

The paper explores this question empirically, and the answer, at least according to the criteria used in the study, is yes.  The work takes advantage of AutoTest [5], an automatic testing framework which relies on the contracts already present in the software to serve as test oracles, and generates test cases automatically. AutoTest was applied to both to the classic EiffelBase, with classic partial contracts in the /A/ style, and to the more recent EiffelBase+ library, with strong contracts in the /B/ style. AutoTest is for Eiffel programs; to check for any language-specificity in the results the work also included testing a smaller set of classes from a C# library, DSA, for which a student developed a version (DSA+) equipped with strong model-based contracts. In that case the testing tool was Microsoft Research’s Pex [7]. The results are similar for both languages: citing from the paper, “the fault rates are comparable in the C# experiments, respectively 6 . 10-3 and 3 . 10-3 . The fault complexity is also qualitatively similar.

The verdict on the effect of strong specifications as captured by automated testing is clear: the same automatic testing tools applied to the versions with strong contracts yield twice as many real faults. The term “real fault” comes from excluding spurious cases, such as specification faults (wrong specification, right implementation), which are a phenomenon worth studying but should not count as a benefit of the strong specification approach. The paper contains a detailed analysis of the various kinds of faults and the corresponding empirically determined measures. This particular analysis is for the Eiffel code, since in the C#/Pex case “it was not possible to get an evaluation of the faults by the original developers“.

In our experience the strong specifications are not that much harder to write. The paper contains a precise measure: about five person-weeks to create EiffelBase+, yielding an “overall benefit/effort ratio of about four defects detected per person-day“. Such a benefit more than justifies the effort. More study of that effort is needed, however, because the “person” in the person-weeks was not just an ordinary programmer. True, Eiffel experience has shown that most programmers quickly get the notion of contract and start applying it; as the saying goes in the community, “if you can write an if-then-else, you can write a contract”. But we do not yet have significant evidence of whether that observation extends to model-based contracts.

Model-based contracts (I prefer to call them “theory-based” because “model” means so many other things, but I do not think I will win that particular battle) are, in my opinion, a required component of the march towards program verification. They are the right compromise between simple contracts, which have proved to be attractive to many practicing programmers but suffer from incompleteness, and full formal specification à la Z, which say everything but require too much machinery. They are not an all-or-nothing specification technique but a progressive one: programmers can start with simple contracts, then extend and refine them as desired to yield exactly the right amount of precision and completeness appropriate in any particular context. The article shows that the benefits are well worth the incremental effort.

According to the ICSE program the talk will be presented in the formal specification session, Wednesday, May 22, 13:30-15:30, Grand Ballroom C.

References

[1] Nadia Polikarpova, Carlo A. Furia, Yu Pei, Yi Wei and Bertrand Meyer: What Good Are Strong Specifications?, to appear in ICSE 2013 (Proceedings of 35th International Conference on Software Engineering), San Francisco, May 2013, draft available here.

[2] Bertrand Meyer: Domain Theory: the forgotten step in program verification, article on this blog, see here.

[3] Bernd Schoeller, Tobias Widmer and Bertrand Meyer: Making Specifications Complete Through Models, in Architecting Systems with Trustworthy Components, eds. Ralf Reussner, Judith Stafford and Clemens Szyperski, Lecture Notes in Computer Science, Springer-Verlag, 2006, available here.

[4] Nadia Polikarpova, Carlo Furia and Bertrand Meyer: Specifying Reusable Components, in Verified Software: Theories, Tools, Experiments (VSTTE ‘ 10), Edinburgh, UK, 16-19 August 2010, Lecture Notes in Computer Science, Springer Verlag, 2010, available here.

[5] Bertrand Meyer, Ilinca Ciupa, Andreas Leitner, Arno Fiva, Yi Wei and Emmanuel Stapf: Programs that Test Themselves, IEEE Computer, vol. 42, no. 9, pages 46-55, September 2009, also available here.

[6] Bertrand Meyer, Ilinca Ciupa, Andreas Leitner, Arno Fiva, Yi Wei and Emmanuel Stapf: Programs that Test Themselves, in IEEE Computer, vol. 42, no. 9, pages 46-55, September 2009, also available here.

[7] Nikolai Tillman and Peli de Halleux, Pex: White-Box Generation for .NET, in Tests And Proofs (TAP 2008), pp. 134-153.

 

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

Presentations at ICSE and VSTTE

 

The following presentations from our ETH group in the ICSE week (International Conference on Software Engineering, San Francisco) address important issues of software specification and verification, describing new techniques that we have recently developed as part of our work building EVE, the Eiffel Verification Environment. One is at ICSE proper and the other at VSTTE (Verified Software: Tools, Theories, Experiments). If you are around please attend them.

Julian Tschannen will present Program Checking With Less Hassle, written with Carlo A. Furia, Martin Nordio and me, at VSTTE on May 17 in the 15:30-16:30 session (see here in the VSTTE program. The draft is available here. I will write a blog article about this work in the coming days.

Nadia Polikarpova will present What Good Are Strong Specifications?, written with , Carlo A. Furia, Yu Pei, Yi Wei and me at ICSE on May 22 in the 13:30-15:30 session (see here in the ICSE program). The draft is available here. I wrote about this paper in an earlier post: see here. It describes the systematic application of theory-based modeling to the full specification and verification of advanced software.

VN:F [1.9.10_1130]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Bringing C code to the modern world

The C2Eif translator developed by Marco Trudel takes C code and translates it into Eiffel; it produces not just a literal translation but a re-engineering version exhibiting object-oriented properties. Trudel defended his PhD thesis last Friday at ETH (the examiners were Hausi Muller from Victoria University, Manuel Oriol from ABB, Richard Paige from the University of York,  and me as the advisor). The thesis is not yet available online but earlier papers describing C2Eif are, all reachable from the project’s home page [1].

At issue is what we do with legacy code. “J’ai plus de souvenirs que si j’avais mille ans”, wrote Charles Baudelaire in Les Fleurs du Mal (“Spleen de Paris”). The software industry is not a thousand years old, but has accumulated even more “souvenirs” than

A heavy chest of drawers cluttered with balance-sheets,
Poems, love letters, lawsuits, romances
And heavy locks of hair wrapped in invoices
.

We are suffocating under layers of legacy code heaped up by previous generations of programmers using languages that no longer meet our scientific and engineering standards. We cannot get rid of this heritage; how do we bring it to the modern world? We need automatic tools to wrap it in contemporary code, or, better, translate it into contemporary code. The thesis and the system offer a way out through translation to a modern object-oriented language. It took courage to choose such a topic, since there have been many attempts in the past, leading to conventional wisdom consisting of two strongly established opinions:

  • Plain translation: it has been tried, and it works. Not interesting for a thesis.
  • Object-oriented reengineering: it has been tried, and it does not work. Not realistic for a thesis.

Both are wrong. For translation, many of the proposed solutions “almost work”: they are good enough to translate simple programs, or even some large programs but on the condition that the code avoids murky areas of C programming such as signals, exceptions (setjmp/longjmp) and library mechanisms. In practice, however, most useful C programs need these facilities, so any tool that ignores them is bound to be of conceptual value only. The basis for Trudel’s work has been to tackle C to OO translation “beyond the easy stuff” (as stated in the title of one of the published papers). This effort has been largely successful, as demonstrated by the translation of close to a million lines of actual C code, including some well-known and representative tools such as the Vim editor.

As to OO reengineering, C2Eif makes a serious effort to derive code that exhibits a true object-oriented design and hence resembles, in its structure at least, what a programmer in the target language might produce. The key is to identify the right data abstractions, yielding classes, and specialization properties, yielding inheritance. In this area too, many people have tried to come up with solutions, with little success. Trudel has had the good sense of avoiding grandiose goals and sticking to a number of heuristics that work, such as looking at the signatures of a set of functions to see if they all involve a common argument type. Clearly there is more to be done in this direction but the result is already significant.

Since Eiffel has a sophisticated C interface it is also possible to wrap existing code; some tools are available for that purpose, such as Andreas Leitner’s EWG (Eiffel Wrapper Generator). Wrapping and translating each have their advantages and limitations; wrapping may be more appropriate for C libraries that someone else is still actively updating  (so that you do not have to redo a translation with every new release), and translation for legacy code that you want to take over and bring up to par with the rest of your software. C2Eif is engineered to support both. More generally, this is a practitioner’s tool, devoting considerable attention to the many details that make all the difference between a nice idea and a tool that really works. The emphasis is on full automation, although more parametrization has been added in recent months.

C2Eif will make a big mark on the Eiffel developer community. Try it yourself — and don’t be shy about telling its author about the future directions in which you think the tool should evolve.

Reference

[1] C2Eif project page, here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (13 votes cast)
VN:F [1.9.10_1130]
Rating: +8 (from 8 votes)

LASER summer school: Software for the Cloud and Big Data

The 2013 LASER summer school, organized by our chair at ETH, will take place September 8-14, once more in the idyllic setting of the Hotel del Golfo in Procchio, on the island of Elba in Italy. This is already the 10th conference; the roster of speakers so far reads like a who’s who of software engineering.

The theme this year is Software for the Cloud and Big Data and the speakers are Roger Barga from Microsoft, Karin Breitman from EMC,  Sebastian Burckhardt  from Microsoft,  Adrian Cockcroft from Netflix,  Carlo Ghezzi from Politecnico di Milano,  Anthony Joseph from Berkeley,  Pere Mato Vila from CERN and I.

LASER always has a strong practical bent, but this year it is particularly pronounced as you can see from the list of speakers and their affiliations. The topic is particularly timely: exploring the software aspects of game-changing developments currently redefining the IT scene.

The LASER formula is by now well-tuned: lectures over seven days (Sunday to Saturday), about five hours in the morning and three in the early evening, by world-class speakers; free time in the afternoon to enjoy the magnificent surroundings; 5-star accommodation and food in the best hotel of Elba, made affordable as we come towards the end of the season (and are valued long-term customers). The group picture below is from last year’s school.

Participants are from both industry and academia and have ample opportunities for interaction with the speakers, who typically attend each others’ lectures and engage in in-depth discussions. There is also time for some participant presentations; a free afternoon to discover Elba and brush up on your Napoleonic knowledge; and a boat trip on the final day.

Information about the 2013 school can be found here.

LASER 2012, Procchio, Hotel del Golvo

VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Master, please explain: “recursively”

 

With pleasure. To define a concept recursively is to define part of it directly and the rest, if any, recursively.

VN:F [1.9.10_1130]
Rating: 8.9/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

How good are strong specifications? (New paper, ICSE 2013)

 

A core aspect of our verification work is the use of “strong” contracts, which express sophisticated specification properties without requiring a separate specification language: even for advanced properties, there is no need for a separate specification language, with special notations such as those of first-order logic; instead, one can continue to rely, in the tradition of Design by Contract, on the built-in notations of the programming language, Eiffel.

This is the idea of domain theory, as discussed in earlier posts on this blog, in particular [1]. An early description of the approach, part of Bernd Schoeller’s PhD thesis work, was [2]; the next step was [3], presented at VSTTE in 2010.

A new paper to be presented at ICSE in May [3], part of an effort led by Nadia Polikarpova for her own thesis in progress, shows new advances in using strong specifications, demonstrating their expressive power and submitting them to empirical evaluation. The results show in particular that strong specifications justify the extra effort; in particular they enable automatic tests to find significantly more bugs.

A byproduct of this work is to show again the complementarity between various forms of verification, including not only proofs but (particularly in the contribution of two of the co-authors, Yi Wei and Yu Pei, as well as Carlo Furia) tests.

References

[1] Bertrand Meyer: Domain Theory: the forgotten step in program verification, article on this blog, see here.

[2] Bernd Schoeller, Tobias Widmer and Bertrand Meyer: Making Specifications Complete Through Models, in Architecting Systems with Trustworthy Components, eds. Ralf Reussner, Judith Stafford and Clemens Szyperski, Lecture Notes in Computer Science, Springer-Verlag, 2006, available here.

[3] Nadia Polikarpova, Carlo Furia and Bertrand Meyer: Specifying Reusable Components, in Verified Software: Theories, Tools, Experiments (VSTTE ‘ 10), Edinburgh, UK, 16-19 August 2010, Lecture Notes in Computer Science, Springer Verlag, 2010, available here.

[4] Nadia Polikarpova, Carlo A. Furia, Yu Pei, Yi Wei and Bertrand Meyer: What Good Are Strong Specifications?, to appear in ICSE 2013 (Proceedings of 35th International Conference on Software Engineering), San Francisco, May 2013, draft available here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

ESEC/FSE 2013: 18-26 August, Saint Petersburg, Russia

The European Software Engineering Conference takes place every two years in connection with the ACM Foundations of Software Engineering symposium (which in even years is in the US). The next ESEC/FSE  will be held for the first time in Russia, where it will be the first major international software engineering conference ever. It comes at a time when the Russian software industry is ever more present through products and services offered worldwide. See the conference site here. The main conference will be held 21-23 August 2013, with associated events before and after so that the full dates are August 18 to 26. (I am the general chair.)

Other than ICSE, ESEC/FSE is second to none in the quality of the program. We already have four outstanding keynote speakers:  Georges Gonthier from Microsoft Research, Paola Inverardi from L’Aquila in Italy, David Notkin from U. of Washington (in whose honor a symposium will be held as an associated event of ESEC/FSE, chaired by Michael Ernst), and Moshe Vardi of Rice and of course Communications of the ACM.

Saint Petersburg is one of the most beautiful cities in the world, strewn with gilded palaces, canals, world-class museums (not just the Hermitage), and everywhere mementos of the great poets, novelists, musicians and scientists who built up its fame.

Hosted by ITMO National Research University, the conference will be held in the magnificent building of the Razumovsky Palace on the banks of the Moika river; see here.

The Call for Papers has a deadline of March 1st, so there is still plenty of time to polish your best paper and send it to ESEC/FSE. There is also still time to propose worskhops and other associated events. ESEC/FSE will be a memorable moment for the community and we hope to see many of the readers there.

VN:F [1.9.10_1130]
Rating: 9.7/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

Hitting on America

 

The study of agile methods is good for your skeptical bones.

“Build the simplest thing that works, then refactor if needed.”

Maybe. Maybe. But what about getting it right the first time around?

Erich Kästner wrote an apposite ditty on this topic [1]:

They tell you it’s OK if first you fail;
OK perhaps — but not so practical.
Not all who for India set sail
Hit on America.

Note

[1] My translation. The original reads:

Irrtümer haben ihren Wert;
Jedoch noch hie und da.
Nicht jeder, der nach Indien fährt,
Endeckt Amerika.

VN:F [1.9.10_1130]
Rating: 6.1/10 (7 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 7 votes)

Loop invariants: the musical

 

Actually it is not a musical but an extensive survey. I have long been fascinated by the notion of loop invariant, which describes the essence of a loop. Considering a loop without its invariant is like conducting an orchestra without a score.

In this submitted survey paper written with Sergey Velder and Carlo Furia [1], we study loop invariants in depth and describe many algorithms from diverse areas of computer science through their invariants. For simplicity and clarity, the specification technique uses the Domain Theory technique described in an earlier article on this blog [2] (see also [3]). The invariants were verified mechanically using Boogie, a sign of how much more realistic verification technology has become in recent years.

The survey was a major effort (we worked on it for a year and a half); it is not perfect but we hope it will prove useful in the understanding, teaching and verification of important algorithms.

Here is the article’s abstract:

At the heart of every loop, and hence of all significant algorithms, lies a loop invariant: a property ensured by the initialization and maintained by every iteration so that, when combined with the exit condition, it yields the loop’s final effect. Identifying the invariant of every loop is not only a required step for software verification, but also a key requirement for understanding the loop and the program to which it belongs. The systematic study of loop invariants of important algorithms can, as a consequence, yield insights into the nature of software.

We performed this study over a wide range of fundamental algorithms from diverse areas of computer science. We analyze the patterns according to which invariants are derived from postconditions, propose a classification of invariants according to these patterns, and present its application to the algorithms reviewed. The discussion also shows the need for high-level specification and invariants based on “domain theory”. The included invariants and the corresponding algorithms have been mechanically verified using an automatic program prover. Along with the classification and applications, the conclusions include suggestions for automatic invariant inference and general techniques for model-based specification.

 

References

[1] Carlo Furia, Bertrand Meyer and Sergey Velder: Loop invariants: analysis, classification, and examples, submitted for publication, December 2012, draft available here.

[2] Domain Theory: the Forgotten Step in Program Verification, article from this blog, 11 April 2012, available here.

[3] Domain Theory: Precedents, article from this blog, 11 April 2012, available here

VN:F [1.9.10_1130]
Rating: 9.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

Precedent

Alexander Kogtenkov pointed out to me that precursor work to my papers on the Alias Calculus [1] [2] had been published by John Whaley and Martin Rinard [3]. There are some significant differences; in particular my rules are simpler, and their work is not explicitly presented as a calculus. But many of the basic ideas are the same. The reason I did not cite that paper is simply that I was not aware of it; I am happy to correct the omission.

References

[1] Bertrand Meyer: Towards a Theory and Calculus of Aliasing, in Journal of Object Technology, vol. 9, no. 2, March-April 2010, pages 37-74, available here (superseded by [2])
[2] Bertrand Meyer: Steps Towards a Theory and Calculus of Aliasing, in International Journal of Software and Informatics, 2011, available here (revised and improved version of [1].)
[3] John Whaley and Martin Rinard: Compositional Pointer and Escape Analysis for Java Programs, in POPL 1999, available here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

The manhood test

 

I came across an obscure and surprisingly interesting article by Cliff Jones [1], about the history of rely-guarantee but with the following extract:

It was perhaps not fully appreciated at the time of [Hoare’s 1969 axiomatic semantics paper] that the roles of pre and post conditions differ in that a pre condition gives permission to a developer to ignore certain possibilities; the onus is on a user to prove that a component will not be initiated in a state that does not satisfy its pre condition. In contrast a post condition is an obligation on the code that is created according to the specification. This Deontic view carries over [to rely-guarantee reasoning].

I use words more proletarian than “deontic”, but this view is exactly what stands behind the concepts of Design by Contract and has been clearly emphasized in all Eiffel literature ever since the first edition of OOSC. It remains, however, misunderstood outside of the Eiffel community; many people confuse Design by Contract with its opposite, defensive programming. The criterion is simple: if you have a precondition to a routine, are you willing entirely to forsake the corresponding checks (conditionals, exceptions…) in the routine body? If not, you may be using the word “contract” as a marketing device, but that’s all. The courage to remove the checks is the true test of adulthood.

The application of Microsoft’s “Code Contracts” mechanism to the .NET libraries fails that test: a precondition may say “buffer not full” or “insertions allowed”, but the code still checks the condition and triggers an exception. The excuse I have heard is that one cannot trust those unwashed developers. But the methodological discipline is lost. Now let me repeat this using clearer terminology: it’s not deontic.

Reference

[1] Cliff Jones: The role of auxiliary variables in the formal development of concurrent programs, in Reflections on the work of C. A. R. Hoare, eds. Jones, Roscoe and Wood, Springer Lecture Notes in Computer Science,  2009, technical report version available here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)