Archive for the ‘Education’ Category.

LASER summer school on software for robotics: last call for registration

Much of the progress in robotics is due to software advances, and software issues remain at the heart of the formidable challenges that remain. The 2017 LASER summer school, held in September in Elba, brings together some of the most prestigious international experts in the area.

The LASER school has established itself as one of the principal forums to discussed advanced software issues. The 2017 school takes place from 9 to 17 September in the idyllic setting of the Hotel del Golfo in Procchio, Elba Island, Italy.

Robotics is progressing at an amazing pace, bringing improvements to almost areas of human activity. Today’s robotics systems rely ever more fundamentally on complex software, raising difficult issues. The LASER 2017 summer school covers both the current state of robotics software technology and open problems. The lecturers are top international experts with both theoretical contributions and major practical achievements in developing robotics systems.
The LASER school is intended for professionals from the industry (engineers and managers) as well as university researchers, including PhD students. Participants learn about the most important software technology advances from the pioneers in the field. The school’s focus is applied, although theory is welcome to establish solid foundations. The format of the school favors extensive interaction between participants and speakers.

We have lined up an impressive roster of speakers from the leading edge of both industry and academia:

Rodolphe Gélin, Aldebaran Robotics
Ashish Kapoor, Microsoft Research
Davide Brugali, University of Bergamo, on Managing software variability in robotic control systems
Nenad Medvidovic, University of Southern California, on Software Architectures of Robotics Systems
Bertrand Meyer, Politecnico di Milano & Innopolis University, on Concurrent Object-Oriented Robotics Software
Issa Nesnas, NASA Jet Propulsion Laboratory, on Experiences from robotic software development for research and planetary flight robots
Hiroshi (“Gitchang”) Okuno, Waseda University & Kyoto University, on Open-Sourced Robot Audition Software HARK: Capabilities and Applications

The school takes place at the magnificent Hotel del Golfo in the Gulf of Procchio, Elba. Along with an intensive scientific program, participants will have time to enjoy the countless natural and cultural riches of this wonderful, history-laden jewel of the Mediterranean.

For more information about the school, the speakers and registration see the LASER site.

VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Agile MOOC starts this week

In spite of all the interest in both agile methods and MOOCs (Massive Open Online Courses) there are few courses on agile methods; I know only of some specialized MOOCs focused on a particular language or method.

I produced for EdX, with the help of Marco Piccioni, a new MOOC entitled Agile Software Development. It starts airing today and is supported by exercises and quizzes. The course uses some of the material from my Agile book.

Registration is free and open to anyone at this address.

VN:F [1.9.10_1130]
Rating: 8.0/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: -1 (from 1 vote)

Software for Robotics: 2016 LASER summer school, 10-18 September, Elba

The 2016 session of the LASER summer school, now in its 13th edition, has just been announced. The theme is new for the school, and timely: software for robotics. Below is the announcement.

School site: here

The 2016 LASER summer school will be devoted to Software for Robotics. It takes place from 10 to 18 September in the magnificent setting of the Hotel del Golfo in Procchio, Elba Island, Italy.

Robotics is progressing at an amazing pace, bringing improvements to almost areas of human activity. Today’s robotics systems rely ever more fundamentally on complex software, raising difficult issues. The LASER 2016 summer school both covers the current state of robotics software technology and open problems. The lecturers are top international experts with both theoretical contributions and major practical achievements in developing robotics systems.
The LASER school is intended for professionals from the industry (engineers and managers) as well as university researchers, including PhD students. Participants learn about the most important software technology advances from the pioneers in the field. The school’s focus is applied, although theory is welcome to establish solid foundations. The format of the school favors extensive interaction between participants and speakers.
The speakers include:

  • Joydeep Biswas, University of Massachussetts, on Development, debugging, and maintenance of deployed robots
  • Davide Brugali, University of Bergamo, on Managing software variability in robotic control systems
  • Nenad Medvidovic, University of Southern California, on Software Architectures of Robotics Systems
  • Bertrand Meyer, Politecnico di Milano and Innopolis University, with Jiwon Shin, on Concurrent Object-Oriented Robotics Software: Concepts, Framework and Applications
  • Issa Nesnas, NASA Jet Propulsion Laboratory, on Experiences from robotic software development for research and planetary flight robots
  • Richard Vaughan, Simon Fraser University

Organized by Politecnico di Milano, the school takes place at the magnificent Hotel del Golfo (http://www.hoteldelgolfo.it/) in Golfo di Procchio, Elba. Along with an intensive scientific program, participants will have time to enjoy the natural and cultural riches of this history-laden jewel of the Mediterranean.

For more information about the school, the speakers and registration see here.

.

— Bertrand Meyer

VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

How to learn languages

Most people in technology, trade, research or education work in an international environment and need to use a foreign language which they learned at some earlier stage [1]. It is striking to see how awfully most of us perform. International conferences are a particular pain; many speakers are impossible to understand. You just want to go home and read the paper — or, often, not.

Teachers — English teachers in the case of the most commonly used international language — often get the blame, as in “They teach us Shakespeare’s English instead of what we need for today’s life“, but such complaints are  unfounded: look at any contemporary language textbook and you will see that it is all about some Svetlanas or Ulriches or Natsukos meeting or tweeting their friends Cathy and Bill.

It is true, though, that everyone teaches languages the wrong way.

There is only one way to teach languages right: start with the phonetics. Languages were spoken [2] millennia before they ever got written down. The basis of all natural languages is vocal. If you do not pronounce a language right you do not speak that language. It is unconscionable for example that most of us non-native speakers, when using English, still have an accent. We should have got rid of its last traces by age 12.

I cannot understand why people who are otherwise at the vanguard of intellectual achievement make a mess of their verbal expression, seemingly not even realizing there might be a problem. Some mistakes seem to be handed out from generation to generation. Most French speakers of English, for example, pronounce the “ow” of “allow” as in “low”, not “cow” (it took a long time before a compassionate colleague finally rid me of that particular mistake, and I don’t know how many more I may still be making); Italians seem to have a particular fondness for pronouncing as a “v” the “w” of “write”; an so on.

The only place I ever saw that taught languages right was a Soviet school for interpreters. Graduating students of French, having had no exposure to the language before their studies, spoke it like someone coming out of a Métro station. (Actually they spoke more like a grandmother coming out of the Métro, since they had little access to contemporary materials, but that would have been easy to fix.) The trick: they spent their entire first year doing phonetics, getting the “r”and the “u” and so on right, shedding the intonation of their native tongue. That year was entirely devoted to audio practice in a phonetics lab. At the end of it they did not know the meaning of what they were saying,but they said it perfectly. Then came a year of grammar, then a year of conversation. Then came the Métro result. (This is not an apology of the Soviet Union. Someone there just happened to get that particular thing right.)

We should teach everyone this way. There is no reason to tolerate phonetic deviations. If you do not get the sounds exactly as they should be, everything else will be flawed. Take, for example, the “r”. If, like me, you cannot roll your “r”s, then when you try to speak Russian or Italian, even if you think you can get the other sounds right you don’t because  your tongue or palate or teeth are in the wrong place. Another example is the “th” sound in English (two distinct sounds in fact) which I never got right. I can fake it but then something else comes out wrong and I still sound foreign. My high-school teachers — to whom I owe gratitude for so much else — should have tortured me until my “th”s were perfect. True, teaching time is a fixed-pie problem, but I am sure something else could have been sacrificed. Since, for example, I can answer in a blink that seven times nine is sixty-three, I must at some stage have taken the time to memorize it. In retrospect I would gladly sacrifice that element of knowledge, which I can reconstruct when needed, for the ability to roll my “r”s.

Age is indeed critical. While we humans can learn anything at any time, it is a well-known fact (although the reasons behind it remain mysterious) that until puberty we are malleable and can learn languages perfectly.  Witness bilingual and trilingual children; they do not have any accent. But around the time we develop new abilities and desires our brain shuts itself off to that particular skill; from then on we can only learn languages at great pain, with only remote hopes of reaching the proficiency of natives. The time to learn the phonetics of a foreign language, and learn it perfectly, is around the age of nine or ten at the latest. Then, at the age of reason, we should learn the structures — the grammar. Declensions in German, the use of tenses in English, the perfective and imperfective aspects of Russian. Conversation — Svetlana greeting Cathy — can come later if there is time left. Once you have the basics wired into your head, the rest is trivial.

Focusing children on phonetics as the crucial part of learning a language will also help them shine. Like physical appearance, verbal clarity is an enormous advantage. I must not be the only one in conferences who pays far more attention to the content of an article if the speaker has impeccable pronunciation, innate or learned. Syntax and choice of words come next. Of course substance matters; we have all heard top scientists with accents thicker than a Humvee tire and grammar thinner than a summer dress.  Everyone else needs fluency.

Conceivably, someone might object that a year of phonetic drilling is not the most amusing pastime for a 10-year-old. Without even noting that it’s not worse than having to learn to play the violin — where did we ever get the idea that learning should be fun?

As to me, like those who before they die want to get into space, visit the capitals of all countries on earth or reach the top of Mount Everest, I have my dream; it has lesser impact on the environment and depends on me, not on the help of others: just once, I’d like to roll an “r” like a Polish plumber.

Notes

[1] Many native English speakers provide the exception to this observation, since they often do not learn any foreign language beyond “Buon giorno” and “melanzane alla parmigiana”, and hence will probably not see the point of this note.

[2] And motioned. Sign language as practiced by deaf people (informally before it was codified starting from the 17th century on) is also a potential teaching start.

VN:F [1.9.10_1130]
Rating: 9.8/10 (14 votes cast)
VN:F [1.9.10_1130]
Rating: +8 (from 8 votes)

Understanding and assessing Agile: free ACM webinar next Wednesday

ACM is offering this coming Wednesday a one-hour webinar entitled Agile Methods: The Good, the Hype and the Ugly. It will air on February 18 at 1 PM New York time (10 AM West Coast, 18 London, 19 Paris, see here for more cities). The event is free and the registration link is here.

The presentation is based on my recent book with an almost identical title [1]. It will be a general discussion of agile methods, analyzing both their impressive contributions to software engineering and their excesses, some of them truly damaging. It is often hard to separate the beneficial from the indifferent and the plain harmful, because most of the existing presentations are of the hagiographical kind, gushing in admiration of the sacred word. A bit of critical distance does not hurt.

As you can see from the Amazon page, the first readers (apart from a few dissenters, not a surprise for such a charged topic) have relished this unprejudiced, no-nonsense approach to the presentation of agile methods.

Another characteristic of the standard agile literature is that it exaggerates the contrast with classic software engineering. This slightly adolescent attitude is not helpful; in reality, many of the best agile ideas are the direct continuation of the best classic ideas, even when they correct or adapt them, a normal phenomenon in technology evolution. In the book I tried to re-place agile ideas in this long-term context, and the same spirit will also guide the webinar. Ideological debates are of little interest to software practitioners; what they need to know is what works and what does not.

References

[1] Bertrand Meyer, Agile! The Good, the Hype and the Ugly, Springer, 2014, see Amazon page here, publisher’s page here and my own book page here.

VN:F [1.9.10_1130]
Rating: 8.8/10 (13 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

Lampsort

 

In support of his view of software methodology, Leslie Lamport likes to use the example of non-recursive Quicksort. Independently of the methodological arguments, his version of the algorithm should be better known. In fact, if I were teaching “data structures and algorithms” I would consider introducing it first.

As far as I know he has not written down his version in an article, but he has presented it in lectures; see [1]. His trick is to ask the audience to give a non-recursive version of Quicksort, and of course everyone starts trying to remove the recursion, for example by making the stack explicit or looking for invertible functions in calls. But his point is that recursion is not at all fundamental in Quicksort. The recursive version is a specific implementation of a more general idea.

Lamport’s version — let us call it Lampsort —is easy to express in Eiffel. We may assume the following context:

a: ARRAY [G -> COMPARABLE]        — The array to be sorted.
pivot: INTEGER                                      —  Set by partition.
picked: INTEGER_INTERVAL            — Used by the sorting algorithm, see below.
partition (i, j: INTEGER)
……..require      — i..j is a sub-interval of the array’s legal indexes:
……..……..i < j
……..……..i >= a.lower
……..……..j <= a.upper
……..do
……..……..… Usual implementation of partition
……..ensure     — The expected effect of partition:
……..……..pivot >= i
……..……..pivot < j
……..……..a [i..j] has been reshuffled so that elements in i..pivot are less than
……..……..or equal to those in pivot+1 .. j.
……..end

We do not write the implementation of partition since the point of the present discussion is the overall algorithm. In the usual understanding, that algorithm consists of doing nothing if the array has no more than one element, otherwise performing a partition and then recursively calling itself on the two resulting intervals. The implementation can take advantage of parallelism by forking the recursive calls out to different processors. That presentation, says Lamport, describes only a possible implementation. The true Quicksort is more general. The algorithm works on a set not_sorted of integer intervals i..j such that the corresponding array slices a [i..j] are the only ones possibly not sorted; the goal of the algorithm is to make not_sorted empty, since then we know the entire array is sorted. In Eiffel we declare this set as:

not_sorted: SET [INTEGER_INTERVAL]

The algorithm initializes not_sorted to contain a single element, the entire interval; at each iteration, it removes an interval from the set, partitions it if that makes sense (i.e. the interval has more than one element), and inserts the resulting two intervals into the set. It ends when not_sorted is empty. Here it is:

……..from                                 — Initialize interval set to contain a single interval, the array’s entire index range:
……..…..create not_sorted.make_one (a.lower |..| a.upper)….         ..……..
……..invariant
……..…..— See below
……..until
……..…..not_sorted.is_empty                                                            — Stop when there are no more intervals in set
……..loop
……..…..picked := not_sorted.item                                                     — Pick an interval from (non-empty) interval set.
……..……if picked.count > 1 then                                                      — (The precondition of partition holds, see below.)
……..……..…..partition (picked.lower, picked.upper)                    — Split it, moving small items before and large ones after pivot.
……..……..…..not_sorted.extend (picked.lower |..| pivot)            — Insert new intervals into the set of intervals: first
……..……....not_sorted.extend (pivot + 1 |..| picked.upper)     — and second.
……..……end
……..…...not_sorted.remove (picked)                                               — Remove interval that was just partitioned.
…….end

Eiffel note: the function yielding an integer interval is declared in the library class INTEGER using the operator |..| (rather than just  ..).

The query item from SET, with the precondition not is_empty,  returns an element of the set. It does not matter which element. In accordance with the Command-Query Separation principle, calling item does not modify the set; to remove the element you have to use the command remove. The command extend adds an element to the set.

The abstract idea behind Lampsort, explaining why it works at all, is the following loop invariant (see [2] for a more general discussion of how invariants provide the basis for understanding loop algorithms). We call “slice” of an array a non-empty contiguous sub-array; for adjacent slices we may talk of concatenation; also, for slices s and t s <= t means that every element of s is less than or equal to every element of t. The invariant is:

a is the concatenation of the members of a set slices of disjoint slices, such that:
– The elements of a are a permutation of its original elements.
– The index range of any member  of slices having more than one element is in not_sorted.
– For any adjacent slices s and t (with s before t), s <= t.

The first condition (conservation of the elements modulo permutation) is a property of partition, the only operation that can modify the array. The rest of the invariant is true after initialization (from clause) with slices made of a single slice, the full array. The loop body maintains it since it either removes a one-element interval from not_sorted (slices loses the corresponding slice) or performs partition with the effect of partitioning one slice into two adjacent ones satisfying s <= t, whose intervals replace the original one in not_sorted. On exit, not_sorted is empty, so slices is a set of one-element slices, each less than or equal to the next, ensuring that the array is sorted.

The invariant also ensures that the call to partition satisfies that routine’s precondition.

The Lampsort algorithm is a simple loop; it does not use recursion, but relies on an interesting data structure, a set of intervals. It is not significantly longer or more difficult to understand than the traditional recursive version

sort (i, j: INTEGER)
……..require
……..……..i <= j
……..……..i >= a.lower
……..……..j <= a.upper
……..do
……..……if j > i then                    — Note that precondition of partition holds.
……..……..…..partition (i, j)         — Split into two slices s and t such that s <= t.
……..……..…..sort (i, pivot)          — Recursively sort first slice.
……..……..…..sort (pivot+1, j)      — Recursively sort second slice.
……..……end……..…..
……..end

Lampsort, in its author’s view, captures the true idea of Quicksort; the recursive version, and its parallelized variants, are only examples of possible implementations.

I wrote at the start that the focus of this article is Lampsort as an algorithm, not issues of methodology. Let me, however, give an idea of the underlying methodological debate. Lamport uses this example to emphasize the difference between algorithms and programs, and to criticize the undue attention being devoted to programming languages. He presents Lampsort in a notation which he considers to be at a higher level than programming languages, and it is for him an algorithm rather than a program. Programs will be specific implementations guided in particular by efficiency considerations. One can derive them from higher-level versions (algorithms) through refinement. A refinement process may in particular remove or restrict non-determinism, present in the above version of Lampsort through the query item (whose only official property is that it returns an element of the set).

The worldview underlying the Eiffel method is almost the reverse: treating the whole process of software development as a continuum; unifying the concepts behind activities such as requirements, specification, design, implementation, verification, maintenance and evolution; and working to resolve the remaining differences, rather than magnifying them. Anyone who has worked in both specification and programming knows how similar the issues are. Formal specification languages look remarkably like programming languages; to be usable for significant applications they must meet the same challenges: defining a coherent type system, supporting abstraction, providing good syntax (clear to human readers and parsable by tools), specifying the semantics, offering modular structures, allowing evolution while ensuring compatibility. The same kinds of ideas, such as an object-oriented structure, help on both sides. Eiffel as a language is the notation that attempts to support this seamless, continuous process, providing tools to express both abstract specifications and detailed implementations. One of the principal arguments for this approach is that it supports change and reuse. If everything could be fixed from the start, maybe it could be acceptable to switch notations between specification and implementation. But in practice specifications change and programs change, and a seamless process relying on a single notation makes it possible to go back and forth between levels of abstraction without having to perform repeated translations between levels. (This problem of change is, in my experience, the biggest obstacle to refinement-based approaches. I have never seen a convincing description of how one can accommodate specification changes in such a framework without repeating the whole process. Inheritance, by the way, addresses this matter much better.)

The example of Lampsort in Eiffel suggests that a good language, equipped with the right abstraction mechanisms, can be effective at describing not only final implementations but also abstract algorithms. It does not hurt, of course, that these abstract descriptions can also be executable, at the possible price of non-optimal performance. The transformation to an optimal version can happen entirely within the same method and language.

Quite apart from these discussions of software engineering methodology, Lamport’s elegant version of Quicksort deserves to be known widely.

References

[1] Lamport video here, segment starting at 0:32:34.
[2] Carlo Furia, Bertrand Meyer and Sergey Velder: Loop invariants: Analysis, Classification and Examples, in ACM Computing Surveys, September 2014, preliminary text here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (12 votes cast)
VN:F [1.9.10_1130]
Rating: +8 (from 8 votes)

New MOOC opens Tuesday

Our online course Computing: Art, Magic, Science, available from EdX, opens this Tuesday (tomorrow, 30 September) at 9 AM Zurich time (and at this time in your area).

An earlier article on this blog described the course, which integrates ten years of experience teaching introductory programming at ETH, and takes advantage of remote-compilation and remote-execution technology from our distributed development research.

You can find the course here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

A gold medal

The French National Research Center (CNRS) has just awarded [1] its annual gold medal to Gérard Berry, a great recognition for an outstanding computer scientist. I first discovered Berry’s work through a brilliant 1976 article, Bottom-Up Computation of Recursive Programs [2], which explained recursion using methods of denotational semantics, and should really figure in collections of classic CS articles. One can hardly understand dynamic programming algorithms, for example, without realizing that they are bottom-up versions of recursive algorithms, through transformations described in the article. I relied extensively on its techniques for my own textbooks, from the advanced concepts of Introduction to the Theory of Programming Languages all the way down to the introductory presentation of recursion in Touch of Class.

Berry then went on to play a founding role in synchronous languages, explaining (in a paper whose reference I don’t have right now) that he went through a revelation of sorts after realizing that the automatic-control industry needed languages completely different from those computer scientists typically love. He created the Estérel language and shepherded it all the way to industrialization, serving as Chief Scientist of Estérel Technologies in the 2000 decade, before coming back to research.

He is also well known in France as a popularizer of informatics (computer science) through a number of successful books aiming at a large audience. He was appointed the first professor of informatics at the Collège de France, and in his inaugural lecture series presented a sweeping description of how information technology advances, based on powerful scientific concepts, permeate today’s world and raise ever new challenges. Hardly could the CNRS have chosen to distinguish a better representative of our discipline.

References

[1] CNRS announcement: here.

[2] Gérard Berry, Bottom-Up Computation of Recursive Programs, in RAIRO (Revue Française d’Automatique, Informatique, Recherche Opérationnelle, Informatique Théorique), tome 10, no R1 (1976), pp. 47-82, available here.

VN:F [1.9.10_1130]
Rating: 5.5/10 (2 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

Record enrollment

First week of semester at ETH. The number of incoming informatics (computer science) students has reached an all-time high: 345; see the (bad-quality) picture from day one of “Introduction to Programming”. (And no, the gender distribution has not changed.) So much for fears that MOOCs and such will displace universities; in fact we have not one but two MOOCs to support the course. Computer science is back in fashion!

Introduction to Programming, ETH Zurich, September 2014, first day of class

 

 

 

 

 

VN:F [1.9.10_1130]
Rating: 7.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

Computing: the Art, the Magic, the Science

 

My colleagues and I have just finished recording our new MOOC (online course), an official ETH offering on the EdX platform. The preview is available [1] and the course will run starting in September.

As readers of this blog know, I  have enthusiastically, under the impulsion of Marco Piccioni at ETH, embraced MOOC technology to support and spread our courses. The particular target has been the introduction to programming that I have taught for over a decade at ETH based on the Touch of Class textbook [2]. In February this blog announced [3] the release of our first MOOC, embodying the essentials of our ETH course and making it available not only to ETH students but to the whole world. The course does not just include video lectures: it also supports active student participation through online exercises and programs that can be compiled and tested on the cloud, with no software installation. These advanced features result from our research on support for distributed software development (by Christian Estler and Martin Nordio, with Carlo Furia and others).

This first course was a skunkworks project, which we did entirely on our own without any endorsement from ETH or any of the main MOOC players. We and our students have very much benefited from the consequent flexibility, and the use of homegrown technology relying on the MOODLE framework. We will keep this course for our own students and for any outside participant who prefers a small-scale, “boutique” version. But the EdX brand and EdX’s marketing power will enable us to reach a much broader audience. We want to provide the best introductory computing course on the market and the world needs to know about it. In addition, the full support of media services at ETH  helped us reach a higher standard on the technical side. (For our first course, the home-brewed one, we did not have a studio, so that every time an ambulance drove by — our offices are close to the main Zurich hospital — we had to restart the current take.)

The course’s content is not exactly the same: we have broadened the scope from just programming to computing, although it retains a strong programming component. We introduced additional elements such as an interview with Professor Peter Widmayer of ETH on the basics of computer science theory. For both new material and the topics retained from the first version we have adapted to the accepted MOOC practice of short segments, although we did not always exactly meet the eight-minute upper limit that was suggested to us.

We hope that you, and many newcomers, will like the course and benefit from it.

References

[3] EdX course: Computing: Art, Magic, Science, preview available here.

[2] Bertrand Meyer: Touch of Class: Learning how to Program Well, with Objects and Contracts, Springer Verlag, revised printing, 2013, book page here.

[3] Learning to Program, Online: article from this blog, 3 February 2014, available here.

 

 

VN:F [1.9.10_1130]
Rating: 8.3/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

Reading Notes: Single-Entry, Single-Exit

 

It is remarkable that almost half a century after Dijkstra’s goto article, and however copiously and reverently it may be cited, today’s programs (other than in Eiffel) are still an orgy of gotos. There are not called gotos, being described as constructs that break out of a loop or exit a routine in multiple places, but they are gotos all the same. Multiple routine exits are particularly bad since they are in effect interprocedural gotos.

Ian Joyner has just released a simple and cogent summary of why routines should always have one entry and one exit.

References

[1] Ian Joyner: Single-entry, single-exit (SESE) heuristic, available here.

VN:F [1.9.10_1130]
Rating: 9.3/10 (12 votes cast)
VN:F [1.9.10_1130]
Rating: +5 (from 5 votes)

Accurately Analyzing Agility

  
Book announcement:

Agile! The Good, the Hype and the Ugly
Bertrand Meyer
Springer, 2014 (just appeared)
Book page: here.
Amazon page: here.
Publisher’s page: here

A few years ago I became fascinated with agile methods: with the unique insights they include; with the obvious exaggerations and plainly wrong advice they also promote; and perhaps most of all with the constant intermingling of these two extremes.

I decided to play the game seriously: I read a good part of the agile literature, including all the important books; I sang the song, became a proud certified Scrum Master; I applied many agile techniques in my own work.

The book mentioned above is the result of that study and experience. It is both a tutorial and a critique.

The tutorial component was, I felt, badly needed. Most of the agile presentations I have seen are partisan texts, exhorting you to genuflect and adopt some agile method as the secret to a better life. Such preaching has a role but professionals know there is no magic in software development.  Agile! describes the key agile ideas objectively, concretely, and as clearly as I could present them. It does not introduce them in a vacuum, like the many agile books that pretend software engineering did not exist before (except for a repulsive idea, the dreaded “waterfall”). Instead, it relates them to many other concepts and results of software engineering, to which they bring their own additions and improvements.

Unfortunately, not all the additions are improvements. Up to now, the field has largely been left (with the exception of Boehm’s and Turner’s 2005 “Guide for the Perplexed“) to propaganda pieces and adoring endorsements. I felt that software developers would benefit more from a reasoned critical analysis. All the more so that agile methods are a remarkable mix of the best and the worst; the book carefully weeds out — in the terminology of the title — the ugly from the hype and the truly good.

Software developers and managers need to know about the “ugly”: awful agile advice that is guaranteed to harm your project. The “hype” covers ideas that have been widely advertised as shining agile contributions but have little relevance to the core goals of software development. The reason it was so critical to identify agile ideas belonging to these two categories is that they detract from the “good”, some of it remarkably good. I would not have devoted a good part of the last five years to studying agile methods if I did not feel they included major contributions to software engineering. I also found that some of these contributions do not get, in the agile literature itself, the explanations and exposure they deserve; I made sure they got their due in the book. An example is the “closed-window rule”, a simple but truly brilliant idea, of immediate benefit to any project.

Software methodology is a difficult topic, on which we still have a lot to learn. I expect some healthy discussions, but I hope readers will appreciate the opportunity to discuss agile ideas in depth for the greater benefit of quality software development.

I also made a point of writing a book that (unlike my last two) is short: 190 pages, including preface, index and everything else.

The table of contents follows; more details and sample chapters can be found on the book page listed above.

Preface
1 OVERVIEW
     1.1 VALUES
     1.2 PRINCIPLES
          Organizational principles
          Technical principles
     1.3 ROLES
     1.4 PRACTICES
          Organizational practices
          Technical practices
     1.5 ARTIFACTS
          Virtual artifacts
          Material artifacts
     1.6 A FIRST ASSESSMENT
          Not new and not good
          New and not good
          Not new but good
          New and good!

2 DECONSTRUCTING AGILE TEXTS
     2.1 THE PLIGHT OF THE TRAVELING SEMINARIST
          Proof by anecdote
          When writing beats speaking
          Discovering the gems
          Agile texts: reader beware!
     2.2 THE TOP SEVEN RHETORICAL TRAPS
          Proof by anecdote
          Slander by association
          Intimidation
          Catastrophism
          All-or-nothing
          Cover-your-behind
          Unverifiable claims
          Postscript: you have been ill-served by the software industry!

&3 THE ENEMY: BIG UPFRONT ANYTHING
     3.1 PREDICTIVE IS NOT WATERFALL
     3.2 REQUIREMENTS ENGINEERING
          Requirements engineering techniques
          Agile criticism of upfront requirements
          The waste criticism
          The change criticism
          The domain and the machine
     3.3 ARCHITECTURE AND DESIGN
          Is design separate from implementation?
          Agile methods and design
     3.4 LIFECYCLE MODELS
     3.5 RATIONAL UNIFIED PROCESS
     3.6 MATURITY MODELS
          CMMI in plain English
          The Personal Software Process
          CMMI/PSP and agile methods
          An agile maturity scale

4 AGILE PRINCIPLES
     4.1 WHAT IS A PRINCIPLE?
     4.2 THE OFFICIAL PRINCIPLES
     4.3 A USABLE LIST
     4.4 ORGANIZATIONAL PRINCIPLES
          Put the customer at the center
          Let the team self-organize
          Maintain a sustainable pace
          Develop minimal software
          Accept change
     4.5 TECHNICAL PRINCIPLES
          Develop iteratively
          Treat tests as a key resource
          Do not start any new development until all tests pass
          Test first
          Express requirements through scenarios

5 AGILE ROLES
     5.1 MANAGER
     5.2 PRODUCT OWNER
     5.3 TEAM
          Self-organizing
          Cross-functional
     5.4 MEMBERS AND OBSERVERS
     5.5 CUSTOMER
     5.6 COACH, SCRUM MASTER
     5.7 SEPARATING ROLES

6 AGILE PRACTICES: MANAGERIAL
     6.1 SPRINT
          Sprint basics
          The closed-window rule
          Sprint: an assessment
     6.2 DAILY MEETING
     6.3 PLANNING GAME
     6.4 PLANNING POKER
     6.5 ONSITE CUSTOMER
     6.6 OPEN SPACE
     6.7 PROCESS MINIATURE
     6.8 ITERATION PLANNING
     6.9 REVIEW MEETING
     6.10 RETROSPECTIVE
     6.11 SCRUM OF SCRUMS
     6.12 COLLECTIVE CODE OWNERSHIP
          The code ownership debate
          Collective ownership and cross-functionality

7 AGILE PRACTICES: TECHNICAL
     7.1 DAILY BUILD AND CONTINUOUS INTEGRATION
     7.2 PAIR PROGRAMMING
          Pair programming concepts
          Pair programming versus mentoring
          Mob programming
          Pair programming: an assessment
     7.3 CODING STANDARDS
     7.4 REFACTORING
          The refactoring concept
          Benefits and limits of refactoring
          Incidental and essential changes
          Combining a priori and a posteriori approaches
     7.5 TEST-FIRST AND TEST-DRIVEN DEVELOPMENT
          The TDD method of software development
          An assessment of TFD and TDD

8 AGILE ARTIFACTS
     8.1 CODE
     8.2 TESTS
     8.3 USER STORIES
     8.4 STORY POINTS
     8.5 VELOCITY
     8.6 DEFINITION OF DONE
     8.7 WORKING SPACE
     8.8 PRODUCT BACKLOG, ITERATION BACKLOG
     8.9 STORY CARD, TASK CARD
     8.10 TASK AND STORY BOARDS
     8.11 BURNDOWN AND BURNUP CHARTS
     8.12 IMPEDIMENT
     8.13 WASTE, TECHNICAL DEBT, DEPENDENCY, DEPENDENCY CHARTS

9 AGILE METHODS
     9.1 METHODS AND METHODOLOGY
          Terminology
          The fox and the hedgehog
     9.2 LEAN SOFTWARE AND KANBAN
          Lean Software’s Big Idea
          Lean Software’s principles
          Lean Software: an assessment
          Kanban
     9.3 EXTREME PROGRAMMING
          XP’s Big Idea
          XP: the unadulterated source
          Key XP techniques
          Extreme Programming: an assessment
     9.4 SCRUM
          Scrum’s Big Idea
          Key Scrum practices
          Scrum: an assessment
     9.5 CRYSTAL
          Crystal’s Big Idea
          Crystal principles
          Crystal: an assessment

10 DEALING WITH AGILE TEAMS
     10.1 GRAVITY STILL HOLDS
     10.2 THE EITHER-WHAT-OR-WHEN FALLACY

11 THE UGLY, THE HYPE AND THE GOOD: AN ASSESSMENT OF THE AGILE APPROACH
     11.1 THE BAD AND THE UGLY
          Deprecation of upfront tasks
          User stories as a basis for requirements
          Feature-based development and ignorance of dependencies
          Rejection of dependency tracking tools
          Rejection of traditional manager tasks
          Rejection of upfront generalization
          Embedded customer
          Coach as a separate role
          Test-driven development
          Deprecation of documents
     11.2 THE HYPED
     11.3 THE GOOD
     11.4 THE BRILLIANT
Bibliography
Index

 

VN:F [1.9.10_1130]
Rating: 8.4/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

Crossing the Is and doting on the Ts

 

Last week at the the CSEE&T conference in Klagenfurt (the conference page is here, I gave a keynote), a panel discussed how universities should prepare students for software engineering. Barry Boehm, one of the panelists, stated the following principle, which afterwards he said he had learned from Simon Ramo, co-founder of TRW. In hiring people, he stated, it is better to avoid candidates with an I-shaped profile: narrowly specialized in one topic that they have explored to exhaustion. Better look for a T: someone who has mastered an area in depth and then branched out to learn about many others.

I started playing with the variants. One should avoid the hyphens, or em-dashes, ““: people with a smattering of everything but no detailed knowledge of anything. Boehm said that this is the reason he always argued against establishing such undergraduate majors as systems engineering. A variant of the hyphen is the overline ““: graduates who supposedly are so smart that they can learn anything, but whose actual knowledge is limited to abstract notions.

Along with the T we should consider the “bottom” symbol of denotational semantics: . It corresponds to people who have a broad educational base, for example in mathematics, and have deepened it by focusing on a particular topic. The T and can be combined into an H turned on its side, H on the side: acquiring a solid foundation, specializing, then using that experience to become familiar with new areas.

Extending the permutation group, I am not sure what a “+” profile would be, but in a discussion last night Rustan Leino and Peter Müller suggested the “O”, ability to to circle around topics, and the umlaut, knowing a thing or two; in fact, exactly two.

 

VN:F [1.9.10_1130]
Rating: 8.2/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 4 votes)

Learning to program, online

Introduction to Programming MOOCThe ETH introductory programming course, which I have taught since 2003 and used as the basis for the Springer Touch of Class textbook, is now available as a MOOC: an online course, open to anyone interested [1]. The project was started and led by Marco Piccioni.

The MOOC was released in September; although it was “open” (the other “O”) from the start, we have not publicized it widely until now, since we used it first for the benefit of students taking the course at ETH, and took advantage of this experience to polish it. If you follow the acronym buzz you may say it was first a “SPOC” (Small Personal Online Course”). The experience with our students has been extremely encouraging: they took it as a supplement to the lectures and widely praised its value.We hope that many others will find it useful as well.

MOOCs are hot but they have attracted as much criticism as hype. We have seen the objections: low completion rates, lack of direct human contact, threats to traditional higher education. Two things are clear, though: MOOCs are more than a passing fad; and they have their own pedagogical advantages.

MOOCs are here to stay; one ignores them at one’s own peril. For courses on a popular topic, I believe that in a few years almost everyone will be teaching from a MOOC. Not in the sense of telling students “just follow this course on the Web and come back for the exam“, but as a basis for individual institutions’ courses. For example the students might be told to take the lessons online, then come to in-person lectures (“Flip The Classroom“) or discussion sessions. For any given topic, such as introduction to programming, only a handful of MOOCs will emerge in this role. We would like the ETH course to be one of them.

The question is not just to replace courses and textbooks with an electronic version. MOOCs enrich the learning experience. Introduction to Programming is a particularly fertile application area for taking advantage of technology: the presentation of programming methods and techniques becomes even more effective if students can immediately try out the ideas, compile the result, run it, and see the results on predefined tests. Our course offers many such interactive exercises, thanks to the E4Mooc (Eiffel for MOOCs) framework developed by Christian Estler [2]. This feature has proved to be a key attraction of the course for ETH students. Here for example is an exercise asking you to write a function that determines whether a string is a palindrome (reads identically in both directions):

The program area is pre-filled with a class skeleton where all that remains for you to enter is the algorithm for the relevant function. Then you can click “Compile” and, if there are no compilation errors, “Run” to test your candidate code against a set of predefined test values. One of the benefits for users taking the course is that there is no software to install: everything runs in the cloud, accessible from the browser. Here we see the MOOC not just as a technique for presenting standard material but as an innovative learning tool, opening up pedagogical techniques that were not previously possible.

Besides E4Mooc, our course relies on the Moodle learning platform. Our experience with Moodle has been pleasant; we noticed, for example, that students really liked the Moodle feature enabling them to gain virtual “badges” for good answers, to the point of repeating exercises until they got the badges. For instructors preparing the course, building a MOOC is a huge amount of work (that was not a surprise, people had told us); but it is worth the effort.

We noted that attendance at the lectures increased as compared to previous years. The matter is a natural concern: other than the cold November mornings in Zurich (one of the lectures is at 8 AM) there are many reasons not to show up in class:  the textbook covers much of the material; all the slides are online; so are the slides for exercise sessions (tutorials) and texts of exercises and some earlier exams; lectures were video-recorded in previous years, and students can access the old recordings. Our feeling is that the MOOC makes the course more exciting and gives students want to come to class and hear more.

The MOOC course is not tied to a particular period; you can take it whenever you like. (The current practice of offering MOOCs at fixed times is disappointing: what is the point of putting a course online if participants are forced to fit to a fixed schedule?)

Marco Piccioni and I are now off to our second MOOC, which will be a generalization of the first, covering the basics not just of programming but of computer science and IT overall, and will be available on one of the major MOOC platforms. We will continue to develop the existing MOOC, which directly supports our in-person course, and which we hope will be of use to many other people.

Take the MOOC, or tell a beginner near you to take it, and tell us what you think.

References

[1] Bertrand Meyer, Marco Piccioni and other members of the ETH Chair of Software Engineering: ETH Introduction to Programming MOOC, available here.

[2] H-Christian Estler, E4Mooc demo, available on YouTube: here.

VN:F [1.9.10_1130]
Rating: 9.4/10 (14 votes cast)
VN:F [1.9.10_1130]
Rating: +7 (from 9 votes)

LASER 2014 (Elba, September)

2014 marks the 10-th anniversary (11th edition) of the LASER summer school. The school will be held September 7-14, 2014, and the detailed information is here.

LASER (the name means Laboratory for Applied Software Engineering Research) is dedicated to practical software engineering. The roster of speakers since we started is a who’s who of innovators in the field. Some of the flavor of the school can gathered from the three proceedings volumes published in Springer LNCS (more on the way) or simply by browsing the pages of the schools from previous years.

Usually we have a theme, but to mark this anniversary we decided to go for speakers first; we do have a title, “Leading-Edge Software Engineering”, but broad enough to encompass a wide variety of a broad range of topics presented by star speakers: Harald Gall, Daniel Jackson, Michael Jackson, Erik Meijer (appearing at LASER for the third time!), Gail Murphy and Moshe Vardi. With such a cast you can expect to learn something important regardless of your own primary specialty.

LASER is unique in its setting: a 5-star hotel in the island paradise of Elba, with outstanding food and countless opportunities for exploring the marvelous land, the beaches, the sea, the geology (since antiquity Elba has been famous for its stones and minerals) and the history, from the Romans to Napoleon, who in the 9 months of his reign changed the island forever. The school is serious stuff (8:30 to 13 and 17 to 20 every day), but with enough time to enjoy the surroundings.

Registration is open now.

VN:F [1.9.10_1130]
Rating: 7.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

Niklaus Wirth birthday symposium, 20 February, Zurich

In honor of Niklaus Wirth’s 80-th birthday we are organizing a symposium at ETH on February 20, 2014. This is a full-day event with invited talks by:

  • Vint Cerf
  • Hans Eberlé
  • Michael Franz
  • me
  • Carroll Morgan
  • Martin Odersky
  • Clemens Szyperski
  • Niklaus Wirth himself

From the symposium’s web page:

Niklaus Wirth was a Professor of Computer Science at ETH Zürich, Switzerland, from 1968 to 1999. His principal areas of contribution were programming languages and methodology, software engineering, and design of personal workstations. He designed the programming languages Algol W, Pascal, Modula-2, and Oberon, was involved in the methodologies of structured programming and stepwise refinement, and designed and built the workstations Lilith and Ceres. He published several text books for courses on programming, algorithms and data structures, and logical design of digital circuits. He has received various prizes and honorary doctorates, including the Turing Award, the IEEE Computer Pioneer, and the Award for outstanding contributions to Computer Science Education.

Participation is free (including breaks, lunch and the concluding “Apéro”) but space is strictly limited and we expect to run out of seats quickly. So if you are interested (but only if you are certain to attend) please register right away.

Symposium page and access to registration form: here.

VN:F [1.9.10_1130]
Rating: 8.7/10 (7 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 6 votes)

Informatics education in Europe: Just the facts

 

In 2005 a number of us started Informatics Europe [1], the association of university departments and industrial research labs in computer science in Europe. The association has now grown to 80 members across the entire continent; it organizes the annual European Computer Science Summit and has published a number of influential reports. The last one just came out: Informatics Education in Europe: Institutions, Degrees, Students, Positions, Salaries — Key Data 2008-2012 [2]. The principal author is Cristina Pereira, who collected and organized the relevant data over more than a year; I helped with the preparation of the final text.

At the beginning of Informatics Europe we considered with particular attention  the model of the Computing Research Association [3], which played a crucial role in giving computer science (informatics) its due place in the US academic landscape. Several past and current officers of the CRA,  such as Willy Zwaenepoel, Ed Lazowska, Bob Constable, Andy Bernat, Jeannette Wing, Moshe Vardi and J Strother Moore gave keynotes at our early conferences and we of course asked them for the secrets of their organization’s success. One answer that struck us was the central role played by data collection. Just gathering the  facts, such as degrees and salaries, established for the first time a solid basis for serious discussions. We took this advice to heart and the report is the first result.

Gathering the information is particularly difficult for Europe given the national variations and the absence of centralized statistical data. Even the list of names under which institutions teach informatics in Europe fills a large table in the report. Cristina’s decision was, from the start, to favor quality over quantity: to focus on impeccable data for countries for which we could get it, rather than trying to cover the whole continent with data of variable credibility.

The result is the first systematic repository of basic information on informatics education in Europe: institutions, degrees offered and numbers awarded, student numbers, position titles and definitions, and (a section which will not please everyone) salaries for PhD students, postdocs and professors of various ranks.

The report is a first step; it only makes sense if we can regularly continue to update it and particularly extend it to other countries. But even in its current form (and with the obvious observation that my opinion is not neutral)  I see it as a major step forward for the discipline in Europe. We need an impeccable factual basis to convince the public at large and political decision-makers to give informatics the place it deserves in today’s educational systems.

References

[1] Informatics Europe site, see here.

[2] Cristina Pereira and Bertrand Meyer: Informatics Education in Europe: Institutions, Degrees, Students, Positions, Salaries — Key Data 2008-2012, Informatics Europe report, 30 September 2013, available here.

[3] Computing Research Association (US), see here.

.

VN:F [1.9.10_1130]
Rating: 7.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 2 votes)

Hungarian rotation

The 2013 Informatics Europe “Best Practices in Education” award was devoted, this year, to initiatives for teaching informatics in schools [1].  It was given out last week at the European Computer Science Summit in Amsterdam [2]. Two teams shared it, one from Poland and the other from Romania. Both teams showed excellent projects, but the second was beyond anything I expected.

The project comes from the Hungarian-speaking Sapientia University in Transylvania and is devoted to teaching algorithms visually and “at the same time enhancing intercultural communication” in the region. It illustrates the classical sorting algorithms through folk dances. Quicksort is Hungarian, selection sort is gypsy, merge sort is “Transylvanian-saxon”. I think my favorite is Shell sort [3]. For more, see their YouTube channel [4].

Now  if only they could act the loop invariants [5].

References

[1] 2013 Best Practices in Education award, see here.

[2] 2013 European Computer Science Summit, here.

[3] “Shell sort with Hungarian (Székely) folk dance”, see here.

[4] YouTube Algorythmics channel, here.

[5] Carlo Furia, Bertrand Meyer and Sergey Velder: Loop invariants: Analysis, Classification and Examples, in ACM Computing Surveys, Septembre 2014, to appear, available here.

VN:F [1.9.10_1130]
Rating: 9.1/10 (10 votes cast)
VN:F [1.9.10_1130]
Rating: +5 (from 7 votes)

Smaller, better textbook

A new version of my Touch of Class [1] programming textbook is available. It is not quite a new edition but more than just a new printing. All the typos that had been reported as of a few months ago have been corrected.

The format is also significantly smaller. This change is more than a trifle. When а  reader told me for the first time “really nice book, pity it is so heavy!”, I commiserated and did not pay much attention. After twenty people said that, and many more after them, including professors looking for textbooks for their introductory programming classes, I realized it was a big deal. The reason the book was big and heavy was not so much the size of the contents (876 is not small, but not outrageous for a textbook introducing all the fundamental concepts of programming). Rather, it is a technical matter: the text is printed in color, and Springer really wanted to do a good job, choosing thick enough paper that the colors would not seep though. In addition I chose a big font to make the text readable, resulting in a large format. In fact I overdid it; the font is bigger than necessary, even for readers who do not all have the good near-reading sight of a typical 19-year-old student.

We kept the color and the good paper,  but reduced the font size and hence the length and width. The result is still very readable, and much more portable. I am happy to make my contribution to reducing energy consumption (at ETH alone, think of the burden on Switzerland’s global energy bid of 200+ students carrying the book — as I hope they do — every morning on the buses, trains and trams crisscrossing the city!).

Springer also provides electronic access.

Touch of Class is the textbook developed on the basis of the Introduction to Programming course [2], which I have taught at ETH Zurich for the last ten years. It provides a broad overview of programming, starting at an elementary level (zeros and ones!) and encompassing topics not usually covered in introductory courses, even a short introduction to lambda calculus. You can get an idea of the style of coverage of such topics by looking up the sample chapter on recursion at touch.ethz.ch. Examples of other topics covered include a general introduction to software engineering and software tools. The presentation uses full-fledged object-oriented concepts (including inheritance, polymorphism, genericity) right from the start, and Design by Contract throughout. Based on the “inverted curriculum” ideas on which I published a number of articles, it presents students with a library of reusable components, the Traffic library for graphical modeling of traffic in a city, and builds on this infrastructure both to teach students abstraction (reusing code through interfaces including contracts) and to provide them models of high-quality code for imitation and inspiration.

For more details see the article on this blog that introduced the book when it was first published [3].

References

[1] Bertrand Meyer, Touch of Class: An Introduction to Programming Well Using Objects and Contracts, Springer Verlag, 2nd printing, 2013. The Amazon page is here. See the book’s own page (with slides and other teaching materials, sample chapter etc.) here. (Also available in Russian, see here.)

[2] Einführung in die Programmierung (Introduction to Programming) course, English course page here.

[3] Touch of Class published, article on this blog, 11 August 2009, see [1] here.

VN:F [1.9.10_1130]
Rating: 7.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +5 (from 7 votes)

The invariants of key algorithms (new paper)

 

I have mentioned this paper before but as a draft. It has now been accepted by ACM’s Computing Surveys and is scheduled to appear in September 2014; the current text, revised from the previous version, is available [1].

Here is the abstract:

Software verification has emerged as a key concern for ensuring the continued progress of information technology. Full verification generally requires, as a crucial step, equipping each loop with a “loop invariant”. Beyond their role in verification, loop invariants help program understanding by providing fundamental insights into the nature of algorithms. In practice, finding sound and useful invariants remains a challenge. Fortunately, many invariants seem intuitively to exhibit a common flavor. Understanding these fundamental invariant patterns could therefore provide help for understanding and verifying a large variety of programs.

We performed a systematic identification, validation, and classification of loop invariants over a range of fundamental algorithms from diverse areas of computer science. This article analyzes the patterns, as uncovered in this study,governing how invariants are derived from postconditions;it proposes a taxonomy of invariants according to these patterns, and presents its application to the algorithms reviewed. The discussion also shows the need for high-level specifications based on “domain theory”. It describes how the invariants and the corresponding algorithms have been mechanically verified using an automated program prover; the proof source files are available. The contributions also include suggestions for invariant inference and for model-based specification.

Reference

[1] Carlo Furia, Bertrand Meyer and Sergey Velder: Loop invariants: Analysis, Classification and Examples, in ACM Computing Surveys, to appear in September 2014, preliminary text available here.

VN:F [1.9.10_1130]
Rating: 7.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

Informatics: catch them early

 

Recycled[I occasionally post on the Communications of the ACM blog. It seems that there is little overlap between readers of that blog and the present one, so — much as I know, from software engineering, about the drawbacks of duplication — I will continue to repost articles here when relevant.]

Some call it computer science, others informatics, but they face the same question: when do we start teaching the subject? In many countries where high schools began to introduce it in the seventies, they actually retreated since then; sure, students are shown how to use word processors and spreadsheets, but that’s not the point.

Should we teach computer science in secondary (and primary) school? In a debate at SIGCSE a few years ago, Bruce Weide said in strong words that we should not: better give students the strong grounding in mathematics and especially logic that they will need to become good at programming and CS in general. I found the argument convincing: I teach first-semester introductory programming to 200 entering CS students every year, and since many have programming experience, of highly diverse nature but usually without much of a conceptual basis, I find myself unteaching a lot. In a simple world, high-school teachers would teach students to reason, and we would teach them to program. The world, however, is not simple. The arguments for introducing informatics earlier are piling up:

  • What about students who do not enter CS programs?
  • Many students will do some programming anyway. We might just as well teach them to do it properly, rather than let them develop bad practices and try to correct them at the university level.
  • Informatics is not just a technique but an original scientific discipline, with its own insights and paradigms (see [2] and, if I may include a self-citation, [3]). Its intellectual value is significant for all educated citizens, not just computer scientists.
  • Countries that want to be ahead of the race rather than just consumers of IT products need their population to understand the basic concepts, just as they want everyone, and not just future mathematicians, to master the basics of arithmetics, algebra and geometry.

These and other observations led Informatics Europe and ACM Europe, two years ago, to undertake the writing of a joint report, which has now appeared [1]. The report is concise and makes strong points, emphasizing in particular the need to distinguish education in informatics from a mere training in digital literacy (the mastery of basic IT tools, the Web etc.). The distinction is often lost on the general public and decision-makers (and we will surely have to emphasize it again and again).

The report proposes general principles for both kinds of programs, emphasizing in particular:

  • For digital literacy, the need to teach not just how-tos but also safe, ethical and effective use of IT resources and tools.
  • For informatics, the role of this discipline as a cross-specialty subject, like mathematics.

The last point is particularly important since we should make it clear that we are not just pushing (out of self-interest, as members of any discipline could) for schools to give our specialty a share, but that informatics is a key educational, scientific and economic resource for the citizens of any modern country.

The report is written from a European perspective, but the analysis and conclusions will, I think, be useful in any country.

It does not include any detailed curriculum recommendation, first because of the wide variety of educational contexts, but also because that next task is really work for another committee, which ACM Europe and Informatics Europe are in the process of setting up. The report also does not offer a magic solution to the key issue of bootstrapping the process — by finding teachers to make the courses possible, and courses to justify training the teachers — but points to successful experiences in various countries that show a way to break the deadlock.

The introduction of informatics as a full-fledged discipline in the K-12 curriculum is clearly where the winds of history are blowing. Just as the report was being finalized, the UK announced that it was making CS one of the choices of required scientific topics would become a topic in the secondary school exam on a par with traditional sciences. The French Academy of Sciences recently published its own report on the topic, and many other countries have similar recommendations in progress. The ACM/IE report is a major milestone which should provide a common basis for all these ongoing efforts.

References

[1] Informatics education: Europe Cannot Afford to Miss the Boat,  Report of the joint Informatics Europe & ACM Europe Working Group on Informatics Education,  April 2013,  available here.

[2] Jeannette Wing: Computational Thinking, in Communications of the ACM, vol. 49, no. 3, March 2006, pages 33-35, available here.

[3] Bertrand Meyer: Software Engineering in the Academy, in Computer (IEEE), vol. 34, no. 5, May 2001, pages 28-35, available here.

VN:F [1.9.10_1130]
Rating: 8.2/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 2 votes)

New course partners sought: a DOSE of software engineering education

 

Since 2007 we have conducted, as part of a course at ETH, the DOSE project, Distributed and Outsourced Software Engineering, developed by cooperating student teams from a dozen universities around the world. We are finalizing the plans for the next edition, October to December 2013, and will be happy to welcome a few more universities.

The project consists of building a significant software system collaboratively, using techniques of distributed software development. Each university contributes a number of “teams”, typically of two or three students each; then “groups”, each made up of three teams from different universities, produce a version of the project.

The project’s theme has varied from year to year, often involving games. We make sure that the development naturally divides into three subsystems or “clusters”, so that each group can quickly distribute the work among its teams. An example of division into clusters, for a game project, is: game logic; database and player management; user interface. The page that describes the setup in more detail [1] has links enabling you to see the results of some of the best systems developed by students in recent years.

The project is a challenge. Students are in different time zones, have various backgrounds (although there are minimum common requirements [1]), different mother tongues (English is the working language of the project). Distributed development is always hard, and is harder in the time-constrained context of a university course. (In industry, while we do not like that a project’s schedule slips, we can often survive if it does; in a university, when the semester ends, we have to give students a grade and they go away!) It is typical, after the initial elation of meeting new student colleagues from exotic places has subsided and the reality of interaction sets in, that some groups will after a month, just before the first or second deadline, start to panic — then take matters into their own hands and produce an impressive result. Students invariably tell us that they learn a lot through the course; it is a great opportunity to practice the principles of modern software engineering and to get prepared for the realities of today’s developments in industry, which are in general distributed.

For instructors interested in software engineering research, the project is also a great way to study issues of distributed development in  a controlled setting; the already long list of publications arising from studies performed in earlier iterations [3-9] suggests the wealth of available possibilities.

Although the 2013 project already has about as many participating universities as in previous years, we are always happy to consider new partners. In particular it would be great to include some from North America. Please read the requirements on participating universities given in [1]; managing such a complex process is a challenge in itself (as one can easily guess) and all teaching teams must share goals and commitment.

References

[1] General description of DOSE, available here.

[2] Bertrand Meyer: Offshore Development: The Unspoken Revolution in Software Engineering, in Computer (IEEE), January 2006, pages 124, 122-123, available here.

[3] Bertrand Meyer and Marco Piccioni: The Allure and Risks of a Deployable Software Engineering Project: Experiences with Both Local and Distributed Development, in Proceedings of IEEE Conference on Software Engineering & Training (CSEE&T), Charleston (South Carolina), 14-17 April 2008, available here.

[4] Martin Nordio, Roman Mitin, Bertrand Meyer, Carlo Ghezzi, Elisabetta Di Nitto and Giordano Tamburelli: The Role of Contracts in Distributed Development, in Proceedings of Software Engineering Advances For Offshore and Outsourced Development, Lecture Notes in Business Information Processing 35, Springer-Verlag, 2009, available here.

[5] Martin Nordio, Roman Mitin and Bertrand Meyer: Advanced Hands-on Training for Distributed and Outsourced Software Engineering, in Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering – Volume 1, ACM. 2010 available here.

[6] Martin Nordio, Carlo Ghezzi, Bertrand Meyer, Elisabetta Di Nitto, Giordano Tamburrelli, Julian Tschannen, Nazareno Aguirre and Vidya Kulkarni: Teaching Software Engineering using Globally Distributed Projects: the DOSE course, in Collaborative Teaching of Globally Distributed Software Development – Community Building Workshop (CTGDSD — an ICSE workshop), ACM, 2011, available here.

[7] Martin Nordio, H.-Christian Estler, Bertrand Meyer, Julian Tschannen, Carlo Ghezzi, and Elisabetta Di Nitto: How do Distribution and Time Zones affect Software Development? A Case Study on Communication, in Proceedings of the 6th International Conference on Global Software Engineering (ICGSE), IEEE, pages 176–184, 2011, available here.

[8] H.-Christian Estler, Martin Nordio, Carlo A. Furia, and Bertrand Meyer: Distributed Collaborative Debugging, to appear in Proceedings of 7th International Conference on Global Software Engineering (ICGSE), 2013.

[9] H.-Christian Estler, Martin Nordio, Carlo A. Furia, and Bertrand Meyer: Unifying Configuration Management with Awareness Systems and Merge Conflict Detection, to appear in Proceedings of the 22nd Australasian Software Engineering Conference (ASWEC), 2013.

 

VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

The education minister who wants fewer students

Picture yourself an incoming education minister in one of the EU countries — Germany, France, UK — who declares that he would like fewer students to graduate and go to university. Imagine the clamor. Even in the US  — where the secretary of education does not in fact have much sway over high schools, managed locally, or universities, controlled by the states or by private organizations — outrage would erupt. Assume for good measure that he criticizes immigrants for pushing their children to educate themselves. Pretty unthinkable.

Johann Schneider-Ammann will be education minister of Switzerland starting in 2013. (The seeming innocuousness of this factual statement belies the uniqueness of the situation: rather than ministries in the usual sense, Switzerland has federal departments, and their management rotates among the seven Federal Counselors — as does, yearly, the presidency of the Confederation. But that topic is for another day.) In a recent interview [1], Schneider-Ammann states that it would be a grave danger to allow any further growth of the percentage of students graduating with the high-school degree, the “Maturity” or  in common parlance Matura (equivalent to the German Abitur and the French Baccalauréat). What is this scary threshold? The graduation rate (France: 84.5%) has in Switzerland grown in the past years from 12% to a whopping 20%. This is where the minister wants to raise a red flag.

Not stopping there, he bitterly complains that immigrant families “want their children to get a Matura at any price”. These immigrant’s conceit has no bound! Can you fathom the insolence: they want to educate their kids!

Were such declarations to come from Mr. Schneider-Ammann’s French counterpart, the streets of Paris would fill up with pitchfork-brandishing youngsters. In the US, no one would even understand the part about immigrants: walk the halls of Berkeley or Stanford and it’s Asians everywhere, since childhood pushed to excellence by their “Chinese mothers” [2] or equivalent.

What is going on? Has Switzerland put in charge of its education the equivalent of (in the US) the would-be Republican candidate Rick Santorum, who infamously proclaimed that “President Obama wants everybody in America to go to college. What a snob!”.

Well, to a point, yes. But Schneider-Ammann, an ETH graduate in electrical engineering, is not an obscurantist and not driven by religious extremism. What he is talking about is the uniqueness of the Swiss educational system, which includes a separation of students at the age of 12 between those who will pursue the Matura, leading to open admission to almost any university program [3], and those channeled to technical tracks with reduced teaching hours and extensive on-the-job training. That system explains the 20% figure: it is not that the other 80% are left to rot; most of them receive a job-oriented qualification and a technical degree. Anyone who has tried to use the services of a plumber in the United States and in Switzerland understands the effect of this system on the quality of professional work (and its price).

Schneider-Ammann (along with, in my experience, most education professionals in Switzerland) has no qualms about defending that system. He says:

Every society is a kind of pyramid with, at the top, the most intellectual people and those with the most predisposition to education, and a wide base made of people with essentially manual skills. We have to include these in our education system as well. This is the only way to remain competitive and innovative and keep everyone, to all the extent possible, in the employment process.

In many circles such an unabated view would be howled down as elitist and paternalistic. The Swiss, however, have little interest in the kind of abstract arguments that are popular among French and German intellectuals. They are pragmatic and look at the results. Schneider-Ammann is not shy in pointing the fingers at other countries:

The more high-shool graduates a country has, the higher its unemployment rate. The relationship is obvious when one looks at the statistics. Highfalutin education plays its part in deindustrialisation. We can see it in Great Britain or France.

The views on immigrants are in the same spirit. Think not of mathematically brilliant Asian students forcefully entering computer science at MIT, but of children of families — for example, as Schneider-Ammann  helpfully explains lest anyone fear ambiguity, “from Germany or France”— which “come to Switzerland and from the experience of their country of origin know hardly anything else than the academic road to education”. Ah, these German mothers who know “hardly anything else” than universities! These French fathers who do not wake up at night worrying whether their daughters will make it to tram driver!

These arguments will, one guesses, make for interesting conversations when he does become minister and gets to meet his foreign colleagues, but they are hard to ignore. What do the statistics actually say?

From OECD documents, e.g. [4], I do not completely understand the British picture (not much of a comment since there are few things I understand about Britain). In  France, where reaching a 80% rate of success at the Baccalauréat was a decades-old political goal and a cause for national celebration when reached a few years ago, the unemployment  is currently 9.5% and shows no sign of abating (that is an optimistic way of putting it). Significantly, high unemployment  is not a fluke resulting from the current  economic crisis but a persistent problem going back at least to the eighties and clearly resulting from structural causes. In Germany, for all its economic strength, the rate is hardly better, having oscillated between 9% and 11% between 2002 and 2007 and remaining around 7% in 2012.

In Switzerland: 3% today, and never above 4% since 2001. (In early 2001 it was around 1.6%!) As to the educational level of the population, the OECD notes [3] that  Switzerland is a top-performing OECD country in reading literacy, maths and sciences with the average student scoring 517.

Correlation is not causation; politicians simplify complex matters, and one can think of a few counter-examples to Mr. Schneider-Ammann’s reasoning (I would like to get a better idea of the Finnish picture, and Korea also seems an interesting case). Still, that reasoning has to be taken seriously. Anyone familiar with the French situation, for example, can only wonder what good it is to give everyone the Bac and access to overcrowded university tracks of sociology, ethnography and psychology. How many ethnographers does a country need? Since the world is selective, selection occurs anyway, if after the  Bac, and most notably in controlling access to the noblest part of the system — the top of Schneider-Ammann’s pyramid: the Grandes Écoles, which are unabashedly elitist. Families in the know understand that the competitive examinations to Polytechnique and the like, not the Bac, are the exams that count. This part of the system, the royal track, works very well; I had the immense privilege of benefiting from it and can testify to its efficiency. It is at least as exclusive as the Swiss Matura+University track. The problem is the rest of the system; those students who do not make it to the top are somehow herded to the Bac and the first years of ordinary universities without the appropriate support and infrastructure.

Thereby lies the difference: the Swiss have no patience for grand speeches about high education, the implicit promise that everyone can become Jean-Paul Sartre or Simone de Beauvoir, and the harsh accompanying reality of a system that hides cruel disparities behind the appearance of universal access. Instead, they bluntly sort out at a tender age [5] the few intellectuals from the many practically-oriented students. The big difference with some other countries is that the latter category is neither duped nor dumped: neither duped into believing they can have an high-flying university education, nor dumped to mend for themselves. The technical and apprenticeship programs are are seriously organized, well-funded, and intended to lead to stable, respected professions.

So far the system has worked incredibly well; the durably low unemployment rate, in sharp contrast with neighboring countries, is only one sign of the country’s success. I do not know how much of the correlation is causation, and how much the Swiss experience is transposable to other countries.

As an intellectual, and someone who gained so much from education in peerless institutions, I do not feel in a good position to decree that others should just learn a trade.  But I find the argument fascinating. The conventional wisdom today is that countries must educate, educate, educate. Usually this is understood as pushing ever more students towards academic tracks. There are a few dissenting voices; Paul Krugman, for example, has regularly warned that automation today threatens low-end intellectual jobs (he comes back to that theme in today’s New York Times [5]). I do not know the answer; but the questions are worth asking, without fear of breaking taboos.

References and notes

[1] «Ich hätte lieber etwas weniger, dafür bessere Maturanden» (I’d rather have somewhat fewer and hence better high-school graduates), interview of Johann Schneider-Ammann (in German),  by René Donzé and Sarah Nowotny, Neue Zürcher Zeitung, 28 October 2012, available here.

[2] Amy Chua, Battle Hymn of the Tiger Mother, Penguin Press, 2011, see summary here.

[3] Law and medicine have a numerus clausus. Students graduating with a Matura can otherwise enter the university and program of their choice.

[4] OECD Better Life index, here. Note that the OECD reports give Switzerland a high-school graduation rate of 90%, at the very top of countries surveyed, meaning that the rate does not distinguish between the various kinds of high-school certificates. High-school graduation rates as discussed in the present article refer to the standard academic tracks, which for Switzerland means the Matura not including professional tracks.

[5] Migration paths exist, for hard-working late bloomers who want to transition from the lower-tier system to the universities.

[6] Paul Krugman: Robots and Robber Barons, New York Times, 9 December 2012, available here.

VN:F [1.9.10_1130]
Rating: 9.8/10 (12 votes cast)
VN:F [1.9.10_1130]
Rating: +6 (from 8 votes)

Domain Theory: the forgotten step in program verification

 

Program verification is making considerable progress but is hampered by a lack of abstraction in specifications. A crucial step is, almost always, absent from the process; this omission is the principal obstacle to making verification a standard component of everyday software development.

Steps in software verification

In the first few minutes of any introduction to program verification, you will be told that the task requires two artifacts: a program, and a specification. The program describes what executions will do; the specification, what they are supposed to do. To verify software is to ascertain that the program matches the specification: that it does is what it should.

The consequence usually drawn is that verification consists of three steps: write a specification, write a program, prove that the program satisfies the specification. The practical process is of course messier, if only because the first two steps may occur in the reverse order and, more generally, all three steps are often intertwined: the specification and the program influence each other, in particular through the introduction of “verification conditions” into the program; and initial proof attempts will often lead to changes in both the specification and the program. But by and large these are the three accepted steps.

Such a description misses a fourth step, a prerequisite to specification that is essential to a scalable verification process: Domain Theory. Any program addresses a specific domain of discourse, be it the domain of network access and communication for a mobile phone system, the domain of air travel for a flight control system, of companies and shares for a stock exchange system and so on. Even simple programs with a limited scope, such as the computation of the maximum of an array, use a specific domain beyond elementary mathematics. In this example, it is the domain of arrays, with their specific properties: an array has a range, a minimum and maximum indexes in that range, an associated sequence of values; we may define a slice a [i..j], ask for the value associated with a given index, replace an element at a given index and so on. The Domain Theory provides a formal model for any such domain, with the appropriate mathematical operations and their properties. In the example the operations are the ones just mentioned, and the properties will include the axiom that if we replace an element at a certain index i with a value v then access the element at an index j, the value we get is v if i = j, and otherwise the earlier value at j.

The role of a Domain Theory

The task of devising a Domain Theory is to describe such a domain of reference, in the spirit of abstract data types: by listing the applicable operations and their properties. If we do not treat this task as a separate step, we end up with the kind of specification that works for toy examples but quickly becomes unmanageable for real-life applications. Most of the verification literature, unfortunately, relies on such specifications. They lack abstraction since they keep using the lowest-level mathematical objects and constructs, such as numbers and quantified expressions. They are to specification what assembly language is to modern programming.

Dines Bjørner has for a long time advocated a closely related idea, domain engineering; see for example his book in progress [1]. Unfortunately, he does not take advantage of modularization through abstract data types; the book is an example of always-back-to-the-basics specification, resorting time and again to fully explicit specifications based on a small number of mathematical primitives, and as a consequence making formal specification look difficult.

Maximum computed from both ends

As a simple example of modeling through an abstract theory consider an algorithm for computing the maximum of an array. We could use the standard technique that goes through the array one-way, but for variety let us take the algorithm that works from both ends, moving two integer cursors towards each other until they meet.  (This example was used in a verification competition at a recent conference, I forgot which one.) The code looks like this:

Two-way maximum

The specification, expressed by the postcondition (ensure) should state that Result is the maximum of the array; the loop invariant will be closely related to it. How do we express these properties? The obvious way is not the right way. It states the postcondition as something like

k: Z | (ka.lowerka.upper) a [k] ≤ Result

k: Z | ka.lowerka.upper a [k] = Result

In words, Result is at least as large as every element of the array, and is equal to at least one of the elements of the array. The invariant can also be expressed in this style (try it).

The preceding specification expresses the desired property, but it is of an outrageously lower level than called for. The notion of maximum is a general one for arrays over an ordered type. It can be computed through many different algorithms in addition to the one shown above, and exists independently of these algorithms. The detailed, assembly-language-like definition of its properties should not have to be repeated in every case. It should be part of the Domain Theory for the underlying notion, arrays.

A specification at the right level of abstraction

In a Domain Theory for arrays of elements from an ordered set, one of the principal operations is maximum, satisfying the above properties. The definition of maximum through these properties belongs at the Domain Theory level. The Domain Theory should include that definition, independent of any particular computational technique such as two_way_max. Then the routine’s postcondition, relying on this notion from the Domain Theory, becomes simply

Result = a.maximum

The application of this approach to the loop invariant is particularly interesting. If you tried to write it at the lowest level, as suggested above, you should have produced something like this:

a.lowerija.upper

k: Z | kikj ∧ (∀ l: Z | l a.lowerl a.upper a [l] ≤ a [k])

The first clause is appropriate but the rest is horrible! With its nested quantified expressions it gives an impression of great complexity for a property that is in fact straightforward, simple enough in fact to be explained to a 10-year-old: the maximum of the entire array can be found between indexes i and j. In other words, it is also the maximum of the array slice going from i to j. The Domain Theory will define the notion of slice and enable us to express the invariant as just

a.lowerij a.upper — This bounding clause remains

a.maximum = (a [i..j ]).maximum

(where we will write the slice a [i..j ] as a.slice (i, j ) if we do not have mechanisms for defining special syntax). To verify the routine becomes trivial: on loop exit the invariant still holds and i = j, so the maximum of the entire array is given by the maximum of the single-element slice a [i..i ], which is the value of its single element a [i ]. This last property — the maximum of a single-element array is its single value — is independent of the verification of any particular program and should be proved as a little theorem of the Domain Theory for arrays.

The comparison between the two versions is striking: without Domain Theory, we are back to the most tedious mathematical manipulations again and again; simple, clear properties look complicated and obscure. This just for a small example on basic data structures; now think what it will be for a complex application domain. Without a first step of formal modeling to develop a Domain Theory, no realistic specification and verification process is realistic.

Although the idea is illustrated here through examples of individual routines, the construction of a Domain Theory should usually occur, in an object-oriented development process, at the level of a class: the embodiment of an abstract data type, which is at the appropriate level of granularity. The theory applies to objects of a given type, and hence will be used for the verification of all operations of that type. This observation justifies the effort of devising a Domain Theory, since it will benefit a whole set of software elements.

Components of a Domain Theory

The Domain Theory should include the three ingredients illustrated in the example:

  • Operations, modeled as mathematical functions (no side effects of course, we are in the world of specification).
  • Axioms characterizing the defining properties of these operations.
  • Theorems, characterizing other important properties.

This approach is of course nothing else than abstract data types (the same thing, although few people realize it, as object-oriented analysis). Even though ADTs are a widely popularized notion, supported for example by tools such as CafeOBJ [2] and Maude [3], it is generally not taken to its full conclusions; in particular there is too often a tendency to define every new ADT from scratch, rather than building up libraries of reusable high-level mathematical components in the O-O spirit of reuse.

Results, not just definitions

In devising a Domain Theory with the three kinds of ingredient listed above, we should not forget the last one, the theorems! The most depressing characteristic of much of the work on formal specification is that it is long on definitions and short on results, while good mathematics is supposed to be the reverse. I think people who have seriously looked at formal methods and do not adopt them are turned off not so much by the need to use mathematics but by the impression they get little value for it.

That is why Eiffel contracts do get adopted: even if it’s just for testing and debugging, people see immediate returns. It suffices for a programmer to have caught one bug as the violation of a simple postcondition to be convinced for life and lose any initial math-phobia.

Quantifiers are evil

As we go beyond simple contract properties — this argument must be positive, this reference will not be void — the math needs to be at the same level of abstraction to which, as modern programmers, we are accustomed. For example, one should always be wary of program specifications relying directly on quantified expressions, as in the low-level variants of the postcondition and loop invariant of the two_way_max routine.

This is not just a matter of taste, as in the choice in logic [4] between lambda expressions (more low-level but also more immediately understandable) and combinators (more abstract but, for many, more abstruse). We are talking here about the fundamental software engineering problem of scalability; more generally, of the understandability, extendibility and reusability of programs, and the same criteria for their specification and verification. Quantifiers are of course needed to express fundamental properties of a structure but in general should not directly appear in program assertions: as the example illustrated, their level of abstraction is lower than the level of discourse of a modern object-oriented program. If the rule — Quantifiers Considered Harmful — is not absolute, it must be pretty close.

Quantified expressions, “All elements of this structure possess this property” and “Some element of this structure possesses this property” — belong in the description of the structure and not in the program. They should appear in the Domain Theory, not in the verification. If you want to express that a hash table search found an element of key K, you should not write

(Result = Void ∧ (∀ i: Z | i a.loweri a.upper a.item (i).key ≠ K))

(ResultVoid ∧ (∀ i: Z | i a.loweri a.upper a.item (i).key = K ∧ Result = a.item (i))

but

Result /= Void     (Result a.elements_of_key (K))

The quantified expressions will appear in the Domain Theory for the corresponding structure, in the definition of such domain properties as elements_of_key. Then the program’s specification — the contracts to be verified — can rely on concepts that make sense to the programmer; the verification will take advantage of theorems that have been proved independently since they belong to the Domain Theory and do not depend on individual programs.

Even the simplest examples…

Practical software verification requires Domain Theory even in the simplest cases, including those often used as purely academic examples. Perhaps the most common (and convenient) way to explain the notion of loop invariant is Euclid’s algorithm to compute the greatest common divisor (gcd) of two numbers (with a structure remarkably similar to that of two_way_max):
Euclid

I have expressed the postcondition using a concept from an assumed Domain Theory for the underlying problem: gcd, the mathematical function that yields the greatest common divisor of two integers. Many specifications I have seen go back to the basics, with something like this (using \\ for integer remainder):

a \\ Result = 0 b \\ Result = 0   ∀ i: N | (a \\ i = 0) ∧ (b \\ i = 0)  i Result

This is indeed the definition of what it means for Result to be the gcd of a and b (it divides a, it divides b, and is greater than any other integer that also has these two properties). But it makes no sense to include such a detailed mathematical property in the specification of a program element. It belongs in the domain theory, where it will serve as the definition of a function gcd, which we can then use directly in the specification of the program.

Note how the invariant makes the necessity of the Domain Theory approach even more clear: try to express it in the basic mathematical form, not using the function gcd, It can be done, but the result is typical of the high complexity to usefulness ratio of traditional formal specifications mentioned above. Instead, the invariant that I have included in the program text above says exactly what there is to say, clearly and concisely: at each iteration, the gcd of our two temporary values, i and j, is the result that we are seeking, the gcd of the original values a and b. On exit from the loop, when i and j are equal, their common value is that result.

It is also thanks to the Domain Theory modeling that the verification of the program — consisting of proving that the stated property is indeed invariant — will be so simple: as part of the theory, we should have the two little theorems

i > j > 0 gcd (i, j) = gcd (ij, j)
gcd
(i, i) = i

which immediately show the implementation to be correct.

Inside of any big, fat, messy, quantifier-ridden specification there is a simple, elegant and clear Domain-Theory-based specification desperately trying to get out. Find it and use it.

From Domain Theory to domain library

One of the reasons most people working on program verification have not used the division into levels of discourse described here, with a clear role for developing a Domain Theory, is that they lack the appropriate notational support. Mathematical notation is of course available, but we are talking about programs a general verification framework cannot resort to a new special notation for every new application domain.

This is one of the places where Eiffel provides a consistent solution, through its seamless approach to integrating programs and specifications in a single notation. Thanks to mechanisms such as deferred classes (classes that describe concepts through detailed specifications without committing to an implementation), Eiffel is as much for specification as for design and implementation; a Domain Theory can be expressed though a set of deferred Eiffel classes, which we may call a domain library. The classes in a domain library should not just be deferred, meaning devoid of implementation; they should in addition describe stateless operations only — queries, not commands — since they are modeling purely mathematical concepts.

An earlier article in this blog [5] outlined the context of our verification work: the EVE project (Eiffel Verification Environment), a practical approach to integrating software verification in the day-to-day practice of modern software development, with the slogan ““Verification As a Matter Of Course”. In this project we have applied the idea of Domain Theory by building a domain library covering fundamental concepts of set theory, including functions and relations. This is the Mathematical Model Library (MML) [6, 7], which we use to verify the new data structure library EiffelBase 2 using specifications at the appropriate level of abstraction.

MML is in fact useful for the specification of a wide variety of programs, since almost every application area can benefit from the general concepts of set, subset, relation and such. But to cover a specific application domain, say flight traffic control, MML will generally not suffice; you will need to devise a Domain Theory that mathematically models the target domain, and may express it in the form of a domain library written in the same general spirit as MML: all deferred, stateless, focused on high-level abstractions.

It is one of the attractions of Eiffel that you can express such a theory and library in the same notation as the programs that will use it — more precisely in a subset of that notation, since the specification classes do not need the imperative constructs of the language such as instructions and attributes. Then both the development process and the verification use a seamlessly integrated set of notations and techniques, and all use the same tools from a modern IDE, in our case EiffelStudio, for browsing, editing, working with graphical repreentation, metrics etc.

DSL libraries for specifications

A mechanism to express Domain Theories is to a general specification mechanism essentially like a Domain Specific Language (DSL) is to a general programming language: a specialization for a particular domain. Domain libraries make the approach practical by:

  • Embedding the specification language in the programming language.
  • Fundamentally relying on reuse, in the best spirit of object technology.

This approach is in line with the one I presented for handling DSLs in an earlier article of this blog [8] (thanks, by the way, for the many comments received, some of them posted here and some on Facebook and LinkedIn where the post triggered long discussions). It is usually a bad idea to invent a new language for a new application domain. A better solution is to rely on libraries, by taking advantage of the power of object-oriented mechanisms to model (in domain libraries) and implement (for DSLs) the defining features of such a domain, and to make the result widely reusable. The resulting libraries are purely descriptive in the case of a domain library expressing a Domain Theory, and directly usable by programs in the case of a library embodying a DSL, but the goal is the same.

A sound and necessary engineering practice

Many ideas superficially look similar to Domain Theory: domain engineering as mentioned above, “domain analysis” as widely discussed in the requirements literature, model-driven development, abstract data type specification… They all start from some of the same observations, but  Domain Theory as described in this article is something different: a systematic approach to modeling an arbitrary application domain mathematically, which:

  • Describes the concepts through applicable operations, axioms and (most importantly) theorems.
  • Expresses these elements in an applicative (side-effect free, i.e. equivalent to pure mathematics) subset of the programming language, for direct embedding in program specifications.
  • Relies on the class mechanism to structure the results.
  • Collects the specifications into specification libraries and promotes the reuse of specifications in the same way we promote software reuse.
  • Uses the combination of these techniques to ensure that program specifications are at a high level of abstraction, compatible with the programmers’ view of their software.
  • Promotes a clear and effective verification process.

The core idea is in line with standard engineering practices in disciplines other than software: to build a bridge, a car or a chip you need first to develop a sound model of the future system and its environment, using any useful models developed previously rather than always going back to elementary textbook mathematics.

It seems in fact easier to justify doing Domain Analysis than to justify not doing it. The power of expression and abstraction of our programs has grown by leaps and bounds; it’s time for our specifications to catch up.

References

[1] Dines Bjørner: From Domains to Requirements —The Triptych Approach to Software Engineering, “to be submitted to Springer”, available here.

[2] Kokichi Futatsugi and others: CafeObj page, here.

[3] José Meseguer and others: Maude publication page, here.

[4] J. Roger Hindley, J. P. Seldin: Introduction to Combinators and l-calculus, Cambridge University Press, 1986.

[5] Verification As a Matter Of Course, earlier article on this blog (March 2010), available here.

[6] Bernd Schoeller, Tobias Widmer and Bertrand Meyer. Making specifications complete through models, in Architecting Systems with Trustworthy Components, eds. Ralf Reussner, Judith Stafford and Clemens Szyperski, Lecture Notes in Computer Science, Springer-Verlag, 2006, pages 48-70, available here.

[7] Nadia Polikarpova, Carlo A. Furia and Bertrand Meyer: Specifying Reusable Components, in VSTTE’10: Verified Software: Theories, Tools and Experiments, Edinburgh, August 2010, Lecture Notes in Computer Science, Springer-Verlag, available here.

[8] Never Design a Language, earlier article on this blog (January 2012), available here.

VN:F [1.9.10_1130]
Rating: 9.3/10 (12 votes cast)
VN:F [1.9.10_1130]
Rating: +7 (from 7 votes)

TOOLS 2012, “The Triumph of Objects”, Prague in May: Call for Workshops

Workshop proposals are invited for TOOLS 2012, The Triumph of Objectstools.ethz.ch, to be held in Prague May 28 to June 1. TOOLS is a federated set of conferences:

  • TOOLS EUROPE 2012: 50th International Conference on Objects, Models, Components, Patterns.
  • ICMT 2012: 5th International Conference on Model Transformation.
  • Software Composition 2012: 10th International Conference.
  • TAP 2012: 6th International Conference on Tests And Proofs.
  • MSEPT 2012: International Conference on Multicore Software Engineering, Performance, and Tools.

Workshops, which are normally one- or two-day long, provide organizers and participants with an opportunity to exchange opinions, advance ideas, and discuss preliminary results on current topics. The focus can be on in-depth research topics related to the themes of the TOOLS conferences, on best practices, on applications and industrial issues, or on some combination of these.

SUBMISSION GUIDELINES

Submission proposal implies the organizers’ commitment to organize and lead the workshop personally if it is accepted. The proposal should include:

  •  Workshop title.
  • Names and short bio of organizers .
  • Proposed duration.
  •  Summary of the topics, goals and contents (guideline: 500 words).
  •  Brief description of the audience and community to which the workshop is targeted.
  • Plans for publication if any.
  • Tentative Call for Papers.

Acceptance criteria are:

  • Organizers’ track record and ability to lead a successful workshop.
  •  Potential to advance the state of the art.
  • Relevance of topics and contents to the topics of the TOOLS federated conferences.
  •  Timeliness and interest to a sufficiently large community.

Please send the proposals to me (Bertrand.Meyer AT inf.ethz.ch), with a Subject header including the words “TOOLS WORKSHOP“. Feel free to contact me if you have any question.

DATES

  •  Workshop proposal submission deadline: 17 February 2012.
  • Notification of acceptance or rejection: as promptly as possible and no later than February 24.
  • Workshops: 28 May to 1 June 2012.

 

VN:F [1.9.10_1130]
Rating: 7.3/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Concurrency seminars

Ever more people are realizing that concurrency is at the center of IT challenges for the next decades. Concurrent programming remains as hard as ever; we have put together a one-day seminar that helps understand the concepts and build successful concurrent applications. The sessions for the first few months of 2012 are:

  • Palo Alto (February 15)
  • Zurich, (March 2)
  • London (March 22)
  • Paris (May 10)
  • Stockholm (June 15)
  • Seattle (July 20)

and the seminar program is available here.

 

 

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

John McCarthy

John McCarthyJohn McCarthy, who died last week at the age of 84, was one of the true giants of computer science. Most remarkable about his contributions are their diversity, their depth, and how they span both theory and practice.

To talk about him it is necessary first to dispel an unjustly negative connotation. McCarthy was one of the founders of the discipline of artificial intelligence, its most forceful advocate and the inventor of its very name. In the “AI Winter” episode of the late 1970s and 1980s, that name suffered some disrepute as a result of a scathing report by James Lighthill blaming AI researchers for over-promising. In fact the promoters of AI may not have delivered exactly what they announced (who can accurately predict science?); but what they delivered is astounding. Many breakthroughs in computer science, both in theory (advances in lambda calculus and the theory of computation) and in the practice of programming (garbage collection, functional programming languages), can directly be traced to work in AI. Part of the problem is a phenomenon that I heard John McCarthy himself describe:  “As soon as it works, no one calls it AI any more.” Automatic garbage collection was once advanced artificial intelligence; now it is just an algorithm that makes sure your smartphone does not freeze up. In a different field, we have become used to computers routinely beating chess champions, a feat that critics of AI once deemed unthinkable.

The worst over-promises came not from researchers in the field such as McCarthy, who understood the difficulties, but from people like Herbert Simon, more of a philosopher, who in 1965 wrote that “machines will be capable, within twenty years, of doing any work a man can do.” McCarthy’s own best-known over-promise was to take up David Levy on his 1968 bet that no computer would be able to beat him within ten years. But McCarthy was only mistaken in under-estimating the time span: Deep Blue eventually proved him right.

The word that comes most naturally to mind when thinking about McCarthy is “brilliant.” He belonged to that category of scientists who produce the fundamental insights before anyone else, even if they do not always have the patience to finalize the details. The breathtaking paper that introduced Lisp [1] is labeled “Part 1”; there was never a “Part 2.” (Of course we have a celebrated example in computer science, this one from a famously meticulous author, of a seven-volume treaty which never materialized in full.) It was imprudent to announce a second part, but the first was enough to create a whole new school of programming. The Lisp 1.5 manual [2], published in 1962, was another masterpiece; as early as page 13 it introduces — an unbelievable feat, especially considering that the program takes hardly more than half a page — an interpreter for the language being defined, written in that very language! The more recent reader can only experience here the kind of visceral, poignant and inextinguishable jealously that overwhelms us the first time we realize that we will never be able to attend the première of Don Giovanni at the Estates Theater in Prague on 29 October, 1787 (exactly 224 years ago yesterday — did you remember to celebrate?). What may have been the reaction of someone in “Data Processing,” such as it was in 1962, suddenly coming across such a language manual?

These years, 1959-1963, will remain as McCarthy’s Anni Mirabiles. 1961 and 1962 saw the publication of two visionary papers [3, 4] which started the road to modern program verification (and where with the benefit of hindsight it seems that he came remarkably close to denotational semantics). In [4] he wrote

Instead of debugging a program, one should prove that it meets its specifications, and this proof should be checked by a computer program. For this to be possible, formal systems are required in which it is easy to write proofs. There is a good prospect of doing this, because we can require the computer to do much more work in checking each step than a human is willing to do. Therefore, the steps can be bigger than with present formal systems.

Words both precise and prophetic. The conclusion of [3] reads:

It is reasonable to hope that the relationship between computation and mathematical logic will be as fruitful in the next century as that between analysis and physics in the last. The development of this relationship demands a concern for both applications and for mathematical elegance.

“A concern for both applications and mathematical elegance” is an apt characterization of McCarthy’s own work. When he was not busy designing Lisp, inventing the notion of meta-circular interpreter and developing the mathematical basis of programming, he was building the Lisp garbage collector and proposing the concept of time-sharing. He also played, again in the same period, a significant role in another milestone development, Algol 60 — yet another sign of his intellectual openness and versatility, since Algol is (in spite of the presence of recursion, which McCarthy championed) an imperative language at the antipodes of Lisp.

McCarthy was in the 1960s and 70s the head of the Artificial Intelligence Laboratory at Stanford. For some reason the Stanford AI Lab has not become as legendary as Xerox PARC, but it was also the home to early versions of  revolutionary technologies that have now become commonplace. Email, which hardly anyone outside of the community had heard about, was already the normal way of communicating, whether with a coworker next door or with a researcher at MIT; the Internet was taken for granted; everyone was using graphical displays and full-screen user interfaces; outside, robots were playing volley-ball (not very successfully, it must be said); the vending machines took no coins, but you entered your login name and received a bill at the end of the month, a setup which never failed to astonish visitors; papers were printed with sophisticated fonts on a laser printer (I remember a whole group reading the successive pages of Marvin Minsky’s  frames paper [5] directly on the lab’s XGP, Xerox Graphics Printer, as  they were coming out, one by one, straight from MIT). Arthur Samuel was perfecting his checkers program. Those who were not programming in Lisp were hooked to SAIL, “Stanford Artificial Intelligence Language,” an amazing design which among other insights convinced me once and for all that one cannot seriously deal with data structures without the benefit of an automatic serialization mechanism. The building itself, improbably set up amid the pastures of the Santa Cruz foothills, was razed in the eighties and the lab moved to the main campus, but the spirit of these early years lives on.

McCarthy ran the laboratory in an open and almost debonair way; he was a legend and somewhat intimidating, but never arrogant and in fact remarkably approachable. I took the Lisp course from him; in my second or third week at Stanford, I raised my hand and with the unflappable assurance of the fully ignorant slowly asked a long question: “In all the recursive function definitions that you have shown so far, termination was obvious because there is some ‘n’ that decreases for every recursive call, and we treat the case ‘n = 0’ or ‘n = 1’ in a special, non-recursive way. But things won’t always be so simple. Is there some kind of grammatical criterion on Lisp programs that we could use to ascertain whether a recursive definition will always lead to a terminating computation?” There was a collective gasp from the older graduate students in the audience, amazed that a greenhorn would have the audacity to interrupt the course with such an incompetent query. But instead of dismissing me, McCarthy proceeded, with a smile, to explain the basics of undecidability. He had the same attitude in the many seminars that he taught, often on topics straddling computer science and philosophy, in a Socratic style where every opinion was welcome and no one was above criticism.

He also had a facetious side. At the end of a talk by McCarthy at SRI, Tony Hoare, who was visiting for a few days, asked a question; McCarthy immediately rejoined that he had expected that question, summoned to the stage a guitar-carrying researcher from the AI Lab, and proceeded with the answer in the form of a prepared song.

The progress of science and technology is a collective effort; it takes many people to turn new insights into everyday reality. The insights themselves come from a few individuals, a handful in every generation. McCarthy was one of these undisputed pioneers.

 

References

[1] John McCarthy: Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I, in Communications of the ACM, vol. 3, no. 4, 1960, pages 184-195.

[2] John McCarthy, Paul W. Abrahams, Daniel J. Edwards, Timothy P. Hart, Michael I. Levin, LISP 1.5 Programmer’s Manual, MIT, 1962. Available at Amazon  External Linkand also as a PDF External Link.

[3] John McCarthy: A Basis for a Mathematical Theory of Computation, first version in Proc. Western Joint Computer Conference, 1961, revised version in Computer Programming and Formal Systems, eds. P. Braffort and D. Hirschberg, North Holland, 1963. Available in various places on the Web, e.g. here External Link.

[4] John McCarthy: Towards a Mathematical Science of Computation, in IFIP Congress 1962, pages 21-28, available in various places on the Web, e.g. here External Link.

[5] Marvin Minsky:  A Framework for Representing Knowledge, MIT-AI Laboratory Memo 306, June 1974, available here External Link.

 

(This article was first published on my ACM blog.  I am resuming regular Monday publication.)

VN:F [1.9.10_1130]
Rating: 10.0/10 (35 votes cast)
VN:F [1.9.10_1130]
Rating: +30 (from 30 votes)

Agile methods: the good, the bad and the ugly

It was a bit imprudent last Monday to announce the continuation of the SCOOP discussion for this week; with the TOOLS conference happening now, with many satellite events such as the Eiffel Design Feast of the past week-end and today’s “New Eiffel Technology Community” workshop, there is not enough time for a full article. Next week might also be problematic. The SCOOP series will resume, but in the meantime I will report on other matters.

As something that can be conveniently typed in while sitting in the back of the TOOLS room during fascinating presentations, here is a bit of publicity for the next round of one-day seminars for industry — “Compact Course” is the official terminology — that I will be teaching at ETH in Zurich next November (one in October), some of them with colleagues. It’s the most extensive session that we have ever done; you can see the full programs and registration information here.

  • Software Engineering for Outsourced and Distributed Development, 27 October 2011
    Taught with Peter Kolb and Martin Nordio
  • Requirements Engineering, 17 November
  • Software Testing and Verification: state of the art, 18 November
    With Carlo Furia and Sebastian Nanz
  • Agile Methods: the Good, the Bad and the Ugly, 23 November
  • Concepts and Constructs of Concurrent Computation, 24 November
    With Sebastian Nanz
  • Design by Contract, 25 November

The agile methods course is new; its summary reads almost like a little blog article, so here it is.

Agile methods: the Good, the Bad and the Ugly

Agile methods are wonderful. They’ll give you software in no time at all, turn your customers and users into friends, catch bugs before they catch you, change the world, and boost your love life. Do you believe these claims (even excluding the last two)? It’s really difficult to form an informed opinion, since most of the presentations of eXtreme Programming and other agile practices are intended to promote them (and the consultants to whom they provide a living), not to deliver an objective assessment.

If you are looking for a guru-style initiation to the agile religion, this is not the course for you. What it does is to describe in detail the corpus of techniques covered by the “agile” umbrella (so that you can apply them effectively to your developments), and assess their contribution to software engineering. It is neither “for” nor “against” agile methods but fundamentally descriptive, pedagogical, objective and practical. The truth is that agile methods include some demonstrably good ideas along with some whose benefits are at best dubious. In addition (and this should not be a surprise) they cannot make the fundamental laws of software engineering go away.

Agile methods have now been around for more than a decade, during which many research teams, applying proven methods of experimental science, have performed credible empirical studies of how well the methods really work and how they compare to more traditional software engineering practices. This important body of research results, although not widely known, is critical to managers and developers in industry for deciding whether and how to use agile development. The course surveys these results, emphasizing the ones most directly relevant to practitioners.

A short discussion session will enable participants with experience in agile methods to share their results.

Taking this course will give you a strong understanding of agile development, and a clear view of when, where and how to apply them.

Schedule

Morning session: A presentation of agile methods

  • eXtreme Programming, pair programming, Scrum, Test-Driven Development, continuous integration, refactoring, stakeholder involvement, feature-driven development etc.
  • The agile lifecycle.
  • Variants: lean programming etc.

Afternoon session (I): Assessment of agile methods

  • The empirical software engineering literature: review of available studies. Assessment of their value. Principles of empirical software engineering.
  • Agile methods under the scrutiny of empirical research: what helps, what harms, and what has no effect? How do agile methods fare against traditional techniques?
  • Examples: pair programming versus code reviews; tests versus specifications; iterative development versus “Big Upfront Everything”.

Afternoon session (II): Discussion and conclusion

This final part of the course will present, after a discussion session involving participants with experience in agile methods, a summary of the contribution of agile methods to software engineering.

It will conclude with advice for organizations involved in software development and interested in applying agile methods in their own environment.

Target groups

CIOs; software project leaders; software developers; software testers and QA engineers.

VN:F [1.9.10_1130]
Rating: 8.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 5 votes)

Concurrent programming is easy

EiffelStudio 6.8, released last month, contains the first official implementation of the SCOOP programming model for concurrent programming. This is an important milestone; let me try to explain why.

Concurrency challenging us

Concurrency is the principal stumbling block in the progress of programming. Do not take just my word for it:

  • Intel: “Multi-core processing is taking the industry on a fast-moving and exciting ride into profoundly new territory. The defining paradigm in computing performance has shifted inexorably from raw clock speed to parallel operations and energy efficiency” [1].
  • Rick Rashid (head of Microsoft Research):  “Multicore processors represent one of the largest technology transitions in the computing industry today, with deep implications for how we develop software.” [2].
  • Bill Gates: “Multicore: This is the one which will have the biggest impact on us. We have never had a problem to solve like this. A breakthrough is needed in how applications are done on multicore devices.” [3]
  • David Patterson: “Industry has basically thrown a Hail Mary. The whole industry is betting on parallel computing. They’ve thrown it, but the big problem is catching it.” [4]
  • Gordon Bell: “I’m skeptical until I see something that gives me some hope…  the machines are here and we haven’t got it right.” [4].

What has happened? Concurrency  used to be a highly specialized domain of interest to a small minority of programmers building operating systems and networking systems and database engines. Just about everyone else could live comfortably pretending that the world was sequential. And then suddenly we all need to be aware of concurrency. The principal reason is the end of Moore’s law as we know it [5].

The end of Moore's law as we know it

This chart show that we can no longer rely on the automatic and regular improvement to our programs’ performance, roughly by a factor of two every two years, thanks to faster chips. The free lunch is over; continued performance increases require taking advantage of concurrency, in particular through multithreading.

Performance is not the only reason for getting into concurrency. Another one is user convenience: ever since the first browser showed that one could write an email and load a Web page in the same window, users have been clamoring for multithreaded applications. Yet another source of concurrency requirements is the need to produce Internet and Web applications.

How do programmers write these applications? The almost universal answer relies on threading mechanisms, typically offered through some combination of language and library mechanisms: Java Threads, .NET threading, POSIX threads, EiffelThreads. The underlying techniques are semaphores and mutexes: nineteen-sixties vintage concepts, rife with risks of data races (access conflicts to a variable or resource, leading to crashes or incorrect computations) and deadlocks (where the system hangs). These risks are worse than the classical bugs of sequential programs because they are very difficult to detect through testing.

Ways to tame the beast

Because the need is so critical, the race is on — a “frantic” race in the words of a memorable New York Times article by John Markoff [4] — to devise a modern programming framework that will bring concurrent programming under control. SCOOP is a contender in this battle. In this post and the next I will try to explain why we think it is exactly what the world needs to tame concurrency.

The usual view, from which SCOOP departs, is that concurrent programming is intrinsically hard and requires a fundamental change in the way programmers think. Indeed some of the other approaches that have attracted attention imply radical departures from accepted programming paradigm:

  • Concurrency calculi such as CSP [6, 7], CCS [8] and the π-Calculus [9] define  high-level mathematical frameworks addressing concurrency, but they are very far from the practical concerns of programmers. An even more serious problem is that they focus on only some aspects of programming, but being concurrent is only one property of a program, among many others (needing a database, relying on graphical user interface, using certain data structures, perform certain computations…). We need mechanisms that integrate concurrency with all the other mechanisms that a program uses.
  • Functional programming languages have also offered interesting idioms for concurrency, taking advantage of the non-imperative nature of functional programming. Advocacy papers have argued for Haskell [10 and Erlang [11] in this role. But should the world renounce other advances of modern software engineering, in particular object-oriented programming, for the sake of these mechanisms? Few people are prepared to take that step, and (as I have discussed in a detailed article [12]) the advantages of functional programming are counter-balanced by the superiority of the object-oriented model in its support for the modular construction of realistic systems.

What if we did not have to throw away everything and relearn programming from the ground up for concurrency? What if we could retain the benefits of five decades of software progress, as crystallized in modern object-oriented programming? This is the conjecture behind SCOOP: that we can benefit from all the techniques we have learned to make our software reliable, extendible and reusable, and add concurrency to the picture in an incremental way.

From sequential to concurrent

A detailed presentation of SCOOP will be for next Monday, but let me give you a hint and I hope whet your appetite by describing how to move a typical example from sequential to concurrent. Here is a routine for transferring money between two accounts:

transfer (amount: INTEGER ; source, target: ACCOUNT)
               -- Transfer amount dollars from source to target.
        require
               enough: source·balance >= amount
        do
         source·withdraw (amount)
         target·deposit (amount)
        ensure
               removed: source·balance = old source·balance – amount
               added: target·balance = old target·balance + amount
        end

The caller must satisfy the precondition, requiring the source account to have enough money to withdraw the requested amount; the postcondition states that the source account will then be debited, and the target account credited, by that amount.

Now assume that we naïvely apply this routine in a concurrent context, with concurrent calls

        if acc1·balance >= 100 then transfer (acc1, acc2, 100) end

and

        if acc1·balance >= 100 then transfer (acc1, acc3, 100) end

If the original balance on acc1 is 100, it would be perfectly possible in the absence of a proper concurrency mechanism that both calls, as they reach the test acc1·balance >= 100, find the property to be true and proceed to do the transfer — but incorrectly since they cannot both happen without bringing the balance of acc1 below zero, a situation that the precondition of transfer and the tests were precisely designed to rule out. This is the classic data race. To avoid it in the traditional approaches, you need complicated and error-prone applications of semaphores or conditional critical regions (the latter with their “wait-and-signal” mechanism, just as clumsy and low-level as the operations on semaphores).

In SCOOP, such data races, and data races of any other kind, cannot occur. If the various objects involved are to run in separate threads of control, the declaration of the routine will be of the form

transfer (amount: INTEGER ; source, target: separate ACCOUNT)
               -- The rest of the routine exactly as before.

where separate is the only specific language keyword of SCOOP. This addition of the separate marker does the trick. will result in the following behavior:

  • Every call to transfer is guaranteed exclusive access to both separate arguments (the two accounts).
  • This simultaneous reservation of multiple objects (a particularly tricky task when programmers must take care of it through their own programs, as they must in traditional approaches) is automatically guaranteed by the SCOOP scheduler. The calls wait as needed.
  • As a consequence, the conditional instructions (if then) are no longer needed. Just call transfer and rely on SCOOP to do the synchronization and guarantee correctness.
  • As part of this correctness guarantee, the calls may have to wait until the preconditions hold, in other words until there is enough money on the account.

This is the desired behavior in the transition from sequential to concurrent. It is achieved here not by peppering the code with low-level concurrent operations, not by moving to a completely different programming scheme, but by simply declaring which objects are “separate” (potentially running elsewhere.

The idea of SCOOP is indeed that we reuse all that we have come to enjoy in modern object-oriented programming, and simply declare what needs to be parallel, expecting things to work (“principle of least surprise”).

This is not how most of the world sees concurrency. It’s supposed to be hard. Indeed it is; very hard, in fact. But the view of the people who built SCOOP is that as much of the difficulty should be for the implementers. Hence the title of this article: for programmers, concurrency should be easy. And we think SCOOP demonstrates that it can be.

SCOOP in practice

A few words of caution: we are not saying that SCOOP as provided in EiffelStudio 6.8 is the last word. (Otherwise it would be called 7.0.) In fact, precisely because implementation is very hard, a number of details are still not properly handled; for example, as discussed in recent exchanges on the EiffelStudio user group [13], just printing out the contents of a separate string is non-trivial. We are working to provide all the machinery that will make everything work well, the ambitious goals and the practical details. But the basics of the mechanism are there, with a solid implementation designed to scale properly for large applications and in distributed settings.

In next week’s article I will describe in a bit more detail what makes up the SCOOP mechanisms. To get a preview, you are welcome to look at the documentation [14, 15]; I hope it will convince you that despite what everyone else says concurrent programming can be easy.

References

[1] Official Intel statement, see e.g. here.

[2] Rich Rashid, Microsoft Faculty Summit, 2008.

[3] This statement was cited at the Microsoft Faculty Summit in 2008 and is part of the official transcript; hence it can be assumed to be authentic, although I do not know the original source.

[4] Patterson and Bell citations from John Markoff, Faster Chips Are Leaving Programmers in Their Dust, New York Times, 17 December 2007, available here.

[5] The chart is from the course material of Tryggve Fossum at the LASER summer school in 2008.

[6] C.A.R. Hoare: em>Communicating Sequential Processes, Prentice Hall, 1985, also available online.

[7] Bill Roscoe: The Theory and Practice of Concurrency, revised edition, Prentice Hall, 2005, also available online.

[8] Robin Milner: Communication and Concurrency, Prentice Hall, 1989.

[9] Robin Milner: Communicating and Mobile Systems: The π-calculus, Cambridge University Press, 1999.

[10] Simon Peyton-Jones: Beautiful Concurrency, in Beautiful Code, ed. Greg Wilson, O’Reilly, 2007, also available online.

[11] Joe Armstrong: Erlang, in Communications of the ACM, vol. 53, no. 9, September 2010, pages 68-75.

[12] Bertrand Meyer: Software Architecture: Functional vs. Object-Oriented Design, in Beautiful Architecture, eds. Diomidis Spinellis and Georgios Gousios, O’Reilly, 2009, pages 315-348, available online.

[13] EiffelStudio user group; see here for a link to current discussions and to join the group.

[14] SCOOP project documentation at ETH, available here.

VN:F [1.9.10_1130]
Rating: 8.9/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +6 (from 6 votes)

Assessing concurrency models

By describing a  poorly conceived hypothetical experiment, last week’s article described the “Professor Smith syndrome” consisting of four risks that threaten the validity of empirical software engineering experiments relying on students in a course:

  • Professor Smith Risk 1: possible bias if the evaluator has a stake in the ideas or tools under assessment.
  • Professor Smith Risk 2: creating different levels of motivation in the different groups (Hawthorne effect).
  • Professor Smith Risk 3: extrapolating from students to professionals.
  • Professor Smith Risk 4: violation of educational ethics if the experiment may cause some students to learn better than others.

If you have developed a great new method or tool and would like to assess it, the best way to address Risk 1 is to find someone else to do the assessment. What if  this solution is not practical? Recently we wanted to get some empirical evidence on the merits of the SCOOP (Simple Concurrent Object-Oriented Programming) approach to concurrency [1, 2], on which I have worked for a long time and which is now part of EiffelStudio since the release of 6.8 a couple of weeks ago. We wanted to see if, despite the Professor Smith risks, we could do a credible study ourselves.

The ETH Software Architecture course[3], into which we introduced some introductory material on concurrency last year (as part of a general effort to push more concurrency into software courses at ETH), looked like a good place to try an evaluation; it is a second-year course, where students, or so we thought, would have little prior experience in concurrent software design.

The study’s authors — Sebastian Nanz, Faraz Torshizi and Michela Pedroni — paid special attention to the methodological issues. To judge for yourself whether we addressed them properly, you can read the current version of our paper to be presented at ESEM 2011 [4]. Do note that it is a draft and that we will improve the paper for final publication.

Here is some of what we did. I will not address the Professor Smith Risk 3, the use of students, which (as Lionel Briand has pointed out in a comment on the previous article) published work has studied; in a later article I will give  references to some of that work. But we were determined to tackle the other risks explicitly, so as to obtain credible results.

The basic experiment was a session in which the students were exposed to two different design methods for concurrent software: multithreaded programming in Java, which I’ll call “Java Threads”, and SCOOP. We wanted to explore whether it is easier to program in SCOOP than in Java. This is too general a hypothesis, so it was refined into three concrete hypotheses: is it easier to understand a SCOOP program? Is it easier to find errors in SCOOP programs? Do programmers using SCOOP make fewer errors?

A first step towards reducing the effect — Professor Smith Risk 1 — of any emotional attachment of the experimenters  to one of the approaches, SCOOP in our case, was to generalize the study. Although what directly interested us was to compare SCOOP against Java Threads, we designed the study as a general scheme to compare concurrency approaches; SCOOP and Java Threads are just an illustration, but anyone else interested in assessing concurrency techniques — say Erlang versus C# concurrency — can apply the same methodology. This decision had two benefits: it freed the study from dependency on the particular techniques, hence, we hope, reducing bias; and as side attraction of the kind that is hard for researchers to resist, it increased the publishability of the results.

Circumstances unexpectedly afforded us another protection against any for-SCOOP bias: unbeknownst to us at the time of the study’s design, a first-year course had newly added (in 2009, whereas our study was performed in 2010) an introduction to concurrent programming — using Java Threads! While we had thought that concurrency in any form would be new to most students, in fact almost all of them had now seen Java Threads before. (The new material in the first-year course was taken by ETH students only, but many transfer students had also already had an exposure to Java Threads.) On the other hand, students had not had any prior introduction to SCOOP. So any advantage that one of the approaches may have had because of students’ prior experience would work against our hypotheses. This unexpected development would not help if the study’s results heavily favored Java Threads, but if they favored SCOOP it would reinforce their credibility.

A particular pedagogical decision was made regarding the teaching of our concurrency material: it started with a self-study rather than a traditional lecture. One of the reasons for this decision was purely pedagogical: we felt (and the course evaluations confirmed) that at that stage of the semester the students would enjoy a break in the rhythm of the course. But another reason was to avoid any bias that might have arisen from any difference in the lecturers’ levels of enthusiasm and effectiveness in teaching the two approaches. In the first course session devoted to concurrency, students were handed study materials presenting Java Threads and SCOOP and containing a test to be taken; the study’s results are entirely based on their answers to these tests. The second session was a traditional lecture presenting both approaches again and comparing them. The purpose of this lecture was to make sure the students got the full picture with the benefit of a teacher’s verbal explanations.

The study material was written carefully and with a tone as descriptive and neutral as possible. To make comparisons meaningful, it does not follow a structure specific to Java Threads or  SCOOP  (as we might have used had we taught only one of these approaches); instead it relies in both cases on the same overall plan  (figure 2 of the paper), based on a neutral analysis of concurrency concepts and issues: threads, mutual exclusion, deadlock etc. Each section then presents, for one such general concurrency question, the solution proposed by Java Threads or SCOOP.

This self-study material, as well as everything else about the study, is freely available on the Web; see the paper for the links.

In the self-study, all students studied both the Java Threads and SCOOP materials. They were randomly assigned to two groups, for which the only difference was the order of studying the approaches. We feel that this decision addresses the ethical issue (Professor Smith Risk 4): any pedagogical effect of reading about A before B rather than the reverse, in the course of a few hours, has to be minimal if you end up reading about the two of them, and on the next day follow a lecture that also covers both.

Having all students study both approaches — a crossover approach in the terminology of [5] — should also address the Hawthorne effect (Professor Smith Risk 2): students have no particular incentive to feel that one of the approaches is more hip than the other. While they are not told that SCOOP is partly the work of the instructors, some of them may know or guess this information; the consequences, positive or negative, are limited, since they are asked in both cases to do as well as they can in answering the assessment questions.

The design of that evaluation is another crucial element in trying to avoid bias. We tried, to the extent possible, to base the assessment on objective criteria. For the first hypothesis (program understanding) the technique was to ask the students to predict the output of some simple concurrent programs. To address the risk of a binary correct/incorrect assessment, and get a more fine-grained view, we devised the programs so that they would produce output strings and measured the Levenshtein (edit) distance to the correct result. For the second hypothesis (ease of program debugging), we gave students programs exhibiting typical errors in both approaches and asked them to tell us both the line number of any error they found and an explanation. Assessing the explanation required human analysis; the idea of also assigning partial credit for pointing out a line number without providing a good explanation is that it may be meaningful that a student found that something is amiss even without being quite able to define what it is. The procedure for the third hypothesis (producing programs with fewer errors) was more complex and required two passes over the result; it requires some human analysis, as you will see in the article, but we hope that the two-pass process removes any bias.

This description of the study is only partial and you should read the article [4] for the full details of the procedure.

So what did we find in the end? Does SCOOP really makes concurrency easier to learn, concurrent programs easier to debug, and concurrent programmers less error-prone? Here too  I will refer you to the article. Let me simply mention that the results held some surprises.

In obtaining these results we tried very hard to address the Professor Smith syndrome and its four risks. Since all of our materials, procedures and data are publicly accessible, described in some detail in the paper, you can determine for yourself how well we met this objective, and whether it is possible to perform credible assessments even of one’s own work.

References

Further reading: for general guidelines on how to conduct empirical research see [5]; for ethical guidelines, applied to psychological research but generalizable, see [6].

[1] SCOOP Eiffel documentation, available here.

[2] SCOOP project documentation at ETH, available here.

[3] Software Architecture course at ETH, course page (2011).

[4] Sebastian Nanz, Faraz Torshizi, Michela Pedroni and Bertrand Meyer: Design of an Empirical Study for Comparing the Usability of Concurrent Programming Languages, to appear in ESEM 2011 (ACM/IEEE International Symposium on Empirical Software Engineering and Measurement), 22-23 September 2011. Draft available here.

[5] Barbara A. Kitchenham, Shari L. Pfleeger, Lesley M. Pickard, Peter W. Jones, David C. Hoaglin, Khaled El-Emam and Jarrett Rosenberg: Preliminary Guidelines for Empirical Research in Software Engineering, national Research Council Canada (NRC-CNRC), Report ERB-1082, 2001, available here.

[6] Robert Rosenthal: Science and ethics in conducting, analyzing, and reporting psychological research, in  Psychological Science, 5, 1994, p127-134. I found a copy cached by a search engine here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +7 (from 7 votes)