Archive for May 2011

Stendhal on abstraction

This week we step away from our usual sources of quotations — the Hoares and Dijkstras and Knuths — in favor an author who might seem like an unlikely inspiration for a technology blog: Stendhal. A scientist may like anyone else be fascinated by Balzac, Flaubert, Tolstoy or Dostoevsky, but they live in an entirely different realm; Stendhal is the mathematician’s novelist. Not particularly through the themes of his works (as could be the case with  Borges or Eco), but because of their clear structure and elegant style,  impeccable in its conciseness and razor-like in its precision. Undoubtedly his writing was shaped by his initial education; he prepared for the entrance exam of the then very young École Polytechnique, although at the last moment he yielded instead to the call of the clarion.

The scientific way of thinking was not just an influence on his writing; he understood the principles of scientific reasoning and knew how to explain them. Witness the following text, which explains just about as well as anything I know the importance of abstraction. In software engineering (see for example [1]), abstraction is the key talent, a talent of a paradoxical nature: the basic ideas take a few minutes to explain, and a lifetime to master. In this effort, going back to the childhood memories of Henri Beyle (Stendhal’s real name) is not a bad start.

Stendhal’s Life of Henri Brulard is an autobiography, with only the thinnest of disguises into a novel (compare the hero’s name with the author’s). In telling the story of his morose childhood in Grenoble, the narrator grumbles about the incompetence of his first mathematics teacher, a Mr. Dupuy, who taught mathematics “as a set of recipes to make vinegar” (comme une suite de recettes pour faire du vinaigre) and tells how his father found a slightly better one, Mr. Chabert. Here is the rest of the story, already cited in [2]. The translation is mine; you can read the original below, as well as a German version. Instead of stacks and circles  — or a university’s commencement day, see last week’s posting — the examples invoke eggs and cheese, but wouldn’t you agree that this paragraph is as good a definition of abstraction, directly applicable to software abstractions, and specifically to abstract data types and object abstractions (yes, it does discuss “objects”!), as any other?

So I went to see Mr. Chabert. Mr. Chabert was indeed less ignorant than Mr. Dupuy. Through him I discovered Euler and his problems on the number of eggs that a peasant woman brings to the market where a scoundrel steals a fifth of them, then she leaves behind the entire half of the remainder and so forth. This opened my mind, I glimpsed what it means to use the tool called algebra. I’ll be damned if anyone had ever explained it to me; endlessly Mr. Dupuy spun pompous sentences on the topic, but never did he say this one simple thing: it is a division of labor, and like every division of labor it creates wonders by allowing the mind to concentrate all its forces on just one side of objects, on just one of their qualities. What difference it would have made if Mr. Dupuy had told us: This cheese is soft or is it hard; it is white, it is blue; it is old, it is young; it is mine, it is yours; it is light or it is heavy. Of so many qualities, let us only consider the weight. Whatever that weight is, let us call it A. And now, no longer thinking of cheese, let us apply to A everything we know about quantities. Such a simple thing; and yet no one was explaining it to us in that far-away province [3]. Since that time, however, the influence of the École Polytechnique and Lagrange’s ideas may have trickled down to the provinces.

References

[1] Jeff Kramer: Is abstraction the key to computing?, in Communications of The ACM, vol. 50, 2007, pages 36-42.
[2] Bertrand Meyer and Claude Baudoin: Méthodes de Programmation, Eyrolles, 1978, third edition, 1982.
[3] No doubt readers from Grenoble, site of great universities and specifically one of the shrines of French computer science, will appreciate how Stendhal calls it  “that backwater” (cette province reculée).

Original French text

J’allai donc chez M. Chabert. M. Chabert était dans le fait moins ignare que M. Dupuy. Je trouvai chez lui Euler et ses problèmes sur le nombre d’œufs qu’une paysanne apportait au marché lorsqu’un méchant lui en vole un cinquième, puis elle laisse toute la moitié du reste, etc., etc. Cela m’ouvrit l’esprit, j’entrevis ce que c’était que se servir de l’instrument nommé algèbre. Du diable si personne me l’avait jamais dit ; sans cesse M. Dupuy faisait des phrases emphatiques sur ce sujet, mais jamais ce mot simple : c’est une division du travail qui produit des prodiges comme toutes les divisions du travail et permet à l’esprit de réunir toutes ses forces sur un seul côté des objets, sur une seule de leurs qualités. Quelle différence pour nous si M. Dupuy nous eût dit : Ce fromage est mou ou il est dur ; il est blanc, il est bleu ; il est vieux, il est jeune ; il est à moi, il est à toi ; il est léger ou il est lourd. De tant de qualités ne considérons absolument que le poids. Quel que soit ce poids, appelons-le A. Maintenant, sans plus penser absolument au fromage, appliquons à A tout ce que nous savons des quantités. Cette chose si simple, personne ne nous la disait dans cette province reculée ; depuis cette époque, l’École polytechnique et les idées de Lagrange auront reflué vers la province.

German translation (by Benjamin Morandi)

Deshalb ging ich zu Herrn Chabert. In der Tat war Herr Chabert weniger ignorant als Herr Dupuy. Bei ihm fand ich Euler und seine Probleme über die Zahl von Eiern, die eine Bäuerin zum Markt brachte, als ein Schurke ihr ein Fünftel stahl, sie dann die Hälfte des Restes hinterliest u.s.w. Es hat mir die Augen geöffnet. Ich sah was es bedeutet, das Algebra genannte Werkzeug zu benutzen. Unaufhörlich machte Herr Dupuy emphatische Sätze über dieses Thema, aber niemals dieses einfache Wort: Es ist eine Arbeitsteilung, die wie alle Arbeitsteilungen Wunder herstellt und dem Geist ermöglicht seine Kraft ganz auf eine einzige Seite von Objekten zu konzentrieren, auf eine Einzige ihrer Qualitäten. Welch Unterschied für uns, wenn uns Herr Dupuy gesagt hätte: Dieser Käse ist weich oder er ist hart; er ist weiss, er ist blau; er ist alt, er ist jung; er gehört dir, er gehört mir; er ist leicht oder er ist schwer. Bei so vielen Qualitäten betrachten wir unbedingt nur das Gewicht. Was dieses Gewicht auch sei, nennen wir es A. Jetzt, ohne unbedingt weiterhin an Käse denken zu wollen, wenden wir auf A alles an, was wir über Mengen wissen. Diese einfach Sache sagte uns niemand in dieser zurückgezogenen Provinz; von dieser Epoche an werden die École Polytechnique und die Ideen von Lagrange in die Provinz zurückgeflossen sein.

VN:F [1.9.10_1130]
Rating: 9.5/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

In praise of Knuth and Liskov

In November of 2005, as part of the festivities of its 150-th anniversary, the ETH Zurich bestowed honorary doctorates on Don Knuth and Barbara Liskov. I gave the speech (the “laudatio”). It was published in Informatik Spektrum, the journal of Gesellschaft für Informatik, the German Computer Society, vo. 29, no. 1, February 2006, pages 74-76; I came across it recently and thought others might be interested in this homage to two great computer scientists.  The beginning was in German; I translated it into English. I also replaced a couple of German expressions by their translations: “ETH commencement” for ETH-Tag (the official name of the annual ceremony) and “main building” for Hauptgebäude.

I took this picture of Wirth, Liskov and Knuth (part of my gallery of computer scientists)  later that same day.

 

Laudatio

 In an institution, Ladies and Gentlement, which so proudly celebrates its hundred-and-fiftieth anniversary, a relatively young disciplines sometimes has cause for envy. We computer scientists are still the babies, or at least the newest kids on the block. Outside of this building, for example, you will see streets bearing such names as Clausius, yet there is neither a Von Neumann Lane nor a a Wirth Square. Youth, however,  also has its advantages; perhaps the most striking is that we still can, in our own lifetime, meet in person some of the very founders of our discipline. No living physicist has seen Newton; no chemist has heard Lavoisier. For us, it works. Today, Ladies and Gentlemen, we have the honor of introducing two of the undisputed pioneers of informatics.

Barbara Liskov

The first of our honorees today is Professor Barbara Liskov. To understand her contributions it is essential to realize the unfair competition in which the so-called Moore’s law pits computer software against computing hardware. To match the astounding progress of computing speed and memory over the past five decades, all that we have on the software side is our own intelligence which, it is safe to say, doesn’t double every eighteen months at constant price. The key to scaling up is abstraction; all advances in programming methodology have relied on new abstraction techniques. Perhaps the most significant is data abstraction, which enables us to organize complex systems on the basis of the types of objects they manipulate, defined in completely abstract terms. This is the notion of abstract data type, a staple component today of every software curriculum, including in the very first programming course here ETH. it was introduced barely thirty years ago in a seemingly modest article in SIGPLAN Notices — the kind of publication that hardly registers a ripple in science indexes — by Barbara Liskov and Stephen Zilles. Few papers have had a more profound impact on the theory and practice of software development than this contribution, “Programming with Abstract Data Types”.

The idea of abstract data types, or ADTs, is one of those Egg of Christopher Columbus moments; a seemingly simple intuition that changes the course of things. An ADT is a class of objects described in terms not of their internal properties, but of the operations applicable to them, and the abstract properties of these operations. Not by what they are, but by what they have. A rather capitalistic view of the world, but well suited to the description of complex systems where each part knows as little as possible about the others to protect itself about their future changes.

An abstraction such as ETH-Commencement could be described in a very concrete way: it happens in a certain place, consists of one event after another, gathers so many people. This is what we computer scientists call an implementation-oriented view, and relying on it means that we can’t change any detail without endangering the consistency of other processes, such as the daily planning of room allocation in the Main Building, which use it. In an ADT view, the abstraction “ETH Commencement” is characterized not by what it is but by what it has: a start, an end, an audience, and operations such as “Schedule the ETH Commencement”, “ Reschedule it”, “Start it”, “End it”. They provide to the rest of the world a clean, precisely specified interface which enables every ADT to use every other based on the minimum properties it requires, thus isolating them from irrelevant internal changes, and providing an irreplaceable weapon in the incessant task of software engineering: battling complexity.

Barbara Liskov didn’t stay with the theoretical concepts but implemented the ideas in the CLU language, one of the most influential of the set of programming languages that in the nineteen-seventies changed our perspective of how to develop good software.

She went on to seminal work on operating systems and distributed computing, introducing several widely applied concepts such as guardians, and always backing her theoretical innovations by building practical systems, from the CLU language and compiler to the Argus and Mercury distributed operating systems. Distributed systems, such as those which banks, airlines and other global enterprises run on multiple machines across multiple networks, raise particularly challenging issues. To quote from the introduction of her article on Argus:

A centralized system is either running or crashed, but a distributed system may be partly running and partly crashed. Distributed programs must cope with failures of the underlying hardware. Both the nodes and the network may fail. The goal of Argus is to provide mechanisms that make it easier for programmers to cope with these problems.

Barbara Liskov’s work introduced seminal concepts to deal with these extremely difficult problems.

Now Ford professor of engineering at MIT, she received not long ago the prestigious John von Neumann award of the IEEE; she has been one of the most influential people in software engineering. We are grateful for how Professor Barbara Liskov has helped shape the field are honored to have her at ETH today.

 Donald Knuth

In computer science and beyond, the name of Donald Knuth carries a unique aura. A professor at Stanford since 1968, now emeritus, he is the only person on record whose job title is the title of his own book: Professor of the Art of Computer Programming. This is for his eponymous multi-volume treatise, which established the discipline of algorithm analysis, and has had more effect than any other computer science publication. The Art of Computer Programming is a marvel of breadth, depth, completeness, mathematical rigor and clarity, not to forget humor. In that legendary book you will find exposed in detail the algorithms and data structures that lie at the basis of all software applications today. A Monte Carlo simulation, as a physicists may use, requires a number sequence that is both very long and very random-looking, even though the computer is a deterministic machine; if the simulation is any good, it almost certainly relies on the devious techniques which The Art of Computer Programming presents for making a perfectly deterministic sequence appear to have no order or other recognizable property. If you are running complex programs on your laptop, and they keep creating millions of software objects without clogging up gigabytes of memory, chances are the author of the garbage collector program is using techniques he learned from Knuth, with such delightful names as “the Buddy System”. If your search engine can at the blink of an eye find a needle of useful information in a haystack of tens of billions of Web pages, it’s most likely because they’ve been indexed using finely tuned data structures, such as hash tables, for which Knuth has been the reference for three decades through volume three, Searching and Sorting.

Knuth is famous for his precision and attention to detail, going so far as to offer a financial reward for every error found in his books, although one suspects this doesn’t cost him too much since people are so proud that instead of cashing the check they have it framed for display. The other immediately striking characteristic of Knuth is how profoundly he is driven by esthetics. This applies to performing arts, as anyone who was in the Fraumünster this morning and found out who the organist was can testify, but even more to his scientific work. The very title “the Art of computer programming” betrays this. Algorithms and data structures for Knuth are never dull codes for computers, but objects of intense esthetic pleasure and friendly discussion. This concern with beauty led to a major turn in his career, which delayed the continuation of the book series by many years but resulted in a development that has affected anyone who publishes scientific text. As he received the page proofs of the second edition of one of the volumes in the late seventies he was so repelled by its physical appearance, resulting from newly introduced computer typesetting technology, that he decided to build a revolutionary font design and text processing system, all by himself, from the ground up. This resulted in a number of publications such as a long and fascinating paper in the Bulletin of the American Mathematical Society entitled “The Letter S”, but even more importantly in widely successful and practical software programs which he wrote himself, TeX and Metafont, which have today become standards for scientific publishing. Here too he has shown the way in quality and rigor, being one of the very few people in the world who promise their software to be free of bugs, and backs that promise by giving a small financial reward for any counter-example.

His numerous other contributions are far too diverse to allow even a partial mention here; they have ranged across wide areas of computer science and mathematics.

To tell the truth, we are a little embarrassed that by bringing Professor Knuth here we are delaying by a bit more the long awaited release of volume 4. But we overcome this embarrassment in time to express our pride for having Donald Erwin Knuth at ETH for this anniversary celebration.

VN:F [1.9.10_1130]
Rating: 10.0/10 (9 votes cast)
VN:F [1.9.10_1130]
Rating: +9 (from 9 votes)

Publish no loop without its invariant

 

There may be no more blatant example of  the disconnect between the software engineering community and the practice of programming than the lack of widespread recognition for the fundamental role of loop invariants. 

Let’s recall the basics, as they are taught in the fourth week or so of the ETH introductory programming course [1], from the very moment the course introduces loops. A loop is a mechanism to compute a result by successive approximations. To describe the current approximation, there is a loop invariant. The invariant must be:

  1. Weak enough that we can easily ensure it on a subset, possibly trivial, of our data set. (“Easily” means than this task is substantially easier than the full problem we are trying to solve.)
  2. Versatile enough that if it holds on some subset of the data we can easily (in the same sense) make it hold on a larger subset — even if only slightly larger.
  3. Strong enough that, when it covers the entire data, it yields the result we seek.

As a simple example, assume we seek the maximum of an array a of numbers, indexed from 1. The invariant states that Result is the maximum of the array slice from 1 to i. Indeed:

  1. We can trivially obtain the invariant by setting Result to be a [1]. (It is then the maximum of the slice a [1..1].)
  2. If the invariant holds, we can extend it to a slightly larger slice — larger by just one element — by increasing i by 1 and updating Result to be the greater of the previous Result and the element a [i] (for the new  i).
  3. When the slice covers the entire array — that is, i = n — the invariant tells us that Result is the maximum of the slice a [1..n], giving us the result we seek.

You cannot understand the corresponding program text

    from
        i := 1; Result := a [1]
    until i = n loop
        i := i + 1
        if Result < a [i] then Result := a [i] end
    end

without understanding the loop invariant. That is true even of people who have never heard the term: they will somehow form a mental image of the intermediate situation that justifies the algorithm. With the formal notion, the reasoning becomes precise and checkable. The difference is the same as between a builder who has no notion of theory, and one who has learned the laws of mechanics and construction engineering.

As another example, take Levenshtein distance (also known as edit distance). It is the shortest sequence of operations (insert, delete or replace a character) that will transform a string into another. The algorithm (a form of dynamic programming) fills in a matrix top to bottom and left to right, each entry being one plus the maximum of the three neighboring ones to the top and left, except if the corresponding characters in the strings are the same, in which case it keeps the top-left neighbor’s value. The basic operation in the loop body reads

      if source [i] = target [j] then
           dist [i, j] := dist [i -1, j -1]
      else
           dist [i, j] := min (dist [i, j-1], dist [i-1, j-1], dist [i-1, j]) + 1
      end

You can run this and see it work, filling the array cell after cell, then delivering the result at (dist [M, N] (the bottom-right entry, M and i being the lengths of the source and target strings. Or just watch the animation on page 60 of [2]. It works, but why it works remains a total mystery until someone tells you the invariant:

Every value of dist filled so far is the minimum distance from the initial substrings of the source, containing characters at position 1 to p, to the initial substring of the target, positions 1 to q.

This is the rationale for the above code: we want to compute the next value, at position [i, j]; if the corresponding characters in the source and target are the same, no operation is needed to extend the result we had in the top-left neighbor (position [i-1, j-1]); if not, the best we can do is the minimum we can get by extending the results obtained for our three neighbors: through the insertion of source [i] if the minimum comes from the neighbor to the left, [i-1, j]; through the deletion of target [j] if it comes from the neighbor above; or through a replacement if from the top-left neighbor.

With this explanation, a mysterious, almost hermetic algorithm instantly becomes crystal-clear. 

Yet another example is in-place linked list reversal. The body of the loop is a pointer ballet:

temp := previous
previous
:= next
next
:= next.right
previous.put_right
(temp)

with proper initialization (set next to the value of first and previous to Void) and finalization (set first to the value of previous). This is not the only possible implementation, but all variants of the algorithm use a very similar scheme.

The code looks again pretty abstruse, and hard to get right if you do not remember it exactly. As in the other examples, the only way to understand it is to see the invariant, describing the intermediate assumption after a typical loop iteration. If the original situation was this:

List reversal: initial state

List reversal: initial state

then after a few iterations the algorithm yields this intermediate situation: 

List reversal: intermediate state

List reversal: intermediate state

 The figure illustrates the invariant:

Starting from previous and repeatedly following right links yields the elements of some initial part of the list, but in the reverse of their original order; starting from next and following right links yields the remaining elements, in their original order. 

Then it is clear what the loop body with its pointer ballet is about: it moves by one position to the right the boundary between the two parts, making sure that the invariant holds again in the new state, with one more element in the first (yellow) part and one fewer in the second (pink) part. At the end the second part will be empty and the first part will encompass all elements, so that (after resetting first to the value of previous) we get the desired result.

This example is particularly interesting because list reversal is a standard interview questions for programmers seeking a job; as a result, dozens of  pages around the Web helpfully present algorithms for the benefit of job candidates. I ran a search  on “List reversal algorithm” [3], which yields many such pages. It is astounding to see that from the first fifteen hits or so, which include pages from programming courses at both Stanford and MIT, not a single one mentions invariants, or (even without using the word) gives the above explanation. The situation is all the more bizarre that many of these pages — read them for yourself! — go into intricate details about variants of the pointer manipulations. There are essentially no correctness arguments.

If you go a bit further down the search results, you will find some papers that do reference invariants, but here is the catch: rather than programming or algorithms papers, they are papers about software verification, such as one by Richard Bornat which uses a low-level (C) version of the example to illustrate separation logic [4]. These are good papers but they are completely distinct from those directed at ordinary programmers, who simply wish to learn a basic algorithm, understand it in depth, and remember it on the day of the interview and beyond.

This chasm is wrong. Software verification techniques are not just good for the small phalanx of experts interested in formal proofs. The basic ideas have potential applications to the daily business of programming, as the practice of Eiffel has shown (this is the concept of  “Verification As a Matter Of Course” briefly discussed in an earlier post [5]). Absurdly, the majority of programmers do not know them.

It’s not that they cannot do their job: somehow they eke out good enough results, most of the time. After all, the European cathedrals of the middle ages were built without the benefit of sophisticated mathematical models, and they still stand. But today we would not hire a construction engineer who had not studied the appropriate mathematical techniques. Why should we make things different for software engineering, and deprive practitioners from the benefits of solid, well-accepted theory?  

As a modest first step, there is no excuse, ever, for publishing a loop without the basic evidence of its adequacy: the loop invariant.

References

[1] Bertrand Meyer: Touch of Class: Learning to Program Well, Using Objects and Contracts, Springer, 2009. See course page (English version) here.

[2] Course slides on control structures,  here in PowerPoint (or here in PDF, without the animation); see example starting on page 51, particularly the animation on page 54. More recent version in German here (and in PDF here), animation on page 60.

[3] For balance I ran the search using Qrobe, which combines results from Ask, Bing and Google.

[4] Richard Bornat, Proving Pointer Programs in Hoare Logic, in  MPC ’00, 5th International Conference on Mathematics of Program Construction, 2000, available here.

[5] Bertrand Meyer, Verification as a Matter of Course, a post on this blog.

VN:F [1.9.10_1130]
Rating: 10.0/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)