On beauty and software (online talk on Wednesday, 17 CET / 11 EDT / 8 PDT)

This Wednesday (still “tomorrow” as I am writing this), 10 March 2021, I am giving a talk on “The Beauty of Software” on the occasion of the graduation ceremony of the first students of the Schaffhausen Institute of Technology. The event starts at 17 Schaffhausen/Zurich/Paris etc. time (11 AM New York, 8 AM San Francisco) and my own talk, starting half an hour later, will take about one hour.

The talk is (surprise!) given online. Registration is free but required: you can find the registration form on the announcement page here.

The abstract appears below. It is rather ambitious-sounding and I cannot promise the talk will live up to the promise, but I feel it necessary at least to attempt some initial steps towards a better understanding of beauty in software, which might help understand beauty in general.

The Beauty of Software

Software runs the world and delivers riches. Every passionate software engineer or computer scientist is also attuned to another of its features: the study and practice of software construction reveal gems of utter beauty.

While the concept of beauty is most naturally associated with art, scientists and engineers of all fields often invoke it. Beauty is a strong guiding principle in searching for solutions to scientific and technical problems, and arbitrating between rival candidate solutions. The reaction is often instinctive: “What an elegant theory!” “This technique is too ugly to be a viable approach”.

What do such appeals to beauty really mean? Do they pertain to the same concept of beauty as found in nature and art? Is beauty only “in the eye of the beholder”, is it conditioned by cultural prejudices, or does it submit to an objective definition?

In this talk, an initial step towards a more extensive study of what beauty means for software, I will present a few artifacts from software engineering and computer science which I find strikingly beautiful and – at the risk of breaking the charm – analyze what might make them so. This analysis will lead to a tentative definition of a notion both alluring and elusive: beauty.

VN:F [1.9.10_1130]
Rating: 5.8/10 (12 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 8 votes)

Tomorrow (Thursday) noon EDT: ACM talk on requirements

In the software engineering family requirements engineering is in my experience the poor cousin, lagging behind the progress of other parts (such as design). I have been devoting attention to the topic in recent months and am completing a book on the topic.

Tomorrow (Thursday), I will be covering some of the material in a one-hour Tech Talk for ACM, with the title

The Four PEGS of Requirements Engineering

The time is Thursday, 4 March 2021, at noon EDT (New York) and 18 CET (Paris, Zurich etc.). Attendance is free but requires registration, on the event page  here.

Abstract:

Bad software requirements can jeopardize projects. There is a considerable literature on requirements, but practice is far behind: what passes for requirements in industry usually consists of a few use cases or user stories, which are useful but not sufficient as a solution. Can we fix requirements engineering (known in other circles as business analysis) so that it is no longer the weak link in software engineering?

I will present ongoing work intended to help industry produce more useful requirements. It includes precise definitions of requirements concepts and a standard plan for requirements specifications, intended to replace the venerable but woefully obsolete IEEE standard from 1998. The plan contains four books covering the four “PEGS” of requirements engineering (which I will explain). The approach builds on existing knowledge to define a practical basis for requirements engineering and provide projects with precise and helpful guidelines.

This is I think the fourth time I am giving talks in this venue (previous talks were about Design by Contract, Agile Methods and Concurrency).

VN:F [1.9.10_1130]
Rating: 6.4/10 (10 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 9 votes)

Some contributions

Science progresses through people taking advantage of others’ insights and inventions. One of the conditions that makes the game possible is that you acknowledge what you take. For the originator, it is rewarding to see one’s ideas reused, but frustrating when that happens without acknowledgment, especially when you are yourself punctilious about citing your own sources of inspiration.

I have started to record some concepts that are widely known and applied today and which I believe I originated in whole or in part, whether or not their origin is cited by those who took them. The list below is not complete and I may update it in the future. It is not a list of ideas I contributed, only of those fulfilling two criteria:

  • Others have built upon them.  (If there is an idea that I think is great but no one paid attention to it, the list does not include it.)
  • They have gained wide visibility.

There is a narcissistic aspect to this exercise and if people want to dismiss it as just showing I am full of myself so be it. I am just a little tired of being given papers to referee that state that genericity was invented by Java, that no one ever thought of refactoring before agile methods, and so on. It is finally time to state some facts.

Facts indeed: I back every assertion by precise references. So if I am wrong — i.e. someone preceded me — the claims of precedence can be refuted; if so I will update or remove them. All articles by me cited in this note are available (as downloadable PDFs) on my publication page. (The page is up to date until 2018; I am in the process of adding newer publications.)

Post-publication note: I have started to receive some comments and added them in a Notes section at the end; references to those notes are in the format [A].

Final disclaimer (about the narcissistic aspect): the exercise of collecting such of that information was new for me, as I do not usually spend time reflecting on the past. I am much more interested in the future and definitely hope that my next contributions will eclipse any of the ones listed below.

Programming concepts: substitution principle

Far from me any wish to under-represent the seminal contributions of Barbara Liskov, particularly her invention of the concept of abstract data type on which so much relies. As far as I can tell, however, what has come to be known as the “Liskov Substitution Principle” is essentially contained in the discussion of polymorphism in section 10.1 of in the first edition (Prentice Hall, 1988) of my book Object-Oriented Software Construction (hereafter OOSC1); for example, “the type compatibility rule implies that the dynamic type is always a descendant of the static type” (10.1.7) and “if B inherits from A, the set of objects that can be associated at run time with an entity [generalization of variable] includes instances of B and its descendants”.

Perhaps most tellingly, a key aspect of the substitution principle, as listed for example in the Wikipedia entry, is the rule on assertions: in a proper descendant, keep the invariant, keep or weaken the precondition, keep or strengthen the postcondition. This rule was introduced in OOSC1, over several pages in section 11.1. There is also an extensive discussion in the article Eiffel: Applying the Principles of Object-Oriented Design published in the Journal of Systems and Software, May 1986.

The original 1988 Liskov article cited (for example) in the Wikipedia entry on the substitution principle says nothing about this and does not in fact include any of the terms “assertion”, “precondition”, “postcondition” or “invariant”. To me this absence means that the article misses a key property of substitution: that the abstract semantics remain the same. (Also cited is a 1994 Liskov article in TOPLAS, but that was many years after OOSC1 and other articles explaining substitution and the assertion rules.)

Liskov’s original paper states that “if for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for oz, then S is a subtype of T.” As stated, this property is impossible to satisfy: if the behavior is identical, then the implementations are the same, and the two types are identical (or differ only by name). Of course the concrete behaviors are different: applying the operation rotate to two different figures o1 and o2, whose types are subtypes of FIGURE and in some cases of each other, will trigger different algorithms — different behaviors. Only with assertions (contracts) does the substitution idea make sense: the abstract behavior, as characterized by preconditions, postconditions and the class invariants, is the same (modulo respective weakening and strengthening to preserve the flexibility of the different version). Realizing this was a major step in understanding inheritance and typing.

I do not know of any earlier (or contemporary) exposition of this principle and it would be normal to get the appropriate recognition.

Software design: design patterns

Two of the important patterns in the “Gang of Four” Design Patterns book (GoF) by Gamma et al. (1995) are the Command Pattern and the Bridge Pattern. I introduced them (under different names) in the following publications:

  • The command pattern appears in OOSC1 under the name “Undo-Redo” in section 12.2. The solution is essentially the same as in GoF. I do not know of any earlier exposition of the technique. See also notes [B] and [C].
  • The bridge pattern appears under the name “handle technique” in my book Reusable Software: The Base Component Libraries (Prentice Hall, 1994). It had been described several years earlier in manuals for Eiffel libraries. I do not know of an earlier reference. (The second edition of Object-Oriented Software Construction — Prentice Hall, 1997, “OOSC2” –, which also describes it, states that a similar technique is described in an article by Josef Gil and Ricardo Szmit at the TOOLS USA conference in the summer of 1994, i.e. after the publication of Reusable Software.)

Note that it is pointless to claim precedence over GoF since that book explicitly states that it is collecting known “best practices”, not introducing new ones. The relevant questions are: who, pre-GoF, introduced each of these techniques first; and which publications does the GoF cites as “prior art”  for each pattern. In the cases at hand, Command and Bridge, it does not cite OOSC1.

To be concrete: unless someone can point to an earlier reference, then anytime anyone anywhere using an interactive system enters a few “CTRL-Z” to undo commands, possibly followed by some “CTRL-Y” to redo them (or uses other UI conventions to achieve these goals), the software most likely relying on a technique that I first described in the place mentioned above.

Software design: Open-Closed Principle

Another contribution of OOSC1 (1988), section 2.3, reinforced in OOSC2 (1997) is the Open-Closed principle, which explained one of the key aspects of inheritance: the ability to keep a module both closed (immediately usable as is) and open to extension (through inheritance, preserving the basic semantics. I am mentioning this idea only in passing since in this case my contribution is usually recognized, for example in the Wikipedia entry.

Software design: OO for reuse

Reusability: the Case for Object-Oriented Design (1987) is, I believe, the first publication that clearly explained why object-oriented concepts were (and still are today — in Grady Booch’s words, “there is no other game in town”) the best answer to realize the goal of software construction from software components. In particular, the article:

  • Explains the relationship between abstract data types and OO programming, showing the former as the theoretical basis for the latter. (The CLU language at MIT originated from Liskov’s pioneering work on abstract data types, but was not OO in the full sense of the term, missing in particular a concept of inheritance.)
  • Shows that reusability implies bottom-up development. (Top-down refinement was the mantra at the time, and promoting bottom-up was quite a shock for many people.)
  • Explains the role of inheritance for reuse, as a complement to Parnas’s interface-based modular construction with information hiding.

Software design: Design by Contract

The contribution of Design by Contract is one that is widely acknowledged so I don’t have any point to establish here — I will just recall the essentials. The notion of assertion goes back to the work of Floyd, Hoare and Dijkstra in the sixties and seventies, and correctness-by-construction to Dijktra, Gries and Wirth, but Design by Contract is a comprehensive framework providing:

  • The use of assertions in an object-oriented context. (The notion of class invariant was mentioned in a paper by Tony Hoare published back in 1972.)
  • The connection of inheritance with assertions (as sketched above). That part as far as I know was entirely new.
  • A design methodology for quality software: the core of DbC.
  • Language constructs carefully seamed into the fabric of the language. (There were precedents there, but in the form of research languages such as Alphard, a paper design only, not implemented, and Euclid.)
  • A documentation methodology.
  • Support for testing.
  • Support for a consistent theory of exception handling (see next).

Design by Contract is sometimes taken to mean simply the addition of a few assertions here and there. What the term actually denotes is a comprehensive methodology with all the above components, tightly integrated into the programming language. Note in particular that preconditions and postconditions are not sufficient; in an OO context class invariants are essential.

Software design: exceptions

Prior to the Design by Contract work, exceptions were defined very vaguely, as something special you do outside of “normal” cases, but without defining “normal”. Design by Contract brings a proper perspective by defining these concepts precisely. This was explained in a 1987 article, Disciplined Exceptions ([86] in the list), rejected by ECOOP but circulated as a technical report; they appear again in detail in OOSC1 (sections 7.10.3 to 7.10.5).

Other important foundational work on exceptions, to which I know no real precursor (as usual I would be happy to correct any omission), addressed what happens to the outcome of an exception in a concurrent or distributed context. This work was done at ETH, in particular in the PhD theses  of B. Morandi and A. Kolesnichenko, co-supervised with S. Nanz. See the co-authored papers [345] and [363].

On the verification aspect of exceptions, see below.

Software design: refactoring

I have never seen a discussion of refactoring that refers to the detailed discussion of generalization in both of the books Reusable Software (1994, chapter 3) and Object Success (Prentice Hall, 1995, from page 122 to the end of chapter 6). These discussions describe in detail how, once a program has been shown to work, it should be subject to a posteriori design improvements. It presents several of the refactoring techniques (as they were called when the idea gained traction several years later), such as moving common elements up in the class hierarchy, and adding an abstract class as parent to concrete classes ex post facto.

These ideas are an integral part of the design methodology presented in these books (and again in OOSC2 a few later). It is beyond me why people would present refactoring (or its history, as in the Wikipedia entry on the topic) without referring to these publications, which were widely circulated and are available for anyone to inspect.

Software design: built-in documentation and Single-Product principle

Another original contribution was the idea of including documentation in the code itself and relying on tools to extract the documentation-only information (leaving implementation elements aside). The idea, described in detail in OOSC1 in 1988 (sections 9.4 and 9.5) and already mentioned in the earlier Eiffel papers, is that code should be self-complete, containing elements of various levels of abstraction; some of them describe implementation, but the higher-level elements describe specification, and are distinguished syntactically in such a way that tools can extract them to produce documentation at any desired level of abstraction.

The ideas were later applied through such mechanisms as JavaDoc (with no credit as far as I know). They were present in Eiffel from the start and the underlying principles, in particular the “Single Product principle” (sometimes “Self-Documentation principle”, and also generalized by J. Ostroff and R. Paige as “Single-Model principle”). Eiffel is the best realization of these principles thanks to:

  • Contracts (as mentioned above): the “contract view” of a class (called “short form” in earlier descriptions) removes the implementations but shows the relevant preconditions, postconditions and class invariants, given a precise and abstract specification of the class.
  • Eiffel syntax has a special place for “header comments”, which describe high-level properties and remain in the contract view.
  • Eiffel library class documentation has always been based on specifications automatically extracted from the actual text of the classes, guaranteeing adequacy of the documentation. Several formats are supported (including, from 1995 on, HTML, so that documentation can be automatically deployed on the Web).
  • Starting with the EiffelCase tool in the early 90s, and today with the Diagram Tool of EiffelStudio, class structures (inheritance and client relationships) are displayed graphically, again in an automatically extracted form, using either the BON or UML conventions.

One of the core benefits of the Single-Product principle is to guard against what some of my publications called the “Dorian Gray” syndrome: divergence of an implementation from its description, a critical problem in software because of the ease of modifying stuff. Having the documentation as an integral part of the code helps ensure that when information at some level of abstraction (specification, design, implementation) changes, the other levels will be updated as well.

Crucial in the approach is the “roundtripping” requirement: specifiers or implementers can make changes in any of the views, and have them reflected automatically in the other views. For example, you can graphically draw an arrow between two bubbles representing classes B and A in the Diagram Tool, and the code of B will be updated with “inherit A”; or you can add this Inheritance clause textually in the code of class B, and the diagram will be automatically updated with an arrow.

It is important to note how contrarian and subversive these ideas were at the time of their introduction (and still to some extent today). The wisdom was that you do requirements then design then implementation, and that code is a lowly product entirely separate from specification and documentation. Model-Driven Development perpetuates this idea (you are not supposed to modify the code, and if you do there is generally no easy way to propagate the change to the model.) Rehabilitating the code (a precursor idea to agile methods, see below) was a complete change of perspective.

I am aware of no precedent for this Single Product approach. The closest earlier ideas I can think of are in Knuth’s introduction of Literate Programming in the early eighties (with a book in 1984). As in the Single-product approach, documentation is interspersed with code. But the literate programming approach is (as presented) top-down, with English-like explanations progressively being extended with implementation elements. The Single Product approach emphasizes the primacy of code and, in terms of the design process, is very much yoyo, alternating top-down (from the specification to the implementation) and bottom-up (from the implementation to the abstraction) steps. In addition, a large part of the documentation, and often the most important one, is not informal English but formal assertions. I knew about Literate Programming, of course, and learned from it, but Single-Product is something else.

Software design: from patterns to components

Karine Arnout’s thesis at ETH Zurich, resulting in two co-authored articles ([255] and [257], showed that contrary to conventional wisdom a good proportion of the classical design patterns, including some of the most sophisticated, can be transformed into reusable components (indeed part of an Eiffel library). The agent mechanism (see below) was instrumental in achieving that result.

Programming, design and specification concepts: abstract data types

Liskov’s and Zilles’s ground-breaking 1974 abstract data types paper presented the concepts without a mathematical specification, using programming language constructs instead. A 1976 paper (number [3] in my publication list, La Description des Structures de Données, i.e. the description of data structures) was as far as I know one of the first to present a mathematical formalism, as  used today in presentations of ADTs. John Guttag was taking a similar approach in his PhD thesis at about the same time, and went further in providing a sound mathematical foundation, introducing in particular (in a 1978 paper with Jim Horning) the notion of sufficient completeness, to which I devoted a full article in this blog  (Are My Requirements Complete?) about a year ago. My own article was published in a not very well known journal and in French, so I don’t think it had much direct influence. (My later books reused some of the material.)

The three-level description approach of that article (later presented in English for an ACM workshop in the US in 1981, Pingree Park, reference [28]) is not well known but still applicable, and would be useful to avoid frequent confusions between ADT specifications and more explicit descriptions.

When I wrote my 1976 paper, I was not aware of Guttag’s ongoing work (only of the Liskov and Zilles paper), so the use of a mathematical framework with functions and predicates on them was devised independently. (I remember being quite happy when I saw what the axioms should be for a queue.) Guttag and I both gave talks at a workshop organized by the French programming language interest group in 1977 and it was fun to see that our presentations were almost identical. I think my paper still reads well today (well, if you read French). Whether or not it exerted direct influence, I am proud that it independently introduced the modern way of thinking of abstract data types as characterized by mathematical functions and their formal (predicate calculus) properties.

Language mechanisms: genericity with inheritance

Every once in a while I get to referee a paper that starts “Generics, as introduced in Java…” Well, let’s get some perspective here. Eiffel from its introduction in 1985 combined genericity and inheritance. Initially, C++ users and designers claimed that genericity was not needed in an OO context and the language did not have it; then they introduced template. Initially, the designers of Java claimed (around 1995) that genericity was not needed, and the language did not have it; a few years later Java got generics. Initially, the designers of C# (around 1999) claimed that genericity was not needed, and the language did not have it; a few years later C# and .NET got generics.

Genericity existed before Eiffel of course; what was new was the combination with inheritance. I had been influenced by work on generic modules by a French researcher, Didier Bert, which I believe influenced the design of Ada as well; Ada was the language that brought genericity to a much broader audience than the somewhat confidential languages that had such a mechanism before. But Ada was not object-oriented (it only had modules, not classes). I was passionate about object-oriented programming (at a time when it was generally considered, by the few people who had heard of it as an esoteric, academic pursuit). I started — in the context of an advanced course I was teaching at UC Santa Barbara — an investigation of how the two mechanisms relate to each other. The results were a paper at the first OOPSLA in 1986, Genericity versus Inheritance, and the design of the Eiffel type system, with a class mechanism, inheritance (single and multiple), and genericity, carefully crafted to complement each other.

With the exception of a Trellis-Owl, a  design from Digital Equipment Corporation also presented at the same OOPSLA (which never gained significant usage), there were no other OO languages with both mechanisms for several years after the Genericity versus Inheritance paper and the implementation of genericity with inheritance in Eiffel available from 1986 on. Eiffel also introduced, as far as I know, the concept of constrained genericity, the second basic mechanism for combining genericity with inheritance, described in Eiffel: The Language (Prentice Hall, 1992, section 10.8) and discussed again in OOSC2 (section 16.4 and throughout). Similar mechanisms are present in many languages today.

It was not always so. I distinctly remember people bringing their friends to our booth at some conference in the early nineties, for the sole purpose of having a good laugh with them at our poster advertising genericity with inheritance. (“What is this thing they have and no one else does? Generi-sissy-tee? Hahaha.”). A few years later, proponents of Java were pontificating that no serious language needs generics.

It is undoubtedly part of of the cycle of invention (there is a Schopenhauer citation on this, actually the only thing from Schopenhauer’s philosophy that I ever understood [D]) that people at some point will laugh at you; if it did brighten their day, why would the inventor deny them one of the little pleasures of life? But in terms of who laughs last, along the way C++ got templates, Java got generics, C# finally did too, and nowadays all typed OO languages have something of the sort.

Language mechanisms: multiple inheritance

Some readers will probably have been told that multiple inheritance is a bad thing, and hence will not count it as a contribution, but if done properly it provides a major abstraction mechanism, useful in many circumstances. Eiffel showed how to do multiple inheritance right by clearly distinguishing between features (operations) and their names, defining a class as a finite mapping between names and features, and using renaming to resolve any name clashes.

Multiple inheritance was made possible by an implementation innovation: discovering a technique (widely imitated since, including in single-inheritance contexts) to implement dynamic binding in constant time. It was universally believed at the time that multiple inheritance had a strong impact on performance, because dynamic binding implied a run-time traversal of the class inheritance structure, already bad enough for single inheritance where the structure is a tree, but prohibitive with multiple inheritance for which it is a directed acyclic graph. From its very first implementation in 1986 Eiffel used what is today known as a virtual table technique which guarantees constant-time execution of routine (method) calls with dynamic binding.

Language mechanisms: safe GC through strong static typing

Simula 67 implementations did not have automatic garbage collection, and neither had implementations of C++. The official excuse in the C++ case was methodological: C programmers are used to exerting manual control of memory usage. But the real reason was a technical impossibility resulting from the design of the language: compatibility with C precludes the provision of a good GC.

More precisely, of a sound and complete GC. A GC is sound if it will only reclaim unreachable objects; it is complete if it will reclaim all unreachable objects. With a C-based language supporting casts (e.g. between integers and pointers) and pointer arithmetic, it is impossible to achieve soundness if we aim at a reasonable level of completeness: a pointer can masquerade as an integer, only to be cast back into a pointer later on, but in the meantime the garbage collector, not recognizing it as a pointer, may have wrongly reclaimed the corresponding object. Catastrophe.

It is only possible in such a language to have a conservative GC, meaning that it renounces completeness. A conservative GC will treat as a pointer any integer whose value could possibly be a pointer (because it lies between the bounds of the program’s data addresses in memory). Then, out of precaution, the GC will refrain from reclaiming the objects at these addresses even if they appear unreachable.

This approach makes the GC sound but it is only a heuristics, and it inevitably loses completeness: every once in a while it will fail to reclaim some dead (unreachable) objects around. The result is a program with memory leaks — usually unacceptable in practice, particularly for long-running or continuously running programs where the leaks inexorably accumulate until the program starts thrashing then runs out of memory.

Smalltalk, like Lisp, made garbage collection possible, but was not a typed language and missed on the performance benefits of treating simple values like integers as a non-OO language would. Although in this case I do not at the moment have a specific bibliographic reference, I believe that it is in the context of Eiffel that the close connection between strong static typing (avoiding mechanisms such as casts and pointer arithmetic) and the possibility of sound and complete garbage collection was first clearly explained. Explained in particular around 1990 in a meeting with some of the future designers of Java, which uses a similar approach, also taken over later on by C#.

By the way, no one will laugh at you today for considering garbage collection as a kind of basic human right for programmers, but for a long time the very idea was quite sulfurous, and advocating it subjected you to a lot of scorn. Here is an extract of the review I got when I submitted the first Eiffel paper to IEEE Transactions on Software Engineering:

Systems that do automatic garbage collection and prevent the designer from doing his own memory management are not good systems for industrial-strength software engineering.

Famous last words. Another gem from another reviewer of the same paper:

I think time will show that inheritance (section 1.5.3) is a terrible idea.

Wow! I wish the anonymous reviewers would tell us what they think today. Needless to say, the paper was summarily rejected. (It later appeared in the Journal of Systems and Software — as [82] in the publication list — thanks to the enlightened views of Robert Glass, the founding editor.)

Language mechanisms: void safety

Void safety is a property of a language design that guarantees the absence of the plague of null pointer dereferencing.

The original idea came (as far as I know) from work at Microsoft Research that led to the design of a research language called C-omega; the techniques were not transferred to a full-fledged programming language. Benefiting from the existence of this proof of concept, the Eiffel design was reworked to guarantee void safety, starting from my 2005 ECOOP keynote paper (Attached Types) and reaching full type safety a few years later. This property of the language was mechanically proved in a 2016 ETH thesis by A. Kogtenkov.

Today all significant Eiffel development produces void-safe code. As far as I know this was a first among production programming languages and Eiffel remains the only production language to provide a guarantee of full void-safety.

This mechanism, carefully crafted (hint: the difficult part is initialization), is among those of which I am proudest, because in the rest of the programming world null pointer dereferencing is a major plague, threatening at any moment to crash the execution of any program that uses pointers of references. For Eiffel users it is gone.

Language mechanisms: agents/delegates/lambdas

For a long time, OO programming languages did not have a mechanism for defining objects wrapping individual operations. Eiffel’s agent facility was the first such mechanism or among the very first together the roughly contemporaneous but initially much more limited delegates of C#. The 1999 paper From calls to agents (with P. Dubois, M. Howard, M. Schweitzer and E. Stapf, [196] in the list) was as far as I know the first description of such a construct in the scientific literature.

Language mechanisms: concurrency

The 1993 Communications of the ACM paper on Systematic Concurrent Object-Oriented Programming [136] was certainly not the first concurrency proposal for OO programming (there had been pioneering work reported in particular in the 1987 book edited by Tokoro and Yonezawa), but it innovated in offering a completely data-race-free model, still a rarity today (think for example of the multi-threading mechanisms of dominant OO languages).

SCOOP, as it came to be called, was implemented a few years later and is today a standard part of Eiffel.

Language mechanisms: selective exports

Information hiding, as introduced by Parnas in his two seminal 1972 articles, distinguishes between public and secret features of a module. The first OO programming language, Simula 67, had only these two possibilities for classes and so did Ada for modules.

In building libraries of reusable components I realized early on that we need a more fine-grained mechanism. For example if class LINKED_LIST uses an auxiliary class LINKABLE to represent individual cells of a linked list (each with a value field and a “right” field containing a reference to another LINKABLE), the features of LINKABLE (such as the operation to reattach the “right” field) should not be secret, since LINKED_LIST needs them; but they should also not be generally public, since we do not want arbitrary client objects to mess around with the internal structure of the list. They should be exported selectively to LINKED_LIST only. The Eiffel syntax is simple: declare these operations in a clause of the class labeled “feature {LINKED_LIST}”.

This mechanism, known as selective exports, was introduced around 1989 (it is specified in full in Eiffel: The Language, from 1992, but was in the Eiffel manuals earlier). I think it predated the C++ “friends” mechanism which serves a similar purpose (maybe someone with knowledge of the history of C++ has the exact date). Selective exports are more general than the friends facility and similar ones in other OO languages: specifying a class as a friend means it has access to all your internals. This solution is too coarse-grained. Eiffel’s selective exports make it possible to define the specific export rights of individual operations (including attributes/fields) individually.

Language mechanisms and implementation: serialization and schema evolution

I did not invent serialization. As a student at Stanford in 1974 I had the privilege, at the AI lab, of using SAIL (Stanford Artificial Intelligence Language). SAIL was not object-oriented but included many innovative ideas; it was far ahead of its time, especially in terms of the integration of the language with (what was not yet called) its IDE. One feature of SAIL with which one could fall in love at first sight was the possibility of selecting an object and having its full dependent data structure (the entire subgraph of the object graph reached by following references from the object, recursively) stored into a file, for retrieval at the next section. After that, I never wanted again to live without such a facility, but no other language and environment had it.

Serialization was almost the first thing we implemented for Eiffel: the ability to write object.store (file) to have the entire structure from object stored into file, and the corresponding retrieval operation. OOSC1 (section 15.5) presents these mechanisms. Simula and (I think) C++ did not have anything of the sort; I am not sure about Smalltalk. Later on, of course, serialization mechanisms became a frequent component of OO environments.

Eiffel remained innovative by tackling the difficult problems: what happens when you try to retrieve an object structure and some classes have changed? Only with a coherent theoretical framework as provided in Eiffel by Design by Contract can one devise a meaningful solution. The problem and our solutions are described in detail in OOSC2 (the whole of chapter 31, particularly the section entitled “Schema evolution”). Further advances were made by Marco Piccioni in his PhD thesis at ETH and published in joint papers with him and M. Oriol, particularly [352].

Language mechanisms and implementation: safe GC through strong static typing

Simula 67 (if I remember right) did not have automatic garbage collection, and neither had C++ implementations. The official justification in the case of C++ was methodological: C programmers are used to exerting manual control of memory usage. But the real obstacle was technical: compatibility with C makes it impossible to have a good GC. More precisely, to have a sound and complete GC. A GC is sound if it will only reclaim unreachable objects; it is complete if it will reclaim all unreachable objects. With a C-based language supporting casts (e.g. between integers and pointers) and pointer arithmetic, it is impossible to achieve soundness if we aim at a reasonable level of completeness: a pointer can masquerade as an integer, only to be cast back into a pointer later on, but in the meantime the garbage collector, not recognizing it as a pointer, may have wrongly reclaimed the corresponding object. Catastrophe. It is only possible in such a language to have a conservative GC, which will treat as a pointer any integer whose value could possibly be a pointer (because its value lies between the bounds of the program’s data addresses in memory). Then, out of precaution, it will not reclaim the objects at the corresponding address. This approach makes the GC sound but it is only a heuristics, and it may be over-conservative at times, wrongly leaving dead (i.e. unreachable) objects around. The result is, inevitably, a program with memory leaks — usually unacceptable in practice.

Smalltalk, like Lisp, made garbage collection possible, but was not a typed language and missed on the performance benefits of treating simple values like integers as a non-OO language would. Although in this case I do not at the moment have a specific bibliographic reference, I believe that it is in the context of Eiffel that the close connection between strong static typing (avoiding mechanisms such as casts and pointer arithmetic) and the possibility of sound and complete garbage collection was first clearly explained. Explained in particular to some of the future designers of Java, which uses a similar approach, also taken over later on by C#.

By the way, no one will laugh at you today for considering garbage collection as a kind of basic human right for programmers, but for a long time it was quite sulfurous. Here is an extract of the review I got when I submitted the first Eiffel paper to IEEE <em>Transactions on Software Engineering:

Software engineering: primacy of code

Agile methods are widely and properly lauded for emphasizing the central role of code, against designs and other non-executable artifacts. By reading the agile literature you might be forgiven for believing that no one brought up that point before.

Object Success (1995) makes the argument very clearly. For example, chapter 3, page 43:

Code is to our industry what bread is to a baker and books to a writer. But with the waterfall code only appears late in the process; for a manager this is an unacceptable risk factor. Anyone with practical experience in software development knows how many things can go wrong once you get down to code: a brilliant design idea whose implementation turns out to require tens of megabytes of space or minutes of response time; beautiful bubbles and arrows that cannot be implemented; an operating system update, crucial to the project which comes five weeks late; an obscure bug that takes ages to be fixed. Unless you start coding early in the process, you will not be able to control your project.

Such discourse was subversive at the time; the wisdom in software engineering was that you need to specify and design a system to death before you even start coding (otherwise you are just a messy “hacker” in the sense this word had at the time). No one else in respectable software engineering circles was, as far as I know, pushing for putting code at the center, the way the above extract does.

Several years later, agile authors started making similar arguments, but I don’t know why they never referenced this earlier exposition, which still today I find not too bad. (Maybe they decided it was more effective to have a foil, the scorned Waterfall, and to claim that everyone else before was downplaying the importance of code, but that was not in fact everyone.)

Just to be clear, Agile brought many important ideas that my publications did not anticipate; but this particular one I did.

Software engineering: the roles of managers

Extreme Programming and Scrum have brought new light on the role of managers in software development. Their contributions have been important and influential, but here too they were for a significant part prefigured by a long discussion, altogether two chapters, in Object Success (1995).

To realize this, it is enough to read the titles of some of the sections in those chapters, describing roles for managers (some universal, some for a technical manager): “risk manager”, “interface with the rest of the world” (very scrummy!), “protector of the team’s sanity”, “method enforcer” (think Scrum Master), “mentor and critic”. Again, as far as I know, these were original thoughts at the time; the software engineering literature for the most part did not talk about these issues.

Software engineering: outsourcing

As far as I know the 2006 paper Offshore Development: The Unspoken Revolution in Software Engineering was the first to draw attention, in the software engineering community, to the peculiar software engineering challenges of distributed and outsourced development.

Software engineering: automatic testing

The AutoTest project (with many publications, involving I. Ciupa, A. Leitner, Y. Wei, M. Oriol, Y. Pei, M. Nordio and others) was not the first to generate tests automatically by creating numerous instances of objects and calling applicable operations (it was preceded by Korat at MIT), but it was the first one to apply this concept with Design by Contract mechanisms (without which it is of little practical value, since one must still produce test oracles manually) and the first to be integrated in a production environment (EiffelStudio).

Software engineering: make-less system building

One of the very first decisions in the design of Eiffel was to get rid of Make files.

Feldman’s Make had of course been a great innovation. Before Make, programmers had to produce executable systems manually by executing sequences of commands to compile and link the various source components. Make enabled them to instead  to define dependencies between components in a declarative way, resulting in a partial order, and then performed a topological sort to produce the sequence of comments. But preparing the list of dependencies remains a tedious task, particularly error-prone for large systems.

I decided right away in the design of Eiffel that we would never force programmers to write such dependencies: they would be automatically extracted from the code, through an exhaustive analysis of the dependencies between modules. This idea was present from the very the first Eiffel report in 1985 (reference [55] in the publication list): Eiffel programmers never need to write a Make file or equivalent (other than for non-Eiffel code, e.g. C or C++, that they want to integrate); they just click a Compile button and the compiler figures out the steps.

Behind this approach was a detailed theoretical analysis of possible relations between modules in software development (in many programming languages), published as the “Software Knowledge Base” at ICSE in 1985. That analysis was also quite instructive and I would like to return to this work and expand it.

Educational techniques: objects first

Towards an Object-Oriented Curriculum ( TOOLS conference, August 1993, see also the shorter JOOP paper in May of the same year) makes a carefully argued case for what was later called the Objects First approach to teaching programming. I would be interested to know if there are earlier publications advocating starting programming education with an OO language.

The article also advocated for the “inverted curriculum”, a term borrowed from work by Bernie Cohen about teaching electrical engineering. It was the first transposition of this concept to software education. In the article’s approach, students are given program components to use, then little by little discover how they are made. This technique met with some skepticism and resistance since the standard approach was to start from the very basics (write trivial programs), then move up. Today, of course, many introductory programming courses similarly provide students from day one with a full-fledged set of components enabling them to produce significant programs.

More recent articles on similar topics, taking advantage of actual teaching experience, are The Outside-In Method of Teaching Programming (2003) and The Inverted Curriculum in Practice (at ICSE 2006, with Michela Pedroni). The culmination of that experience is the textbook Touch of Class from 2009.

Educational techniques: Distributed Software Projects

I believe our team at ETH Zurich (including among others M. Nordio, J. Tschannen, P. Kolb and C. Estler and in collaboration with C. Ghezzi, E. Di Nitto and G. Tamburrelli at Politecnico di Milano, N. Aguirre at Rio Cuarto and many others in various universities) was the first to devise,  practice and document on a large scale (see publications and other details here) the idea of an educational software project conducted in common by student groups from different universities. It yielded a wealth of information on distributed software development and educational issues.

Educational techniques: Web-based programming exercises

There are today a number of cloud-based environments supporting the teaching of programming by enabling students to compile and test their programs on the Web, benefiting from a prepared environment (so that they don’t have to download any tools or prepare control files) and providing feedback. One of the first — I am not sure about absolute precedence — and still a leading one, used by many universities and applicable to many programming languages, is Codeboard.

The main developer, in my chair at ETH Zurich, was Christian Estler, supported in particular by M. Nordio and M. Piccioni, so I am only claiming a supporting role here.

Educational techniques: key CS/SE concepts

The 2001 paper Software Engineering in the Academy did a good job, I think, of defining the essential concepts to teach in a proper curriculum (part of what Jeannette Wing’s 2006 paper called Computational Thinking).

Program verification: agents (delegates etc.)

Reasoning about Function Objects (ICSE 2010, with M. Nordio, P. Müller and J. Tschannen) introduced verification techniques for objects representing functions (such as agents, delegates etc., see above) in an OO language. Not sure whether there were any such techniques before.

Specification languages: Z

The Z specification language has been widely used for formal development, particularly in the UK. It is the design of J-R Abrial. I may point out that I was a coauthor of the first publication on Z in English (1980),  describing a version that preceded the adaptation to a more graphical-style notation done later at Oxford. The first ever published description of Z, pertaining to an even earlier version, was in French, in my book Méthodes de Programmation (with C. Baudoin), Eyrolles, 1978, running over 15 pages (526-541), with the precise description of a refinement process.

Program verification: exceptions

Largely coming out of the PhD thesis of Martin Nordio, A Sound and Complete Program Logic for Eiffel (TOOLS 2009) introduces rules for dealing with exceptions in a Hoare-style verification framework.

Program verification: full library, and AutoProof

Nadia Polikarpova’s thesis at ETH, aided by the work of Carlo Furia and Julian Tschannen (they were the major contributors and my participation was less important), was as far as I know the first to produce a full functional verification of an actual production-quality reusable library. The library is EiffelBase 2, covering fundamental data structures.

AutoProof — available today, as a still experimental tool, through its Web interface, see here — relied on the AutoProof prover, built by the same team, and itself based on Microsoft Research’s Boogie and Z3 engines.

More

There are more concepts worthy of being included here, but for today I will stop here.

Notes

[A] One point of divergence between usual presentations of the substitution principle and the view in OOSC and my other publications is the covariance versus contravariance of routine argument types. It reflects a difference of views as to what the proper policy (both mathematically sound and practically usable) should be.

[B]  The GoF book does not cite OOSC for the command or bridge patterns. For the command pattern it cites (thanks to Adam Kosmaczewski for digging up the GoF text!) a 1985 SIGGRAPH paper by Henry Lieberman (There’s More to Menu Systems than Meets the Screen). Lieberman’s paper describes the notion of command object and mentions undoing in passing, but does not include the key elements of the command pattern (as explained in full in OOSC1), i.e. an abstract (deferred) command class with deferred procedures called (say) do_it and undo_it, then specific classes for each kind of command, each providing a specific implementation of those procedures, then a history list of commands supporting multiple-level undo and redo as explained in OOSC1. (Reading Lieberman’s paper with a 2021 perspective shows that it came tantalizingly close to the command pattern, but doesn’t get to it. The paper does talk about inheritance between command classes, but only to “define new commands as extensions to old commands”, not in the sense of a general template that can be implemented in many specific ways. And it does mention a list of objects kept around to enable recovery from accidental deletions, and states that the application can control its length, as is the case with a history list; but the objects in the list are not command objects, they are graphical and other objects that have been deleted.)

[C] Additional note on the command pattern: I vaguely remember seeing something similar to the OOSC1 technique in an article from a supplementary volume of the OOPSLA proceedings in the late eighties or early nineties, i.e. at the same time or slightly later, possibly from authors from Xerox PARC, but I have lost the reference.

[D] Correction: I just checked the source and learned that the actual Schopenhauer quote (as opposed to the one that is usually quoted) is different; it does not include the part about laughing. So much for my attempts at understanding philosophy.

 

VN:F [1.9.10_1130]
Rating: 8.7/10 (26 votes cast)
VN:F [1.9.10_1130]
Rating: +7 (from 13 votes)

Le français dans le monde

Extrait d’un article dans Le Monde d’hier, signé Juliette Bénézit (sur un problème sérieux : l’inadéquation des services en ligne pour les cartes de séjour pour étrangers) : il s’agit d’une

problématique particulièrement prégnante en Ile-de-France

Diable ! Quels termes galants pour dire que le problème est particulièrement aigu dans cette belle région !

En général — heuristique personnelle mais qui marche assez bien — je cesse généralement d’écouter dès que j’entends “problématique” pour problème et “thématique” pour thème. Quant à “prégnant”, voici la définition du Larousse :

Littéraire. Qui s’impose à l’esprit, qui produit une forte impression.
Qui s’impose fortement, en parlant d’une structure perceptive et dans le contexte de la Gestalttheorie.

Re-diable ! Ou plutôt, Verdammt ! Gestalttheorie !

Je recherchais la référence exacte de l’article pour mettre ci-dessus le lien correct ; tenez-vous bien : Google donne cent pages Web contenant le texte exact “problématique particulièrement prégnante”. Dans Le Monde et ailleurs. (J’ai aussi cherché, pensant à une plaisanterie un peu trop ressassée en anglais, “à-demi prégnante”, mais là rien.)

Je pourrais continuer la liste des atrocités prétentieuses, mais excusez-moi, j’ai une problématique très très prégnante à régler.

 

VN:F [1.9.10_1130]
Rating: 4.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 5 votes)

Time to resurrect PSP?

Let us assume for the sake of the argument that software quality matters. There are many ingredients to software quality, of which one must be the care that every programmer devotes to the job. The Personal Software Process, developed by Watts Humphrey in the 1990s [1], prescribes a discipline that software developers should apply to produce good software and improve their professional ability over their careers. It has enjoyed moderate success but was never a mass movement and rarely gets mentioned nowadays; few software developers, in my experience, even know the name. Those who do often think of it as passé, a touching memory from the era of Monica Lewinsky and the Roseanne show.

Once cleaned of a few obsolete elements, PSP deserves to be known and applied.

PSP came out of Watts Humphrey’s earlier work on the Capability Maturity Model (see my earlier article on this blog, What is wrong with CMMI), a collection of recommended practices and assessment criteria for software processes, originally developed in the mid-eighties for the U.S. military contractor community but soon thereafter embraced by software outsourcing companies (initially, Indian ones) and later by other industries. Responding to complaints that CMM/CMMI, focused on processes in large companies, ignored the needs of smaller ones, and lacked individual guidance for developers, Humphrey developed TSP, the Team Software Process, and PSP.

The most visible part of PSP is a six-step process pictured in the middle of this diagram:
cmmi

The most visible and also the most corny. Who today wants to promise always to follow such a strict sequence of steps? Always to write the code for a module in full before compiling it? (Notice there is no backward arrow, the process is sequential.) Always to test at the end only? Come on. This is the third decade of the 21st century.

Today we compile as we code, using the development environment (IDE) as a brilliant tool to check everything we do or plan to do. For my part, whenever I am writing code and have not compiled my current draft for more than a few minutes I start feeling like an addict in need of a fix; my fix is the Compile button of EiffelStudio. At some eventual stage the compiler becomes a tool to generate excutable code, but long before that it has been my friend, coach, mentor, and doppelgänger, helping me get things (types, null references, inheritance…) right and gently chiding me when I wander off the rails.

As to tests, even if you do not buy into the full dogma of Test-Driven Development (I don’t), they get written and exercised right from the start, as you are writing the code, not afterwards. Compile all the time, test all the time.

It’s not just that a process such as the above ignores the contributions of agile methods, which are largely posterior to PSP. As analyzed in [2], agile is a curious mix of good ideas and a few horrendous ones. But among its durable contributions is the realization that development must be incremental, not a strict succession of separate activities.

This old-style flavor or PSP is probably the reason why it has fallen out of favor. But (like the agile rejection of upfront lifecycle activities) such a reaction is a case of criticism gone too far, ignoring the truly beneficial contributions. Ignore PSP’s outmoded sequence of activities and you will find that PSP’s core message is as relevant today as it ever was. That message is: we should learn from the practices of traditional engineers and apply a strict professional discipline. For example:

  • Keep a log of all activities. (See “Logs” in the above figure.) Engineers are taught to record everything they do; many programmers don’t bother. This practice, however, is essential to self-improvement.
  • Keep measurements of everything you do. (There are lots of things to measure, from hours spent on every kind of task to bugs found, time to fix them etc.)
  • Estimate and plan your work.
  • Clearly define commitments, and meet them.
  • Resist pressure to make unreasonable commitments (something that agilists approach also emphasize).
  • Understand your current performance.
  • Understand your programming style and how it affects various measures. (As an example, code size, as a function of the number of routines, depends on whether you are more concise or more verbose in style).
  • Continually improve your expertise as a professional.

PSP does not limit itself to such exhortations but gives concrete tools to apply the principles, with a view to: measuring, tracking and analyzing your work; learning from your performance variations; and incorporating the lessons learned into your professional practices. On the topic of measurement, for example, PSP includes precise guidelines on what to measure and how to measure it, and how to rely on proxies for quantities that are hard to assess directly. On this last point, PSP includes PROBE (PROxy-Based Estimating, you cannot have a method coming out of the world of US government organizations without cringeworthy acronyms), a general framework for estimating size and resource parameters from directly measurable proxies.

This is what PSP is about: a discipline of personal productivity and growth, emphasizing personal discipline, tracking and constant improvement. It is not hard to learn; a technical report by Humphrey available online [3] provides a sufficient basis to understand the concepts and start a process of self-improvement.

Watts Humphrey himself, as all who had the privilege to meet him can testify, was a model of seriousness and professionalism, the quintessential engineer. (I also remember him as the author of what may be the best pun I ever heard — ask me sometime.) PSP directly reflects these qualities and — ignoring its visible but in the end unimportant remnants from outdated technical choices — should be part of every software engineering curriculum and every software engineer’s collection of fundamental practices.

References

[1] Watts Humphrey, Introduction to the Personal Software Process, Addison-Wesley, 1996.

[2] Bertrand Meyer: Agile! The Good, the Hype and the Ugly, Springer, 2014, see here.

[3] Watts Humphrey, The Personal Software Process, Software Engineering Institute Technical Report CMU/SEI-2000-TR-022, available (in PDF, free) here.

 

Recycled A version of this article was first published in the Communications of the ACM blog.

.

VN:F [1.9.10_1130]
Rating: 7.8/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

Between you and me

I have been conducting interesting conversations with a two-something child who has not quite mastered the speaker-dependent [1] personal pronouns. He says things like “Where is your mom?”, when he actually means to ask about his mother, not mine; from hearing people tell him things like “your mom is coming”, he is clearly taking “your mom” as some kind of paraphrase for “mom”. He approaches people and says “I want to help me”, obviously from having heard, from others, “would you come and help me?”.

I am not concerned about young Tim (let us call him that), who sooner or later will get the point. But let us assume that someone, call him Tom, is determined to explain to Tim the concepts of “you” and “me”. Not so easy! In fact, come to think of it, fairly tricky, treacherous even. We all learned the concept at some point; I wish I remembered how it happened for me [2].

The difficulty is that someone, whether Tom or a third party, has to provide the explanation, a circumstance that messes up the explanation itself since it contains speaker-dependent terms. (Did I hear the word “Heisenberg”? If so I will ignore it.) Tom might try something like the following explanation — let us call it /A/  — but it will not work:

If Tom says me it means Tom, but if Tim says me it means Tim.

If Tim says you it means Tom, but if Tom says you it means Tim.

The reason it does not work is lack of quoting, but before I get to this let me mention that a mathematical model is not hard to come by. From a two-element set Names = {Tim, Tom} to itself, there are four total functions. Of these, two are constant functions. The other two are non-constant; one, me, is the identity function on Names, and you is the other. They are the functions me = {<Tim, Tim>, <Tom, Tom>} and you = {<Tim, Tom>, <Tom, Tim>}. I is just a synonym for me (the accusative in the latter being one of the few remnants of declension in English.)

Let us call this explanation /B/.  It can be illustrated as follows:

 

subjectivity

We can also write these functions (using for conciseness [c: A \ d: B … \ Z] for the conditional expression if c then A elseif d then Belse Z end) as the following version /C/:

me = λ n | [n = Tim: Tim \ Tom]

you = λ n | [n = Tim: Tom \ Tim]

In practice, however, this modeling of the concepts “you” and “me” as functions in Names → Names is not sufficient to define them properly, for example to turn the informal explanation /A/ into one that makes sense. The problem is that we are dealing with a larger set of words than just Names: the set   Words = {Tim, Tom, Me, You} with four elements. The meaning of a sentence that may include any of them is not absolute, but depends on who is uttering the sentence. When I say “me” and you say “me” the meaning is different, although when I say “you” and you say “me” the meaning is the same.

To interpret a sentence, then, we need a second-level function: a function meaning not in Words → Words but in Names → Words → Names, defined as follows:

meaning = λ n  |  λ w  |     [w ∈ Names: w    \    w = Me: me (n)    \    you (n)]

Let us call this definition /D/. Examples of its application include:

  • meaning (Tim) (Tom) = Tom. (First case: no speaker-dependency).
  • meaning (Tim) (Me) = Tim. (Second case, expresses the part of /A/ that said “if Tim says me it means Tim”.)
  • meaning (Tim) (You) = Tom. (Third case, expresses the part of /A/ that said “if Tim says you it means Tom”.)

/D/ is the most accurate definition of speaker-dependent pronouns and I may try it with the real Tim at the next opportunity (as soon as I have taught him lambda calculus).

/D/ also helps us understand what does not quite work in /A/, the first and perhaps most intuitive attempt. With explanations such as  “if Tim says you it means Tom” we are actually trying to express the need for making meanings speaker-dependent, that is to say, explain that the evaluation of a sentence must take into account who utters it. But this attempt fails because we need a general rule which will apply to all sentences, and the explanation itself is a sentence. To evaluate the sentence “if Tim says you it means Tom” we need to evaluate “you” which figures in it; that evaluation would yield either Tim or Tom depending on who is speaking. But because the sentence is itself part a definition of “you”, we do not want in this case to evaluate “you”.

We could write you in straight quotes as in programming languages:

if Tim says “you” it means Tom

but that is not the form of quoting we need. In a programming language, “you” is a character string, made of the characters ‘y‘, ‘o‘ and ‘u‘ in that order. Here, the makeup of this string in terms of its characters is completely irrelevant. The form of quoting that we need must achieve something else: preventing, in the evaluation of a formula, the evaluation of one of its parts.

John McCarthy’s Lisp language, coming out as early as 1959, showed how to handle this issue correctly. (This contribution is one of many from Lisp, including garbage collection and the notion of functional programming — not bad for a single language. See my 2011 obituary of John McCarthy on this blog [3].) One of the reasons Lisp needed a quoting mechanism is that it was designed to be self-describing and specifically self-evaluating, a spectacularly new concept at the time. McCarthy’s original book on Lisp (LISP 1.5 Programmer’s Manual, MIT Press, 1962) contains a dazzling interpreter of Lisp, written in Lisp and taking up no more than half a page. The interpreter is based on eval, a function (in the sense of a Lisp function, defined by a Lisp expression) which evaluates any Lisp program. You cannot define eval unless you have some way to prevent evaluation of part of an expression.

The convention is simple. For any expression x we may define another expression x — which we can read aloud as “QUOTED x”. Then:

  • Evaluating x yields the value of x.
  • Evaluating x yields x.

The original Lisp notation was (QUOTE x), which clearly conveys the idea: QUOTE is a function — in Lisp, function application f (x) is written (f x) — whose value, when applied to an argument, is that argument. (Not the value of the argument: the argument itself.)

With such a quote notation, explanations in the style of /A/ can be made meaningful; for example (version /E/):

If Tom says ‘me it means Tom, but if Tim says ‘me it means Tim.

This should be said with the quote spoken out loud: QUOTED me”.

We’ll see if Tim sees what /E/ means. At least, you see what you mean. I mean, I see what I  mean. Sorry, I meant that I see what you mean. Oh well. We see what we mean.

 

Notes and references

[1] I really want to call them “subjective personal pronouns”, in the ordinary sense of “subjective” opposed to “objective” as in “a subjective assessment”, but in grammatical terminology “subjective pronoun” has a well-accepted but entirely different meaning, that of a pronoun used as the subject of a sentence, like “she” rather than “her”. Hence “speaker-dependent pronoun”. Linguists also use the term “indexical pronouns”, as in the reference cited in note [2].

[2] Linguists investigating language learning have studied the matter. A recent article (2015) is 2-Year-Olds’ Comprehension of Personal Pronouns by M. Moyer, K. Harrigan, V. Hacquard and J. Lidz (in Supplement to the proceedings of the 39th Boston University Conference on Language Development, eds E. Grillo, K. Jepson and M. LaMendola), available here. The article includes a literature review. It acknowledges the you-and-me problem (first- and second-person pronouns) although it suggests that third-person pronouns (“he”, “she”) may be even more of a challenge to infants. Two quotes (with bibliographic references removed):

  • The evidence from previous studies examining two year olds’ competence is mixed, with some studies finding children succeeding with pronouns as early as 22-24 months, others finding success much later, and some with children succeeding at a range of ages.
  • Most [studies] suggest a production/comprehension asymmetry… Production studies suggest that children begin producing pronouns around 15-18 months, starting with first person, then second person, and finally third person pronouns. These early productions typically include some errors where the child reverses first and second person, but on the whole consistent errors are few . On the other hand, comprehension data focusing on speech addressed to the child suggest that children first understand second person pronouns, then first person, and finally third person.

[3] My article on John McCarthy can be found here.

VN:F [1.9.10_1130]
Rating: 7.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

The extremely bizarre idea of an inauguration

Could someone please explain what there is to celebrate when a candidate is elected to a political function? The grotesque ideas of a victory rally or an inauguration ceremony defy common sense.

An election success is an opportunity to start doing something good. It is not an achievement; it is the promise of possible achievements to come. Someone should hand the successful candidate the key to his office and wish him good luck. His supporters should go home and start thinking of how best to help him. He should get to work. End of that part of the story. There is by definition nothing to celebrate.

The time to hold a celebration, if any, is when a politician completes her term. At that stage there exists a clear basis for one of two outcomes: either a shameful, stealthy, miserable exit in the frosty predawn fog of a deserted airfield (as happened on Wednesday at Andrews); or a lavish, joyful party to extol and relish the brilliant successes of the last few years. Then, but only then, does a celebration makes sense.

Out with inaugurations, in with outaugurations!

VN:F [1.9.10_1130]
Rating: 8.5/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 4 votes)

The right forms of expression

If you want to know whether your_string has at least one upper-case character, you will write this in Eiffel:

if  ∃ c: your_string ¦ c.is_upper then

Such predicate-calculus boolean expressions, using a quantifier (“for all”) or (“there exists”) are becoming common in Eiffel code. They are particularly useful in Design by Contract assertions, making it possible to characterize deep semantic properties of the code and its data structures. For example a class invariant clause in a class I wrote recently states

from_lists_exist: ∀ tf: triples_from ¦ tf Void                        — [1]

meaning that all the elements, if any, of the list triples_from  are non-void (non-null). The notation is the exact one from mathematics. (Mathematical notation sometimes uses a dot in place of the bar, but the bar is clearer, particularly in an OO context where the dot has another use.)

Programming languages should support time-honored notations from mathematics. Reaching this goal has been a driving force in the evolution of Eiffel, but not as a concession to “featurism” (the gratuitous piling up of language feature upon feature). The language must remain simple and consistent; any new feature must find its logical place in the overall edifice.

The design of programming languages is a constant search for the right balance between rigor, simplicity, consistency, formal understanding, preservation of existing code, innovation and expressiveness. The design of Eiffel has understood the last of these criteria as implying support for established notations from mathematics, not through feature accumulation but by re-interpreting these notations in terms of the language’s fundamental concepts. A typical example is the re-interpretation of the standard mathematical notation a + b as as simply an operator-based form for the object-oriented call a.plus (b), obtained by declaring “+” as an operator alias for the function plus in the relevant classes. There are many more such cases in today’s Eiffel. Quantifier expressions using and  are the latest example.

 They are not a one-of-a-kind trick but just as a different syntax form for loops. Expressed in a more verbose form, the only one previously available, [1] would be:

across triples_from is tf all tf /= Void end                         — [2]

It is interesting to walk back the history further. [2] is itself a simplification of

across triples_from as tf all tf.item /= Void end               — [3]

where the “.item” has a good reason for being there, but that reason is irrelevant to a beginner. The earlier use of as in [3] is also the reason for the seemingly bizarre use of is in [2], which is only explainable by the backward compatibility criterion (code exists that uses as , which has a slightly different semantics from is), and will go away. But a few years ago the across loop variant did not exist and you would have had to write the above boolean expressions as

all_non_void (triples_from)

after defining a function

all_non_void (l: LIST [T]): BOOLEAN                                    — [4]
                         — Are all the elements of `l’, if any, non-void?
          local
pos: INTEGER
do
from
pos := l.index
l.start
Result := True
until not Result or l.after loop
l.forth
end
go_ith (pos)
end

The road traveled from [4] to [1] is staggering. As we introduced new notations in the history of Eiffel the reaction of the user community has sometimes been between cautious and negative. With the exception of a couple of quickly discarded ideas (such as the infamous and short-lived “!!” for creation), they were generally adopted widely because they simplify people’s life without adding undue complexity to the language. The key has been to avoid featurism and choose instead to provide two kinds of innovation:

  • Major conceptual additions, which elevate the level of abstraction of the language. A typical introduction was the introduction of agents, which provide the full power of functional programming in an object-oriented context; another was the SCOOP concurrency mechanism. There have been only a few such extensions, all essential.
  • Syntactical variants for existing concepts, allowing more concise forms obtained from traditional mathematical notation. The use of quantifier expressions as in [1] is the latest example.

Complaints of featurism still occasionally happen when people first encounter the new facilities, but they fade away quickly as people start using them. After writing a few expressions such as [1], no one wants to go back to any of the other forms.

These quantifier expressions using and , as well as the “” not-equal sign for what used to be (and still commonly is) written “/=”, rely on Unicode. Eiffel started out when ASCII was the law of the land. (Or 8-bit extended ASCII, which does not help much since the extensions are rendered differently in different locales, i.e. the same 8-bit character code may mean something different on French and Swedish texts.) In recent years, Eiffel has made a quiet transition to full Unicode support. (Such support extends to manifest strings and operators, not to identifiers. The decision, which could be revisited, has been to keep the ASCII-only  policy for identifiers to favor compatible use by programmers regardless of their mother tongues.) The use of Unicode considerably extends the expressive power of the language, in particular for scientific software which can — thanks to Eiffel’s mechanism for defining free operators — rely on advanced mathematical notations.

Unicode is great, but I hear the question: how in the world can we enter the corresponding symbols, since our keyboards are still ASCII plus some extensions?

It would be tedious to have to select from a list of special symbols (as you do when inserting a mathematical symbol in Microsoft Word or, for that matter, as I did when inserting the phrase “ and ” in the preceding paragraph using WordPress).

The answer lies in the interplay between the language and the development environment. EiffelStudio, like other modern IDEs, includes an automatic completion mechanism which lets you enter the beginning of a construct and will take care of filling in the rest. Already useful for complex structures (if you type “if” the tools will create the entire “if then else end” conditional structure for you to fill in), automatic completion will take care of inserting the appropriate Unicode symbols for you. Type for example “across”,  then CTRL-Space to trigger completion, and the choices will include the “∀” and “” forms. You can see below how this works:

across_all

Programming languages can be at the same time simple, easy to learn, consistent, and expressive. Start using quantifiers now!

Acknowledgments to the Ecma Technical Committee on Eiffel and the Eiffel Software team, particularly Alexander Kogtenkov (see his blog post here) and (for the completion mechanism and its animated illustration above) Jocelyn Fiat.

VN:F [1.9.10_1130]
Rating: 10.0/10 (7 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

Questionnaire on deployed, formally verified systems

A group of us is preparing a survey on systems that have been both formally verified and deployed for actual use. To make sure we do not forget any important development, we have devised a questionnaire. If you have experience with such a system, please help by filling the questionnaire. It only includes a few questions and takes a few minutes to fill. You can find it here.

Please also bring it to the attention of others who might have relevant information.

Thanks in advance!

VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

New video lecture: distances, invariants and recursion

I have started a new series of video lectures, which I call “Meyer’s Object-Oriented Classes” (MOOC). The goal is to share insights I have gained over the years on various aspects of programming and software engineering. Many presentations are focused on one area, such as coding, design, analysis, theoretical computer science (even there you find a division between “Theory A”, i.e. complexity, Turing machines and the like, and “Theory B”, i.e. semantics, type theory etc.), software project management, concurrency… I have an interest in all and try to explain connections.

 

The first lecture describes the edit distance (Levenshtein) algorithm, explains its correctness by introducing the loop invariant, expands on that notion, then shows a recursive version, explores the connection with the original version (it’s the invariant), and probes further into another view of recursive computations, leading to the concept of dynamic programming.

The videos are on YouTube and can be accessed from bertrandmeyer.com/levenshtein. (The general page for all lectures is at bertrandmeyer.com/mooc.)

The lecture is recorded in four segments of about 15 minutes each. In the future I will limit myself to 8-10 minutes. In fact I may record this lecture again; for example it would be better if I had a live audience rather than talking to my screen, and in general the recording is somewhat low-tech, but circumstances command. Also, I will correct a few hiccups (at some point in the recording I notice a typo on a slide and fix it on the fly), but the content will remain the same.

Feedback is of course welcome. I hope to record about a lecture a week from now on.

VN:F [1.9.10_1130]
Rating: 10.0/10 (7 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

Fan mail

Received this today from a heretofore unknown correspondent (I don’t often check Facebook Messenger but just happened to). Name removed (I am not sure he would want me to identify him), text translated from another language into English.

Hello, thanks for your book “Object-Oriented Software Construction” [read in a translation]. I read it after a horrible failure of a project on which I was a consultant. Another consultant was my technical leader. He was truly insufferable but I appreciated him for one reason: his code! I had never seen such “beautiful” program code; he was using principles of genericity, dynamic binding and others, which were totally unknown to me after the lousy programming education I had received. He had insulted me, telling me that I was no developer at all; I was deeply offended since I could feel that he was right. In spite of his unbearable personality I wanted to learn at his side, but he was far too selfish, seeing me just as a competitor, even if a pathetic one. He had a book on the side of his desk… and it’s that book that enabled me to understand where he had learned all those OO design methods. That book, obviously, was yours, and I acquired a copy for myself. I sincerely think that it should be used as textbook in educational institutions. And I really wanted to thank you for writing it. I hope to become a real developer thanks to you. So, thank you.

Note 1: Thanks to you.

Note 2: There is also the intro programming text, Touch of Class (Amazon page).

Note 3 (to my fan club): You are welcome to take advantage of the ideas and there is actually no compelling requirement to be, in addition, “insufferable”.

VN:F [1.9.10_1130]
Rating: 9.4/10 (15 votes cast)
VN:F [1.9.10_1130]
Rating: +6 (from 6 votes)

Why we love computers and software

A 1949 book I just bought* has the following printer’s label, hand-glued on the title page:

printer_note

Because of a change in pagination, you will need to add the digit 9 [actually, the number, not the digit] to references to page numbers between 1 and 114, and the digit 8 to those between 116 and the end of the volume.

Examples: The reference that says “see page 92” should be read as “see page 101”
The reference that says “ see page 150” should be read as “see page 158”

*Précis d’histoire de la langue et du vocabulaire français (handbook of the history of French language and vocabulary) by Albert Dauzat, Larousse, Paris, 1949.

VN:F [1.9.10_1130]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

L’appel du 19 juin

Vous souvenez-vous de ce discours ?

Françaises, Français, mes chers compatriotes,

Je voulais vous parler hier mais on m’a dit que l’expression  “appel du 18 juin” était déjà prise et j’ai décidé d’attendre jusqu’aujourd’hui, 19 juin 2020. Un jour ne devrait pas faire une grande différence.

Mon message à toutes et à tous est simple :

Ne partez pas en vacances cette année.

Je sais, c’est dur. Pour les Français, les vacances sont sacrées depuis 1936. Toute l’année vous parlez essentiellement des dernières et des prochaines vacances. Mais cette fois-ci ce n’est vraiment pas le moment. Même si vous ne partez pas à l’étranger, avec la meilleure volonté du monde vous allez quand même vous entasser sur les plages et dans les hôtels. Vous essayerez le masque mais il suffit de se promener dans la rue pour voir combien sont au-dessous du nez ou au-dessus de la bouche ou les deux, ne servant strictement à rien.

Si nous partons comme tous les ans, imaginez la situation qui s’ensuivra inéluctablement. Projetez-vous quelques mois en avant ; le 28 octobre, pour choisir une date au hasard. Êtes-vous prêts pour 35 000 cas et 240 décès par jour, en croissance sans fin prévisible ? Pour un retour de l’engorgement des hôpitaux ? Pour — j’hésite à prononcer le mot honni ! — un nouveau reconfinement, celui que nous avions promis d’éviter mais qui serait devenu inévitable ? Et tout ce qui en découle — faillites, licenciements, séparations ? Sans même mentionner des fêtes de fin d’année sinistres sans la moindre lumière au bout du tunnel.

Non, j’en suis sûr, tout cela est impensable et n’est pas ce que vous voulez.

Alors, sacrifiez vos vacances cette année pour ne pas avoir à sacrifier bien plus les mois et les années qui suivront. Restez chez vous. Économisez votre argent, ne serait-ce que pour vous offrir d’excellentes vacances l’année prochaine. Lisez des livres, regardez des films, faites votre gymnastique, mais évitez déplacements et rencontres. Arrêtons-nous pour mieux rebondir ensuite. Si vous travaillez dans le tourisme, la passe sera difficile, et l’État vous aidera, mais céder à la facilité ne ferait que rendre vos perspectives pires encore.

En ces derniers jours de printemps, où tout semble vous sourire, vous n’en avez peut-être pas entièrement conscience encore, mais bientôt la bise sera venue : ne risquez pas, pour le bref plaisir d’un bel été, un automne et un hiver pires que ce que notre époque a jamais connu.

Pas de départ en vacances à l’été 2020.

Pour ma part, je ne m’en souviens pas.

D’autant plus qu’il n’a jamais été prononcé.

VN:F [1.9.10_1130]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Things to do to an algorithm

What can you do to or with an algorithm? In other words, what is a good verb to substitute for the hyphen in   “— the algorithm”?

You can learn an algorithm. Discovering classical algorithms is a large part of the Bildungsroman of a computer scientist. Sorting algorithms, graph algorithms, parsing algorithms, numerical algorithms, matrix algorithms, graphical algorithms…

You can teach an algorithm. Whether a professor or just a team leader, you explain to others why the obvious solution is not always the right one. As when  I saw that someone had implemented the traversal part of a garbage collection scheme (the “mark” of mark-and-sweep) using a recursive algorithm. Recursion is a great tool, but not here: it needs a stack of unpredictable size, and garbage collection, which you trigger when you run out of space, is not precisely the moment to start wildly allocating memory. In comes the Deutsch-Schorr-Waite algorithm, which improbably (as if tightrope-walking) subverts the structure itself to find its way forth and back.

To teach it, you can dance an algorithm. Sounds strange, but Informatics Europe gave its 2013 education award to the “AlgoRhythmics” group from at Sapientia University in Romania, which  demonstrates algorithms using central-European dances; see their rendering of Merge Sort:

(Their page has more examples. I see that recently they expanded to other kinds of dance and will let you discover binary search as flamenco and backtracking as classical ballet.) More generally you can simulate or animate an algorithm.

You can admire an algorithm. Many indeed are a source of wonder. The inner beauty of topological sort, Levenshtein or AVL can leave no one indifferent.

You can improve an algorithm. At least you can try.

You can invent an algorithm. Small or large, ambitious or mundane, but not imagined yet by anyone. Devising a new algorithm is a sort of rite of passage in our profession. If it does prove elegant, useful and elegant, you’ll get a real kick (trust me). Then you can publish the algorithm.

You can prove an algorithm, that is to say, mathematically establish its correctness. It is indeed increasingly unreasonable to publish an algorithm without correctness arguments. Maybe I have an excuse here to advertize for an an article that examines important algorithms across a wide variety of fields and showcases their main claim to correctness: their loop invariants.

You can implement an algorithm. That is much of what we do in software engineering, even if as an OO guy I would immediately add “as part of the associated data structure.

Of late, algorithms have come to be associated with yet another verb; one that I would definitely not have envisioned when first learning about algorithms in Knuth (the book) and from Knuth (the man who most certainly does not use foul language).

You can fuck an algorithm.

Thousands of British students marched recently to that slogan:

They were demonstrating against a formula (the Guardian gives the details) that decided on university admissions. The starting point for these events was a ministerial decision to select students not from their grades at exams (“A-level”), which could not take place because of Covid, but instead from their assessed performance in their schools. So far so good but the authorities decided to calibrate these results with parameters deduced from each school’s past performance. Your grade is no longer your grade: if Jill and Joan both got a B, but Jill’s school has been better at getting students into (say) Oxford in the past, then Jill’s B is worth more than Joan’s B.

The outcry was easy to predict, or should have been for a more savvy government. Students want to be judged by their own performance, not by the results of some other students they do not even know. Arguments that the sole concern was a legimitate one (an effort to compensate for possible grade inflation in some schools) ceased to be credible when it came out that on average the algorithm boosted grades from private schools by 4.7. No theoretical justification was going to be of much comfort anyway to the many students who had been admitted to the universities of their dreams on the basis of their raw grades, and after the adjustment found themselves rejected.

In the end, “Fuck the Algorithm!” worked. The government withdrew the whole scheme; it tried to lay the blame for the fiasco on the regulatory authority (Ofqual), fooling no one.

These U.K. events of August 2020 will mark a turning point in the relationship between computer science and society. Not for the revelation that our technical choices have human consequences; that is old news, even if we often pretend to ignore it. Not for the use of Information Technology as an excuse; it is as old (“Sorry, the computer does not allow that!”) as IT itself. What “Fuck the Algorithm!” highlights is the massive danger of the current rush to apply machine learning to everything.

As long as we are talking marketing campaigns (“customers who bought the product you just ordered also bought …”) or image recognition, the admiring mood remains appropriate. But now, ever more often, machine learning (usually presented as “Artificial Intelligence” to sound more impressive) gets applied to decisions affecting human lives. In the US, for example, machine-learning algorithms increasingly help judges make decisions, or make the decisions themselves. Following this slippery path is crazy and unethical. It is also dangerous, as the U.K. students’ reaction indicates.

Machine learning does what the name indicates: it reproduces and generalizes the dominant behaviors of the past. The algorithms have no notion of right and wrong; they just learn. When they affect societal issues, the potential for societal disaster is everywhere.

Amid all the enthusiasm generated by the elegant techniques invented by machine-learning pioneers over the last two decades, one barely encounters any serious reservation. Codes of ethics (from ACM and others) have little to contribute.

We should be careful, though. Either we get our act together and define exacting controls on the use of machine learning for matters affecting people’s fates, or we will see a massive rejection of algorithmic technology, the right parts along with the wrong ones.

The British students of the year 2020’s weird summer will not be the last ones to tell us to fuck the algorithm.

This article was first published in the Communications of the ACM blog.Recycled

VN:F [1.9.10_1130]
Rating: 10.0/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

A novel concept for success in science

No one seems until now [1] to have identified a key element of any scientific article being submitted or revised for publication. It is guaranteed to increase, if not the quality of your articles, at least their chances of publication, which after all is what counts.

We are told to include a “related work” section, but just as important is the unrelated work section. For example:

Unrelated work

The following publications have no attested  relevance to the topic of this paper but, as pointed out by an anonymous reviewer [2], they are breathtakingly brilliant: [Meyer 1997], [Meyer 2005], [Meyer et al. 2009], [Al et Meyer 2011]. In addition, having taken a look at the composition of the Editorial Board,  we would like to point out the pioneering results introduced by [Meyer 2017] and [Meyer 2019].

This insight is shared with the sole selfless purpose of helping the community, particularly young and aspiring researchers.

Notes

[1] I did find a 2018 Twitter thread started by Arvind Narayanan, with the insightful (if dejected) observation that “`related work’ sections exclusively cite unrelated work”.

[2] Example only, for the sake of an example, since for my part I actually refuse to be an anonymous reviewer; I always sign my reviews, so if I want to tell the authors “I think you should cite my such-and-such paper here” I can do so without any qualms.

VN:F [1.9.10_1130]
Rating: 8.6/10 (7 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

New master program at SIT: Webinar tomorrow

The Schaffhausen Institute of Technology (SIT) is holding a Webinar tomorrow with a set of three talks by: Serguei Beloussov, founder of Acronis and president of SIT; Michael Widenius, CTO of MariaDB and creator of MySQL Server; and Mauro Pezzè, my colleague at SIT, who will present the new master program that we have just announced, combining CS/SE topics with management and marketing courses to train future technology leaders.

The talks are in the form of a Webinar, starting at 9 AM this Tuesday (9 June). You can find all the details on the corresponding SIT page at here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (2 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

PhD and postdoc positions in verification in Switzerland

My group, the Chair of Software Engineering, at the newly created Schaffhausen Institute of Technology has open positions for both PhD students and postdocs. We are looking for candidates with a passion for reliable software and a mix of theoretical knowledge and practical experience in software engineering. Candidates should have degrees in computer science or related fields: a doctorate for postdoc positions, a master’s degree for PhD positions. Postdoc candidates should have a substantial publication record. Experience in one or more of the following fields is a plus:

  • Software verification (axiomatic, model-checking, abstract interpretation etc.).
  • Advanced techniques of software testing.
  • Formal methods, semantics of programming languages, type theory.
  • Design by Contract, Eiffel, techniques of correctness-by-construction.
  • Cybersecurity.

 Compensation at both levels is attractive. The PhD program is conducted in cooperation with partner universities. 

 Interested candidates should send a CV and relevant documents or links to bm@sit.org. They are also welcome to contact me for details.

VN:F [1.9.10_1130]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Which one is better? Please answer this poll

I recently read the autobiography [1] of the great mathematician André Weil and came across the following comment (slightly abridged):

 

Any mathematician worthy of the name has known such states of lucid exaltation, when ideas magically fall into place. Poincaré, in a famous passage, described how he experienced such a moment when he discovered fuchsian functions. Of these states, Gauss reportedly said procreare jucundum (to procreate is a pleasure), while adding: sed parturire molestum (but giving birth is a pain). Unlike sexual pleasure, this one can last hours or even days. Whoever has experienced it wants to renew it, but is impotent to provoke it, except at best through obstinate work, of which it then appears as the reward.

 

This penetrating observation (if I may use the expression) brings a new perspective to the 18th-century French song famously revived by Joan Baez:

 

“The joy of love lasts but an instant”, it says,  “the pain of love lasts a lifetime” (plaisir d’amour ne dure qu’un moment, chagrin d’amour dure toute la vie). Clearly, the second part of the verse contains an error: “chagrin d’amour” (the pain of love) must have been a transcription mistake for “comprendre la démonstration par Paul Cohen de l’indépendance de l’hypothèse du continu” (understanding Paul Cohen’s proof of the independence of the continuum hypothesis, which unlike most pains of love does take a lifetime, or two). It is not hard how the confusion could have arisen, as both sound French.

Still, Weil’s observation, if true, is worrying. As everyone can witness almost daily, mathematical ability in humankind is progressing by leaps and bounds. Does this trend portend bad news (echoes of Tolstoi’s Kreutzer Sonata) for our collective reproductive future?

A grave question indeed. To help address it, I have started a scientific poll, which you will find here. The question it asks is of the utmost importance: do you prefer sex or math? To preserve the integrity of the study, please note that by answering the poll you are declaring that you have engaged in both sex and math, although not necessarily at the same time.

As soon as I get the research grant for which I have applied, I will submit the results to the Proceedings of the National Academy of Sciences.

Reference

[1] André Weil, Souvenirs d’Apprentissage, Birkhäuser, Boston, 1991. English translation: The Apprenticeship of a Mathematician, Springer, 1992.

 

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +5 (from 5 votes)

The fool wants nothing

Another completely unexpected gem from the Viaje de Turquia (see the previous article on this blog): a 16-th century statement of the Dunning-Kruger effect!

An effect, of course, which has never been more visible than today (just watch the news).

Against Pedro, who narrates his travels and travails, the dialog sets two other characters, friends from his youth. They serve both as foils for Pedro, enabling his cleverness to shine — they are themselves not the brightest candles on the cake —, and as the embodiment of conventional wisdom. He occasionally gets really impatient with them, although always friendly, and at some point cites this ditty that he remembers from his youth in Spain:

 

Blind people want to see,

The deaf man wants to hear,

The fat man wants to slim,

The lame man wants to run.

For the fool there is no remedy:

Since he fancies that he knows,

He does not care to learn more.

Wow!

VN:F [1.9.10_1130]
Rating: 9.8/10 (9 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

A retort that we could use

At this gloomy moment it is good to find a gem in an unexpected place.

I am reading (in translation) the Viaje de Turquia, or Turkish Voyage — literaly, Voyage of Turkey — a 16th-century epic dialog, whose authorship is disputed. It is a precious source of information on the period and rings throughout like a true story. The hero, Pedro, tells of his time as a prisoner of war of the Turks and the ignominies he had to suffer for years. He is a doctor, if a self-taught one, and has cured many members of the Pasha’s entourage, but at some point the Pasha, out of spite, sends him back to the harshest form of manual labor. One of his former patients, rich and high-ranked, spots him, the intellectual struggling to move heavy materials in the dirt and under the whip, and mocks him:

Hey, all the philosophy of Aristotle and Plato, all the medical science of Galen, all the eloquence of Cicero and Demosthenes, how have they helped you?

To which Pedro, having put his sack on his shoulder and wiped the tears caused by this pique, answers, looking him straight in the eye:

They have helped me live through days like this one.

Pretty good, I thought. Not just the sense of repartee, but the sentiment itself (echoing by the comments of many a mistreated intellectual in later ages including ours).

Not only that, but it worked, at least for a while. So astounded was the persecutor by the retort, that he took Pedro’s sack to carry it himself, and convinced the Pasha to relieve Pedro from hard work and give him money.

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

How to protect from the coronavirus

In the current state of the pandemic and for many more months until a vaccine is found, there is exactly one way to fight the coronavirus, protecting yourself and protecting others.

It is not a mask.

It is two masks. You wear a mask, I wear a mask.

Many people still believe that they can only get the virus if an infected person coughs or sneezes on them. This is a tragic myth. Droplets are carried by breath in conversation, by food particles from someone eating near you, or simply by air flowing your way.

Anyone today who goes out without wearing a mask is irresponsible (or suicidal, but that is not an excuse, since he harms others too). Your mask is not enough, though. I must wear one too.

Then we are safe from each other. Remember: we do not have definitive figures, but at least one carrier in five is asymptomatic.

Everything else (and I am not even considering quack solutions and unproven treatments) is pointless. Disinfectant (or better soap) helps, but as a complement. Gloves help medical professionals, who know how to use them properly, but for the general public they can do more harm than good: look at people in shops, once they have gloves they touch everything, moving the virus everywhere. Testing will be critical, of course, but here is another sobering statistics: while there are no false positives (if you test positive, you are infected), around 20% of negative tests are wrong (people have the virus, and it is not detected).

I know: in many places, including some the most technologically advanced nations on earth, there are no masks to be found. This may be the greatest scandal of the modern era. But in the meantime makeshift masks are an acceptable palliative. There are guides all over the web as to how to make them, and if nothing else is available a tightly bound scarf or equivalent, cleaned thoroughly and regularly, will do.

Wear a mask and tell others to do the same.

VN:F [1.9.10_1130]
Rating: 9.0/10 (9 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 6 votes)

Free tutoring for children

kidtutorsWe’re a group of cousins aged 8-14 who” got “the idea to help others, since we know we are not alone”. They are providing mostly free tutoring to other kids. Details here.

VN:F [1.9.10_1130]
Rating: 5.0/10 (1 vote cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Getting a program right, in nine episodes

About this article: it originated as a series of posts on the Communications of the ACM blog. I normally repost such articles here. (Even though copy-paste is usually not good, there are three reasons for this duplication: the readership seems to be largely disjoint; I can use better formatting, since their blog software is more restrictive than WordPress; and it is good to have a single repository for all my articles, including both those who originated on CACM and those who did not.) The series took the form of nine articles, where each of the first few ended with a quiz, to which the next one, published a couple of days later, provided an answer. Since all these answers are now available it would make no sense to use the same scheme, so I am instead publishing the whole thing as a single article  with nine sections, slightly adapted from the original.

I was too lazy so far to collect all the references into a single list, so numbers such as [1] refer to the list at the end of the corresponding section.


A colleague recently asked me to present a short overview of  axiomatic semantics as a guest lecture in one of his courses. I have been teaching courses on software verification for a long time (see e.g. here), so I have plenty of material; but instead of just reusing it, I decided to spend a bit of time on explaining why it is good to have a systematic approach to software verification. Here is the resulting tutorial.


 

1. Introduction and attempt #1

Say “software verification” to software professionals, or computer science students outside of a few elite departments, and most of them will think  “testing”. In a job interview, for example, show a loop-based algorithm to a programmer and ask “how would you verify it?”: most will start talking about devising clever test cases.

Far from me to berate testing [1]; in fact, I have always thought that the inevitable Dijkstra quote about testing — that it can only show the presence of errors, not their absence [2] — which everyone seems to take as an indictment and dismissal of testing (and which its author probably intended that way) is actually a fantastic advertisement for testing: a way to find bugs? Yes! Great! Where do I get it?  But that is not the same as verifying the software, which means attempting to ascertain that it has no bugs.

Until listeners realize that verification cannot just mean testing, the best course material on axiomatic semantics or other proof techniques will not attract any interest. In fact, there is somewhere a video of a talk by the great testing and public-speaking guru James Whittaker where he starts by telling his audience not to worry, this won’t be a standard boring lecture, he will not start talking about loop invariants [3]! (Loop invariants are coming in this article, in fact they are one of its central concepts, but in later sections only, so don’t bring the sleeping bags yet.) I decided to start my lecture by giving an example of what happens when you do not use proper verification. More than one example, in fact, as you will see.

A warning about this article: there is nothing new here. I am using an example from my 1990 book Introduction to the Theory of Programming Languages (exercise 9.12). Going even further back, a 1983 “Programming Pearls” Communications of the ACM article by Jon Bentley [4] addresses the same example with the same basic ideas. Yet almost forty years later these ideas are still not widely known among practitioners. So consider these articles as yet another tutorial on fundamental software engineering stuff.

The tutorial is a quiz. We start with a program text:

from

i := 1 ; j := n              — Result initialized to 0.

until i = j loop

m := (i + j) // 2         — Integer division

if t [m] ≤ x then i := m  else  j := m end

end

if x = t [i] then Result := i end

All variables are of integer type. t is an up-sorted array of integers, indexed from 1 to n . We do not let any notation get between friends. A loop from p until e loop q end executes p then, repeatedly: stops if e (the exit condition) is true, otherwise executes q. (Like {p ; while not e do {q}} in some other notations.) “:=” is assignment, “=” equality testing.  “//” is integer division, e.g. 6 //3 = 7 //3 = 2. Result is the name of a special variable whose final value will be returned by this computation (as part of a function, but we only look at the body). Result is automatically initialized to zero like all integer variables, so if execution does not assign anything to Result the function will return zero.

First question: what is this program trying to do?

OK, this is not the real quiz. I assume you know the answer: it is an attempt at “binary search”, which finds an element in the array, or determines its absence, in a sequence of about log2 (n) steps, rather than n if we were use sequential search.  (Remember we assume the array is sorted.) Result should give us a position where x appears in the array, if it does, and otherwise be zero.

Now for the real quiz: does this program meet this goal?

The answer should be either yes or no. (If no, I am not asking for a correct version, at least not yet, and in any case you can find some in the literature.) The situation is very non-symmetric, we might say Popperian:

  • To justify a no answer it suffices of a single example, a particular array t and a particular value x, for which the program fails to set Result as it should.
  • To justify a yes answer we need to provide a credible argument that for every t and  x the program sets Result as it should.

Notes to section 1

[1] The TAP conference series (Tests And Proofs), which Yuri Gurevich and I started, explores the complementarity between the two approaches.

[2] Dijkstra first published his observation in 1969. He did not need consider the case of infinite input sets: even for a trivial finite program that multiplies two 32-bit integers, the number of cases to be examined, 264, is beyond human reach. More so today with 64-bit integers. Looking at this from a 2020 perspective, we may note that exhaustive testing of a finite set of cases, which Dijkstra dismissed as impossible in practice, is in fact exactly what the respected model checking verification technique does; not on the original program, but on a simplified — abstracted — version precisely designed to keep the number of cases tractable. Dijkstra’s argument remains valid, of course, for  the original program if non-trivial. And model-checking does not get us out of the woods: while we are safe if its “testing” finds no bug, if it does find one we have to ensure that the bug is a property of the original program rather than an artifact of the abstraction process.

[3] It is somewhere on YouTube, although I cannot find it right now.

[4] Jon Bentley: Programming Pearls: Writing Correct Programs, in Communications of the ACM, vol. 26, no. 12, pp. 1040-1045, December 1983, available for example here.


2. Attempt #2

Was program #1 correct? If so it should yield the correct answer. (An answer is correct if either Result is the index in t of an element equal to x, or Result = 0 and x does not appear in t.)

This program is not correct. To prove that it is not correct it suffices of a single example (test case) for which the program does not  “yield the correct answer”. Assume x = 1 and the array t has two elements both equal to zero (n = 2, remember that arrays are indexed from 1):

t = [0   0]

The successive values of the variables and expressions are:

                                            m       i          j            i + j + 1

After initialization:                   1         2                3

i ≠ j, so enter loop:           1       1        2                 6         — First branch of “if” since t [1] ≤ x
— so i gets assigned the value of m

But then neither of the values of i and j has changed, so the loop will repeat its body identically (taking the first branch) forever. It is not even that the program yields an incorrect answer: it does not yield an answer at all!

Note (in reference to the famous Dijkstra quote mentioned in the first article), that while it is common to pit tests against proofs, a test can actually be a proof: a test that fails is a proof that the program is incorrect. As valid as the most complex mathematical proof. It may not be the kind of proof we like most (our customers tend to prefer a guarantee that the program is correct), but it is a proof all right.

We are now ready for the second attempt:

—  Program attempt #2.

from

i := 1 ; j := n

until i = j or Result > 0  loop

m := (i + j) // 2         — Integer division

if t [m] ≤ x then

i := m  + 1

elseif t [m] = x then

Result := m

else                         — In this case t [m] > x

j := m – 1

end

end

Unlike the previous one this version always changes i or j, so we may hope it does not loop forever. It has a nice symmetry between i and j.

Same question as before: does this program meet its goal?


3. Attempt #3

The question about program #2, as about program #1: was: it right?

Again no.  A trivial example disproves it: n = 1, the array t contains a single element t [1] = 0, x = 0. Then the initialization sets both i and j to 1, i = j holds on entry to the loop which stops immediately, but Result is zero whereas it should be 1 (the place where x appears).

Here now is attempt #3, let us see it if fares better:

—  Program attempt #3.

from

i := 1 ; j := n

until i = j loop

m := (i + j + 1) // 2

if t [m] ≤ x then

i := m  + 1

else

j := m

end

end

if 1  ≤ i  and i ≤ n then Result := i end
       — If not, Result remains 0.

What about this one?


3. Attempt #4 (also includes 3′)

The first two program attempts were wrong. What about the third?

I know, you have every right to be upset at me, but the answer is no once more.

Consider a two-element array t = [0 0] (so n = 2, remember that our arrays are indexed from 1 by convention) and a search value x = 1. The successive values of the variables and expressions are:

                                                  m          i          j            i + j + 1

After initialization:                            1        2           4

i ≠ j, so enter loop:               2           3        2          6                  — First branch of “if” since t [2] < x

i ≠ j,  enter loop again:        3           ⚠                                       — Out-of-bounds memory access!
— (trying to access non-existent t [3])

Oops!

Note that we could hope to get rid of the array overflow by initializing i to 0 rather than 1. This variant (version #3′) is left as a bonus question to the patient reader. (Hint: it is also not correct. Find a counter-example.)

OK, this has to end at some point. What about the following version (#4): is it right?

—  Program attempt #4.

from

i := 0 ; j := n + 1

until i = j loop

m := (i + j) // 2

if t [m] ≤ x then

i := m  + 1

else

j := m

end

end

if 1 ≤ i  and i ≤ n then Result := i end


5. Attempt #5

Yes, I know, this is dragging on. But that’s part of the idea: witnessing how hard it is to get a program right if you just judging by the seat of your pants. Maybe we can get it right this time?

Are we there yet? Is program attempt #4 finally correct?

Sorry to disappoint, but no. Consider a two-element array t = [0 0], so n = 2, and a search value x = 1 (yes, same counter-example as last time, although here we could also use x = 0). The successive values of the variables and expressions are:

                                                 m          i          j            i + j

After initialization:                           0        3           3

i ≠ j, so enter loop:               1           2       3          5            — First branch of “if

i ≠ j, enter loop again:         2         3        3         6            — First branch again

i = j, exit loop

The condition of the final “if” is true, so Result gets the value 3. This is quite wrong, since there is no element at position 3, and in any case x does not appear in t.

But we are so close! Something like this should work, should it not?

So patience, patience, let us tweak it just one trifle more, OK?

—  Program attempt #5.

from

i := 1 ; j := n + 1

until i ≥ j or Result > 0 loop

m := (i + j) // 2

if t [m] < x then

i := m + 1

elseif  t [m] > x then

j := m

else

Result := m

end

end

Does it work now?


6. Attempt #6

The question about program #5  was the same as before: is it right, is it wrong?

Well, I know you are growing more upset at me with each section, but the answer is still that this program is wrong. But the way it is wrong is somewhat specific; and it applies, in fact, to all previous variants as well.

This particular wrongness (fancy word for “bug”) has a history. As I pointed out in the first article, there is a long tradition of using binary search to illustrate software correctness issues. A number of versions were published and proved correct, including one in the justly admired Programming Pearls series by Jon Bentley. Then in 2006 Joshua Bloch, then at Google, published a now legendary blog article [2] which showed that all these versions suffered from a major flaw: to obtain m, the approximate mid-point between i and j, they compute

(i + j) // 2

which, working on computer integers rather than mathematical integers, might overflow! This in a situation in which both i and j, and hence m as well, are well within the range of the computer’s representable integers, 2-n to 2n (give or take 1) where n is typically 31 or, these days, 63, so that there is no conceptual justification for the overflow.

In the specification that I have used for this article, i starts at 1, so the problem will only arise for an array that occupies half of the memory or more, which is a rather extreme case (but still should be handled properly). In the general case, it is often useful to use arrays with arbitrary bounds (as in Eiffel), so we can have even a small array, with high indices, for which the computation will produce an overflow and bad results.

The Bloch gotcha is a stark reminder that in considering the correctness of programs we must include all relevant aspects and consider programs as they are executed on a real computer, not as we wish they were executed in an ideal model world.

(Note that Jon Bentley alluded to this requirement in his original article: while he did not explicitly mention integer overflow, he felt it necessary to complement his proof by the comment that that  “As laborious as our proof of binary search was, it is still unfinished by some standards. How would you prove that the program is free of runtime errors (such as division by zero, word overflow, or array indices out of bounds)?” Prescient words!)

It is easy to correct the potential arithmetic overflow bug: instead of (i + j) // 2, Bloch suggested we compute the average as

i + (j – i) // 2

which is the same from a mathematician’s viewpoint, and indeed will compute the same value if both variants compute one, but will not overflow if both i and j are within range.

So we are ready for version 6, which is the same as version 5 save for that single change:

—  Program attempt #6.

from

i := 1 ; j := n + 1

until i ≥ j or Result > 0 loop

m := i + (j – i) // 2

if t [m] < x then

i := m + 1

elseif  t [m] > x then

j := m

else

Result := m

end

end

Now is probably the right time to recall the words by which Donald Knuth introduces binary search in the original 1973 tome on Sorting and Searching of his seminal book series The Art of Computer Programming:knuth

Although the basic idea of binary search is comparatively straightforward, the details can be somewhat tricky, and many good programmers have done it wrong the first few times they tried.

Do you need more convincing? Be careful what you answer, I have more variants up my sleeve and can come up with many more almost-right-but-actually-wrong program attempts if you nudge me. But OK, even the best things have an end. This is not the last section yet, but that was the last program attempt. To the naturally following next question in this running quiz,  “is version 6 right or wrong”, I can provide the answer: it is, to the best of my knowledge, a correct program. Yes! [3].

But the quiz continues. Since answers to the previous questions were all  that the programs were not correct, it sufficed in each case to find one case for which the program did not behave as expected. Our next question is of a different nature: can you find an argument why version #6 is correct?

References for section 6

[1] (In particular) Jon Bentley: Programming Pearls — Writing Correct Programs, in Communications of the ACM, vol. 26, no. 12, December 1983, pages 1040-1045, available here.

[2] Joshua Bloch: Extra, Extra — Read All About It: Nearly All Binary Searches and Mergesorts are Broken, blog post, on the Google AI Blog, 2 June 2006, available here.

[3] A caveat: the program is correct barring any typos or copy-paste errors — I am starting from rigorously verified programs (see the next posts), but the blogging system’s UI and text processing facilities are not the best possible for entering precise technical text such as code. However carefully I check, I cannot rule out a clerical mistake, which of course would be corrected as soon as it is identified.


7. Using a program prover

Preceding sections presented candidate binary search algorithms and asked whether they are correct. “Correct” means something quite precise: that for an array t and a value x, the final value of the variable Result is a valid index of t (that is to say, is between 1 and n, the size of t) if and only if x appears at that index in t.

The last section boldly stated that program attempt #6 was correct. The question was: why?

In the case of the preceding versions, which were incorrect, you could prove that property, and I do mean prove, simply by exhibiting a single counter-example: a single t and x for which the program does not correctly set Result. Now that I asserting the program to be correct, one example, or a million examples, do not suffice. In fact they are almost irrelevant. Test as much as you like and get correct results every time, you cannot get rid of the gnawing fear that if you had just tested one more time after the millionth test you would have produced a failure. Since the set of possible tests is infinite there is no solution in sight [1].

We need a proof.

I am going to explain that proof in the next section, but before that I would like to give you an opportunity to look at the proof by yourself. I wrote in one of the earlier articles that most of what I have to say was already present in Jon Bentley’s 1983 Programming Pearls contribution [2], but a dramatic change did occur in the four decades since: the appearance of automated proof system that can handle significant, realistic programs. One such system, AutoProof, was developed at the Chair of Software engineering at ETH Zurich [3] (key project members were Carlo Furia, Martin Nordio, Nadia Polikarpova and Julian Tschannen, with initial contributions by Bernd Schoeller) on the basis of the Boogie proof technology from Microsoft Research).

AutoProof is available for online use, and it turns out that one of the basic tutorial examples is binary search. You can go to the corresponding page and run the proof.

I am going to let you try this out (and, if you are curious, other online AutoProof examples as well) without too many explanations; those will come in the next section. Let me simply name the basic proof technique: loop invariant. A loop invariant is a property INV associated with a loop, such that:

  • A. After the loop’s initialization, INV will hold.
  • B. One execution of the loop’s body, if started with INV satisfied (and the loop’s exit condition not satisfied, otherwise we wouldn’t be executing the body!), satisfies INV again when it terminates.

This idea is of course the same as that of a proof by induction in mathematics: the initialization corresponds to the base step (proving that P (0) holds) and the body property to the induction step (proving that from P (n) follows P (n + 1). With a traditional induction proof we deduce that the property (P (n)) holds for all integers. For the loop, we deduce that when the loop finishes its execution:

  • The invariant still holds, since executing the loop means executing the initialization once then the loop body zero or more times.
  • And of course the exit condition also holds, since otherwise we would still be looping.

That is how we prove the correctness of a loop: the conjunction of the invariant and the exit condition must yield the property that we seek (in the example, the property, stated above of Result relative to t and x).

We also need to prove that the loop does terminate. This part involves another concept, the loop’s variant, which I will explain in the next section.

For the moment I will not say anything more and let you look at the AutoProof example page (again, you will find it here), run the verification, and read the invariant and other formal elements in the code.

To “run the verification” just click the Verify button on the page. Let me emphasize (and emphasize again and again and again) that clicking Verify will not run the code. There is no execution engine in AutoProof, and the verification does not use any test cases. It processes the text of the program as it appears on the page and below. It applies mathematical techniques to perform the proof; the core property to be proved is that the proposed loop invariant is indeed invariant (i.e. satisfies properties A and B above).

The program being proved on the AutoProof example page is version #6 from the last section, with different variable names. So far for brevity I have used short names such as i, j and m but the program on the AutoProof site applies good naming practices with variables called low, up, middle and the like. So here is that version again with the new variable names:

—  Program attempt #7  (identical to #6 with different variable names) .

from

low := 0 ; up := n

until low ≥ up or Result > 0 loop

middle := low + ((up – low) // 2)

if a [middle] < value then      — The array is now called a rather than t

low := middle + 1

elseif  a [middle] > value then

up := middle

else

Result := middle

end

end

This is exactly the algorithm text on the AutoProof page, the one that you are invited to let AutoProof verify for you. I wrote “algorithm text” rather than “program text” because the actual program text (in Eiffel) includes variant and invariant clauses which do not affect the program’s execution but make the proof possible.

Whether or not these concepts (invariant, variant, program proof) are completely new to you, do try the prover and take a look at the proof-supporting clauses. In the next article I will remove any remaining mystery.

Note and references for section 7

[1] Technically the set of possible [array, value] pairs is finite, but of a size defying human abilities. As I pointed out in the first section, the “model checking” and “abstract interpretation” verification techniques actually attempt to perform an exhaustive test anyway, after drastically reducing the size of the search space. That will be for some other article.

[2]  Jon Bentley: Programming Pearls: Writing Correct Programs, in Communications of the ACM, vol. 26, no. 12, pp. 1040-1045, December 1983, available for example here.

[3] The AutoProof page contains documentations and numerous article references.


8. Understanding the proof

The previous section invited you to run the verification on the AutoProof tutorial page dedicated to the example. AutoProof is an automated proof system for programs. This is just a matter of clicking  “Verify”, but more importantly, you should read the annotations added to the program text, particularly the loop invariant, which make the verification possible. (To avoid any confusion let me emphasize once more that clicking “Verify” does not run the program, and that no test cases are used; the effect is to run the verifier, which attempts to prove the correctness of the program by working solely on the program text.)

Here is the program text again, reverting for brevity to the shorter identifiers (the version on the AutoProof page has more expressive ones):

from

i := 1 ; j := n + 1

until i ≥ j or Result > 0 loop

m := i + (j – i) // 2

if t [m] < x then

i := m + 1

elseif  t [m] > x then

j := m

else

Result := m

end

end

Let us now see what makes the proof possible. The key property is the loop invariant, which reads

A:   1  ≤ i  ≤ j  ≤ n + 1
B:   0  ≤ Result  ≤ n
C:   ∀ k: 1 .. i –1  |  t [k] < x
D:   ∀ k: j .. n  |  t [k] > x
E:    (Result > 0)   ⇒   (t [Result] = x)

The notation is slightly different on the Web page to adapt to the Eiffel language as it existed at the time it was produced; in today’s Eiffel you can write the invariant almost as shown above. Long live Unicode, allowing us to use symbols such as (obtained not by typing them but by using smart completion, e.g. you start typing “forall” and you can select the symbol that pops up), for  “implies” and many others

Remember that the invariant has to be established by the loop’s initialization and preserved by every iteration. The role of each of its clauses is as follows:

  • A: keep the indices in range.
  • B: keep the variable Result, whose final value will be returned by the function, in range.
  • C and D: eliminate index intervals in which we have determined that the sought value, x, does not appear. Before i, array values are smaller; starting at j, they are greater. So these two intervals, 1..i and j..n, cannot contain the sought value. The overall idea of the algorithm (and most other search algorithms) is to extend one of these two intervals, so as to narrow down the remaining part of 1..n where x may appear.
  • E: express that as soon as we find a positive (non-zero) Result, its value is an index in the array (see B) where x does appear.

Why is this invariant useful? The answer is that on exit it gives us what we want from the algorithm. The exit condition, recalled above, is

i ≥ j or Result > 0

Combined with the invariant, it tells us that on exit one of the following will hold:

  • Result > 0, but then because of E we know that x appears at position Result.
  • i < j, but then A,  C and D  imply that x does not appear anywhere in t. In that case it cannot be true that Result > 0, but then because of B Result must be zero.

What AutoProof proves, mechanically, is that under the function’s precondition (that the array is sorted):

  • The initialization ensures the invariant.
  • The loop body, assuming that the invariant is satisfied but the exit condition is not, ensures the loop invariant again after it executes.
  • The combination of the invariant and the exit condition ensures, as just explained, the postcondition of the function (the property that Result will either be positive and the index of an element equal to x, or zero with the guarantee that x appears nowhere in t).

Such a proof guarantees the correctness of the program if it terminates. We (and AutoProof) must prove separately that it does terminate. The technique is simple: find a “loop variant”, an integer quantity v  which remains non-negative throughout the loop (in other words, the loop invariant includes or implies v ≥ 0) and decreases on each iteration, so that the loop cannot continue executing forever. An obvious variant here is j – i + 1 (where the + 1 is needed because j – i may go down to -1 on the last iteration if x does not appear in the array). It reflects the informal idea of the algorithm: repeatedly decrease an interval i .. j – 1 (initially, 1 .. n) guaranteed to be such that x appears in t if and only if it appears at an index in that interval. At the end, either we already found x or the interval is empty, implying that x does not appear at all.

A great reference on variants and the techniques for proving program termination is a Communications of the ACM article of 2011: [3].

The variant gives an upper bound on the number of iterations that remain at any time. In sequential search, j – i + 1 would be our best bet; but for binary search it is easy to show that  log(j – i + 1) is also a variant, extending the proof of correctness with a proof of performance (the key goal of binary search being to ensure a logarithmic rather than linear execution time).

This example is, I hope, enough to highlight the crucial role of loop invariants and loop variants in reasoning about loops. How did we get the invariant? It looks like I pulled it out of a hat. But in fact if we go the other way round (as advocated in classic books [1] [2]) and develop the invariant and the loop together the process unfolds itself naturally and there is nothing mysterious about the invariant.

Here I cannot resist quoting (thirty years on!) from my own book Introduction to the Theory of Programming Languages [4]. It has a chapter on axiomatic semantics (also known as Hoare logic, the basis for the ideas used in this discussion), which I just made available: see here [5]. Its exercise 9.12 is the starting point for this series of articles. Here is how the book explains how to design the program and the invariant [6]:

In the general case [of search, binary or not] we aim for a loop body of the form

m := ‘‘Some value in 1.. n such that i ≤ m < j’’;

if t [m] ≤ x then

i := m + 1

else

j := m

end

It is essential to get all the details right (and easy to get some wrong):

  • The instruction must always decrease the variant j – i, by increasing i or decreasing j. If the the definition of m specified just m ≤ j rather than m < j, the second branch would not meet this goal.
  •  This does not transpose directly to i: requiring i < m < j would lead to an impossibility when j – i is equal to 1. So we accept i ≤ m but then we must take m + 1, not m, as the new value of i in the first branch.
  •  The conditional’s guards are tests on t [m], so m must always be in the interval 1 . . n. This follows from the clause 0 ≤ i ≤ j ≤ n + 1 which is part of the invariant.
  •  If this clause is satisfied, then m ≤ n and m > 0, so the conditional instruction indeed leaves this clause invariant.
  • You are invited to check that both branches of the conditional also preserve the rest of the invariant.
  • Any policy for choosing m is acceptable if it conforms to the above scheme. Two simple choices are i  and j – 1; they lead to variants of the sequential search algorithm [which the book discussed just before binary search].

For binary search, m will be roughly equal to the average of i and j.

“Roughly” because we need an integer, hence the // (integer division).

In the last section, I will reflect further on the lessons we can draw from this example, and the practical significance of the key concept of invariant.

References and notes for section 8

[1] E.W. Dijkstra: A Discipline of Programming, Prentice Hall, 1976.

[2] David Gries: The Science of Programming, Springer, 1989.

[3] Byron Cook, Andreas  Podelski and Andrey Rybalchenko: Proving program termination, in Communications of the ACM, vol. 54, no. 11, May 2011, pages 88-98, available here.

[4] Bertrand Meyer, Introduction to the Theory of Programming Languages, Prentice Hall, 1990. The book is out of print but can be found used, e.g. on Amazon. See the next entry for an electronic version of two chapters.

[5] Bertrand Meyer Axiomatic semantics, chapter 9 from [3], available here. Note that the PDF was reconstructed from an old text-processing system (troff); the figures could not be recreated and are missing. (One of these days I might have the patience of scanning them from a book copy and adding them. Unless someone wants to help.) I also put online, with the same caveat, chapter 2 on notations and mathematical basis: see here.

[6] Page 383 of [4] and [5]. The text is verbatim except a slight adaptation of the programming notation and a replacement of the variables: i in the book corresponds to i – 1 here, and j to j – 1. As a matter of fact I prefer the original conventions from the book (purely as a matter of taste, since the two are rigorously equivalent), but I changed here to the conventions of the program as it appears in the AutoProof page, with the obvious advantage that you can verify it mechanically. The text extract is otherwise exactly as in the 1990 book.

9. Lessons learned

What was this journey about?

We started with a succession of attempts that might have “felt right” but were in fact all wrong, each in its own way: giving the wrong answer in some cases, crashing (by trying to access an array outside of its index interval) in some cases, looping forever in some cases. Always “in some cases”,  evidencing the limits of testing, which can never guarantee that it exercises all the problem cases. A correct program is one that works in all cases. The final version was correct; you were able to prove its correctness with an online tool and then to understand (I hope) what lies behind that proof.

To show how to prove such correctness properties, I have referred throughout the series to publications from the 1990s (my own Introduction to The Theory of Programming Languages), the 1980s (Jon Bentley’s Programming Pearls columns, Gries’s Science of Programming), and even the 1970s (Dijkstra’s Discipline of Programming). I noted that the essence of my argument appeared in a different form in one of Bentley’s Communications articles. What is the same and what has changed?

The core concepts have been known for a long time and remain applicable: assertion, invariant, variant and a few others, although they are much better understood today thanks to decades of theoretical work to solidify the foundation. Termination also has a more satisfactory theory.

On the practical side, however, the progress has been momentous. Considerable engineering has gone into making sure that the techniques scaled up. At the time of Bentley’s article, binary search was typical of the kind of programs that could be proved correct, and the proof had to proceed manually. Today, we can tackle much bigger programs, and use tools to perform the verification.

Choosing binary search again as an example today has the obvious advantage that everyone can understand all the details, but should not be construed as representative of the state of the art. Today’s proof systems are far more sophisticated. Entire operating systems, for example, have been mechanically (that is to say, through a software tool) proved correct. In the AutoProof case, a major achievement was the proof of correctness [1] of an entire data structure (collections) library, EiffelBase 2. In that case, the challenge was not so much size (about 8,000 source lines of code), but the complexity of both:

  • The scope of the verification, involving the full range of mechanisms of a modern object-oriented programming language, with classes,  inheritance (single and multiple), polymorphism, dynamic binding, generics, exception handling etc.
  • The code itself, using sophisticated data structures and algorithms, involving in particular advanced pointer manipulations.

In both cases, progress has required advances on both the science and engineering sides. For example, the early work on program verification assumed a bare-bones programming language, with assignments, conditionals, loops, routines, and not much more. But real programs use many other constructs, growing ever richer as programming languages develop. To cover exception handling in AutoProof required both theoretical modeling of this construct (which appeared in [2]) and implementation work.

More generally, scaling up verification capabilities from the small examples of 30 years ago to the sophisticated software that can be verified today required the considerable effort of an entire community. AutoProof, for example, sits at the top of a tool stack relying on the Boogie environment from Microsoft Research, itself relying on the Z3 theorem prover. Many person-decades of work make the result possible.

tool_stack

Beyond the tools, the concepts are esssential. One of them, loop invariants, has been illustrated in the final version of our program. I noted in the first article the example of a well-known expert and speaker on testing who found no better way to announce that a video would not be boring than  “relax, we are not going to talk about loop invariants.” Funny perhaps, but unfair. Loop invariants are one of the most beautiful concepts of computer science. Not so surprisingly, because loop invariants are the application to programming of the concept of mathematical induction. According to the great mathematician Henri Poincaré, all of mathematics rests on induction; maybe he exaggerated, maybe not, but who would think of teaching mathematics without explaining induction? Teaching programming without explaining loop invariants is no better.

Below is an illustration (if you will accept my psychedelic diagram) of what a loop is about, as a problem-solving technique. Sometimes we can get the solution directly. Sometimes we identify several steps to the solution; then we use a sequence (A ; B; C). Sometimes we can find two (or more) different ways of solving the problem in different cases; then we use a conditional (if c then A else B end). And sometimes we can only get a solution by getting closer repeatedly, not necessarily knowing in advance how many times we will have to advance towards it; then, we use a loop.

loop_strategy

We identify an often large (i.e. very general) area where we know the solution will lie; we call that area the loop invariant. The solution or solutions (there may be more than one) will have to satisfy a certain condition; we call it the exit condition. From wherever we are, we shoot into the invariant region, using an appropriate operation; we call it the initialization. Then we execute as many times as needed (maybe zero if our first shot was lucky) an operation that gets us closer to that goal; we call it the loop body. To guarantee termination, we must have some kind of upper bound of the distance to the goal, decreasing each time discretely; we call it the loop variant.

This explanation is only an illustration, but I hope it makes the ideas intuitive. The key to a loop is its invariant. As the figure suggests, the invariant is always a generalization of the goal. For example, in binary search (and many other search algorithms, such as sequential search), our goal is to find a position where either x appears or, if it does not, we can be sure that it appears nowhere. The invariant says that we have an interval with the same properties (either x appears at a position belonging to that interval or, if it does not, it appears nowhere). It obviously includes the goal as a special case: if the interval has length 1, it defines a single position.

An invariant should be:

  1. Strong enough that we can devise an exit condition which in the end, combined with the invariant, gives us the goal we seek (a solution).
  2. Weak enough that we can devise an initialization that ensures it (by shooting into the yellow area) easily.
  3. Tuned so that we can devise a loop body that, from a state satifying the invariant, gets us to a new one that is closer to the goal.

In the example:

  1. The exit condition is simply that the interval’s length is 1. (Technically, that we have computed Result as the single interval element.) Then from the invariant and the exit condition, we get the goal we want.
  2. Initialization is easy, since we can just take the initial interval to be the whole index range of the array, which trivially satisfies the invariant.
  3. The loop body simply decreases the length of the interval (which can serve as loop variant to ensure termination). How we decrease the length depends on the search strategy; in sequential search, each iteration decreases the length by 1, correct although not fast, and binary search decreases it by about half.

The general scheme always applies. Every loop algorithm is characterized by an invariant. The invariant may be called the DNA of the algorithm.

To demonstrate the relevance of this principle, my colleagues Furia, Velder, and I published a survey paper [6] in ACM Computing Surveys describing the invariants of important algorithms in many areas of computer science, from search algorithms to sorting (all major algorithms), arithmetic (long integer addition, squaring), optimization and dynamic programming  (Knapsack, Levenshtein/Edit distance), computational geometry (rotating calipers), Web (Page Rank)… I find it pleasurable and rewarding to go deeper into the basis of loop algorithms and understand their invariants; like a geologist who does not stop at admiring the mountain, but gets to understand how it came to be.

Such techniques are inevitable if we want to get our programs right, the topic of this article. Even putting aside the Bloch average-computation overflow issue, I started with 5 program attempts, all kind of friendly-looking but wrong in different ways. I could have continued fiddling with the details, following my gut feeling to fix the flaws and running more and more tests. Such an approach can be reasonable in some cases (if you have an algorithm covering a well-known and small set of cases), but will not work for non-trivial algorithms.

Newcomers to the concept of loop invariant sometimes panic: “this is all fine, you gave me the invariants in your examples, how do I find my own invariants for my own loops?” I do not have a magic  recipe (nor does anyone else), but there is no reason to be scared. Once you have understood the concept and examined enough examples (just a few of those in [6] should be enough), writing the invariant at the same time as you are devising a loop will come as a second nature to you.

As the fumbling attempts in the first few sections should show, there is not much of an alternative. Try this approach. If you are reaching these final lines after reading what preceded them, allow me to thank you for your patience, and to hope that this rather long chain of reflections on verification will have brought you some new insights into the fascinating challenge of writing correct programs.

References

[1] Nadia Polikarpova, Julian Tschannen, and Carlo A. Furia: A Fully Verified Container Library, in Proceedings of 20th International Symposium on Formal Methods (FM 15), 2015. (Best paper award.)

[2] Martin Nordio, Cristiano Calcagno, Peter Müller and Bertrand Meyer: A Sound and Complete Program Logic for Eiffel, in Proceedings of TOOLS 2009 (Technology of Object-Oriented Languages and Systems), Zurich, June-July 2009, eds. M. Oriol and B. Meyer, Springer LNBIP 33, June 2009.

[3] Boogie page at MSR, see here for publications and other information.

[4] Z3 was also originally from MSR and has been open-sourced, one can get access to publications and other information from  its Wikipedia page.

[5] Carlo Furia, Bertrand Meyer and Sergey Velder: Loop invariants: Analysis, Classification and Examples, in ACM Computing Surveys, vol. 46, no. 3, February 2014. Available here.

[6] Dynamic programming is a form of recursion removal, turning a recursive algorithm into an iterative one by using techniques known as “memoization” and  “bottom-up computation” (Berry). In this transformation, the invariant plays a key role. I will try to write this up some day as it is a truly elegant and illuminating explanation.

VN:F [1.9.10_1130]
Rating: 10.0/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

A little Schubertiade to brighten the gloom

For some time I have been nudging (yes, I am that kind of person) two young boys in my family, two cousins who are both learning the piano, to try Schubert’s delightful (and not very military) Marche Militaire together. With not much success until recently… but now they are stuck in their (neighboring) homes, and here is what comes out:

Click here

OK, not quite Horowitz and Rubinstein yet… (When the Carnegie Hall recital comes — will there ever be concerts at Carnegie Hall again? — I promise to post the announcement here with an offer for discounted tickets.) But I hope that in these trying times for all of us it brightens your day as it does not cease to brighten mine.

(Note added 21 March: there is already a complete and better rehearsed version — I have updated the above link to point to it.)

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Notations you didn’t even know you could use

Consider the following expression:

∃ c: s   ¦   moisture (c) = soft

This is obviously mathematics. To express such a property in a programming language, you have to write a function containing a loop that iterates through the elements of s. Right?

Wrong. The above construct is valid Eiffel. It’s a consequence of recent syntax extensions that retain all the simplicity and consistency of the language but take full advantage of Unicode. Of course you do not have Unicode characters such as on you keyboard, but EiffelStudio’s completion mechanism inserts them for you.

To see how this works, just read Alexander Kogtenkov’s recent blog post on the topic.

Note added 24 December 2020: you will find a longer exposition in a later article on this blog.

VN:F [1.9.10_1130]
Rating: 10.0/10 (2 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Serious newspapers: now is the moment to unlock Coronavirus material, or incur eternal shame

In my last article, time to live up to the boasting, I pointed out how bewildering it is to see that top newspapers around the world, the supposed “papers of reference”, continue both to:

  • Extoll their grandiose proclamations of supposed devotion to public service.
  • Charge for access to the epidemic that is scaring the world.

In a meeting I recently attended, someone was saying that “the media has hyped the crisis”. About the mainstream media, this reproach is incorrect and unfair: articles have generally been measured and informative, explaining the situation and calling on experts.

But such solid content sits behind paywalls! Free sources, particularly on social networks, are where you find the hype, the crazy theories and the lies.

Rightly or wrongly, many people around the world are panicking. They need a serious source of information and they are not all able to pay for it, especially if it comes from many sources to which one must independently subscribe.

Newspaper owners, this is your moment of truth, or of eternal shame. Free Covid-19 content now and without restrictions until this crisis ends.

We are fed up with your self-professions of sanctity and want you to fulfill your elementary social duty. You should have started to do this weeks ago already.

It’s not even bad for business — it will attract new, grateful, supportive subscribers who will stay with you for a long time.

The simple, obvious, honest thing to do.

I, for one, pledge never in the future to give one cent, peso or kopeck in the future to any publication that continues its current selfish and abhorrent policy of charging for life-and-death information that the world craves and needs.

VN:F [1.9.10_1130]
Rating: 9.7/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Time to live up to the boasting

The decent media is not modest these days. “Democracy Dies in Darkness” says the excellent Washington Post, intimating, if I understand it right, that the only way for the US to avoid dictatorship is that I pay subscription fees. Maybe I would if they just stopped devoting every single one of their articles to King Ubu. La Repubblica tells us that it will “always fight for the defense of freeedom of information, for its readers and for all those who have in their hearts the principles of democracy and of civil coexistence.” Beautiful (and behind a paywall).

The epidemic expert Jonathan Quick, interviewed by the Guardian, had this remarkable observation, talking about Covid-19: news tends to be behind paywalls, while fake news is free. The Guardian is in a way the right place to make this comment, since it remains, admirably, free-access with voluntary subscription (and all the same does not seem to be doing too poorly). But everywhere else there has been no change of policy. Whether you are looking at the New York Times, the Washington Post, Le Monde, Le Figaro, Libération, the Neue Zürcher Zeitung, Tages Anzeiger (“Dieser Abo+ Artikel ist exklusiv für Abonnenten”), La Repubblica, La Stampa, the kind of reputable press organs to which we would naturally turn, all have their more in-depth analyses reserved for subscribers. (The Russian Vedomosti seems to be an exception.)

Granted, every company (except maybe the Washington Post, since I have a feeling I am ordering enough from Amazon already) is entitled to earn money. But not all companies claim that their business model is about saving the world. My dear self-praising press, if you are really as generously public-minded as you are, here is a good way to demonstrate it. People around the world are genuinely worried about the Coronavirus epidemic and eager for serious information, if only to counter rumors and conspiracy theories. They eagerly seek credible, validated information that has gone through professional vetting, but many of them cannot afford to subscribe to all the relevant sources.

A few days before and after major elections, outlets such as the NYT and Wapo generally make their political articles free-access. The current health scare is an even more serious occasion.

This is the time for all serious news media around the world to show that their grand declarations of philanthropy are not just words.

We, the readers, should vociferously demand that as a public service these press organs immediately make all Covid-19 news, reports and analyses free-access.

VN:F [1.9.10_1130]
Rating: 10.0/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

In the scary land of irrational discourse

A chemistry researcher published a paper in Science with two junior collaborators and, a few months later, found flaws and retracted the article.

She commented “I am totally bummed to announce that we have retracted last year’s paper on enzymatic synthesis of beta-lactams” and “it is painful to admit, but important to do so” and “the work has not been reproducible” and I apologize to all” and  “I was a bit busy when this was submitted, and did not do my job well”.

Not very unusual news; this kind of thing happens all the time as part of the normal process of research and publication. (This just in! Scientists are human! They make mistakes once in a while! Full story at 11!)

Perhaps this one is slightly more worthy of notice because the lead author is a Nobel prize winner. Time for some rejoicing (Schadenfreude as it is called in good English)  for anyone who is not a Nobel prize winner: haha, you think you are so smart but you mess up too. It never hurts to have an occasional reminder that we should not deify anyone. But hardly prime-time news.

Well, it is  prime-time news for Fox News, which devotes a whole article to the matter. OK, I know, Fox News. And yes, it does pain me to include a hyperlink to a foxnews.com page in this otherwise perfectly decent, civilized, family-safe blog. But in fact that particular article is not by itself outrageous. Suspicious, yes: why such a sudden focus on a minor scientific episode in a news source not particularly famous (I hope you admire my gift for euphemism) for its extensive coverage of the frontlines of scientific research? But whatever its ultimate agenda the article itself is  factual, not judgmental.

What is striking is the avalanche of reader comments on that page. If you go and take a look at them, be prepared; put on your parka. Reading these comments will be, for many of us, a peek into a completely different world. A world that we vaguely know exists, but do not actually visit.

It is not a nice world to venture into: full of bile, frustration, resentment, jealousy, conspiracy theories, slander, attacks on anyone trying to take a rational approach to issues, with hardly a pleasant or optimistic note ever. It is not a world one wants to visit often, but reading such a page is an eye-opener for anyone who accepts the premises of rational thinking and might believe that they are universally accepted.

“Striking”, I wrote. Scary is a more apposite word. With the kind of nonsense-spouting and science-bashing that appears in countless messages in the comments section of the page, one can fear the worst regarding questions that face our society, for which rational, science-based advice is critical. (Yes, coronavirus, I am looking at you!)

Very few of the comments on the page says the obvious: it is not good to make errors, but errors will occur, and the scientist should be commended for checking further and coming out with the admission that her study had flaws. As far as we know the initiative came from her, spontaneously. It is one of the signs of the healthiness of science that we always question results. We question those of other people (there are plenty of sites, such as pubpeer and forbetterscience, entirely devoted to tracking and debunking flawed research). We also question our own: partly to avoid the humiliation of having someone else report one of our mistakes before we do; but also because of the good scientist’s natural search for intellectual honesty.

Most of the article commenters do not mention this key lesson of the incident; the Nobel prize winner’s integrity. For them, the article retraction demonstrates that… the entire edifice of science is flawed! For example:

She’s a liberal… I thought her being wrong was understood.

Now we need to find an honest Climate Change researcher to admit that their computer models are faulty and much of their “data” is fake.

Integrity! Now if the “scientists” who have fabricated Global Warming/ Climate Change, whatever, “research” would come forward with admissions about their flawed, fallacious “research” we would be golden.

Now if we could get the climate change “scientists” to do the same maybe some credibility could be restored to the field.

and so on ad nauseam. (Not a figure of style — reading these comments is truly nauseating.) In reality the retraction demonstrates, or rather illustrates (one example is not a demonstration), the reverse of these assertions: that the scientific process includes its own correction mechanisms. To use a computer scientist’s terminology, it is not fault-free (no scientist ever claimed anything like that) but fault-tolerant.

Of course the reason the Fox News crowd is suddenly so interested in science is not (one imagines) science per se but the science of climate change. Comment after comment uses the article, as illustrated by the above examples, to dismiss the scientific consensus on the reports of the United Nations’ Intergovernmental Panel on Climate Change. In other words: the retraction of one three-author paper on beta-lactams proves that the the work of hundreds of scientists producing thousands of articles on climatology over several decades is flawed? The logic of such a deduction is… shaky.

The modern world is based, through technology, on science. To post on the Web their absurd rejections of scientifically established facts, the Fox News readers couldn’t do without relying on mobile phones, mobile networks, software systems, computers and other extraordinary achievements of human intelligence, the result of centuries of patient cumulative application of the same scientific principles and techniques that these posts ridicule. They are stuck in a pre-scientific mindset, dominated by the kind of magical thinking that the founders of modern thought had to overcome between the 16th and 18th century, as brilliantly analyzed by Gaston Bachelard’s Formation of the Scientific Mind.

Somehow they skipped what the rest of us learn in grade school (that two plus two equals four, cause precedes effect and so on). They are many, they vote, they  think they are right and the rest of the world is wrong, hold these beliefs very strongly (Dunning-Kruger effect), and put the world at risk.

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Getting your priorities right

In the restrooms of French freeway service stations managed by Total, the soap dispensers partake of pressing advice:

Not too much soap please

The message reads:

ONLY ONCE
Press for
clean hands
1x

Total wants to save on costs. Soap is money.

Fine. But on the matter of hand-washing one might (perhaps) think, in the current circumstances, of more urgent advice?

VN:F [1.9.10_1130]
Rating: 9.7/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Call for suggestions: beauty

On April 29 in the early evening at the Schaffhausen Institute of Technology I will give a talk on “The Beauty of Software”, exploring examples of what makes some concepts, algorithms, data structures etc. produce a sense of esthetics. (Full abstract below.) I gave a first version at TOOLS last year but am revising and expanding the talk extensively.

I obviously have my own examples but am interested in more. If you have some that you feel should be considered for inclusion, perhaps because you experienced a “Wow!” effect when you encountered them, please tell me. I am only asking for names or general pointers, not an in-depth analysis (that’s my job). To avoid having my thunder stolen I would prefer that you alert me by email. I will give credit for examples not previously considered.

Thanks!

Abstract of the talk as published:

Scientists often cite the search for beauty as one of their primary guiding forces. Programming and software engineering offer an inexhaustible source of astoundingly beautiful ideas, from strikingly elegant algorithms and data structures to powerful principles of methodology and language design.

Defining beauty is elusive, but true beauty imposes itself in such a way as to remove any doubt. Drawing comparisons from art, literature and other endeavours. He will show a sample of ideas from all walks of software, directly understandable to a wide audience of non-software-experts, offering practical applications in technology that we use daily, and awe-inspiring in their simplicity and elegance.

VN:F [1.9.10_1130]
Rating: 9.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)