Archive for the ‘Programming techniques’ Category.

A remarkable group photo

On 13-15 September 1999 a symposium took place in St Catherine College in Oxford,  in honor of Tony Hoare’s “retirement” from Oxford (the word is in quotes because he has had several further productive careers since). The organizers were Jim Woodcock, Bill Roscoe and Jim Davies. The proceedings are available as Millenial Perspectives in Computer Science, MacMillan Education UK, edited by Davies, Roscoe and Woodcock. The Symposium was a milestone event.

As part of a recent conversation on something else, YuQian Zhou(who was also there) sent me a group photo from the event, which I did not know even existed. I am including it below; it is actually a photo of a paper photo but the resolution is good. It is a fascinating gallery of outstanding people in programming and verification. (How many Turing award winners can you spot? I see 7.)

Many thanks to YuQian Zhou, Jim Woodcock and Bill Roscoe for insights into the picture in discussions of the past two weeks.

photo

VN:F [1.9.10_1130]
Rating: 10.0/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

Niklaus Wirth and the Importance of Being Simple

[This is a verbatim copy of a post in the Communications of the ACM blog, 9 January 2024.]

I am still in shock from the unexpected death of Niklaus Wirth eight days ago. If you allow a personal note (not the last one in this article): January 11, two days from now, was inscribed in my mind as the date of the next time he was coming to my home for dinner. Now it is the date set for his funeral.

standing

Niklaus Wirth at the ACM Turing centenary celebration
San Francisco, 16 June 2012
(all photographs in this article are by B. Meyer)

A more composed person would wait before jotting down thoughts about Wirth’s contributions but I feel I should do it right now, even at the risk of being biased by fresh emotions.

Maybe I should first say why I have found myself, involuntarily, writing obituaries of computer scientists: Kristen Nygaard and Ole-Johan Dahl, Andrey Ershov, Jean Ichbiah, Watts Humphrey, John McCarthy, and most recently Barry Boehm (the last three in this very blog). You can find the list with comments and links to the eulogy texts on the corresponding section of my publication page. The reason is simple: I have had the privilege of frequenting giants of the discipline, tempered by the sadness of seeing some of them go away. (Fortunately many others are still around and kicking!) Such a circumstance is almost unbelievable: imagine someone who, as a student and young professional, discovered the works of Galileo, Descartes, Newton, Ampère, Faraday, Einstein, Planck and so on, devouring their writings and admiring their insights — and later on in his career got to meet all his heroes and conduct long conversations with them, for example in week-long workshops, or driving from a village deep in Bavaria (Marktoberdorf) to Munich airport. Not possible for a physicist, of course, but exactly the computer science equivalent of what happened to me. It was possible for someone of my generation to get to know some of the giants in the field, the founding fathers and mothers. In my case they included some of the heroes of programming languages and programming methodology (Wirth, Hoare, Dijkstra, Liskov, Parnas, McCarthy, Dahl, Nygaard, Knuth, Floyd, Gries, …) whom I idolized as a student without every dreaming that I would one day meet them. It is natural then to should share some of my appreciation for them.

My obituaries are neither formal, nor complete, nor objective; they are colored by my own experience and views. Perhaps you object to an author inserting himself into an obituary; if so, I sympathize, but then you should probably skip this article and its companions and go instead to Wikipedia and official biographies. (In the same vein, spurred at some point by Paul Halmos’s photographic record of mathematicians, I started my own picture gallery. I haven’t updated it recently, and the formatting shows the limits of my JavaScript skills, but it does provide some fresh, spontaneous and authentic snapshots of famous people and a few less famous but no less interesting ones. You can find it here. The pictures of Wirth accompanying this article are taken from it.)

liskov

Niklaus Wirth, Barbara Liskov, Donald Knuth
(ETH Zurich, 2005, on the occasion of conferring honorary doctorates to Liskov and Knuth)

A peculiarity of my knowledge of Wirth is that unlike his actual collaborators, who are better qualified to talk about his years of full activity, I never met him during that time. I was keenly aware of his work, avidly getting hold of anything he published, but from a distance. I only got to know him personally after his retirement from ETH Zurich (not surprisingly, since I joined ETH because of that retirement). In the more than twenty years that followed I learned immeasurably from conversations with him. He helped me in many ways to settle into the world of ETH, without ever imposing or interfering.

I also had the privilege of organizing in 2014, together with his longtime colleague Walter Gander, a symposium in honor of his 80th birthday, which featured a roster of prestigious speakers including some of the most famous of his former students (Martin Oderski, Clemens Szyperski, Michael Franz…) as well as Vint Cerf. Like all participants in this memorable event (see here for the program, slides, videos, pictures…) I learned more about his intellectual rigor and dedication, his passion for doing things right, and his fascinating personality.

Some of his distinctive qualities are embodied in a book published on the occasion of an earlier event, School of Niklaus Wirth: The Art of Simplicity (put together by his close collaborator Jürg Gutknecht together with Laszlo Boszormenyi and Gustav Pomberger; see the Amazon page). The book, with its stunning white cover, is itself a model of beautiful design achieved through simplicity. It contains numerous reports and testimonials from his former students and colleagues about the various epochs of Wirth’s work.

bauer

Niklaus Wirth (right)
with F.L. Bauer, one of the founders of German computer science
Zurich,22 June 2005

Various epochs and many different topics. Like a Renaissance man, or one of those 18-th century “philosophers” who knew no discipline boundaries, Wirth straddled many subjects. It was in particular still possible (and perhaps necessary) in his generation to pay attention to both hardware and software. Wirth is most remembered for his software work but he was also a hardware builder. The influence of his PhD supervisor, computer design pioneer and UC Berkeley professor Harry Huskey, certainly played a role.

Stirred by the discovery of a new world through two sabbaticals at Xerox PARC (Palo Alto Research Center, the mother lode of invention for many of today’s computer techniques) but unable to bring the innovative Xerox machines to Europe, Wirth developed his own modern workstations, Ceres and Lilith. (Apart from the Xerox stays, Wirth spent significant time in the US and Canada: University of Laval for his master degree, UC Berkeley for his PhD, then Stanford, but only as an assistant professor, which turned out to be Switzerland’s and ETH’s gain, as he returned in 1968,)

 

lilith

Lilith workstation and its mouse
(Public display in the CAB computer science building at ETH Zurich)

One of the Xerox contributions was the generalized use of the mouse (the invention of Doug Englebart at the nearby SRI, then the Stanford Research Institute). Wirth immediately seized on the idea and helped found the Logitech company, which soon became, and remains today, a world leader in mouse technology.
Wirth returned to hardware-software codesign late in his career, in his last years at ETH and beyond, to work on self-driving model helicopters (one might say to big drones) with a Strong-ARM-based hardware core. He was fascinated by the goal of maintaining stability, a challenge involving physics, mechanical engineering, electronic engineering in addition to software engineering.
These developments showed that Wirth was as talented as an electronics engineer and designer as he was in software. He retained his interest in hardware throughout his career; one of his maxims was indeed that the field remains driven by hardware advances, which make software progress possible. For all my pride as a software guy, I must admit that he was largely right: object-oriented programming, for example, became realistic once we had faster machines and more memory.

Software is of course what brought him the most fame. I struggle not to forget any key element of his list of major contributions. (I will come back to this article when emotions abate, and will add a proper bibliography of the corresponding Wirth publications.) He showed that it was possible to bring order to the world of machine-level programming through his introduction of the PL/360 structured assembly language for the IBM 360 architecture. He explained top-down design (“stepwise refinement“), as no one had done before, in a beautiful article that forever made the eight-queens problem famous. While David Gries had in his milestone book Compiler Construction for Digital Computers established compiler design as a systematic discipline, Wirth showed that compilers could be built simply and elegantly through recursive descent. That approach had a strong influence on language design, as will be discussed below in relation to Pascal.

The emphasis simplicity and elegance carried over to his book on compiler construction. Another book with the stunning title Algorithms + Data Structures = Programs presented a clear and readable compendium of programming and algorithmic wisdom, collecting the essentials of what was known at the time.

And then, of course, the programming languages. Wirth’s name will forever remained tied to Pascal, a worldwide success thanks in particular to its early implementations (UCSD Pascal, as well as Borland Pascal by his former student Philippe Kahn) on microcomputers, a market that was exploding at just that time. Pascal’s dazzling spread was also helped by another of Wirth’s trademark concise and clear texts, the Pascal User Manual and Report, written with Kathleen Jensen. Another key component of Pascal’s success was the implementation technique, using a specially designed intermediate language, P-Code, the ancestor of today’s virtual machines. Back then the diversity of hardware architectures was a major obstacle to the spread of any programming language; Wirth’s ETH compiler produced P-Code, enabling anyone to port Pascal to a new computer type by writing a translator from P-Code to the appropriate machine code, a relatively simple task.

Here I have a confession to make: other than the clear and simple keyword-based syntax, I never liked Pascal much. I even have a snide comment in my PhD thesis about Pascal being as small, tidy and exciting as a Swiss chalet. In some respects, cheekiness aside, I was wrong, in the sense that the limitations and exclusions of the language design were precisely what made compact implementations possible and widely successful. But the deeper reason for my lack of enthusiasm was that I had fallen in love with earlier designs from Wirth himself, who for several years, pre-Pascal, had been regularly churning out new language proposals, some academic, some (like PL/360) practical. One of the academic designs I liked was Euler, but I was particularly keen about Algol W, an extension and simplification of Algol 60 (designed by Wirth with the collaboration of Tony Hoare, and implemented in PL/360). I got to know it as a student at Stanford, which used it to teach programming. Algol W was a model of clarity and elegance. It is through Algol W that I started to understand what programming really is about; it had the right combination of freedom and limits. To me, Pascal, with all its strictures, was a step backward. As an Algol W devotee, I felt let down.
Algol W played, or more precisely almost played, a historical role. Once the world realized that Algol 60, a breakthrough in language design, was too ethereal to achieve practical success, experts started to work on a replacement. Wirth proposed Algol W, which the relevant committee at IFIP (International Federation for Information Processing) rejected in favor of a competing proposal by a group headed by the Dutch computer scientist (and somewhat unrequited Ph.D. supervisor of Edsger Dijkstra) Aad van Wijngaarden.

Wirth recognized Algol 68 for what it was, a catastrophe. (An example of how misguided the design was: Algol 68 promoted the concept of orthogonality, roughly stating that any two language mechanisms could be combined. Very elegant in principle, and perhaps appealing to some mathematicians, but suicidal: to make everything work with everything, you have to complicate the compiler to unbelievable extremes, whereas many of these combinations are of no use whatsoever to any programmer!) Wirth was vocal in his criticism and the community split for good. Algol W was a casualty of the conflict, as Wirth seems to have decided in reaction to the enormity of Algol 68 that simplicity and small size were the cardinal virtues of a language design, leading to Pascal, and then to its modular successors Modula and Oberon.

Continuing with my own perspective, I admired these designs, but when I saw Simula 67 and object-oriented programming I felt that I had come across a whole new level of expressive power, with the notion of class unifying types and modules, and stopped caring much for purely modular languages, including Ada as it was then. A particularly ill-considered feature of all these languages always irked me: the requirement that every module should be declared in two parts, interface and implementation. An example, in my view, of a good intention poorly realized and leading to nasty consequences. One of these consequences is that the information in the interface part inevitably gets repeated in the implementation part. Repetition, as David Parnas has taught us, is (particularly in the form of copy-paste) the programmer’s scary enemy. Any change needs to be checked and repeated in both the original and the duplicate. Any bug needs to be fixed in both. The better solution, instead of the interface-implementation separation, is to write everything in one place (the class of object-oriented programming) and then rely on tools to extract, from the text, the interface view but also many other interesting views abstracted from the text.

In addition, modular languages offer one implementation for each interface. How limiting! With object-oriented programming, you use inheritance to provide a general version of an abstraction and then as many variants as you like, adding them as you see fit (Open-Closed Principle) and not repeating the common information. These ideas took me towards a direction of language design completely different from Wirth’s.

One of his principles in language design was that it should be easy to write a compiler — an approach that paid off magnificently for Pascal. I mentioned above the beauty of recursive-descent parsing (an approach which means roughly that you parse a text by seeing how it starts, deducing the structure that you expect to follow, then applying the same technique recursively to the successive components of the expected structure). Recursive descent will only work well if the language is LL (1) or very close to it. (LL (1) means, again roughly, that the first element of a textual component unambiguously determines the syntactic type of that component. For example the instruction part of a language is LL (1) if an instruction is a conditional whenever it starts with the keyword if, a loop whenever it starts with the keyword while, and an assignment variable := expression whenever it starts with a variable name. Only with a near-LL (1) structure is recursive descent recursive-decent.) Pascal was designed that way.

A less felicitous application of this principle was Wirth’s insistence on one-pass compilation, which resulted in Pascal requiring any use of indirect recursion to include an early announcement of the element — procedure or data type — being used recursively. That is the kind of thing I disliked in Pascal: transferring (in my opinion) some of the responsibilities of the compiler designer onto the programmer. Some of those constraints remained long after advances in hardware and software made the insistence on one-pass compilation seem obsolete.

What most characterized Wirth’s approach to design — of languages, of machines, of software, of articles, of books, of curricula — was his love of simplicity and dislike of gratuitous featurism. He most famously expressed this view in his Plea for Lean Software article. Even if hardware progress drives software progress, he could not accept what he viewed as the lazy approach of using hardware power as an excuse for sloppy design. I suspect that was the reasoning behind the one-compilation-pass stance: sure, our computers now enable us to use several passes, but if we can do the compilation in one pass we should since it is simpler and leaner.
As in the case of Pascal, this relentless focus could be limiting at times; it also led him to distrust artificial intelligence, partly because of the grandiose promises its proponents were making at the time. For many years indeed, AI never made it into ETH computer science. I am talking here of the classical, logic-based form of AI; I had not yet had the opportunity to ask Niklaus what he thought of the modern, statistics-based form. Perhaps the engineer in him would have mollified his attitude, attracted by the practicality and well-defined scope of today’s AI methods. I will never know.

As to languages, I was looking forward to more discussions; while I wholeheartedly support his quest for simplicity, size to me is less important than simplicity of the structure and reliance on a small number of fundamental concepts (such as data abstraction for object-oriented programming), taken to their full power, permeating every facet of the language, and bringing consistency to a powerful construction.

Disagreements on specifics of language design are normal. Design — of anything — is largely characterized by decisions of where to be dogmatic and where to be permissive. You cannot be dogmatic all over, or will end with a stranglehold. You cannot be permissive all around, or will end with a mess. I am not dogmatic about things like the number of compiler passes: why care about having one, two, five or ten passes if they are fast anyway? I care about other things, such as the small number of basic concepts. There should be, for example, only one conceptual kind of loop, accommodating variants. I also don’t mind adding various forms of syntax for the same thing (such as, in object-oriented programming, x.a := v as an abbreviation for the conceptually sound x.set_a (v)). Wirth probably would have balked at such diversity.

In the end Pascal largely lost to its design opposite, C, the epitome of permissiveness, where you can (for example) add anything to almost anything. Recent languages went even further, discarding notions such as static types as dispensable and obsolete burdens. (In truth C is more a competitor to P-Code, since provides a good target for compilers: its abstraction level is close to that of the computer and operating system, humans can still with some effort decipher C code, and a C implementation is available by default on most platforms. A kind of universal assembly language. Somehow, somewhere, the strange idea creeped into people’s minds that it could also be used as a notation for human programmers.)

In any case I do not think Niklaus followed closely the evolution of the programming language field in recent years, away from principles of simplicity and consistency; sometimes, it seems, away from any principles at all. The game today is mostly “see this cute little feature in my language, I bet you cannot do as well in yours!” “Oh yes I can, see how cool my next construct is!“, with little attention being paid to the programming language as a coherent engineering construction, and even less to its ability to produce correct, robust, reusable and extendible software.

I know Wirth was horrified by the repulsive syntax choices of today’s dominant languages; he could never accept that a = b should mean something different from b = a, or that a = a + 1 should even be considered meaningful. The folly of straying away from conventions of mathematics carefully refined over several centuries (for example by distorting “=” to mean assignment and resorting to a special symbol for equality, rather than the obviously better reverse) depressed him. I remain convinced that the community will eventually come back to its senses and start treating language design seriously again.

One of the interesting features of meeting Niklaus Wirth the man, after decades of studying from the works of Professor Wirth the scientist, was to discover an unexpected personality. Niklaus was an affable and friendly companion, and most strikingly an extremely down-to-earth person. On the occasion of the 2014 symposium we were privileged to meet some of his children, all successful in various walks of life: well-known musician in the Zurich scene, specialty shop owner… I do not quite know how to characterize in words his way of speaking (excellent) English, but it is definitely impossible to forget its special character, with its slight but unmistakable Swiss-German accent (also perceptible in German). To get an idea, just watch one of the many lecture videos available on the Web. See for example the videos from the 2014 symposium mentioned above, or this full-length interview recorded in 2018 as part of an ACM series on Turing Award winners.

On the “down-to-earth” part: computer scientists, especially of the first few generations, tend to split into the mathematician types and the engineer types. He was definitely the engineer kind, as illustrated by his hardware work. One of his maxims for a successful career was that there are a few things that you don’t want to do because they are boring or feel useless, but if you don’t take care of them right away they will come back and take even more of your time, so you should devote 10% of that time to discharge them promptly. (I wish I could limit that part to 10%.)

He had a witty, subtle — sometimes caustic — humor. Here is a Niklaus Wirth story. On the seventh day of creation God looked at the result. (Side note: Wirth was an atheist, which adds spice to the choice of setting for the story.) He (God) was pretty happy about it. He started looking at the list of professions and felt good: all — policeman, minister, nurse, street sweeper, interior designer, opera singer, personal trainer, supermarket cashier, tax collector… — had some advantages and some disadvantages. But then He got to the University Professor row. The Advantages entry was impressive: long holidays, decent salary, you basically get to do what you want, and so on; but the Disadvantages entry was empty! Such a scandalous discrepancy could not be tolerated. For a moment, a cloud obscured His face. He thought and thought and finally His smile came back. At that point, He had created colleagues.

When the computing world finally realizes that design needs simplicity, it will do well to go back to Niklaus Wirth’s articles, books and languages. I can think of only a handful of people who have shaped the global hardware and software industry in a comparable way. Niklaus Wirth is, sadly, sadly gone — and I still have trouble accepting that he will not show up for dinner, on Thursday or ever again — but his legacy is everywhere.

VN:F [1.9.10_1130]
Rating: 9.8/10 (9 votes cast)
VN:F [1.9.10_1130]
Rating: +6 (from 6 votes)

Statement Considered Harmful

I harbor no illusion about the effectiveness of airing this particular pet peeve; complaining about it has about the same chance of success as protesting against split infinitives or music in restaurants. Still, it is worth mentioning that the widespread use of the word “statement” to denote a programming language element, such as an assignment, that directs a computer to perform some change, is misleading. “Instruction” is the better term.

A “statement” is “something stated, such as a single declaration or remark, or a report of fact or opinions” (Merriam-Webster).

Why does it matter? The use of “statement” to mean “instruction” obscures a fundamental distinction of software engineering: the duality between specification and implementation. Programming produces a solution to a problem; success requires expressing both the problem, in the form of a specification, and the devised solution, in the form of an implementation. It is important at every stage to know exactly where we stand: on the problem side (the “what”) or the solution side (the “how”). In his famous Goto Statement Considered Harmful of 1968, Dijkstra beautifully characterized this distinction as the central issue of programming:

Our intellectual powers are rather geared to master static relations and our powers to visualize processes evolving in time are relatively poorly developed. For that reason we should do (as wise programmers aware of our limitations) our utmost to shorten the conceptual gap between the static program and the dynamic process, to make the correspondence between the program (spread out in text space) and the process (spread out in time) as trivial as possible.

Software verification, whether conducted through dynamic means (testing) or static techniques (static analysis, proofs of correctness), relies on having separately expressed both a specification of the intent and a proposed implementation intended to realize that intent. They have to remain distinct; otherwise we cannot even define what it means that the program should be correct (correct with respect to what?), and even less what it means to validate the program (validate it against what?).

In many approaches to verification, the properties against which we validate programs are called assertions. An assertion expresses a property that should hold at some point of program execution. For example, after the assignment instruction a := b + 1, the assertion ab will hold. This notion of assertion is used both in testing frameworks, such as JUnit for Java or PyUnit for Python, and in program proving frameworks; see, for example, the interactive Web-based version of the AutoProof program-proving framework for Eiffel at autoproof.sit.org, and of course the entire literature on axiomatic (Floyd-Hoare-Dijkstra-style) verification.

The difference between the instruction and the assertion is critical: a := b + 1 tells the computer to do something (change the value of a), as emphasized here by the “:=” notation for assignment; ab does not direct the computer or the computation to do anything, but simply states a property that should hold at a certain stage of the computation if everything went fine so far.

In the second case, the word “states” is indeed appropriate: an assertion states a certain property. The expression of that property, ab, is a “statement” in the ordinary English sense of the term. The command to the computer, a := b + 1, is an instruction whose effect is to ensure the satisfaction of the statement ab. So if we use the word “statement” at all, we should use it to mean an assertion, not an instruction.

If we start calling instructions “statements” (a usage that Merriam-Webster grudgingly accepts in its last entry for the term, although it takes care to define it as “an instruction in a computer program,” emphasis added), we lose this key distinction.

There is no reason for this usage, however, since the word “instruction” is available, and entirely appropriate.

So, please stop saying “an assignment statement” or “a print statement“; say “an assignment instruction” and so on.

Maybe you won’t, but at least you have been warned.

Recycled This article was first published in the “Communications of the ACM” blog.

VN:F [1.9.10_1130]
Rating: 9.9/10 (10 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

New book: the Requirements Handbook

cover

I am happy to announce the publication of the Handbook of Requirements and Business Analysis (Springer, 2022).

It is the result of many years of thinking about requirements and how to do them right, taking advantage of modern principles of software engineering. While programming, languages, design techniques, process models and other software engineering disciplines have progressed considerably, requirements engineering remains the sick cousin. With this book I am trying to help close the gap.

pegsThe Handbook introduces a comprehensive view of requirements including four elements or PEGS: Project, Environment, Goals and System. One of its principal contributions is the definition of a standard plan for requirements documents, consisting of the four corresponding books and replacing the obsolete IEEE 1998 structure.

The text covers both classical requirements techniques and novel topics such as object-oriented requirements and the use of formal methods.

The successive chapters address: fundamental concepts and definitions; requirements principles; the Standard Plan for requirements; how to write good requirements; how to gather requirements; scenario techniques (use cases, user stories); object-oriented requirements; how to take advantage of formal methods; abstract data types; and the place of requirements in the software lifecycle.

The Handbook is suitable both as a practical guide for industry and as a textbook, with over 50 exercises and supplementary material available from the book’s site.

You can find here a book page with the preface and sample chapters.

To purchase the book, see the book page at Springer and the book page at Amazon US.

VN:F [1.9.10_1130]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Introduction to the Theory of Programming Languages: full book now freely available

itpl_coverShort version: the full text of my Introduction to the Theory of Programming Languages book (second printing, 1991) is now available. This page has more details including the table of chapters, and a link to the PDF (3.3MB, 448 + xvi pages).

The book is a survey of methods for language description, particularly semantics (operational, translational, denotational, axiomatic, complementary) and also serves as an introduction to formal methods. Obviously it would be written differently today but it may still have its use.

A few days ago I released the Axiomatic Semantics chapter of the book, and the chapter introducing mathematical notations. It looked at the time that I could not easily  release the rest in a clean form, because it is impossible or very hard to use the original text-processing tools (troff and such). I could do it for these two chapters because I had converted them years ago for my software verification classes at ETH.

By perusing old files, however,  I realized that around the same time (early 2000s) I actually been able to produce PDF versions of the other chapters as well, even integrating corrections to errata  reported after publication. (How I managed to do it then I have no idea, but the result looks identical, save the corrections, to the printed version.)

The figures were missing from that reconstructed version (I think they had been produced with Brian Kernighan’s PIC graphical description language , which is even more forgotten today than troff), but I scanned them from a printed copy and reinserted them into the PDFs.

Some elements were missing from my earlier resurrection: front matter, preface, bibliography, index. I was able to reconstruct them from the original troff source using plain MS Word. The downside is that they are not hyperlinked; the index has the page numbers (which may be off by 1 or 2 in some cases because of reformatting) but not hyperlinks to the corresponding occurrences as we would expect for a new book. Also, I was not able to reconstruct the table of contents; there is only a chapter-level table of contents which, however, is hyperlinked (in other words, chapter titles link to the actual chapters). In the meantime I obtained the permission of the original publisher (Prentice Hall, now Pearson Education Inc.).

Here again is the page with the book’s description and the link to the PDF:

bertrandmeyer.com/ITPL

 

 

VN:F [1.9.10_1130]
Rating: 9.6/10 (11 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

OOSC-2 available online (officially)

My book Object-Oriented Software Construction, 2nd edition (see the Wikipedia page) has become hard to get. There are various copies floating around the Web but they often use bad typography (wrong colors) and are unauthorized.

In response to numerous requests and in anticipation of the third edition I have been able to make it available electronically (with the explicit permission of the original publisher).

You can find the link on another page on this site. (In sharing or linking please use that page, not the URL of the actual PDF which might change.)

I hope having the text freely available proves useful.

 

VN:F [1.9.10_1130]
Rating: 8.5/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

PhD and postdoc positions in verification in Switzerland

The Chair of Software Engineering, my group at the Schaffhausen Institute of Technology in Switzerland (SIT), has open positions for both PhD students and postdocs. We are looking for candidates with a passion for reliable software and a mix of theoretical knowledge and practical experience in software engineering. Candidates should have degrees in computer science or related fields: a doctorate for postdoc positions, a master’s degree for PhD positions. Postdoc candidates should have a substantial publication record. Experience is expected in one or more of the following fields:

  • Software verification (axiomatic, model-checking, abstract interpretation etc.).
  • Advanced techniques of software testing.
  • Formal methods, semantics of programming languages.
  • Concurrent programming.
  • Design by Contract, Eiffel, techniques of correctness-by-construction.

Some of the work involves the AutoProof framework, under development at SIT (earlier at ETH), although other topics are also available, particularly in static analysis.

Compensation is attractive. Candidates must have the credentials to work in Switzerland (typically, citizenship or residence in Switzerland or the EU). Although we work in part remotely like everyone else these days, the positions are residential.

Interested candidates should send a CV and relevant documents or links (and any questions) to bm@sit.org.

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Tomorrow (Thursday) noon EDT: ACM talk on requirements

In the software engineering family requirements engineering is in my experience the poor cousin, lagging behind the progress of other parts (such as design). I have been devoting attention to the topic in recent months and am completing a book on the topic.

Tomorrow (Thursday), I will be covering some of the material in a one-hour Tech Talk for ACM, with the title

The Four PEGS of Requirements Engineering

The time is Thursday, 4 March 2021, at noon EDT (New York) and 18 CET (Paris, Zurich etc.). Attendance is free but requires registration, on the event page  here.

Abstract:

Bad software requirements can jeopardize projects. There is a considerable literature on requirements, but practice is far behind: what passes for requirements in industry usually consists of a few use cases or user stories, which are useful but not sufficient as a solution. Can we fix requirements engineering (known in other circles as business analysis) so that it is no longer the weak link in software engineering?

I will present ongoing work intended to help industry produce more useful requirements. It includes precise definitions of requirements concepts and a standard plan for requirements specifications, intended to replace the venerable but woefully obsolete IEEE standard from 1998. The plan contains four books covering the four “PEGS” of requirements engineering (which I will explain). The approach builds on existing knowledge to define a practical basis for requirements engineering and provide projects with precise and helpful guidelines.

This is I think the fourth time I am giving talks in this venue (previous talks were about Design by Contract, Agile Methods and Concurrency).

VN:F [1.9.10_1130]
Rating: 6.4/10 (10 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 9 votes)

Some contributions

Science progresses through people taking advantage of others’ insights and inventions. One of the conditions that makes the game possible is that you acknowledge what you take. For the originator, it is rewarding to see one’s ideas reused, but frustrating when that happens without acknowledgment, especially when you are yourself punctilious about citing your own sources of inspiration.

I have started to record some concepts that are widely known and applied today and which I believe I originated in whole or in part, whether or not their origin is cited by those who took them. The list below is not complete and I may update it in the future. It is not a list of ideas I contributed, only of those fulfilling two criteria:

  • Others have built upon them.  (If there is an idea that I think is great but no one paid attention to it, the list does not include it.)
  • They have gained wide visibility.

There is a narcissistic aspect to this exercise and if people want to dismiss it as just showing I am full of myself so be it. I am just a little tired of being given papers to referee that state that genericity was invented by Java, that no one ever thought of refactoring before agile methods, and so on. It is finally time to state some facts.

Facts indeed: I back every assertion by precise references. So if I am wrong — i.e. someone preceded me — the claims of precedence can be refuted; if so I will update or remove them. All articles by me cited in this note are available (as downloadable PDFs) on my publication page. (The page is up to date until 2018; I am in the process of adding newer publications.)

Post-publication note: I have started to receive some comments and added them in a Notes section at the end; references to those notes are in the format [A].

Final disclaimer (about the narcissistic aspect): the exercise of collecting such of that information was new for me, as I do not usually spend time reflecting on the past. I am much more interested in the future and definitely hope that my next contributions will eclipse any of the ones listed below.

Programming concepts: substitution principle

Far from me any wish to under-represent the seminal contributions of Barbara Liskov, particularly her invention of the concept of abstract data type on which so much relies. As far as I can tell, however, what has come to be known as the “Liskov Substitution Principle” is essentially contained in the discussion of polymorphism in section 10.1 of in the first edition (Prentice Hall, 1988) of my book Object-Oriented Software Construction (hereafter OOSC1); for example, “the type compatibility rule implies that the dynamic type is always a descendant of the static type” (10.1.7) and “if B inherits from A, the set of objects that can be associated at run time with an entity [generalization of variable] includes instances of B and its descendants”.

Perhaps most tellingly, a key aspect of the substitution principle, as listed for example in the Wikipedia entry, is the rule on assertions: in a proper descendant, keep the invariant, keep or weaken the precondition, keep or strengthen the postcondition. This rule was introduced in OOSC1, over several pages in section 11.1. There is also an extensive discussion in the article Eiffel: Applying the Principles of Object-Oriented Design published in the Journal of Systems and Software, May 1986.

The original 1988 Liskov article cited (for example) in the Wikipedia entry on the substitution principle says nothing about this and does not in fact include any of the terms “assertion”, “precondition”, “postcondition” or “invariant”. To me this absence means that the article misses a key property of substitution: that the abstract semantics remain the same. (Also cited is a 1994 Liskov article in TOPLAS, but that was many years after OOSC1 and other articles explaining substitution and the assertion rules.)

Liskov’s original paper states that “if for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for oz, then S is a subtype of T.” As stated, this property is impossible to satisfy: if the behavior is identical, then the implementations are the same, and the two types are identical (or differ only by name). Of course the concrete behaviors are different: applying the operation rotate to two different figures o1 and o2, whose types are subtypes of FIGURE and in some cases of each other, will trigger different algorithms — different behaviors. Only with assertions (contracts) does the substitution idea make sense: the abstract behavior, as characterized by preconditions, postconditions and the class invariants, is the same (modulo respective weakening and strengthening to preserve the flexibility of the different version). Realizing this was a major step in understanding inheritance and typing.

I do not know of any earlier (or contemporary) exposition of this principle and it would be normal to get the appropriate recognition.

Software design: design patterns

Two of the important patterns in the “Gang of Four” Design Patterns book (GoF) by Gamma et al. (1995) are the Command Pattern and the Bridge Pattern. I introduced them (under different names) in the following publications:

  • The command pattern appears in OOSC1 under the name “Undo-Redo” in section 12.2. The solution is essentially the same as in GoF. I do not know of any earlier exposition of the technique. See also notes [B] and [C].
  • The bridge pattern appears under the name “handle technique” in my book Reusable Software: The Base Component Libraries (Prentice Hall, 1994). It had been described several years earlier in manuals for Eiffel libraries. I do not know of an earlier reference. (The second edition of Object-Oriented Software Construction — Prentice Hall, 1997, “OOSC2” –, which also describes it, states that a similar technique is described in an article by Josef Gil and Ricardo Szmit at the TOOLS USA conference in the summer of 1994, i.e. after the publication of Reusable Software.)

Note that it is pointless to claim precedence over GoF since that book explicitly states that it is collecting known “best practices”, not introducing new ones. The relevant questions are: who, pre-GoF, introduced each of these techniques first; and which publications does the GoF cites as “prior art”  for each pattern. In the cases at hand, Command and Bridge, it does not cite OOSC1.

To be concrete: unless someone can point to an earlier reference, then anytime anyone anywhere using an interactive system enters a few “CTRL-Z” to undo commands, possibly followed by some “CTRL-Y” to redo them (or uses other UI conventions to achieve these goals), the software most likely relying on a technique that I first described in the place mentioned above.

Software design: Open-Closed Principle

Another contribution of OOSC1 (1988), section 2.3, reinforced in OOSC2 (1997) is the Open-Closed principle, which explained one of the key aspects of inheritance: the ability to keep a module both closed (immediately usable as is) and open to extension (through inheritance, preserving the basic semantics. I am mentioning this idea only in passing since in this case my contribution is usually recognized, for example in the Wikipedia entry.

Software design: OO for reuse

Reusability: the Case for Object-Oriented Design (1987) is, I believe, the first publication that clearly explained why object-oriented concepts were (and still are today — in Grady Booch’s words, “there is no other game in town”) the best answer to realize the goal of software construction from software components. In particular, the article:

  • Explains the relationship between abstract data types and OO programming, showing the former as the theoretical basis for the latter. (The CLU language at MIT originated from Liskov’s pioneering work on abstract data types, but was not OO in the full sense of the term, missing in particular a concept of inheritance.)
  • Shows that reusability implies bottom-up development. (Top-down refinement was the mantra at the time, and promoting bottom-up was quite a shock for many people.)
  • Explains the role of inheritance for reuse, as a complement to Parnas’s interface-based modular construction with information hiding.

Software design: Design by Contract

The contribution of Design by Contract is one that is widely acknowledged so I don’t have any point to establish here — I will just recall the essentials. The notion of assertion goes back to the work of Floyd, Hoare and Dijkstra in the sixties and seventies, and correctness-by-construction to Dijktra, Gries and Wirth, but Design by Contract is a comprehensive framework providing:

  • The use of assertions in an object-oriented context. (The notion of class invariant was mentioned in a paper by Tony Hoare published back in 1972.)
  • The connection of inheritance with assertions (as sketched above). That part as far as I know was entirely new.
  • A design methodology for quality software: the core of DbC.
  • Language constructs carefully seamed into the fabric of the language. (There were precedents there, but in the form of research languages such as Alphard, a paper design only, not implemented, and Euclid.)
  • A documentation methodology.
  • Support for testing.
  • Support for a consistent theory of exception handling (see next).

Design by Contract is sometimes taken to mean simply the addition of a few assertions here and there. What the term actually denotes is a comprehensive methodology with all the above components, tightly integrated into the programming language. Note in particular that preconditions and postconditions are not sufficient; in an OO context class invariants are essential.

Software design: exceptions

Prior to the Design by Contract work, exceptions were defined very vaguely, as something special you do outside of “normal” cases, but without defining “normal”. Design by Contract brings a proper perspective by defining these concepts precisely. This was explained in a 1987 article, Disciplined Exceptions ([86] in the list), rejected by ECOOP but circulated as a technical report; they appear again in detail in OOSC1 (sections 7.10.3 to 7.10.5).

Other important foundational work on exceptions, to which I know no real precursor (as usual I would be happy to correct any omission), addressed what happens to the outcome of an exception in a concurrent or distributed context. This work was done at ETH, in particular in the PhD theses  of B. Morandi and A. Kolesnichenko, co-supervised with S. Nanz. See the co-authored papers [345] and [363].

On the verification aspect of exceptions, see below.

Software design: refactoring

I have never seen a discussion of refactoring that refers to the detailed discussion of generalization in both of the books Reusable Software (1994, chapter 3) and Object Success (Prentice Hall, 1995, from page 122 to the end of chapter 6). These discussions describe in detail how, once a program has been shown to work, it should be subject to a posteriori design improvements. It presents several of the refactoring techniques (as they were called when the idea gained traction several years later), such as moving common elements up in the class hierarchy, and adding an abstract class as parent to concrete classes ex post facto.

These ideas are an integral part of the design methodology presented in these books (and again in OOSC2 a few later). It is beyond me why people would present refactoring (or its history, as in the Wikipedia entry on the topic) without referring to these publications, which were widely circulated and are available for anyone to inspect.

Software design: built-in documentation and Single-Product principle

Another original contribution was the idea of including documentation in the code itself and relying on tools to extract the documentation-only information (leaving implementation elements aside). The idea, described in detail in OOSC1 in 1988 (sections 9.4 and 9.5) and already mentioned in the earlier Eiffel papers, is that code should be self-complete, containing elements of various levels of abstraction; some of them describe implementation, but the higher-level elements describe specification, and are distinguished syntactically in such a way that tools can extract them to produce documentation at any desired level of abstraction.

The ideas were later applied through such mechanisms as JavaDoc (with no credit as far as I know). They were present in Eiffel from the start and the underlying principles, in particular the “Single Product principle” (sometimes “Self-Documentation principle”, and also generalized by J. Ostroff and R. Paige as “Single-Model principle”). Eiffel is the best realization of these principles thanks to:

  • Contracts (as mentioned above): the “contract view” of a class (called “short form” in earlier descriptions) removes the implementations but shows the relevant preconditions, postconditions and class invariants, given a precise and abstract specification of the class.
  • Eiffel syntax has a special place for “header comments”, which describe high-level properties and remain in the contract view.
  • Eiffel library class documentation has always been based on specifications automatically extracted from the actual text of the classes, guaranteeing adequacy of the documentation. Several formats are supported (including, from 1995 on, HTML, so that documentation can be automatically deployed on the Web).
  • Starting with the EiffelCase tool in the early 90s, and today with the Diagram Tool of EiffelStudio, class structures (inheritance and client relationships) are displayed graphically, again in an automatically extracted form, using either the BON or UML conventions.

One of the core benefits of the Single-Product principle is to guard against what some of my publications called the “Dorian Gray” syndrome: divergence of an implementation from its description, a critical problem in software because of the ease of modifying stuff. Having the documentation as an integral part of the code helps ensure that when information at some level of abstraction (specification, design, implementation) changes, the other levels will be updated as well.

Crucial in the approach is the “roundtripping” requirement: specifiers or implementers can make changes in any of the views, and have them reflected automatically in the other views. For example, you can graphically draw an arrow between two bubbles representing classes B and A in the Diagram Tool, and the code of B will be updated with “inherit A”; or you can add this Inheritance clause textually in the code of class B, and the diagram will be automatically updated with an arrow.

It is important to note how contrarian and subversive these ideas were at the time of their introduction (and still to some extent today). The wisdom was that you do requirements then design then implementation, and that code is a lowly product entirely separate from specification and documentation. Model-Driven Development perpetuates this idea (you are not supposed to modify the code, and if you do there is generally no easy way to propagate the change to the model.) Rehabilitating the code (a precursor idea to agile methods, see below) was a complete change of perspective.

I am aware of no precedent for this Single Product approach. The closest earlier ideas I can think of are in Knuth’s introduction of Literate Programming in the early eighties (with a book in 1984). As in the Single-product approach, documentation is interspersed with code. But the literate programming approach is (as presented) top-down, with English-like explanations progressively being extended with implementation elements. The Single Product approach emphasizes the primacy of code and, in terms of the design process, is very much yoyo, alternating top-down (from the specification to the implementation) and bottom-up (from the implementation to the abstraction) steps. In addition, a large part of the documentation, and often the most important one, is not informal English but formal assertions. I knew about Literate Programming, of course, and learned from it, but Single-Product is something else.

Software design: from patterns to components

Karine Arnout’s thesis at ETH Zurich, resulting in two co-authored articles ([255] and [257], showed that contrary to conventional wisdom a good proportion of the classical design patterns, including some of the most sophisticated, can be transformed into reusable components (indeed part of an Eiffel library). The agent mechanism (see below) was instrumental in achieving that result.

Programming, design and specification concepts: abstract data types

Liskov’s and Zilles’s ground-breaking 1974 abstract data types paper presented the concepts without a mathematical specification, using programming language constructs instead. A 1976 paper (number [3] in my publication list, La Description des Structures de Données, i.e. the description of data structures) was as far as I know one of the first to present a mathematical formalism, as  used today in presentations of ADTs. John Guttag was taking a similar approach in his PhD thesis at about the same time, and went further in providing a sound mathematical foundation, introducing in particular (in a 1978 paper with Jim Horning) the notion of sufficient completeness, to which I devoted a full article in this blog  (Are My Requirements Complete?) about a year ago. My own article was published in a not very well known journal and in French, so I don’t think it had much direct influence. (My later books reused some of the material.)

The three-level description approach of that article (later presented in English for an ACM workshop in the US in 1981, Pingree Park, reference [28]) is not well known but still applicable, and would be useful to avoid frequent confusions between ADT specifications and more explicit descriptions.

When I wrote my 1976 paper, I was not aware of Guttag’s ongoing work (only of the Liskov and Zilles paper), so the use of a mathematical framework with functions and predicates on them was devised independently. (I remember being quite happy when I saw what the axioms should be for a queue.) Guttag and I both gave talks at a workshop organized by the French programming language interest group in 1977 and it was fun to see that our presentations were almost identical. I think my paper still reads well today (well, if you read French). Whether or not it exerted direct influence, I am proud that it independently introduced the modern way of thinking of abstract data types as characterized by mathematical functions and their formal (predicate calculus) properties.

Language mechanisms: genericity with inheritance

Every once in a while I get to referee a paper that starts “Generics, as introduced in Java…” Well, let’s get some perspective here. Eiffel from its introduction in 1985 combined genericity and inheritance. Initially, C++ users and designers claimed that genericity was not needed in an OO context and the language did not have it; then they introduced template. Initially, the designers of Java claimed (around 1995) that genericity was not needed, and the language did not have it; a few years later Java got generics. Initially, the designers of C# (around 1999) claimed that genericity was not needed, and the language did not have it; a few years later C# and .NET got generics.

Genericity existed before Eiffel of course; what was new was the combination with inheritance. I had been influenced by work on generic modules by a French researcher, Didier Bert, which I believe influenced the design of Ada as well; Ada was the language that brought genericity to a much broader audience than the somewhat confidential languages that had such a mechanism before. But Ada was not object-oriented (it only had modules, not classes). I was passionate about object-oriented programming (at a time when it was generally considered, by the few people who had heard of it as an esoteric, academic pursuit). I started — in the context of an advanced course I was teaching at UC Santa Barbara — an investigation of how the two mechanisms relate to each other. The results were a paper at the first OOPSLA in 1986, Genericity versus Inheritance, and the design of the Eiffel type system, with a class mechanism, inheritance (single and multiple), and genericity, carefully crafted to complement each other.

With the exception of a Trellis-Owl, a  design from Digital Equipment Corporation also presented at the same OOPSLA (which never gained significant usage), there were no other OO languages with both mechanisms for several years after the Genericity versus Inheritance paper and the implementation of genericity with inheritance in Eiffel available from 1986 on. Eiffel also introduced, as far as I know, the concept of constrained genericity, the second basic mechanism for combining genericity with inheritance, described in Eiffel: The Language (Prentice Hall, 1992, section 10.8) and discussed again in OOSC2 (section 16.4 and throughout). Similar mechanisms are present in many languages today.

It was not always so. I distinctly remember people bringing their friends to our booth at some conference in the early nineties, for the sole purpose of having a good laugh with them at our poster advertising genericity with inheritance. (“What is this thing they have and no one else does? Generi-sissy-tee? Hahaha.”). A few years later, proponents of Java were pontificating that no serious language needs generics.

It is undoubtedly part of of the cycle of invention (there is a Schopenhauer citation on this, actually the only thing from Schopenhauer’s philosophy that I ever understood [D]) that people at some point will laugh at you; if it did brighten their day, why would the inventor deny them one of the little pleasures of life? But in terms of who laughs last, along the way C++ got templates, Java got generics, C# finally did too, and nowadays all typed OO languages have something of the sort.

Language mechanisms: multiple inheritance

Some readers will probably have been told that multiple inheritance is a bad thing, and hence will not count it as a contribution, but if done properly it provides a major abstraction mechanism, useful in many circumstances. Eiffel showed how to do multiple inheritance right by clearly distinguishing between features (operations) and their names, defining a class as a finite mapping between names and features, and using renaming to resolve any name clashes.

Multiple inheritance was made possible by an implementation innovation: discovering a technique (widely imitated since, including in single-inheritance contexts) to implement dynamic binding in constant time. It was universally believed at the time that multiple inheritance had a strong impact on performance, because dynamic binding implied a run-time traversal of the class inheritance structure, already bad enough for single inheritance where the structure is a tree, but prohibitive with multiple inheritance for which it is a directed acyclic graph. From its very first implementation in 1986 Eiffel used what is today known as a virtual table technique which guarantees constant-time execution of routine (method) calls with dynamic binding.

Language mechanisms: safe GC through strong static typing

Simula 67 implementations did not have automatic garbage collection, and neither had implementations of C++. The official excuse in the C++ case was methodological: C programmers are used to exerting manual control of memory usage. But the real reason was a technical impossibility resulting from the design of the language: compatibility with C precludes the provision of a good GC.

More precisely, of a sound and complete GC. A GC is sound if it will only reclaim unreachable objects; it is complete if it will reclaim all unreachable objects. With a C-based language supporting casts (e.g. between integers and pointers) and pointer arithmetic, it is impossible to achieve soundness if we aim at a reasonable level of completeness: a pointer can masquerade as an integer, only to be cast back into a pointer later on, but in the meantime the garbage collector, not recognizing it as a pointer, may have wrongly reclaimed the corresponding object. Catastrophe.

It is only possible in such a language to have a conservative GC, meaning that it renounces completeness. A conservative GC will treat as a pointer any integer whose value could possibly be a pointer (because it lies between the bounds of the program’s data addresses in memory). Then, out of precaution, the GC will refrain from reclaiming the objects at these addresses even if they appear unreachable.

This approach makes the GC sound but it is only a heuristics, and it inevitably loses completeness: every once in a while it will fail to reclaim some dead (unreachable) objects around. The result is a program with memory leaks — usually unacceptable in practice, particularly for long-running or continuously running programs where the leaks inexorably accumulate until the program starts thrashing then runs out of memory.

Smalltalk, like Lisp, made garbage collection possible, but was not a typed language and missed on the performance benefits of treating simple values like integers as a non-OO language would. Although in this case I do not at the moment have a specific bibliographic reference, I believe that it is in the context of Eiffel that the close connection between strong static typing (avoiding mechanisms such as casts and pointer arithmetic) and the possibility of sound and complete garbage collection was first clearly explained. Explained in particular around 1990 in a meeting with some of the future designers of Java, which uses a similar approach, also taken over later on by C#.

By the way, no one will laugh at you today for considering garbage collection as a kind of basic human right for programmers, but for a long time the very idea was quite sulfurous, and advocating it subjected you to a lot of scorn. Here is an extract of the review I got when I submitted the first Eiffel paper to IEEE Transactions on Software Engineering:

Systems that do automatic garbage collection and prevent the designer from doing his own memory management are not good systems for industrial-strength software engineering.

Famous last words. Another gem from another reviewer of the same paper:

I think time will show that inheritance (section 1.5.3) is a terrible idea.

Wow! I wish the anonymous reviewers would tell us what they think today. Needless to say, the paper was summarily rejected. (It later appeared in the Journal of Systems and Software — as [82] in the publication list — thanks to the enlightened views of Robert Glass, the founding editor.)

Language mechanisms: void safety

Void safety is a property of a language design that guarantees the absence of the plague of null pointer dereferencing.

The original idea came (as far as I know) from work at Microsoft Research that led to the design of a research language called C-omega; the techniques were not transferred to a full-fledged programming language. Benefiting from the existence of this proof of concept, the Eiffel design was reworked to guarantee void safety, starting from my 2005 ECOOP keynote paper (Attached Types) and reaching full type safety a few years later. This property of the language was mechanically proved in a 2016 ETH thesis by A. Kogtenkov.

Today all significant Eiffel development produces void-safe code. As far as I know this was a first among production programming languages and Eiffel remains the only production language to provide a guarantee of full void-safety.

This mechanism, carefully crafted (hint: the difficult part is initialization), is among those of which I am proudest, because in the rest of the programming world null pointer dereferencing is a major plague, threatening at any moment to crash the execution of any program that uses pointers of references. For Eiffel users it is gone.

Language mechanisms: agents/delegates/lambdas

For a long time, OO programming languages did not have a mechanism for defining objects wrapping individual operations. Eiffel’s agent facility was the first such mechanism or among the very first together the roughly contemporaneous but initially much more limited delegates of C#. The 1999 paper From calls to agents (with P. Dubois, M. Howard, M. Schweitzer and E. Stapf, [196] in the list) was as far as I know the first description of such a construct in the scientific literature.

Language mechanisms: concurrency

The 1993 Communications of the ACM paper on Systematic Concurrent Object-Oriented Programming [136] was certainly not the first concurrency proposal for OO programming (there had been pioneering work reported in particular in the 1987 book edited by Tokoro and Yonezawa), but it innovated in offering a completely data-race-free model, still a rarity today (think for example of the multi-threading mechanisms of dominant OO languages).

SCOOP, as it came to be called, was implemented a few years later and is today a standard part of Eiffel.

Language mechanisms: selective exports

Information hiding, as introduced by Parnas in his two seminal 1972 articles, distinguishes between public and secret features of a module. The first OO programming language, Simula 67, had only these two possibilities for classes and so did Ada for modules.

In building libraries of reusable components I realized early on that we need a more fine-grained mechanism. For example if class LINKED_LIST uses an auxiliary class LINKABLE to represent individual cells of a linked list (each with a value field and a “right” field containing a reference to another LINKABLE), the features of LINKABLE (such as the operation to reattach the “right” field) should not be secret, since LINKED_LIST needs them; but they should also not be generally public, since we do not want arbitrary client objects to mess around with the internal structure of the list. They should be exported selectively to LINKED_LIST only. The Eiffel syntax is simple: declare these operations in a clause of the class labeled “feature {LINKED_LIST}”.

This mechanism, known as selective exports, was introduced around 1989 (it is specified in full in Eiffel: The Language, from 1992, but was in the Eiffel manuals earlier). I think it predated the C++ “friends” mechanism which serves a similar purpose (maybe someone with knowledge of the history of C++ has the exact date). Selective exports are more general than the friends facility and similar ones in other OO languages: specifying a class as a friend means it has access to all your internals. This solution is too coarse-grained. Eiffel’s selective exports make it possible to define the specific export rights of individual operations (including attributes/fields) individually.

Language mechanisms and implementation: serialization and schema evolution

I did not invent serialization. As a student at Stanford in 1974 I had the privilege, at the AI lab, of using SAIL (Stanford Artificial Intelligence Language). SAIL was not object-oriented but included many innovative ideas; it was far ahead of its time, especially in terms of the integration of the language with (what was not yet called) its IDE. One feature of SAIL with which one could fall in love at first sight was the possibility of selecting an object and having its full dependent data structure (the entire subgraph of the object graph reached by following references from the object, recursively) stored into a file, for retrieval at the next section. After that, I never wanted again to live without such a facility, but no other language and environment had it.

Serialization was almost the first thing we implemented for Eiffel: the ability to write object.store (file) to have the entire structure from object stored into file, and the corresponding retrieval operation. OOSC1 (section 15.5) presents these mechanisms. Simula and (I think) C++ did not have anything of the sort; I am not sure about Smalltalk. Later on, of course, serialization mechanisms became a frequent component of OO environments.

Eiffel remained innovative by tackling the difficult problems: what happens when you try to retrieve an object structure and some classes have changed? Only with a coherent theoretical framework as provided in Eiffel by Design by Contract can one devise a meaningful solution. The problem and our solutions are described in detail in OOSC2 (the whole of chapter 31, particularly the section entitled “Schema evolution”). Further advances were made by Marco Piccioni in his PhD thesis at ETH and published in joint papers with him and M. Oriol, particularly [352].

Language mechanisms and implementation: safe GC through strong static typing

Simula 67 (if I remember right) did not have automatic garbage collection, and neither had C++ implementations. The official justification in the case of C++ was methodological: C programmers are used to exerting manual control of memory usage. But the real obstacle was technical: compatibility with C makes it impossible to have a good GC. More precisely, to have a sound and complete GC. A GC is sound if it will only reclaim unreachable objects; it is complete if it will reclaim all unreachable objects. With a C-based language supporting casts (e.g. between integers and pointers) and pointer arithmetic, it is impossible to achieve soundness if we aim at a reasonable level of completeness: a pointer can masquerade as an integer, only to be cast back into a pointer later on, but in the meantime the garbage collector, not recognizing it as a pointer, may have wrongly reclaimed the corresponding object. Catastrophe. It is only possible in such a language to have a conservative GC, which will treat as a pointer any integer whose value could possibly be a pointer (because its value lies between the bounds of the program’s data addresses in memory). Then, out of precaution, it will not reclaim the objects at the corresponding address. This approach makes the GC sound but it is only a heuristics, and it may be over-conservative at times, wrongly leaving dead (i.e. unreachable) objects around. The result is, inevitably, a program with memory leaks — usually unacceptable in practice.

Smalltalk, like Lisp, made garbage collection possible, but was not a typed language and missed on the performance benefits of treating simple values like integers as a non-OO language would. Although in this case I do not at the moment have a specific bibliographic reference, I believe that it is in the context of Eiffel that the close connection between strong static typing (avoiding mechanisms such as casts and pointer arithmetic) and the possibility of sound and complete garbage collection was first clearly explained. Explained in particular to some of the future designers of Java, which uses a similar approach, also taken over later on by C#.

By the way, no one will laugh at you today for considering garbage collection as a kind of basic human right for programmers, but for a long time it was quite sulfurous. Here is an extract of the review I got when I submitted the first Eiffel paper to IEEE <em>Transactions on Software Engineering:

Software engineering: primacy of code

Agile methods are widely and properly lauded for emphasizing the central role of code, against designs and other non-executable artifacts. By reading the agile literature you might be forgiven for believing that no one brought up that point before.

Object Success (1995) makes the argument very clearly. For example, chapter 3, page 43:

Code is to our industry what bread is to a baker and books to a writer. But with the waterfall code only appears late in the process; for a manager this is an unacceptable risk factor. Anyone with practical experience in software development knows how many things can go wrong once you get down to code: a brilliant design idea whose implementation turns out to require tens of megabytes of space or minutes of response time; beautiful bubbles and arrows that cannot be implemented; an operating system update, crucial to the project which comes five weeks late; an obscure bug that takes ages to be fixed. Unless you start coding early in the process, you will not be able to control your project.

Such discourse was subversive at the time; the wisdom in software engineering was that you need to specify and design a system to death before you even start coding (otherwise you are just a messy “hacker” in the sense this word had at the time). No one else in respectable software engineering circles was, as far as I know, pushing for putting code at the center, the way the above extract does.

Several years later, agile authors started making similar arguments, but I don’t know why they never referenced this earlier exposition, which still today I find not too bad. (Maybe they decided it was more effective to have a foil, the scorned Waterfall, and to claim that everyone else before was downplaying the importance of code, but that was not in fact everyone.)

Just to be clear, Agile brought many important ideas that my publications did not anticipate; but this particular one I did.

Software engineering: the roles of managers

Extreme Programming and Scrum have brought new light on the role of managers in software development. Their contributions have been important and influential, but here too they were for a significant part prefigured by a long discussion, altogether two chapters, in Object Success (1995).

To realize this, it is enough to read the titles of some of the sections in those chapters, describing roles for managers (some universal, some for a technical manager): “risk manager”, “interface with the rest of the world” (very scrummy!), “protector of the team’s sanity”, “method enforcer” (think Scrum Master), “mentor and critic”. Again, as far as I know, these were original thoughts at the time; the software engineering literature for the most part did not talk about these issues.

Software engineering: outsourcing

As far as I know the 2006 paper Offshore Development: The Unspoken Revolution in Software Engineering was the first to draw attention, in the software engineering community, to the peculiar software engineering challenges of distributed and outsourced development.

Software engineering: automatic testing

The AutoTest project (with many publications, involving I. Ciupa, A. Leitner, Y. Wei, M. Oriol, Y. Pei, M. Nordio and others) was not the first to generate tests automatically by creating numerous instances of objects and calling applicable operations (it was preceded by Korat at MIT), but it was the first one to apply this concept with Design by Contract mechanisms (without which it is of little practical value, since one must still produce test oracles manually) and the first to be integrated in a production environment (EiffelStudio).

Software engineering: make-less system building

One of the very first decisions in the design of Eiffel was to get rid of Make files.

Feldman’s Make had of course been a great innovation. Before Make, programmers had to produce executable systems manually by executing sequences of commands to compile and link the various source components. Make enabled them to instead  to define dependencies between components in a declarative way, resulting in a partial order, and then performed a topological sort to produce the sequence of comments. But preparing the list of dependencies remains a tedious task, particularly error-prone for large systems.

I decided right away in the design of Eiffel that we would never force programmers to write such dependencies: they would be automatically extracted from the code, through an exhaustive analysis of the dependencies between modules. This idea was present from the very the first Eiffel report in 1985 (reference [55] in the publication list): Eiffel programmers never need to write a Make file or equivalent (other than for non-Eiffel code, e.g. C or C++, that they want to integrate); they just click a Compile button and the compiler figures out the steps.

Behind this approach was a detailed theoretical analysis of possible relations between modules in software development (in many programming languages), published as the “Software Knowledge Base” at ICSE in 1985. That analysis was also quite instructive and I would like to return to this work and expand it.

Educational techniques: objects first

Towards an Object-Oriented Curriculum ( TOOLS conference, August 1993, see also the shorter JOOP paper in May of the same year) makes a carefully argued case for what was later called the Objects First approach to teaching programming. I would be interested to know if there are earlier publications advocating starting programming education with an OO language.

The article also advocated for the “inverted curriculum”, a term borrowed from work by Bernie Cohen about teaching electrical engineering. It was the first transposition of this concept to software education. In the article’s approach, students are given program components to use, then little by little discover how they are made. This technique met with some skepticism and resistance since the standard approach was to start from the very basics (write trivial programs), then move up. Today, of course, many introductory programming courses similarly provide students from day one with a full-fledged set of components enabling them to produce significant programs.

More recent articles on similar topics, taking advantage of actual teaching experience, are The Outside-In Method of Teaching Programming (2003) and The Inverted Curriculum in Practice (at ICSE 2006, with Michela Pedroni). The culmination of that experience is the textbook Touch of Class from 2009.

Educational techniques: Distributed Software Projects

I believe our team at ETH Zurich (including among others M. Nordio, J. Tschannen, P. Kolb and C. Estler and in collaboration with C. Ghezzi, E. Di Nitto and G. Tamburrelli at Politecnico di Milano, N. Aguirre at Rio Cuarto and many others in various universities) was the first to devise,  practice and document on a large scale (see publications and other details here) the idea of an educational software project conducted in common by student groups from different universities. It yielded a wealth of information on distributed software development and educational issues.

Educational techniques: Web-based programming exercises

There are today a number of cloud-based environments supporting the teaching of programming by enabling students to compile and test their programs on the Web, benefiting from a prepared environment (so that they don’t have to download any tools or prepare control files) and providing feedback. One of the first — I am not sure about absolute precedence — and still a leading one, used by many universities and applicable to many programming languages, is Codeboard.

The main developer, in my chair at ETH Zurich, was Christian Estler, supported in particular by M. Nordio and M. Piccioni, so I am only claiming a supporting role here.

Educational techniques: key CS/SE concepts

The 2001 paper Software Engineering in the Academy did a good job, I think, of defining the essential concepts to teach in a proper curriculum (part of what Jeannette Wing’s 2006 paper called Computational Thinking).

Program verification: agents (delegates etc.)

Reasoning about Function Objects (ICSE 2010, with M. Nordio, P. Müller and J. Tschannen) introduced verification techniques for objects representing functions (such as agents, delegates etc., see above) in an OO language. Not sure whether there were any such techniques before.

Specification languages: Z

The Z specification language has been widely used for formal development, particularly in the UK. It is the design of J-R Abrial. I may point out that I was a coauthor of the first publication on Z in English (1980),  describing a version that preceded the adaptation to a more graphical-style notation done later at Oxford. The first ever published description of Z, pertaining to an even earlier version, was in French, in my book Méthodes de Programmation (with C. Baudoin), Eyrolles, 1978, running over 15 pages (526-541), with the precise description of a refinement process.

Program verification: exceptions

Largely coming out of the PhD thesis of Martin Nordio, A Sound and Complete Program Logic for Eiffel (TOOLS 2009) introduces rules for dealing with exceptions in a Hoare-style verification framework.

Program verification: full library, and AutoProof

Nadia Polikarpova’s thesis at ETH, aided by the work of Carlo Furia and Julian Tschannen (they were the major contributors and my participation was less important), was as far as I know the first to produce a full functional verification of an actual production-quality reusable library. The library is EiffelBase 2, covering fundamental data structures.

AutoProof — available today, as a still experimental tool, through its Web interface, see here — relied on the AutoProof prover, built by the same team, and itself based on Microsoft Research’s Boogie and Z3 engines.

More

There are more concepts worthy of being included here, but for today I will stop here.

Notes

[A] One point of divergence between usual presentations of the substitution principle and the view in OOSC and my other publications is the covariance versus contravariance of routine argument types. It reflects a difference of views as to what the proper policy (both mathematically sound and practically usable) should be.

[B]  The GoF book does not cite OOSC for the command or bridge patterns. For the command pattern it cites (thanks to Adam Kosmaczewski for digging up the GoF text!) a 1985 SIGGRAPH paper by Henry Lieberman (There’s More to Menu Systems than Meets the Screen). Lieberman’s paper describes the notion of command object and mentions undoing in passing, but does not include the key elements of the command pattern (as explained in full in OOSC1), i.e. an abstract (deferred) command class with deferred procedures called (say) do_it and undo_it, then specific classes for each kind of command, each providing a specific implementation of those procedures, then a history list of commands supporting multiple-level undo and redo as explained in OOSC1. (Reading Lieberman’s paper with a 2021 perspective shows that it came tantalizingly close to the command pattern, but doesn’t get to it. The paper does talk about inheritance between command classes, but only to “define new commands as extensions to old commands”, not in the sense of a general template that can be implemented in many specific ways. And it does mention a list of objects kept around to enable recovery from accidental deletions, and states that the application can control its length, as is the case with a history list; but the objects in the list are not command objects, they are graphical and other objects that have been deleted.)

[C] Additional note on the command pattern: I vaguely remember seeing something similar to the OOSC1 technique in an article from a supplementary volume of the OOPSLA proceedings in the late eighties or early nineties, i.e. at the same time or slightly later, possibly from authors from Xerox PARC, but I have lost the reference.

[D] Correction: I just checked the source and learned that the actual Schopenhauer quote (as opposed to the one that is usually quoted) is different; it does not include the part about laughing. So much for my attempts at understanding philosophy.

 

VN:F [1.9.10_1130]
Rating: 8.7/10 (27 votes cast)
VN:F [1.9.10_1130]
Rating: +8 (from 14 votes)

Time to resurrect PSP?

Let us assume for the sake of the argument that software quality matters. There are many ingredients to software quality, of which one must be the care that every programmer devotes to the job. The Personal Software Process, developed by Watts Humphrey in the 1990s [1], prescribes a discipline that software developers should apply to produce good software and improve their professional ability over their careers. It has enjoyed moderate success but was never a mass movement and rarely gets mentioned nowadays; few software developers, in my experience, even know the name. Those who do often think of it as passé, a touching memory from the era of Monica Lewinsky and the Roseanne show.

Once cleaned of a few obsolete elements, PSP deserves to be known and applied.

PSP came out of Watts Humphrey’s earlier work on the Capability Maturity Model (see my earlier article on this blog, What is wrong with CMMI), a collection of recommended practices and assessment criteria for software processes, originally developed in the mid-eighties for the U.S. military contractor community but soon thereafter embraced by software outsourcing companies (initially, Indian ones) and later by other industries. Responding to complaints that CMM/CMMI, focused on processes in large companies, ignored the needs of smaller ones, and lacked individual guidance for developers, Humphrey developed TSP, the Team Software Process, and PSP.

The most visible part of PSP is a six-step process pictured in the middle of this diagram:
cmmi

The most visible and also the most corny. Who today wants to promise always to follow such a strict sequence of steps? Always to write the code for a module in full before compiling it? (Notice there is no backward arrow, the process is sequential.) Always to test at the end only? Come on. This is the third decade of the 21st century.

Today we compile as we code, using the development environment (IDE) as a brilliant tool to check everything we do or plan to do. For my part, whenever I am writing code and have not compiled my current draft for more than a few minutes I start feeling like an addict in need of a fix; my fix is the Compile button of EiffelStudio. At some eventual stage the compiler becomes a tool to generate excutable code, but long before that it has been my friend, coach, mentor, and doppelgänger, helping me get things (types, null references, inheritance…) right and gently chiding me when I wander off the rails.

As to tests, even if you do not buy into the full dogma of Test-Driven Development (I don’t), they get written and exercised right from the start, as you are writing the code, not afterwards. Compile all the time, test all the time.

It’s not just that a process such as the above ignores the contributions of agile methods, which are largely posterior to PSP. As analyzed in [2], agile is a curious mix of good ideas and a few horrendous ones. But among its durable contributions is the realization that development must be incremental, not a strict succession of separate activities.

This old-style flavor or PSP is probably the reason why it has fallen out of favor. But (like the agile rejection of upfront lifecycle activities) such a reaction is a case of criticism gone too far, ignoring the truly beneficial contributions. Ignore PSP’s outmoded sequence of activities and you will find that PSP’s core message is as relevant today as it ever was. That message is: we should learn from the practices of traditional engineers and apply a strict professional discipline. For example:

  • Keep a log of all activities. (See “Logs” in the above figure.) Engineers are taught to record everything they do; many programmers don’t bother. This practice, however, is essential to self-improvement.
  • Keep measurements of everything you do. (There are lots of things to measure, from hours spent on every kind of task to bugs found, time to fix them etc.)
  • Estimate and plan your work.
  • Clearly define commitments, and meet them.
  • Resist pressure to make unreasonable commitments (something that agilists approach also emphasize).
  • Understand your current performance.
  • Understand your programming style and how it affects various measures. (As an example, code size, as a function of the number of routines, depends on whether you are more concise or more verbose in style).
  • Continually improve your expertise as a professional.

PSP does not limit itself to such exhortations but gives concrete tools to apply the principles, with a view to: measuring, tracking and analyzing your work; learning from your performance variations; and incorporating the lessons learned into your professional practices. On the topic of measurement, for example, PSP includes precise guidelines on what to measure and how to measure it, and how to rely on proxies for quantities that are hard to assess directly. On this last point, PSP includes PROBE (PROxy-Based Estimating, you cannot have a method coming out of the world of US government organizations without cringeworthy acronyms), a general framework for estimating size and resource parameters from directly measurable proxies.

This is what PSP is about: a discipline of personal productivity and growth, emphasizing personal discipline, tracking and constant improvement. It is not hard to learn; a technical report by Humphrey available online [3] provides a sufficient basis to understand the concepts and start a process of self-improvement.

Watts Humphrey himself, as all who had the privilege to meet him can testify, was a model of seriousness and professionalism, the quintessential engineer. (I also remember him as the author of what may be the best pun I ever heard — ask me sometime.) PSP directly reflects these qualities and — ignoring its visible but in the end unimportant remnants from outdated technical choices — should be part of every software engineering curriculum and every software engineer’s collection of fundamental practices.

References

[1] Watts Humphrey, Introduction to the Personal Software Process, Addison-Wesley, 1996.

[2] Bertrand Meyer: Agile! The Good, the Hype and the Ugly, Springer, 2014, see here.

[3] Watts Humphrey, The Personal Software Process, Software Engineering Institute Technical Report CMU/SEI-2000-TR-022, available (in PDF, free) here.

 

Recycled A version of this article was first published in the Communications of the ACM blog.

.

VN:F [1.9.10_1130]
Rating: 8.0/10 (9 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

The right forms of expression

If you want to know whether your_string has at least one upper-case character, you will write this in Eiffel:

if  ∃ c: your_string ¦ c.is_upper then

Such predicate-calculus boolean expressions, using a quantifier (“for all”) or (“there exists”) are becoming common in Eiffel code. They are particularly useful in Design by Contract assertions, making it possible to characterize deep semantic properties of the code and its data structures. For example a class invariant clause in a class I wrote recently states

from_lists_exist: ∀ tf: triples_from ¦ tf Void                        — [1]

meaning that all the elements, if any, of the list triples_from  are non-void (non-null). The notation is the exact one from mathematics. (Mathematical notation sometimes uses a dot in place of the bar, but the bar is clearer, particularly in an OO context where the dot has another use.)

Programming languages should support time-honored notations from mathematics. Reaching this goal has been a driving force in the evolution of Eiffel, but not as a concession to “featurism” (the gratuitous piling up of language feature upon feature). The language must remain simple and consistent; any new feature must find its logical place in the overall edifice.

The design of programming languages is a constant search for the right balance between rigor, simplicity, consistency, formal understanding, preservation of existing code, innovation and expressiveness. The design of Eiffel has understood the last of these criteria as implying support for established notations from mathematics, not through feature accumulation but by re-interpreting these notations in terms of the language’s fundamental concepts. A typical example is the re-interpretation of the standard mathematical notation a + b as as simply an operator-based form for the object-oriented call a.plus (b), obtained by declaring “+” as an operator alias for the function plus in the relevant classes. There are many more such cases in today’s Eiffel. Quantifier expressions using and  are the latest example.

 They are not a one-of-a-kind trick but just as a different syntax form for loops. Expressed in a more verbose form, the only one previously available, [1] would be:

across triples_from is tf all tf /= Void end                         — [2]

It is interesting to walk back the history further. [2] is itself a simplification of

across triples_from as tf all tf.item /= Void end               — [3]

where the “.item” has a good reason for being there, but that reason is irrelevant to a beginner. The earlier use of as in [3] is also the reason for the seemingly bizarre use of is in [2], which is only explainable by the backward compatibility criterion (code exists that uses as , which has a slightly different semantics from is), and will go away. But a few years ago the across loop variant did not exist and you would have had to write the above boolean expressions as

all_non_void (triples_from)

after defining a function

all_non_void (l: LIST [T]): BOOLEAN                                    — [4]
                         — Are all the elements of `l’, if any, non-void?
          local
pos: INTEGER
do
from
pos := l.index
l.start
Result := True
until not Result or l.after loop
l.forth
end
go_ith (pos)
end

The road traveled from [4] to [1] is staggering. As we introduced new notations in the history of Eiffel the reaction of the user community has sometimes been between cautious and negative. With the exception of a couple of quickly discarded ideas (such as the infamous and short-lived “!!” for creation), they were generally adopted widely because they simplify people’s life without adding undue complexity to the language. The key has been to avoid featurism and choose instead to provide two kinds of innovation:

  • Major conceptual additions, which elevate the level of abstraction of the language. A typical introduction was the introduction of agents, which provide the full power of functional programming in an object-oriented context; another was the SCOOP concurrency mechanism. There have been only a few such extensions, all essential.
  • Syntactical variants for existing concepts, allowing more concise forms obtained from traditional mathematical notation. The use of quantifier expressions as in [1] is the latest example.

Complaints of featurism still occasionally happen when people first encounter the new facilities, but they fade away quickly as people start using them. After writing a few expressions such as [1], no one wants to go back to any of the other forms.

These quantifier expressions using and , as well as the “” not-equal sign for what used to be (and still commonly is) written “/=”, rely on Unicode. Eiffel started out when ASCII was the law of the land. (Or 8-bit extended ASCII, which does not help much since the extensions are rendered differently in different locales, i.e. the same 8-bit character code may mean something different on French and Swedish texts.) In recent years, Eiffel has made a quiet transition to full Unicode support. (Such support extends to manifest strings and operators, not to identifiers. The decision, which could be revisited, has been to keep the ASCII-only  policy for identifiers to favor compatible use by programmers regardless of their mother tongues.) The use of Unicode considerably extends the expressive power of the language, in particular for scientific software which can — thanks to Eiffel’s mechanism for defining free operators — rely on advanced mathematical notations.

Unicode is great, but I hear the question: how in the world can we enter the corresponding symbols, since our keyboards are still ASCII plus some extensions?

It would be tedious to have to select from a list of special symbols (as you do when inserting a mathematical symbol in Microsoft Word or, for that matter, as I did when inserting the phrase “ and ” in the preceding paragraph using WordPress).

The answer lies in the interplay between the language and the development environment. EiffelStudio, like other modern IDEs, includes an automatic completion mechanism which lets you enter the beginning of a construct and will take care of filling in the rest. Already useful for complex structures (if you type “if” the tools will create the entire “if then else end” conditional structure for you to fill in), automatic completion will take care of inserting the appropriate Unicode symbols for you. Type for example “across”,  then CTRL-Space to trigger completion, and the choices will include the “∀” and “” forms. You can see below how this works:

across_all

Programming languages can be at the same time simple, easy to learn, consistent, and expressive. Start using quantifiers now!

Acknowledgments to the Ecma Technical Committee on Eiffel and the Eiffel Software team, particularly Alexander Kogtenkov (see his blog post here) and (for the completion mechanism and its animated illustration above) Jocelyn Fiat.

VN:F [1.9.10_1130]
Rating: 10.0/10 (7 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

New video lecture: distances, invariants and recursion

I have started a new series of video lectures, which I call “Meyer’s Object-Oriented Classes” (MOOC). The goal is to share insights I have gained over the years on various aspects of programming and software engineering. Many presentations are focused on one area, such as coding, design, analysis, theoretical computer science (even there you find a division between “Theory A”, i.e. complexity, Turing machines and the like, and “Theory B”, i.e. semantics, type theory etc.), software project management, concurrency… I have an interest in all and try to explain connections.

 

The first lecture describes the edit distance (Levenshtein) algorithm, explains its correctness by introducing the loop invariant, expands on that notion, then shows a recursive version, explores the connection with the original version (it’s the invariant), and probes further into another view of recursive computations, leading to the concept of dynamic programming.

The videos are on YouTube and can be accessed from bertrandmeyer.com/levenshtein. (The general page for all lectures is at bertrandmeyer.com/mooc.)

The lecture is recorded in four segments of about 15 minutes each. In the future I will limit myself to 8-10 minutes. In fact I may record this lecture again; for example it would be better if I had a live audience rather than talking to my screen, and in general the recording is somewhat low-tech, but circumstances command. Also, I will correct a few hiccups (at some point in the recording I notice a typo on a slide and fix it on the fly), but the content will remain the same.

Feedback is of course welcome. I hope to record about a lecture a week from now on.

VN:F [1.9.10_1130]
Rating: 10.0/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

Fan mail

Received this today from a heretofore unknown correspondent (I don’t often check Facebook Messenger but just happened to). Name removed (I am not sure he would want me to identify him), text translated from another language into English.

Hello, thanks for your book “Object-Oriented Software Construction” [read in a translation]. I read it after a horrible failure of a project on which I was a consultant. Another consultant was my technical leader. He was truly insufferable but I appreciated him for one reason: his code! I had never seen such “beautiful” program code; he was using principles of genericity, dynamic binding and others, which were totally unknown to me after the lousy programming education I had received. He had insulted me, telling me that I was no developer at all; I was deeply offended since I could feel that he was right. In spite of his unbearable personality I wanted to learn at his side, but he was far too selfish, seeing me just as a competitor, even if a pathetic one. He had a book on the side of his desk… and it’s that book that enabled me to understand where he had learned all those OO design methods. That book, obviously, was yours, and I acquired a copy for myself. I sincerely think that it should be used as textbook in educational institutions. And I really wanted to thank you for writing it. I hope to become a real developer thanks to you. So, thank you.

Note 1: Thanks to you.

Note 2: There is also the intro programming text, Touch of Class (Amazon page).

Note 3 (to my fan club): You are welcome to take advantage of the ideas and there is actually no compelling requirement to be, in addition, “insufferable”.

VN:F [1.9.10_1130]
Rating: 9.4/10 (15 votes cast)
VN:F [1.9.10_1130]
Rating: +6 (from 6 votes)

Getting a program right, in nine episodes

About this article: it originated as a series of posts on the Communications of the ACM blog. I normally repost such articles here. (Even though copy-paste is usually not good, there are three reasons for this duplication: the readership seems to be largely disjoint; I can use better formatting, since their blog software is more restrictive than WordPress; and it is good to have a single repository for all my articles, including both those who originated on CACM and those who did not.) The series took the form of nine articles, where each of the first few ended with a quiz, to which the next one, published a couple of days later, provided an answer. Since all these answers are now available it would make no sense to use the same scheme, so I am instead publishing the whole thing as a single article  with nine sections, slightly adapted from the original.

I was too lazy so far to collect all the references into a single list, so numbers such as [1] refer to the list at the end of the corresponding section.


A colleague recently asked me to present a short overview of  axiomatic semantics as a guest lecture in one of his courses. I have been teaching courses on software verification for a long time (see e.g. here), so I have plenty of material; but instead of just reusing it, I decided to spend a bit of time on explaining why it is good to have a systematic approach to software verification. Here is the resulting tutorial.


 

1. Introduction and attempt #1

Say “software verification” to software professionals, or computer science students outside of a few elite departments, and most of them will think  “testing”. In a job interview, for example, show a loop-based algorithm to a programmer and ask “how would you verify it?”: most will start talking about devising clever test cases.

Far from me to berate testing [1]; in fact, I have always thought that the inevitable Dijkstra quote about testing — that it can only show the presence of errors, not their absence [2] — which everyone seems to take as an indictment and dismissal of testing (and which its author probably intended that way) is actually a fantastic advertisement for testing: a way to find bugs? Yes! Great! Where do I get it?  But that is not the same as verifying the software, which means attempting to ascertain that it has no bugs.

Until listeners realize that verification cannot just mean testing, the best course material on axiomatic semantics or other proof techniques will not attract any interest. In fact, there is somewhere a video of a talk by the great testing and public-speaking guru James Whittaker where he starts by telling his audience not to worry, this won’t be a standard boring lecture, he will not start talking about loop invariants [3]! (Loop invariants are coming in this article, in fact they are one of its central concepts, but in later sections only, so don’t bring the sleeping bags yet.) I decided to start my lecture by giving an example of what happens when you do not use proper verification. More than one example, in fact, as you will see.

A warning about this article: there is nothing new here. I am using an example from my 1990 book Introduction to the Theory of Programming Languages (exercise 9.12). Going even further back, a 1983 “Programming Pearls” Communications of the ACM article by Jon Bentley [4] addresses the same example with the same basic ideas. Yet almost forty years later these ideas are still not widely known among practitioners. So consider these articles as yet another tutorial on fundamental software engineering stuff.

The tutorial is a quiz. We start with a program text:

from

i := 1 ; j := n              — Result initialized to 0.

until i = j loop

m := (i + j) // 2         — Integer division

if t [m] ≤ x then i := m  else  j := m end

end

if x = t [i] then Result := i end

All variables are of integer type. t is an up-sorted array of integers, indexed from 1 to n . We do not let any notation get between friends. A loop from p until e loop q end executes p then, repeatedly: stops if e (the exit condition) is true, otherwise executes q. (Like {p ; while not e do {q}} in some other notations.) “:=” is assignment, “=” equality testing.  “//” is integer division, e.g. 6 //3 = 7 //3 = 2. Result is the name of a special variable whose final value will be returned by this computation (as part of a function, but we only look at the body). Result is automatically initialized to zero like all integer variables, so if execution does not assign anything to Result the function will return zero.

First question: what is this program trying to do?

OK, this is not the real quiz. I assume you know the answer: it is an attempt at “binary search”, which finds an element in the array, or determines its absence, in a sequence of about log2 (n) steps, rather than n if we were use sequential search.  (Remember we assume the array is sorted.) Result should give us a position where x appears in the array, if it does, and otherwise be zero.

Now for the real quiz: does this program meet this goal?

The answer should be either yes or no. (If no, I am not asking for a correct version, at least not yet, and in any case you can find some in the literature.) The situation is very non-symmetric, we might say Popperian:

  • To justify a no answer it suffices of a single example, a particular array t and a particular value x, for which the program fails to set Result as it should.
  • To justify a yes answer we need to provide a credible argument that for every t and  x the program sets Result as it should.

Notes to section 1

[1] The TAP conference series (Tests And Proofs), which Yuri Gurevich and I started, explores the complementarity between the two approaches.

[2] Dijkstra first published his observation in 1969. He did not need consider the case of infinite input sets: even for a trivial finite program that multiplies two 32-bit integers, the number of cases to be examined, 264, is beyond human reach. More so today with 64-bit integers. Looking at this from a 2020 perspective, we may note that exhaustive testing of a finite set of cases, which Dijkstra dismissed as impossible in practice, is in fact exactly what the respected model checking verification technique does; not on the original program, but on a simplified — abstracted — version precisely designed to keep the number of cases tractable. Dijkstra’s argument remains valid, of course, for  the original program if non-trivial. And model-checking does not get us out of the woods: while we are safe if its “testing” finds no bug, if it does find one we have to ensure that the bug is a property of the original program rather than an artifact of the abstraction process.

[3] It is somewhere on YouTube, although I cannot find it right now.

[4] Jon Bentley: Programming Pearls: Writing Correct Programs, in Communications of the ACM, vol. 26, no. 12, pp. 1040-1045, December 1983, available for example here.


2. Attempt #2

Was program #1 correct? If so it should yield the correct answer. (An answer is correct if either Result is the index in t of an element equal to x, or Result = 0 and x does not appear in t.)

This program is not correct. To prove that it is not correct it suffices of a single example (test case) for which the program does not  “yield the correct answer”. Assume x = 1 and the array t has two elements both equal to zero (n = 2, remember that arrays are indexed from 1):

t = [0   0]

The successive values of the variables and expressions are:

                                            m       i          j            i + j + 1

After initialization:                   1         2                3

i ≠ j, so enter loop:           1       1        2                 6         — First branch of “if” since t [1] ≤ x
— so i gets assigned the value of m

But then neither of the values of i and j has changed, so the loop will repeat its body identically (taking the first branch) forever. It is not even that the program yields an incorrect answer: it does not yield an answer at all!

Note (in reference to the famous Dijkstra quote mentioned in the first article), that while it is common to pit tests against proofs, a test can actually be a proof: a test that fails is a proof that the program is incorrect. As valid as the most complex mathematical proof. It may not be the kind of proof we like most (our customers tend to prefer a guarantee that the program is correct), but it is a proof all right.

We are now ready for the second attempt:

—  Program attempt #2.

from

i := 1 ; j := n

until i = j or Result > 0  loop

m := (i + j) // 2         — Integer division

if t [m] ≤ x then

i := m  + 1

elseif t [m] = x then

Result := m

else                         — In this case t [m] > x

j := m – 1

end

end

Unlike the previous one this version always changes i or j, so we may hope it does not loop forever. It has a nice symmetry between i and j.

Same question as before: does this program meet its goal?


3. Attempt #3

The question about program #2, as about program #1: was: it right?

Again no.  A trivial example disproves it: n = 1, the array t contains a single element t [1] = 0, x = 0. Then the initialization sets both i and j to 1, i = j holds on entry to the loop which stops immediately, but Result is zero whereas it should be 1 (the place where x appears).

Here now is attempt #3, let us see it if fares better:

—  Program attempt #3.

from

i := 1 ; j := n

until i = j loop

m := (i + j + 1) // 2

if t [m] ≤ x then

i := m  + 1

else

j := m

end

end

if 1  ≤ i  and i ≤ n then Result := i end
       — If not, Result remains 0.

What about this one?


3. Attempt #4 (also includes 3′)

The first two program attempts were wrong. What about the third?

I know, you have every right to be upset at me, but the answer is no once more.

Consider a two-element array t = [0 0] (so n = 2, remember that our arrays are indexed from 1 by convention) and a search value x = 1. The successive values of the variables and expressions are:

                                                  m          i          j            i + j + 1

After initialization:                            1        2           4

i ≠ j, so enter loop:               2           3        2          6                  — First branch of “if” since t [2] < x

i ≠ j,  enter loop again:        3           ⚠                                       — Out-of-bounds memory access!
— (trying to access non-existent t [3])

Oops!

Note that we could hope to get rid of the array overflow by initializing i to 0 rather than 1. This variant (version #3′) is left as a bonus question to the patient reader. (Hint: it is also not correct. Find a counter-example.)

OK, this has to end at some point. What about the following version (#4): is it right?

—  Program attempt #4.

from

i := 0 ; j := n + 1

until i = j loop

m := (i + j) // 2

if t [m] ≤ x then

i := m  + 1

else

j := m

end

end

if 1 ≤ i  and i ≤ n then Result := i end


5. Attempt #5

Yes, I know, this is dragging on. But that’s part of the idea: witnessing how hard it is to get a program right if you just judging by the seat of your pants. Maybe we can get it right this time?

Are we there yet? Is program attempt #4 finally correct?

Sorry to disappoint, but no. Consider a two-element array t = [0 0], so n = 2, and a search value x = 1 (yes, same counter-example as last time, although here we could also use x = 0). The successive values of the variables and expressions are:

                                                 m          i          j            i + j

After initialization:                           0        3           3

i ≠ j, so enter loop:               1           2       3          5            — First branch of “if

i ≠ j, enter loop again:         2         3        3         6            — First branch again

i = j, exit loop

The condition of the final “if” is true, so Result gets the value 3. This is quite wrong, since there is no element at position 3, and in any case x does not appear in t.

But we are so close! Something like this should work, should it not?

So patience, patience, let us tweak it just one trifle more, OK?

—  Program attempt #5.

from

i := 1 ; j := n + 1

until i ≥ j or Result > 0 loop

m := (i + j) // 2

if t [m] < x then

i := m + 1

elseif  t [m] > x then

j := m

else

Result := m

end

end

Does it work now?


6. Attempt #6

The question about program #5  was the same as before: is it right, is it wrong?

Well, I know you are growing more upset at me with each section, but the answer is still that this program is wrong. But the way it is wrong is somewhat specific; and it applies, in fact, to all previous variants as well.

This particular wrongness (fancy word for “bug”) has a history. As I pointed out in the first article, there is a long tradition of using binary search to illustrate software correctness issues. A number of versions were published and proved correct, including one in the justly admired Programming Pearls series by Jon Bentley. Then in 2006 Joshua Bloch, then at Google, published a now legendary blog article [2] which showed that all these versions suffered from a major flaw: to obtain m, the approximate mid-point between i and j, they compute

(i + j) // 2

which, working on computer integers rather than mathematical integers, might overflow! This in a situation in which both i and j, and hence m as well, are well within the range of the computer’s representable integers, 2-n to 2n (give or take 1) where n is typically 31 or, these days, 63, so that there is no conceptual justification for the overflow.

In the specification that I have used for this article, i starts at 1, so the problem will only arise for an array that occupies half of the memory or more, which is a rather extreme case (but still should be handled properly). In the general case, it is often useful to use arrays with arbitrary bounds (as in Eiffel), so we can have even a small array, with high indices, for which the computation will produce an overflow and bad results.

The Bloch gotcha is a stark reminder that in considering the correctness of programs we must include all relevant aspects and consider programs as they are executed on a real computer, not as we wish they were executed in an ideal model world.

(Note that Jon Bentley alluded to this requirement in his original article: while he did not explicitly mention integer overflow, he felt it necessary to complement his proof by the comment that that  “As laborious as our proof of binary search was, it is still unfinished by some standards. How would you prove that the program is free of runtime errors (such as division by zero, word overflow, or array indices out of bounds)?” Prescient words!)

It is easy to correct the potential arithmetic overflow bug: instead of (i + j) // 2, Bloch suggested we compute the average as

i + (j – i) // 2

which is the same from a mathematician’s viewpoint, and indeed will compute the same value if both variants compute one, but will not overflow if both i and j are within range.

So we are ready for version 6, which is the same as version 5 save for that single change:

—  Program attempt #6.

from

i := 1 ; j := n + 1

until i ≥ j or Result > 0 loop

m := i + (j – i) // 2

if t [m] < x then

i := m + 1

elseif  t [m] > x then

j := m

else

Result := m

end

end

Now is probably the right time to recall the words by which Donald Knuth introduces binary search in the original 1973 tome on Sorting and Searching of his seminal book series The Art of Computer Programming:knuth

Although the basic idea of binary search is comparatively straightforward, the details can be somewhat tricky, and many good programmers have done it wrong the first few times they tried.

Do you need more convincing? Be careful what you answer, I have more variants up my sleeve and can come up with many more almost-right-but-actually-wrong program attempts if you nudge me. But OK, even the best things have an end. This is not the last section yet, but that was the last program attempt. To the naturally following next question in this running quiz,  “is version 6 right or wrong”, I can provide the answer: it is, to the best of my knowledge, a correct program. Yes! [3].

But the quiz continues. Since answers to the previous questions were all  that the programs were not correct, it sufficed in each case to find one case for which the program did not behave as expected. Our next question is of a different nature: can you find an argument why version #6 is correct?

References for section 6

[1] (In particular) Jon Bentley: Programming Pearls — Writing Correct Programs, in Communications of the ACM, vol. 26, no. 12, December 1983, pages 1040-1045, available here.

[2] Joshua Bloch: Extra, Extra — Read All About It: Nearly All Binary Searches and Mergesorts are Broken, blog post, on the Google AI Blog, 2 June 2006, available here.

[3] A caveat: the program is correct barring any typos or copy-paste errors — I am starting from rigorously verified programs (see the next posts), but the blogging system’s UI and text processing facilities are not the best possible for entering precise technical text such as code. However carefully I check, I cannot rule out a clerical mistake, which of course would be corrected as soon as it is identified.


7. Using a program prover

Preceding sections presented candidate binary search algorithms and asked whether they are correct. “Correct” means something quite precise: that for an array t and a value x, the final value of the variable Result is a valid index of t (that is to say, is between 1 and n, the size of t) if and only if x appears at that index in t.

The last section boldly stated that program attempt #6 was correct. The question was: why?

In the case of the preceding versions, which were incorrect, you could prove that property, and I do mean prove, simply by exhibiting a single counter-example: a single t and x for which the program does not correctly set Result. Now that I asserting the program to be correct, one example, or a million examples, do not suffice. In fact they are almost irrelevant. Test as much as you like and get correct results every time, you cannot get rid of the gnawing fear that if you had just tested one more time after the millionth test you would have produced a failure. Since the set of possible tests is infinite there is no solution in sight [1].

We need a proof.

I am going to explain that proof in the next section, but before that I would like to give you an opportunity to look at the proof by yourself. I wrote in one of the earlier articles that most of what I have to say was already present in Jon Bentley’s 1983 Programming Pearls contribution [2], but a dramatic change did occur in the four decades since: the appearance of automated proof system that can handle significant, realistic programs. One such system, AutoProof, was developed at the Chair of Software engineering at ETH Zurich [3] (key project members were Carlo Furia, Martin Nordio, Nadia Polikarpova and Julian Tschannen, with initial contributions by Bernd Schoeller) on the basis of the Boogie proof technology from Microsoft Research).

AutoProof is available for online use, and it turns out that one of the basic tutorial examples is binary search. You can go to the corresponding page and run the proof.

I am going to let you try this out (and, if you are curious, other online AutoProof examples as well) without too many explanations; those will come in the next section. Let me simply name the basic proof technique: loop invariant. A loop invariant is a property INV associated with a loop, such that:

  • A. After the loop’s initialization, INV will hold.
  • B. One execution of the loop’s body, if started with INV satisfied (and the loop’s exit condition not satisfied, otherwise we wouldn’t be executing the body!), satisfies INV again when it terminates.

This idea is of course the same as that of a proof by induction in mathematics: the initialization corresponds to the base step (proving that P (0) holds) and the body property to the induction step (proving that from P (n) follows P (n + 1). With a traditional induction proof we deduce that the property (P (n)) holds for all integers. For the loop, we deduce that when the loop finishes its execution:

  • The invariant still holds, since executing the loop means executing the initialization once then the loop body zero or more times.
  • And of course the exit condition also holds, since otherwise we would still be looping.

That is how we prove the correctness of a loop: the conjunction of the invariant and the exit condition must yield the property that we seek (in the example, the property, stated above of Result relative to t and x).

We also need to prove that the loop does terminate. This part involves another concept, the loop’s variant, which I will explain in the next section.

For the moment I will not say anything more and let you look at the AutoProof example page (again, you will find it here), run the verification, and read the invariant and other formal elements in the code.

To “run the verification” just click the Verify button on the page. Let me emphasize (and emphasize again and again and again) that clicking Verify will not run the code. There is no execution engine in AutoProof, and the verification does not use any test cases. It processes the text of the program as it appears on the page and below. It applies mathematical techniques to perform the proof; the core property to be proved is that the proposed loop invariant is indeed invariant (i.e. satisfies properties A and B above).

The program being proved on the AutoProof example page is version #6 from the last section, with different variable names. So far for brevity I have used short names such as i, j and m but the program on the AutoProof site applies good naming practices with variables called low, up, middle and the like. So here is that version again with the new variable names:

—  Program attempt #7  (identical to #6 with different variable names) .

from

low := 0 ; up := n

until low ≥ up or Result > 0 loop

middle := low + ((up – low) // 2)

if a [middle] < value then      — The array is now called a rather than t

low := middle + 1

elseif  a [middle] > value then

up := middle

else

Result := middle

end

end

This is exactly the algorithm text on the AutoProof page, the one that you are invited to let AutoProof verify for you. I wrote “algorithm text” rather than “program text” because the actual program text (in Eiffel) includes variant and invariant clauses which do not affect the program’s execution but make the proof possible.

Whether or not these concepts (invariant, variant, program proof) are completely new to you, do try the prover and take a look at the proof-supporting clauses. In the next article I will remove any remaining mystery.

Note and references for section 7

[1] Technically the set of possible [array, value] pairs is finite, but of a size defying human abilities. As I pointed out in the first section, the “model checking” and “abstract interpretation” verification techniques actually attempt to perform an exhaustive test anyway, after drastically reducing the size of the search space. That will be for some other article.

[2]  Jon Bentley: Programming Pearls: Writing Correct Programs, in Communications of the ACM, vol. 26, no. 12, pp. 1040-1045, December 1983, available for example here.

[3] The AutoProof page contains documentations and numerous article references.


8. Understanding the proof

The previous section invited you to run the verification on the AutoProof tutorial page dedicated to the example. AutoProof is an automated proof system for programs. This is just a matter of clicking  “Verify”, but more importantly, you should read the annotations added to the program text, particularly the loop invariant, which make the verification possible. (To avoid any confusion let me emphasize once more that clicking “Verify” does not run the program, and that no test cases are used; the effect is to run the verifier, which attempts to prove the correctness of the program by working solely on the program text.)

Here is the program text again, reverting for brevity to the shorter identifiers (the version on the AutoProof page has more expressive ones):

from

i := 1 ; j := n + 1

until i ≥ j or Result > 0 loop

m := i + (j – i) // 2

if t [m] < x then

i := m + 1

elseif  t [m] > x then

j := m

else

Result := m

end

end

Let us now see what makes the proof possible. The key property is the loop invariant, which reads

A:   1  ≤ i  ≤ j  ≤ n + 1
B:   0  ≤ Result  ≤ n
C:   ∀ k: 1 .. i –1  |  t [k] < x
D:   ∀ k: j .. n  |  t [k] > x
E:    (Result > 0)   ⇒   (t [Result] = x)

The notation is slightly different on the Web page to adapt to the Eiffel language as it existed at the time it was produced; in today’s Eiffel you can write the invariant almost as shown above. Long live Unicode, allowing us to use symbols such as (obtained not by typing them but by using smart completion, e.g. you start typing “forall” and you can select the symbol that pops up), for  “implies” and many others

Remember that the invariant has to be established by the loop’s initialization and preserved by every iteration. The role of each of its clauses is as follows:

  • A: keep the indices in range.
  • B: keep the variable Result, whose final value will be returned by the function, in range.
  • C and D: eliminate index intervals in which we have determined that the sought value, x, does not appear. Before i, array values are smaller; starting at j, they are greater. So these two intervals, 1..i and j..n, cannot contain the sought value. The overall idea of the algorithm (and most other search algorithms) is to extend one of these two intervals, so as to narrow down the remaining part of 1..n where x may appear.
  • E: express that as soon as we find a positive (non-zero) Result, its value is an index in the array (see B) where x does appear.

Why is this invariant useful? The answer is that on exit it gives us what we want from the algorithm. The exit condition, recalled above, is

i ≥ j or Result > 0

Combined with the invariant, it tells us that on exit one of the following will hold:

  • Result > 0, but then because of E we know that x appears at position Result.
  • i < j, but then A,  C and D  imply that x does not appear anywhere in t. In that case it cannot be true that Result > 0, but then because of B Result must be zero.

What AutoProof proves, mechanically, is that under the function’s precondition (that the array is sorted):

  • The initialization ensures the invariant.
  • The loop body, assuming that the invariant is satisfied but the exit condition is not, ensures the loop invariant again after it executes.
  • The combination of the invariant and the exit condition ensures, as just explained, the postcondition of the function (the property that Result will either be positive and the index of an element equal to x, or zero with the guarantee that x appears nowhere in t).

Such a proof guarantees the correctness of the program if it terminates. We (and AutoProof) must prove separately that it does terminate. The technique is simple: find a “loop variant”, an integer quantity v  which remains non-negative throughout the loop (in other words, the loop invariant includes or implies v ≥ 0) and decreases on each iteration, so that the loop cannot continue executing forever. An obvious variant here is j – i + 1 (where the + 1 is needed because j – i may go down to -1 on the last iteration if x does not appear in the array). It reflects the informal idea of the algorithm: repeatedly decrease an interval i .. j – 1 (initially, 1 .. n) guaranteed to be such that x appears in t if and only if it appears at an index in that interval. At the end, either we already found x or the interval is empty, implying that x does not appear at all.

A great reference on variants and the techniques for proving program termination is a Communications of the ACM article of 2011: [3].

The variant gives an upper bound on the number of iterations that remain at any time. In sequential search, j – i + 1 would be our best bet; but for binary search it is easy to show that  log(j – i + 1) is also a variant, extending the proof of correctness with a proof of performance (the key goal of binary search being to ensure a logarithmic rather than linear execution time).

This example is, I hope, enough to highlight the crucial role of loop invariants and loop variants in reasoning about loops. How did we get the invariant? It looks like I pulled it out of a hat. But in fact if we go the other way round (as advocated in classic books [1] [2]) and develop the invariant and the loop together the process unfolds itself naturally and there is nothing mysterious about the invariant.

Here I cannot resist quoting (thirty years on!) from my own book Introduction to the Theory of Programming Languages [4]. It has a chapter on axiomatic semantics (also known as Hoare logic, the basis for the ideas used in this discussion), which I just made available: see here [5]. Its exercise 9.12 is the starting point for this series of articles. Here is how the book explains how to design the program and the invariant [6]:

In the general case [of search, binary or not] we aim for a loop body of the form

m := ‘‘Some value in 1.. n such that i ≤ m < j’’;

if t [m] ≤ x then

i := m + 1

else

j := m

end

It is essential to get all the details right (and easy to get some wrong):

  • The instruction must always decrease the variant j – i, by increasing i or decreasing j. If the the definition of m specified just m ≤ j rather than m < j, the second branch would not meet this goal.
  •  This does not transpose directly to i: requiring i < m < j would lead to an impossibility when j – i is equal to 1. So we accept i ≤ m but then we must take m + 1, not m, as the new value of i in the first branch.
  •  The conditional’s guards are tests on t [m], so m must always be in the interval 1 . . n. This follows from the clause 0 ≤ i ≤ j ≤ n + 1 which is part of the invariant.
  •  If this clause is satisfied, then m ≤ n and m > 0, so the conditional instruction indeed leaves this clause invariant.
  • You are invited to check that both branches of the conditional also preserve the rest of the invariant.
  • Any policy for choosing m is acceptable if it conforms to the above scheme. Two simple choices are i  and j – 1; they lead to variants of the sequential search algorithm [which the book discussed just before binary search].

For binary search, m will be roughly equal to the average of i and j.

“Roughly” because we need an integer, hence the // (integer division).

In the last section, I will reflect further on the lessons we can draw from this example, and the practical significance of the key concept of invariant.

References and notes for section 8

[1] E.W. Dijkstra: A Discipline of Programming, Prentice Hall, 1976.

[2] David Gries: The Science of Programming, Springer, 1989.

[3] Byron Cook, Andreas  Podelski and Andrey Rybalchenko: Proving program termination, in Communications of the ACM, vol. 54, no. 11, May 2011, pages 88-98, available here.

[4] Bertrand Meyer, Introduction to the Theory of Programming Languages, Prentice Hall, 1990. The book is out of print but can be found used, e.g. on Amazon. See the next entry for an electronic version of two chapters.

[5] Bertrand Meyer Axiomatic semantics, chapter 9 from [3], available here. Note that the PDF was reconstructed from an old text-processing system (troff); the figures could not be recreated and are missing. (One of these days I might have the patience of scanning them from a book copy and adding them. Unless someone wants to help.) I also put online, with the same caveat, chapter 2 on notations and mathematical basis: see here.

[6] Page 383 of [4] and [5]. The text is verbatim except a slight adaptation of the programming notation and a replacement of the variables: i in the book corresponds to i – 1 here, and j to j – 1. As a matter of fact I prefer the original conventions from the book (purely as a matter of taste, since the two are rigorously equivalent), but I changed here to the conventions of the program as it appears in the AutoProof page, with the obvious advantage that you can verify it mechanically. The text extract is otherwise exactly as in the 1990 book.

9. Lessons learned

What was this journey about?

We started with a succession of attempts that might have “felt right” but were in fact all wrong, each in its own way: giving the wrong answer in some cases, crashing (by trying to access an array outside of its index interval) in some cases, looping forever in some cases. Always “in some cases”,  evidencing the limits of testing, which can never guarantee that it exercises all the problem cases. A correct program is one that works in all cases. The final version was correct; you were able to prove its correctness with an online tool and then to understand (I hope) what lies behind that proof.

To show how to prove such correctness properties, I have referred throughout the series to publications from the 1990s (my own Introduction to The Theory of Programming Languages), the 1980s (Jon Bentley’s Programming Pearls columns, Gries’s Science of Programming), and even the 1970s (Dijkstra’s Discipline of Programming). I noted that the essence of my argument appeared in a different form in one of Bentley’s Communications articles. What is the same and what has changed?

The core concepts have been known for a long time and remain applicable: assertion, invariant, variant and a few others, although they are much better understood today thanks to decades of theoretical work to solidify the foundation. Termination also has a more satisfactory theory.

On the practical side, however, the progress has been momentous. Considerable engineering has gone into making sure that the techniques scaled up. At the time of Bentley’s article, binary search was typical of the kind of programs that could be proved correct, and the proof had to proceed manually. Today, we can tackle much bigger programs, and use tools to perform the verification.

Choosing binary search again as an example today has the obvious advantage that everyone can understand all the details, but should not be construed as representative of the state of the art. Today’s proof systems are far more sophisticated. Entire operating systems, for example, have been mechanically (that is to say, through a software tool) proved correct. In the AutoProof case, a major achievement was the proof of correctness [1] of an entire data structure (collections) library, EiffelBase 2. In that case, the challenge was not so much size (about 8,000 source lines of code), but the complexity of both:

  • The scope of the verification, involving the full range of mechanisms of a modern object-oriented programming language, with classes,  inheritance (single and multiple), polymorphism, dynamic binding, generics, exception handling etc.
  • The code itself, using sophisticated data structures and algorithms, involving in particular advanced pointer manipulations.

In both cases, progress has required advances on both the science and engineering sides. For example, the early work on program verification assumed a bare-bones programming language, with assignments, conditionals, loops, routines, and not much more. But real programs use many other constructs, growing ever richer as programming languages develop. To cover exception handling in AutoProof required both theoretical modeling of this construct (which appeared in [2]) and implementation work.

More generally, scaling up verification capabilities from the small examples of 30 years ago to the sophisticated software that can be verified today required the considerable effort of an entire community. AutoProof, for example, sits at the top of a tool stack relying on the Boogie environment from Microsoft Research, itself relying on the Z3 theorem prover. Many person-decades of work make the result possible.

tool_stack

Beyond the tools, the concepts are esssential. One of them, loop invariants, has been illustrated in the final version of our program. I noted in the first article the example of a well-known expert and speaker on testing who found no better way to announce that a video would not be boring than  “relax, we are not going to talk about loop invariants.” Funny perhaps, but unfair. Loop invariants are one of the most beautiful concepts of computer science. Not so surprisingly, because loop invariants are the application to programming of the concept of mathematical induction. According to the great mathematician Henri Poincaré, all of mathematics rests on induction; maybe he exaggerated, maybe not, but who would think of teaching mathematics without explaining induction? Teaching programming without explaining loop invariants is no better.

Below is an illustration (if you will accept my psychedelic diagram) of what a loop is about, as a problem-solving technique. Sometimes we can get the solution directly. Sometimes we identify several steps to the solution; then we use a sequence (A ; B; C). Sometimes we can find two (or more) different ways of solving the problem in different cases; then we use a conditional (if c then A else B end). And sometimes we can only get a solution by getting closer repeatedly, not necessarily knowing in advance how many times we will have to advance towards it; then, we use a loop.

loop_strategy

We identify an often large (i.e. very general) area where we know the solution will lie; we call that area the loop invariant. The solution or solutions (there may be more than one) will have to satisfy a certain condition; we call it the exit condition. From wherever we are, we shoot into the invariant region, using an appropriate operation; we call it the initialization. Then we execute as many times as needed (maybe zero if our first shot was lucky) an operation that gets us closer to that goal; we call it the loop body. To guarantee termination, we must have some kind of upper bound of the distance to the goal, decreasing each time discretely; we call it the loop variant.

This explanation is only an illustration, but I hope it makes the ideas intuitive. The key to a loop is its invariant. As the figure suggests, the invariant is always a generalization of the goal. For example, in binary search (and many other search algorithms, such as sequential search), our goal is to find a position where either x appears or, if it does not, we can be sure that it appears nowhere. The invariant says that we have an interval with the same properties (either x appears at a position belonging to that interval or, if it does not, it appears nowhere). It obviously includes the goal as a special case: if the interval has length 1, it defines a single position.

An invariant should be:

  1. Strong enough that we can devise an exit condition which in the end, combined with the invariant, gives us the goal we seek (a solution).
  2. Weak enough that we can devise an initialization that ensures it (by shooting into the yellow area) easily.
  3. Tuned so that we can devise a loop body that, from a state satifying the invariant, gets us to a new one that is closer to the goal.

In the example:

  1. The exit condition is simply that the interval’s length is 1. (Technically, that we have computed Result as the single interval element.) Then from the invariant and the exit condition, we get the goal we want.
  2. Initialization is easy, since we can just take the initial interval to be the whole index range of the array, which trivially satisfies the invariant.
  3. The loop body simply decreases the length of the interval (which can serve as loop variant to ensure termination). How we decrease the length depends on the search strategy; in sequential search, each iteration decreases the length by 1, correct although not fast, and binary search decreases it by about half.

The general scheme always applies. Every loop algorithm is characterized by an invariant. The invariant may be called the DNA of the algorithm.

To demonstrate the relevance of this principle, my colleagues Furia, Velder, and I published a survey paper [6] in ACM Computing Surveys describing the invariants of important algorithms in many areas of computer science, from search algorithms to sorting (all major algorithms), arithmetic (long integer addition, squaring), optimization and dynamic programming  (Knapsack, Levenshtein/Edit distance), computational geometry (rotating calipers), Web (Page Rank)… I find it pleasurable and rewarding to go deeper into the basis of loop algorithms and understand their invariants; like a geologist who does not stop at admiring the mountain, but gets to understand how it came to be.

Such techniques are inevitable if we want to get our programs right, the topic of this article. Even putting aside the Bloch average-computation overflow issue, I started with 5 program attempts, all kind of friendly-looking but wrong in different ways. I could have continued fiddling with the details, following my gut feeling to fix the flaws and running more and more tests. Such an approach can be reasonable in some cases (if you have an algorithm covering a well-known and small set of cases), but will not work for non-trivial algorithms.

Newcomers to the concept of loop invariant sometimes panic: “this is all fine, you gave me the invariants in your examples, how do I find my own invariants for my own loops?” I do not have a magic  recipe (nor does anyone else), but there is no reason to be scared. Once you have understood the concept and examined enough examples (just a few of those in [6] should be enough), writing the invariant at the same time as you are devising a loop will come as a second nature to you.

As the fumbling attempts in the first few sections should show, there is not much of an alternative. Try this approach. If you are reaching these final lines after reading what preceded them, allow me to thank you for your patience, and to hope that this rather long chain of reflections on verification will have brought you some new insights into the fascinating challenge of writing correct programs.

References

[1] Nadia Polikarpova, Julian Tschannen, and Carlo A. Furia: A Fully Verified Container Library, in Proceedings of 20th International Symposium on Formal Methods (FM 15), 2015. (Best paper award.)

[2] Martin Nordio, Cristiano Calcagno, Peter Müller and Bertrand Meyer: A Sound and Complete Program Logic for Eiffel, in Proceedings of TOOLS 2009 (Technology of Object-Oriented Languages and Systems), Zurich, June-July 2009, eds. M. Oriol and B. Meyer, Springer LNBIP 33, June 2009.

[3] Boogie page at MSR, see here for publications and other information.

[4] Z3 was also originally from MSR and has been open-sourced, one can get access to publications and other information from  its Wikipedia page.

[5] Carlo Furia, Bertrand Meyer and Sergey Velder: Loop invariants: Analysis, Classification and Examples, in ACM Computing Surveys, vol. 46, no. 3, February 2014. Available here.

[6] Dynamic programming is a form of recursion removal, turning a recursive algorithm into an iterative one by using techniques known as “memoization” and  “bottom-up computation” (Berry). In this transformation, the invariant plays a key role. I will try to write this up some day as it is a truly elegant and illuminating explanation.

VN:F [1.9.10_1130]
Rating: 10.0/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

LASER 2020 in Elba Island: DevOps, Microservices and more, first week of June

The page for the 2020 LASER summer school (31 May to 7 June) now has the basic elements (some additions still forthcoming) and registration at the early price is open. The topic is DevOps, Microservices and Software Development for the Age of the Web with both conceptual lectures and contributions from industry, by technology leaders from Amazon, Facebook and ServiceNow. The confirmed speakers are:

  • Fabio Casati, ServiceNow and University of Trento, and Kannan Govindarajan from ServiceNow on Taking AI from research to production – at scale.
  • Adrian Cockcroft, Amazon Web Services, on Building and Operating Modern Applications.
  • Elisabetta Di Nitto, Politecnico di Milano.
  • Valérie Issarny, INRIA, on The Web for the age of the IoT.
  • Erik Meijer, Facebook, on Software Development At Scale.
  • Me, on Software from beginning to end: a comprehensive method.

As always, the setup is the incomparable environment of the Hotel del Golfo in Procchio, Elba Island off the coast of Tuscany, ideal at that time of year (normally good weather, warm but not hot, few tourists). The school is intensive but there is time to enjoy the beach, the hotel’s amenities and the wonderful of environment of Elba (wake up your inner Napoleon). The school has a fairly small size and everyone lives under the same (beautiful) roof, so there is plenty of time for interaction with the speakers and other participants.

About these participants: the school is intended for engineers and managers in industry as well as researchers and PhD student. In fact it’s a mix that one doesn’t find that often, allowing for much cross-learning.

Another way to put it is that this is now the 16th edition of the school (it started in 2004 but we skipped one year), so it cannot be doing everything wrong.

 

VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Are my requirements complete?

Some important concepts of software engineering, established over the years, are not widely known in the community. One use of this blog is to provide tutorials on such overlooked ideas. An earlier article covered one pertaining to project management: the Shortest Possible Schedule property . Here is another, this time in the area of requirements engineering, also based on a publication that I consider to be a classic (it is over 40 years old) but almost unknown to practitioners.

Practitioners are indeed, as in most of my articles, the intended audience. I emphasize this point right at the start because if you glance at the rest of the text you will see that it contains (horror of horrors) some mathematical formulae, and might think “this is not for me”. It is! The mathematics is very simple and my aim is practical: to shed light on an eternal question that faces anyone writing requirements (whatever the style, traditional or agile): how can I be sure that a requirements specification is complete?

To a certain extent you cannot. But there is better answer, a remarkably simple one which, while partial, helps.

Defining completeness

The better answer is called “sufficient completeness” and comes from the theory of abstract data types. It was introduced in a 1978 article by Guttag and Horning [1]. It is also implicit in a more down-to-earth document, the 1998 IEEE standard on how to write requirements [2].

There is nothing really new in the present article; in fact my book Object-Oriented Software Construction [3] contains an extensive discussion of sufficient completeness (meant to be more broadly accessible than Guttag and Horning’s scholarly article). But few people know the concepts; in particular very few practitioners have heard of sufficient completeness (if they have heard at all of abstract data types). So I hope the present introduction will be useful.

The reason the question of determining completeness of requirements seems hopeless at first is the natural reaction: complete with respect to what? To know that the specification is complete we would need a more general description of all that our stakeholders want and all the environment constraints, but this would only push the problem further: how do we know that such description itself is complete?

That objection is correct in principle: we can never be sure that we did not forget something someone wanted, or some property that the environment imposes. But there also exist more concrete and assessable notions of completeness.

The IEEE standard gives three criteria of completeness. The first states that “all requirements” have been included, and is useless, since it  runs into the logical paradox mentioned above, and is tautological anyway (the requirements are complete if they include all requirements, thank you for the information!). The second is meaningful but of limited interest (a “bureaucratic” notion of completeness): every element in the requirements document is numbered, every cross-reference is defined and so on. The last criterion is the interesting one: “Definition of the responses of the software to all realizable classes of input data in all realizable classes of situations”. Now this is meaningful. To understand this clause we need to step back to sufficient completeness and, even before that, to abstract data types.

Abstract data types will provide our little mathematical excursion (our formal picnic in the words of an earlier article) in our study of requirements and completeness. If you are not familiar with this simple mathematical theory, which every software practitioner should know, I hope you will benefit from the introduction and example. They will enable us to introduce the notion of sufficient completeness formally before we come back to its application to requirements engineering.

Specifying an abstract data type

 Abstract data types are the mathematical basis for object-oriented programming. In fact, OO programming but also OO analysis and OO design are just a realization of this mathematical concept at various levels of abstraction, even if few OO practitioners are aware of it. (Renewed reference to [3] here if you want to know more.)

An ADT (abstract data type) is a set of objects characterized not by their internal properties (what they are) but by the operations applicable to them (what they have), and the properties of these operations. If you are familiar with OO programming you will recognize that this is exactly, at the implementation level, what a class is. But here we are talking about mathematical objects and we do not need to consider implementation.

An example  of a type defined in this way, as an ADT, is a notion of POINT on a line. We do not say how this object is represented (a concept that is irrelevant at the specification level) but how it appears to the rest of the world: we can create a new point at the origin, ask for the coordinate of a point, or move the point by a certain displacement. The example is the simplest meaningful one possible, but it gives the ideas.

adt

An ADT specification has three part: Functions, Preconditions and Axioms. Let us see them (skipping Preconditions for the moment) for the definition of the POINT abstract data type.

The functions are the operations that characterize the type. There are three kinds of function, defined by where the ADT under definition, here POINT, appears:

  • Creators, where the type appears only among the results.
  • Queries, where it appears only among the arguments.
  • Commands, where it appears on both sides.

There is only one creator here:

new: → POINT

new is a function that takes no argument, and yields a point (the origin). We will write the result as just new (rather than using empty parentheses as in new ()).

Creators correspond in OO programming to constructors of a class (creation procedures in Eiffel). Like constructors, creators may have arguments: for example instead of always creating a point at the origin we could decide that new creates a point with a given coordinate, specifying it as INTEGER → POINT and using it as new (i) for some integer i (our points will have integer coordinates). Here for simplicity we choose a creator without arguments. In any case the new type, here POINT, appears only on the side of the results.

Every useful ADT specification needs at least one creator, without which we would never obtain any objects of the type (here any points) to work with.

There is also only one query:

x: POINT → INTEGER

 which gives us the position of a point, written x (p) for a point p. More generally, a query enables us to obtain properties of objects of the new type. These properties must be expressed in terms of types that we have already defined, like INTEGER here. Again there has to be at least one query, otherwise we could never obtain usable information (information expressed in terms of what we already know) about objects of the new type. In OO programming, queries correspond to fields (attributes) of a class and functions without side effects.

And we also have just one command:

move: POINT × INTEGER → POINT

a function that for any point p and integer i and yields a new point, move (p, i).  Again an ADT specification is not interesting unless it has at least one command, representing ways to modify objects. (In mathematics we do not actually modify objects, we get new objects. In imperative programming we will actually update existing objects.) In the classes of object-oriented programming, commands correspond to procedures (methods which may change objects).

You see the idea: define the notion of POINT through the applicable operations.

Listing their names and the types of their arguments types results (as in POINT × INTEGER → POINT) is not quite enough to specify these operations: we must specify their fundamental properties, without of course resorting to a programming implementation. That is the role of the second component of an ADT specification, the axioms.

For example I wrote above that new yields the origin, the point for which x = 0,  but you only had my word for it. My word is good but not good enough. An axiom will give you this property unambiguously:

x (new) = 0                                    — A0

The second axiom, which is also the last, tells us what move actually does. It applies to any point p and any integer m:

x (move (p, m)) = x (p) + m       — A1

In words: the coordinate of the point resulting from moving p by m is the coordinate of p plus m.

That’s it! (Except for the notion of precondition, which will wait a bit.) The example is trivial but this approach can be applied to any number of  data types, with any number of applicable operations and any level of complexity. That is what we do, at the design and implementation level, when writing classes in OO programming.

Is my ADT sufficiently complete?

Sufficient completeness is a property that we can assess on such specifications. An ADT specification for a type T (here POINT) is sufficiently complete if the axioms are powerful enough to yield the value of any well-formed query expression in a form not involving T. This definition contains a few new terms but the concepts are very simple; I will explain what it means through an example.

With an ADT specification we can form all kinds of expressions, representing arbitrarily complex specifications. For example:

x (move (move (move (new, 3), x (move (move (new, -2), 4))), -6))

This expression will yield an integer (since function x has INTEGER as its result type) describing the result of a computation with points. We can visualize this computation graphically; note that it involves creating two points (since there are two occurrences of new) and moving them, using in one case the current coordinate of one of them as displacement for the other. The following figure illustrates the process.

computation

The result, obtained informally by drawing this picture, is the x of P5, that is to say -1. We will derive it mathematically below.

Alternatively, if like most programmers (and many other people) you find it more intuitive to reason operationally than mathematically, you may think of the previous expression as describing the result of the following OO program (with variables of type POINT):

create p                                — In C++/Java syntax: p = new POINT();
create q
p.move (3)
q.move (-2)
q.move (4)
p.move (q.x)
p.move (-6)

Result := p.x

You can run this program in your favorite OO programming language, using a class POINT with new, x and move, and print the value of Result, which will be -1.

Here, however, we will stay at the mathematical level and simplify the expression using the axioms of the ADT, the same way we would compute any other mathematical formula, applying the rules without needing to rely on intuition or operational reasoning. Here is the expression again (let’s call it i, of type INTEGER):

ix (move (move (move (new, 3), x (move (move (new, -2), 4))), -6))

A query expression is one in which the outermost function being applied, here x, is a query function. Remember that a query function is one which the new type, here POINT, appears only on the left. This is the case with x, so the above expression i is indeed a query expression.

For sufficient completeness, query expressions are the ones of interest because their value is expressed in terms of things we already know, like INTEGERs, so they are the only way we can concretely obtain directly usable information the ADT (to de-abstract it, so to speak).

But we can only get such a value by applying the axioms. So the axioms are “sufficiently complete” if they always give us the answer: the value of any such query expression.

 Let us see if the above expression i satisfies this condition of sufficient completeness. To make it more tractable let us write  it in terms of simpler expressions (all of type POINT), as illustrated by the figure below:

p1 = move (new, 3)
p2= move (new, -2)
p3= move (p2, 4)
p4= move (p1, x (p3))
p5= move (p4, -6)
i = x (p5)

expression

(You may note that the intermediate expressions roughly correspond to the steps in the above interpretation of the computation as a program. They also appear in the illustrative figure repeated below.)

computation

Now we start applying the axioms to evaluating the expressions. Remember that we have two axioms: A0 tells us that x (new) = 0 and A1 that x (move (p, m)) = x (p) + m. Applying A1 to the definition the expression i yields

i = x (p4) – 6
= i4 – 6

if we define

i4 = x (p4)      — Of type INTEGER

We just have to compute i4. Applying A1 to the definion of p4 tells us that

i4 = x (p1) + x (p3)

To compute the two terms:

  • Applying A1 again, we see that the first term x (p1) is x (new) + 3, but then A0 tells us that x (new) is zero, so x (p1) is 3.
  • As to x (p3), it is, once more from A1, x (p2) + 4, and x (p2) is (from A1 then A0), just -2, so x (p3) is 2.

In the end, then, i4 is 5, and the value of the entire expression i = i4 – 6 is -1. Good job!

Proving sufficient completeness

The successful computation of i was just a derivation for one example, showing that in that particular case the axioms yield the answer in terms of an INTEGER. How do we go from one example to an entire specification?

The bad news first: like all interesting problems in programming, sufficient completeness of an ADT specification is theoretically undecidable. There is no general automatic procedure that will process an ADT specification and print out ““sufficiently complete” or “not sufficiently complete”.

Now that you have recovered from the shock, you can share the computer scientist’s natural reaction to such an announcement: so what. (In fact we might define the very notion of computer scientist as someone who, even before he brushes his teeth in the morning — if he brushes them at all — has already built the outline of a practical solution to an undecidable problem.) It is enough that we can find a way to determine if a given specification is sufficiently complete. Such a proof is, in fact, the computer scientist’s version of dental hygiene: no ADT is ready for prime time unless it is sufficiently complete.

The proof is usually not too hard and will follow the general style illustrated for our simple example.

We note that the definition of sufficient completeness said: “the axioms are powerful enough to yield the value of any well-formed query expression in a form not involving the type”. I have not defined “well-formed” yet. It simply means that the expressions are properly structured, with the proper syntax (basically the correct matching of parentheses) and proper number and types of arguments. For example the following are not well-formed (if p is an expression of type POINT):

move (p, 55(     — Bad use of parentheses.
move (p)            — Wrong number of arguments.
move (p, p)       — Wrong type: second argument should be an integer.

Such expressions are nonsense, so we only care about well-formed expressions. Note that in addition to new, x and move , an expression can use integer constants as in the example (although we could generalize to arbitrary integer expressions). We consider an integer constant as a query expression.

We have to prove that with the two axioms A0 and A1 we can determine the value of any query expression i. Note that since the only query functions is x, the only possible form for i, other than an integer constant, is x (p) for some expression p of type POINT.

The proof proceeds by induction on the number n of parenthesis pairs in a query expression i.

There are two base steps:

  • n = 0: in that case i can only be an integer constant. (The only expression with no parentheses built out of the ADT’s functions is new, and it is not a query expression.) So the value is known. In all other cases i will be of the form x (p) as noted.
  • n = 1: in that case p  can only be new, in other words i = x (new), since the only function that yields points, other than new, is move, and any use of it would add parentheses. In this case axiom A0 gives us the value of i: zero.

For the induction step, we consider i with n + 1 parenthesis pairs for n > 1. As noted, i is of the form x (p), so p has exactly n parenthesis pairs. p cannot be new (which would give 0 parenthesis pairs and was taken care of in the second base step), so p has to be of the form

p =  move (p’, i’)    — For expressions p’ of type POINT and i’ of type INTEGER.

implying (since i = x (p)) that by axiom A1, the value of i is

x (p’) + i’

So we will be able to determine the value of i if we can determine the value of both x (p’) and i’. Since p has n parenthesis pairs and p =  move (p’, i’), both p’ and i’ have at most n – 1 parenthesis pairs. (This use of n – 1 is legitimate because we have two base steps, enabling us to assume n > 1.) As a consequence, both x (p’) and i’ have at most n parenthesis pairs, enabling us to deduce their values, and hence the value of i, by the induction hypothesis.

Most proofs of sufficient completeness in my experience follow this style: induction on the number of parenthesis pairs (or the maximum nesting level).

Preconditions

I left until now the third component of a general ADT specification: preconditions. The need for preconditions arises because most practical specifications need some of their functions to be partial. A partial function from X to Y is a function that may not yield a value for some elements of X. For example, the inverse function on real numbers, which yields 1 / a for x, is partial  since it is not defined for a = 0 (or, on a computer, for non-zero but very small a).

Assume that in our examples we only want to accept points that lie in the interval [-4, +4]:

limited

 We can simply model this property by turning move into a partial function. It was specified above as

move: POINT × INTEGER → POINT

The ordinary arrow → introduces a total (always defined) function. For a partial function we will use a crossed arrow ⇸, specifying the function as

move: POINT × INTEGER ⇸ POINT

Other functions remain unchanged. Partial functions cause trouble: for f in X ⇸ Y we can no longer cheerfully use f (x) if f is a partial function, even for x of the appropriate type X. We have to make sure that x belongs to the domain of f, meaning the set of values for which f is defined. There is no way around it: if you want your specification to be meaningful and it uses partial functions, you must specify explicitly the domain of each of them. Here is how to do it, in the case of move:

move (p: POINT; d: INTEGER) require |x (p) + d | < 5    — where |…| is absolute value

To adapt the definition (and proofs) of sufficient completeness to the possible presence of partial functions:

  • We only need to consider (for the rule that axioms must yield the value of query expressions) well-formed expressions that satisfy the associated preconditions.
  • The definition must, however, include the property that axioms always enable us to determine whether an expression satisfies the associated preconditions (normally a straightforward part of the proof since preconditions are themselves query expressions).

Updating the preceding proof accordingly is not hard.

Back to requirements

The definition of sufficient completeness is of great help to assess the completeness of a requirements document. We must first regretfully note that for many teams today requirements stop at  “use cases” (scenarios) or  “user stories”. Of course these are not requirements; they only describe individual cases and are to requirements what tests are to programs. They can serve to check requirements, but do not suffice as requirements. I am assuming real requirements, which include descriptions of behavior (along with other elements such as environment properties and project properties). To describe behaviors, you will define operations and their effects. Now we know what the old IEEE standard is telling us by stating that complete requirements should include

definition of the responses of the software to all realizable classes of input data in all realizable classes of situations

Whether or not we have taken the trouble to specify the ADTs, they are there in the background; our system’s operations reflect the commands, and the effects we can observe reflect the queries. To make our specification complete, we should draw as much as possible of the (mental or explicit) matrix of possible effects of all commands on all queries. “As much as possible” because software engineering is engineering and we will seldom be able to reach perfection. But the degree of fullness of the matrix tells us a lot (possible software metric here?) about how close our requirements are to completeness.

I should note that there are other aspects to completeness of requirements. For example the work of Michael Jackson, Pamela Zave and Axel van Lamsweerde (more in some later article, with full references) distinguishes between business goals, environment constraints and system properties, leading to a notion of completeness as how much the system properties meet the goals and obey the constraints [4]. Sufficient completeness operates at the system level and, together with its theoretical basis, is one of those seminal concepts that every practicing software engineer or project manager should master.

References and notes

[1] John V. Guttag, Jim J. Horning: The Algebraic Specification of Abstract Data Types, in Acta Informatica, vol. 10, no. 1, pages 27-52, 1978, available here from the Springer site. This is a classic paper but I note that few people know it today; in Google Scholar I see over 700 citations but less than 100 of them in the past 8 years.

[2]  IEEE: Recommended Practice for Software Requirements Specifications, IEEE Standard 830-1998, 1998. This standard is supposed to be obsolete and replaced by newer ones, more detailed and verbose, but it remains the better reference: plain, modest and widely applied by the industry. It does need an update, but a good one.

[3] Bertrand Meyer, Object-Oriented Software Construction, 2nd edition, Prentice Hall, 1997. The discussion of sufficient completeness was in fact already there in the first edition from 1988.

[4] With thanks to Elisabetta Di Nitto from Politecnico di Milano for bringing up this notion of requirements completeness.

Recycled A version of this article was first published on the Communications of the ACM blog.

 

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Soundness and completeness: with precision

Over breakfast at your hotel you read an article berating banks about the fraudulent credit card transactions they let through. You proceed to check out and bang! Your credit card is rejected because (as you find out later) the bank thought [1] it couldn’t possibly be you in that exotic place. Ah, those banks! They accept too much. Ah, those banks! They reject too much. Finding the right balance is a case of soundness versus precision.

Similar notions are essential to the design of tools for program analysis, looking for such suspicious cases as  dead code (program parts that will never be executed). An analysis can be sound, or not; it can be complete, or not.

These widely used concepts are sometimes misunderstood.  The first answer I get when innocently asking people whether the concepts are clear is yes, of course, everyone knows! Then, as I bring up such examples as credit card rejection or dead code detection, assurance quickly yields to confusion. One sign that things are not going well is when people start throwing in terms like “true positive” and “false negative”. By then any prospect of reaching a clear conclusion has vanished. I hope that after reading this article you will never again (in a program analysis context) be tempted to use them.

Now the basic idea is simple. An analysis is sound if it reports all errors, and complete if it only reports errors. If not complete, it is all the more precise that it reports fewer non-errors.

You can stop here and not be too far off [2]. But a more nuanced and precise discussion helps.

1. A relative notion

As an example of common confusion, one often encounters attempts to help through something like Figure 1, which cannot be right since it implies that all sound methods are complete. (We’ll have better pictures below.)

Figure 1: Naïve (and wrong) illustration

Perhaps this example can be dismissed as just a bad use of illustrations [3] but consider the example of looking for dead code. If the analysis wrongly determines that some reachable code is unreachable, is it unsound or incomplete?

With this statement of the question, the only answer is: it depends!

It depends on the analyzer’s mandate:

  • If it is a code checker that alerts programmers to cases of bad programming style, it is incomplete: it reports as an error a case that is not. (Reporting that unreachable code is reachable would cause unsoundness, by missing a case that it should have reported.)
  • If it is the dead-code-removal algorithm of an optimizing compiler, which will remove unreachable code, it is unsound: the compiler will remove code that it should not. (Reporting that unreachable code is reachable would cause incompleteness, by depriving the compiler of an optimization.)

As another example, consider an analyzer that finds out whether a program will terminate. (If you are thinking “but that can’t be done!“, see the section “Appendix: about termination” at the very end of this article.) If it says a program does not terminates when in fact it does, is it unsound or incomplete?

Again, that depends on what the analyzer seeks to establish. If it is about the correctness of a plain input-to-output program (a program that produces results and then is done), we get incompleteness: the analyzer wrongly flags a program that is actually OK. But if it is about verifying that continuously running programs, such as the control system for a factory, will not stop (“liveness”), then the analyzer is unsound.

Examples are not limited to program analysis. A fraud-indentification process that occasionally rejects a legitimate credit card purchase is, from the viewpoint of preserving the bank from fraudulent purchases, incomplete. From the viewpoint of the customer who understands a credit card as an instrument enabling payments as long as you have sufficient credit, it is unsound.

These examples suffice to show that there cannot be absolute definitions of soundness and precision: the determination depends on which version of a boolean property we consider desirable. This decision is human and subjective. Dead code is desirable for the optimizing compiler and undesirable (we will say it is a violation) for the style checker. Termination is desirable for input-output programs and a violation for continuously running programs.

Once we have decided which cases are desirable and which are violations, we can define the concepts without any ambiguity: soundness means rejecting all violations, and completeness means accepting all desirables.

While this definition is in line with the unpretentious, informal one in the introduction, it makes two critical aspects explicit:

  • Relativity. Everything depends on an explicit decision of what is desirable and what is a violation. Do you want customers always to be able to use their credit cards for legitimate purchases, or do you want to detect all frauds attempts?
  • Duality. If you reverse the definitions of desirable and violation (they are the negation of each other), you automatically reverse the concepts of soundness and completeness and the associated properties.

We will now explore the consequences of these observations.

2. Theory and practice

For all sufficiently interesting problems, theoretical limits (known as Rice’s theorem) ensure that it is impossible to obtain both soundness and completeness.

But it is not good enough to say “we must be ready to renounce either soundness or completeness”. After all, it is very easy to obtain soundness if we forsake completeness: reject every case. A termination-enforcement analyzer can reject every program as potentially non-terminating. A bank that is concerned with fraud can reject every transaction (this seems to be my bank’s approach when I am traveling) as potentially fraudulent. Dually, it is easy to ensure completeness if we just sacrifice soundness: accept every case.

These extreme theoretical solutions are useless in practice; here we need to temper the theory with considerations of an engineering nature.

The practical situation is not as symmetric as the concept of duality theoretically suggests. If we have to sacrifice one of the two goals, it is generally better to accept some incompleteness: getting false alarms (spurious reports about cases that turn out to be harmless) is less damaging than missing errors. Soundness, in other words, is essential.

Even on the soundness side, though, practice tempers principle. We have to take into account the engineering reality of how tools get produced. Take a program analyzer. In principle it should cover the entire programming language. In practice, it will be built step by step: initially, it may not handle advanced features such as exceptions, or dynamic mechanisms such as reflection (a particularly hard nut to crack). So we may have to trade soundness for what has been called  “soundiness[4], meaning soundness outside of cases that the technology cannot handle yet.

If practical considerations lead us to more tolerance on the soundness side, on the completeness side they drag us (duality strikes again) in the opposite direction. Authors of analysis tools have much less flexibility than the theory would suggest. Actually, close to none. In principle, as noted, false alarms do not cause catastrophes, as missed violations do; but in practice they can be almost as bad.  Anyone who has ever worked on or with a static analyzer, going back to the venerable Lint analyzer for C, knows the golden rule: false alarms kill an analyzer. When people discover the tool and run it for the first time, they are thrilled to discover how it spots some harmful pattern in their program. What counts is what happens in subsequent runs. If the useful gems among the analyzer’s diagnostics are lost in a flood of irrelevant warnings, forget about the tool. People just do not have the patience to sift through the results. In practice any analysis tool has to be darn close to completeness if it has to stand any chance of adoption.

Completeness, the absence of false alarms, is an all-or-nothing property. Since in the general case we cannot achieve it if we also want soundness, the engineering approach suggests using a numerical rather than boolean criterion: precision. We may define the precision pr as 1 – im where im is the imprecision:  the proportion of false alarms.

The theory of classification defines precision differently: as pr = tp / (tp + fp), where tp is the number of false positives and fp the number of true positives. (Then im would be fp / (tp + fp).) We will come back to this definition, which requires some tuning for program analyzers.

From classification theory also comes the notion of recall: tp / (tp + fn) where fn is the number of false negatives. In the kind of application that we are looking at, recall corresponds to soundness, taken not as a boolean property (“is my program sound?“) but  a quantitative one (“how sound is my program?“). The degree of unsoundness un would then be fn / (tp + fn).

3. Rigorous definitions

With the benefit of the preceding definitions, we can illustrate the concepts, correctly this time. Figure 2 shows two different divisions of the set of U of call cases (universe):

  • Some cases are desirable (D) and others are violations (V).
  • We would like to know which are which, but we have no way of finding out the exact answer, so instead we run an analysis which passes some cases (P) and rejects some others (R).

Figure 2: All cases, classified

The first classification, left versus right columns in Figure 2, is how things are (the reality). The second classification, top versus bottom rows, is how we try to assess them. Then we get four possible categories:

  • In two categories, marked in green, assessment hits reality on the nail:  accepted desirables (A), rightly passed, and caught violations (C), rightly rejected.
  • In the other two, marked in red, the assessment is off the mark: missed violations (M), wrongly passed; and false alarms (F), wrongly accepted.

The following properties hold, where U (Universe) is the set of all cases and  ⊕ is disjoint union [5]:

— Properties applicable to all cases:
U = D ⊕ V
U = P ⊕ R
D = A ⊕ F
V = C ⊕ M
P = A ⊕ M
R = C ⊕ F
U = A ⊕M ⊕ F ⊕ C

We also see how to define the precision pr: as the proportion of actual violations to reported violations, that is, the size of C relative to R. With the convention that u is the size of U and so on, then  pr = c / r, that is to say:

  • pr = c / (c + f)      — Precision
  • im = f / (c + f)      — Imprecision

We can similarly define soundness in its quantitative variant (recall):

  • so = a / (a + m)      — Soundness (quantitative)
  • un = m / (a + m)   — Unsoundness

These properties reflect the full duality of soundness and completeness. If we reverse our (subjective) criterion of what makes a case desirable or a violation, everything else gets swapped too, as follows:

Figure 3: Duality

We will say that properties paired this way “dual” each other [6].

It is just as important (perhaps as a symptom that things are not as obvious as sometimes assumed) to note which properties do not dual. The most important examples are the concepts of  “true” and “false” as used in “true positive” etc. These expressions are all the more confusing that the concepts of True and False do dual each other in the standard duality of Boolean algebra (where True duals False,  Or duals And, and an expression duals its negation). In “true positive” or “false negative”,  “true” and “false” do not mean True and False: they mean cases in which (see figure 2 again) the assessment respectively matches or does not match the reality. Under duality we reverse the criteria in both the reality and the assessment; but matching remains matching! The green areas remain green and the red areas remain red.

The dual of positive is negative, but the dual of true is true and the dual of false is false (in the sense in which those terms are used here: matching or not). So the dual of true positive is true negative, not false negative, and so on. Hereby lies the source of the endless confusions.

The terminology of this article removes these confusions. Desirable duals violation, passed duals rejected, the green areas dual each other and the red areas dual each other.

4. Sound and complete analyses

If we define an ideal world as one in which assessment matches reality [7], then figure 2 would simplify to just two possibilities, the green areas:

Figure 4: Perfect analysis (sound and complete)

This scheme has the following properties:

— Properties of a perfect (sound and complete) analysis as in Figure 4:
M = ∅              — No missed violations
F = ∅               — No false alarms
P = D                — Identify  desirables exactly
R = V                –Identify violations exactly

As we have seen, however, the perfect analysis is usually impossible. We can choose to build a sound solution, potentially incomplete:

Figure 5: Sound desirability analysis, not complete

In this case:

— Properties of a sound analysis (not necessarily complete) as in Figure 5:
M = ∅              — No missed violations
P = A                — Accept only desirables
V = C                — Catch all violations
P ⊆ D               — Under-approximate desirables
R ⊇ V               — Over-approximate violations

Note the last two properties. In the perfect solution, the properties P = D and R = V mean that the assessment, yielding P and V, exactly matches the reality, D and V. From now on we settle for assessments that approximate the sets of interest: under-approximations, where the assessment is guaranteed to compute no more than the reality, and over-approximations, where it computes no less. In all cases the assessed sets are either subsets or supersets of their counterparts. (Non-strict, i.e. ⊆ and ⊇ rather than ⊂ and ⊃; “approximation” means possible approximation. We may on occasion be lucky and capture reality exactly.)

We can go dual and reach for completeness at the price of possible unsoundness:

Figure 6: Complete desirability analysis, not sound

The properties are dualled too:

— Properties of a complete analysis (not necessarily sound), as in Figure 6:
F = ∅              — No false alarms
R = C               — Reject only violations
D = A               — Accept all desirables
P ⊇ D               — Over-approximate desirables
R ⊆ V              — Under-approximate violations

5. Desirability analysis versus violation analysis

We saw above why the terms “true positives”, “false negatives” etc., which do not cause any qualms in classification theory, are deceptive when applied to the kind of pass/fail analysis (desirables versus violations) of interest here. The definition of precision provides further evidence of the damage. Figure 7 takes us back to the general case of Figure 2 (for analysis that is guaranteed neither sound nor complete)  but adds these terms to the respective categories.

Figure 7: Desirability analysis (same as fig. 2 with added labeling)

The analyzer checks for a certain desirable property, so if it wrongly reports a violation (F) that is a false negative, and if it misses a violation (M) it is a false positive. In the  definition from classification theory (section 2, with abbreviations standing for True/False Positives/Negatives): TP = A, FP = M, FN =  F, TN = C, and similarly for the set sizes: tp = a, fp = m, fn = f, tn = c.

The definition of precision from classification theory was pr = tp / (tp + fp), which here gives a / (a + m). This cannot be right! Precision has to do with how close the analysis is to completeness, that is to day, catching all violations.

Is classification theory wrong? Of course not. It is simply that, just as Alice stepped on the wrong side of the mirror, we stepped on the wrong side of duality. Figures 2 and 7 describe desirability analysis: checking that a tool does something good. We assess non-fraud from the bank’s viewpoint, not the stranded customer’s; termination of input-to-output programs, not continuously running ones; code reachability for a static checker, not an optimizing compiler. Then, as seen in section 3, a / (a + m) describes not precision but  soundness (in its quantitative interpretation, the parameter called “so” above).

To restore the link with classification theory , we simply have to go dual and take the viewpoint of violation analysis. If we are looking for possible violations, the picture looks like this:

Figure 8: Violation analysis (same as fig. 7 with different positive/negative labeling)

Then everything falls into place:  tp = c, fp = f, fn =  m, tn = a, and the classical definition of  precision as pr = tp / (tp + fp) yields c / (c + f) as we are entitled to expect.

In truth there should have been no confusion since we always have the same picture, going back to Figure 2, which accurately covers all cases and supports both interpretations: desirability analysis and violation analysis. The confusion, as noted, comes from using the duality-resistant “true”/”false” opposition.

To avoid such needless confusion, we should use the four categories of the present discussion:  accepted desirables, false alarms, caught violations and missed violations [8]. Figure 2 and its variants clearly show the duality, given explicitly in Figure 3, and sustains  interpretations both for desirability analysis and for violation analysis. Soundness and completeness are simply special cases of the general framework, obtained by ruling out one of the cases of incorrect analysis in each of Figures 4 and 5. The set-theoretical properties listed after Figure 2 express the key concepts and remain applicable in all variants. Precision c / (c + f) and quantitative soundness a / (a + m) have unambiguous definitions matching intuition.

The discussion is, I hope, sound. I have tried to make it complete. Well, at least it is precise.

Notes and references

[1] Actually it’s not your bank that “thinks” so but its wonderful new “Artificial Intelligence” program.

[2] For a discussion of these concepts as used in testing see Mauro Pezzè and Michal Young, Software Testing and Analysis: Process, Principles and Techniques, Wiley, 2008.

[3] Edward E. Tufte: The Visual Display of Quantitative Information, 2nd edition, Graphics Press, 2001.

[4] Michael Hicks,What is soundness (in static analysis)?, blog article available here, October 2017.

[5] The disjoint union property X = Y ⊕ Z means that Y ∩ Z = ∅ (Y and Z are disjoint) and X = Y ∪ Z (together, they yield X).

[6] I thought this article would mark the introduction into the English language of “dual” as a verb, but no, it already exists in the sense of turning a road from one-lane to two-lane (dual).

[7] As immortalized in a toast from the cult movie The Prisoner of the Caucasus: “My great-grandfather says: I have the desire to buy a house, but I do not have the possibility. I have the possibility to buy a goat, but I do not have the desire. So let us drink to the matching of our desires with our possibilities.” See 6:52 in the version with English subtitles.

[8] To be fully consistent we should replace the term “false alarm” by rejected desirable. I is have retained it because it is so well established and, with the rest of the terminology as presented, does not cause confusion.

[9] Byron Cook, Andreas Podelski, Andrey Rybalchenko: Proving Program Termination, in Communications of the ACM, May 2011, Vol. 54 No. 5, Pages 88-98.

Background and acknowledgments

This reflection arose from ongoing work on static analysis of OO structures, when I needed to write formal proofs of soundness and completeness and found that the definitions of these concepts are more subtle than commonly assumed. I almost renounced writing the present article when I saw Michael Hicks’s contribution [4]; it is illuminating, but I felt there was still something to add. For example, Hicks’s set-based illustration is correct but still in my opinion too complex; I believe that the simple 2 x 2 pictures used above convey the ideas  more clearly. On substance, his presentation and others that I have seen do not explicitly mention duality, which in my view is the key concept at work here.

I am grateful to Carlo Ghezzi for enlightening discussions, and benefited from comments by Alexandr Naumchev and others from the Software Engineering Laboratory at Innopolis University.

Appendix: about termination

With apologies to readers who have known all of the following from kindergarten: a statement such as (section 1): “consider an analyzer that finds out whether a program will terminate” can elicit no particular reaction (the enviable bliss of ignorance) or the shocked rejoinder that such an analyzer is impossible because termination (the “halting” problem) is undecidable. This reaction is just as incorrect as the first. The undecidability result for the halting problem says that it is impossible to write a general termination analyzer that will always provide the right answer, in the sense of both soundness and completeness, for any program in a realistic programming language. But that does not preclude writing termination analyzers that answer the question correctly, in finite time, for given programs. After all it is not hard to write an analyzer that will tell us that the program from do_nothing until True loop do_nothing end will terminate and that the program from do_nothing until False loop do_nothing end will not terminate. In the practice of software verification today, analyzers can give such sound answers for very large classes of programs, particularly with some help from programmers who can obligingly provide variants (loop variants, recursion variants). For a look into the state of the art on termination, see the beautiful survey by Cook, Podelski and Rybalchenko [9].

Also appears in the Communications of the ACM blog

VN:F [1.9.10_1130]
Rating: 10.0/10 (7 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

Why not program right?

recycled-logo (Originally published on CACM blog.)

Most of the world programs in a very strange way. Strange to me. I usually hear the reverse question: people ask us, the Eiffel community, to explain why we program our way. I hardly understand the question, because the only mystery is how anyone can even program in any other way.

The natural reference is the beginning of One Flew Over the Cuckoo’s Nest: when entering an insane asylum and wondering who is an inmate and who a doctor, you may feel at a loss for objective criteria. Maybe the rest of the world is right and we are the nut cases. Common sense suggests it.

But sometimes one can go beyond common sense and examine the evidence. So lend me an ear while I explain my latest class invariant. Here it is, in Figure 1. (Wait, do not just run away yet.)

multigraph_invariant

Figure 1: From the invariant of class MULTIGRAPH

This is a program in progress and by the time you read this note the invariant and enclosing class will have changed. But the ideas will remain.

Context: multigraphs

The class is called MULTIGRAPH and describes a generalized notion of graph, illustrated in Figure 2. The differences are that: there can be more than one edge between two nodes, as long as they have different tags (like the spouse and boss edges between 1 and 2); and there can be more than one edge coming out of a given node and with a given tag (such as the two boss edges out of 1, reflecting that 1’s boss might be 2 in some cases and 3 in others). Some of the nodes, just 1 here, are “roots”.

The class implements the notion of multigraph and provides a wide range of operations on multigraphs.

multigraph_example

Figure 2: A multigraph

Data structures

Now we turn to the programming and software engineering aspects. I am playing with various ways of accessing multigraphs. For the basic representation of a multigraph, I have chosen a table of triples:

                triples_table: HASH_TABLE [TRIPLE, TUPLE [source: INTEGER; tag: INTEGER; target: INTEGER]]  — Table of triples, each retrievable through its `source’, `tag’ and `target’.

where the class TRIPLE describes [source, tag, target] triples, with a few other properties, so they are not just tuples. It is convenient to use a hash table, where the key is such a 3-tuple. (In an earlier version I used just an ARRAY [TRIPLE], but a hash table proved more flexible.)

Sources and targets are nodes, also called “objects”; we represent both objects and tags by integers for efficiency. It is easy to have structures that map symbolic tag names such as “boss” to integers.

triples_table is the core data structure but it turns out that for the many needed operations it is convenient to have others. This technique is standard: for efficiency, provide different structures to access and manipulate the same underlying information, with some redundancy. So I also have:

 triples_from:  ARRAYED_LIST [LIST [TRIPLE]]
               — Triples starting from a given object. Indexed by object numbers.

  triples_with:  HASH_TABLE [LIST [TRIPLE], INTEGER]
               — Triples labeled by a given tag. Key is tag number.

 triples_to:  ARRAYED_LIST [LIST [TRIPLE]]
               — Triples leading into a given object. Indexed by object numbers.

Figure 3 illustrates triples_from and Figures 4 illustrates triples_with. triples_to is similar.

triples_from

Figure 3: The triples_from array of lists and the triples_table

triples_with

Figure 4: The triples_with array of lists and the triples_table

It is also useful to access multigraphs through yet another structure, which gives us the targets associated with a given object and tag:

successors: ARRAY [HASH_TABLE [LIST [TRIPLE], INTEGER]]
               — successors [obj] [t] includes all o such that there is a t- reference from obj to o.

For example in Figure 1 successors [1] [spouse] is {2, 3}, and in Figures 3 and 4 successors [26] [t] is {22, 55, 57}. Of course we can obtain the “successors” information through the previously defined structures, but since this is a frequently needed operation I decided to include a specific data structure (implying that every operation modifying the multigraph must update it). I can change my mind later on and decide to make “successors” a function rather than a data structure; it is part of the beauty of OO programming, particularly in Eiffel, that such changes are smooth and hardly impact client classes.

There is similar redundancy in representing roots:

                roots:  LINKED_SET [INTEGER]
                              — Objects that are roots.

                is_root:  ARRAY [BOOLEAN]
                              — Which objects are roots? Indexed by object numbers.

If o is a root, then it appears in the “roots” set and is_root [o] has value True.

Getting things right

These are my data structures. Providing such a variety of access modes is a common programming technique. From a software engineering perspective ― specification, implementation, verification… ― it courts disaster. How do we maintain their consistency? It is very easy for a small mistake to slip into an operation modifying the graph, causing one of the data structures to be improperly updated, but in a subtle and rare enough way that it will not manifest itself during testing, coming back later to cause strange behavior that will be very hard to debug.

For example, one of the reasons I have a class TRIPLE and not just 3-tuples is that a triple is not exactly  the same as an edge in the multigraph. I have decided that by default the operation that removes and edge would not remove the corresponding triple from the data structure, but leave it in and mark it as “inoperative” (so class TRIPLE has an extra “is_inoperative” boolean field). There is an explicit GC-like mechanism to clean up deleted edges occasionally. This approach brings efficiency but makes the setup more delicate since we have to be extremely careful about what a triple means and what removal means.

This is where I stop understanding how the rest of the world can work at all. Without some rigorous tools I just do not see how one can get such things right. Well, sure, spend weeks of trying out test cases, printing out the structures, manually check everything (in the testing world this is known as writing lots of “oracles”), try at great pains to find out the reason for wrong results, guess what program change will fix the problem, and start again. Stop when things look OK. When, as Tony Hoare once wrote, there are no obvious errors left.

Setting aside the minuscule share of projects (typically in embedded life-critical systems) that use some kind of formal verification, this process is what everyone practices. One can only marvel that systems, including many successful ones, get produced at all. To take an analogy from another discipline, this does not compare to working like an electrical engineer. It amounts to working like an electrician.

For a short time I programmed like that too (one has to start somewhere, and programming methodology was not taught back then). I no longer could today. Continuing with the Hoare citation, the only acceptable situation is to stop when there are obviously no errors left.

How? Certainly not, in my case, by always being right the first time. I make mistakes like everyone else does. But I have the methodology and tools to avoid some, and, for those that do slip through, to spot and fix them quickly.

Help is available

First, the type system. Lots of inconsistencies, some small and some huge, which in an untyped language would only hit during execution, do not make it past compilation. We are not just talking here about using REAL instead of INTEGER. With a sophisticated type system involving multiple inheritance, genericity, information hiding and void safety, a compiler error message can reflect a tricky logical mistake. You are using a SET as if it were a LIST (some operations are common, but others not). You are calling an operation on a reference that may be void (null) at run time. And so on.

By the way, about void-safety: for a decade now, Eiffel has been void-safe, meaning a compile-time guarantee of no run-time null pointer dereferencing. It is beyond my understanding how the rest of the world can still live with programs that run under myriad swords of Damocles: x.op (…) calls that might any minute, without any warning or precedent, hit a null x and crash.

Then there is the guarantee of logical consistency, which is where my class invariant (Figure 1) comes in. Maybe it scared you, but in reality it is all simple concepts, intended to make sure that you know what you are doing, and rely on tools to check that you are right. When you are writing your program, you are positing all kinds, logical assumptions, large and (mostly) small, all the time. Here, for the structure triples_from [o] to make sense, it must be a list such that:

  • It contains all the triples t in the triples_table such that t.source = o.
  •  It contains only those triples!

You know this when you write the program; otherwise you would not be having a “triples_from” structure. Such gems of knowledge should remain an integral part of the program. Individually they may not be rocket science, but accumulated over the lifetime of a class design, a subsystem design or a system design they collect all the intelligence that makes the software possible.  Yet in the standard process they are gone the next minute! (At best, some programmers may write a comment, but that does not happen very often, and a comment has no guarantee of precision and no effect on testing or correctness.)

Anyone who takes software development seriously must record such fundamental properties. Here we need the following invariant clause:

across triples_from as tf all

across tf.item as tp all tp.item.source = tf.cursor_index end

end

(It comes in the class, as shown in Figure 1, with the label “from_list_consistent”. Such labels are important for documentation and debugging purposes. We omit them here for brevity.)

What does that mean? If we could use Unicode (more precisely, if we could type it easily with our keyboards) we would write things like “∀ x: E | P (x) for all x in E, property P holds of x. We need programming-language syntax and write this as across E as x all P (x.item) end. The only subtlety is the .item part, which gives us generality beyond the  notation: x in the across is not an individual element of E but a cursor that moves over E. The actual element at cursor position is x.item, one of the properties of that cursor. The advantage is that the cursor has more properties, for example x.cursor_index, which gives its position in E. You do not get that with the plain of mathematics.

If instead of  you want  (there exists), use some instead of all. That is pretty much all you need to know to understand all the invariant clauses of class MULTIGRAPH as given in Figure 1.

So what the above invariant clause says is: take every position tf in triples_from; its position is tf.cursor_index and its value is tf.item. triples_from is declared as ARRAYED_LIST [LIST [TRIPLE]], so tf.cursor_index is an integer representing an object o, and tf.item is a list of triples. That list should  consist of the triples having tf.cursor_index as their source. This is the very property that we are expressing in this invariant clause, where the innermost across says: for every triple tp.item in the list, the source of that triple is the cursor index (of the outside across). Simple and straightforward, I think (although such English explanations are so much more verbose than formal versions, such as the Eiffel one here, and once you get the hang of it you will not need them any more).

How can one ever include a structure such as triples_from without expressing such a property? To put the question slightly differently: am I inside the asylum looking out, or outside the asylum looking in? Any clue would be greatly appreciated.

More properties

For the tag ( with_) and target lists, the properties are similar:

across triples_with as tw all across tw.item as tp all tp.item.tag = tw.key end end

across triples_to as tt all across tt.item as tp all tp.item.target = tt.cursor_index end end 

We also have some properties of array bounds:

 is_root.lower = 1 and is_root.upper = object_count

triples_from.lower = 1 and triples_from.upper = object_count

triples_to.lower = 1 and triples_to.upper = object_count

where object_count is the number of objects (nodes), and for an array a (whose bounds in Eiffel are arbitrary, not necessarily 0 or 1, and set on array creation), a.lower and a.upper are the bounds. Here we number the arrays from 1.

There are, as noted, two ways to represent rootness. We must express their consistency (or risk trouble). Two clauses of the invariant do the job:

across roots as t all is_root [t.item] end

across is_root as t all (t.item = roots.has (t.cursor_index)) end

The first one says that if we go through the list roots we only find elements whose is_root value is true; the second, that if we go through the array “is_root” we find values that are true where and only where the corresponding object, given by the cursor index, is in the roots set. Note that the = in that second property is between boolean values (if in doubt, check the type instantly in the EIffelStudio IDE!), so it means “if and only if.

Instead of these clauses, a more concise version, covering them both, is just

roots ~ domain (is_root)

with a function domain that gives the domain of a function represented by a boolean array. The ~ operator denotes object equality, redefined in many classes, and in particular in the SET classes (roots is a LINKED_SET) to cover equality between sets, i.e. the property of having the same elements.

The other clauses are all similarly self-explanatory. Let us just go through the most elaborate one, successors_consistent, involving three levels of across:

across successors as httpl all                   — httpl.item: hash table of list of triples

        across httpl.item as tpl all                — tpl.item: list of triples (tpl.key: key (i.e. tag) in hash table (tag)

                  across tpl.item as tp all            — tp.item: triple

                         tp.item.tag = tpl.key

and tp.item.source = httpl.cursor_index

                   end

          end

end

You can see that I struggled a bit with this one and made provisions for not having to struggle again when I would look at the code again 10 minutes, 10 days or 10 months later. I chose (possibly strange but consistent) names such as httpl for hash-table triple, and wrote comments (I do not usually need any in invariant and other contract clauses) to remind me of the type of everything. That was not strictly needed since once again the IDE gives me the types, but it does not cost much and could help.

What this says: go over successors; which as you remember is an ARRAY, indexed by objects, of HASH_TABLE, where each entry of such a hash table has an element of type [LIST [TRIPLE] and a key of type INTEGER, representing the tag of a number of outgoing edges from the given object. Go over each hash table httpl. Go over the associated list of triples tpl. Then for each triple tp in this list: the tag of the triple must be the key in the hash table entry (remember, the key does denote a tag); and the source of the triple must the object under consideration, which is the current iteration index in the array of the outermost iteration.

I hope I am not scaring you at this point. Although the concepts are simple, this invariant is more sophisticated than most of those we typically write. Many invariant clauses (and preconditions, and postconditions) are very simple properties, such as x > 0 or x ≠ y. The reason this one is more elaborate is not that I am trying to be fussy but that without it I would be the one scared to death. What is elaborate here is the data structure and programming technique. Not rocket science, not anything beyond programmers typically do, but elaborate. The only way to get it right is to buttress it by the appropriate logical properties. As noted, these properties are there anyway, in the back of your head, when you write the program. If you want to be more like an electrical engineer than an electrician, you have to write them down.

There is more to contracts

Invariants are not the only kind of such “contract properties. Here for example, from the same class, is a (slightly abbreviated) part of the postcondition (output property) of the operation that tells us, through a boolean Result, if the multigraph has an edge of given components osource, t (the tag) and otarget :

Result =

(across successors [osource] [t] as tp some

not tp.item.is_inoperative and tp.item.target = otarget

end)

In words, this clause expresses the compatibility of the operation with the successors view: it must answer yes if and only if otarget appears in the successor set of osource for t, and the corresponding triple is not marked inoperative.

The concrete benefits

And so? What do we get out of making these logical properties explicit? Just the intellectual satisfaction of doing things right, and the methodological guidance? No! Once you have done this work, it is all downhill. Turn on the run-time assertion monitoring option (tunable separately for preconditions, postconditions, invariants etc., and on by default in development mode), and watch your tests run. If you are like almost all of us, you will have made a few mistakes, some which will seem silly when or rather if you find them in time (but there is nothing funny about a program that crashes during operation) and some more subtle. Sit back, and just watch your contracts be violated. For example if I change <= to < in the invariant property tw.key <= max_tag, I get the result of Figure 5. I see the call stack that I can traverse, the object run-time structure that I can explore, and all the tools of a modern debugger for an OO language. Finding and correcting the logical flaw will be a breeze.

debugger

Figure 5: An invariant violation brings up the debugger

The difference

It will not be a surprise that I did not get all the data structures and algorithms of the class MULTIGRAPH  right the first time. The Design by Contract approach (the discipline of systematically expressing, whenever you write any software element, the associated logical properties) does lead to fewer mistakes, but everyone occasionally messes up. Everyone also looks at initial results to spot and correct mistakes. So what is the difference?

Without the techniques described here, you execute your software and patiently examine the results. In the example, you might output the content of the data structures, e.g.

List of outgoing references for every object:

        1: 1-1->1|D, 1-1->2|D, 1-1->3|D, 1-2->1|D, 1-2->2|D,  1-25->8|D, 1-7->1|D, 1-7->6|D,

1-10->8|D, 1-3->1|D, 1-3->2|D, 1-6->3|D, 1-6->4|D, 1-6->5|D

        3: 3-6->3, 3-6->4, 3-6->5, 3-9->14, 3-9->15,   3-9->16, 3-1->3, 3-1->2, 3-2->3, 3-2->2,

                  3-25->8, 3-7->3, 3-7->6, 3-10->8, 3-3->3,  3-3->2    

List of outgoing references for every object:

        1: 1-1->1|D, 1-1->2|D, 1-1->3|D, 1-2->1|D, 1-2->2|D, 1-25->8|D, 1-7->1|D, 1-7->6|D,

1-10->8|D, 1-3->1|D,  1-3->2|D, 1-6->3|D, 1-6->4|D, 1-6->5|D

        3: 3-6->3, 3-6->4, 3-6->5, 3-9->14, 3-9->15,  3-9->16, 3-1->3, 3-1->2, 3-2->3, 3-2->2,

                                 3-25->8, 3-7->3, 3-7->6, 3-10->8, 3-3->3,  3-3->2

and so on for all the structures. You check the entries one by one to ascertain that they are as expected. The process nowadays has some automated support, with tools such as JUnit, but it is still essentially manual, tedious and partly haphazard: you write individual test oracles for every relevant case. (For a more automated approach to testing, taking advantage of contracts, see [1].) Like the logical properties appearing in contracts, these oracles are called assertions but the level of abstraction is radically different: an oracle describes the desired result of one test, where a class invariant, or routine precondition, or postcondition expresses the properties desired of all executions.

Compared to the cost of writing up such contract properties (simply a matter of formalizing what you are thinking anyway when you write the code), their effect on testing is spectacular. Particularly when you take advantage of across iterators. In the example, think of all the checks and crosschecks automatically happening across all the data structures, including the nested structures as in the 3-level across clause. Even with a small test suite, you immediately get, almost for free, hundreds or thousands of such consistency checks, each decreasing the likelihood that a logical flaw will survive this ruthless process.

Herein lies the key advantage. Not that you will magically stop making mistakes; but that the result of such mistakes, in the form of contract violations, directly points to logical properties, at the level of your thinking about the program. A wrong entry in an output, whether you detect it visually or through a Junit clause, is a symptom, which may be far from the cause. (Remember Dijkstra’s comment, the real point of his famous Goto paper, about the core difficulty of programming being to bridge the gap between the static program text, which is all that we control, and its effect: the myriad possible dynamic executions.) Since the cause of a bug is always a logical mistake, with a contract violation, which expresses a logical inconsistency, you are much close to that cause.

(About those logical mistakes: since a contract violation reflects a discrepancy between intent, expressed by the contract, and reality, expressed by the code, the mistake may be on either side. And yes, sometimes it is the contract that is wrong while the implementation in fact did what is informally expected. There is partial empirical knowledge [1] of how often this is the case. Even then, however, you have learned something. What good is a piece of code of which you are not able to say correctly what it is trying to do?)

The experience of Eiffel programmers reflects these observations. You catch the mistakes through contract violations; much of the time, you find and correct the problem easily. When you do get to producing actual test output (which everyone still does, of course), often it is correct.

This is what has happened to me so far in the development of the example. I had mistakes, but converging to a correct version was a straightforward process of examining violations of invariant violations and other contract elements, and fixing the underlying logical problem each time.

By the way, I believe I do have a correct version (in the sense of the second part of the Hoare quote), on the basis not of gut feeling or wishful thinking but of solid evidence. As already noted it is hard to imagine, if the code contains any inconsistencies, a test suite surviving all the checks.

Tests and proofs

Solid evidence, not perfect; hard to imagine, not impossible. Tests remain only tests; they cannot exercise all cases. The only way to achieve demonstrable correctness is to rely on mathematical proofs performed mechanically. We have this too, with the AutoProof proof system for Eiffel, developed in recent years [1]. I cannot overstate my enthusiasm for this work (look up the Web-based demo), its results (automated proof of correctness of a full-fledged data structures and algorithms library [2]) and its potential, but it is still a research effort. The dynamic approach (meaning test-based rather than proof-based) presented above is production technology, perfected over several decades and used daily for large-scale mission-critical applications. Indeed (I know you may be wondering) it scales up without difficulty:

  • The approach is progressive. Unlike fully formal methods (and proofs), it does not require you to write down every single property down to the last quantifier. You can start with simple stuff like x > 0. The more you write, the more you get, but it is the opposite of an all-or-nothing approach.
  • On the practical side, if you are wondering about the consequences on performance of a delivered system: there is none. Run-time contract monitoring is a compilation option, tunable for different kinds of contracts (invariants, postconditions etc.) and different parts of a system. People use it, as discussed here, for development, testing and debugging. Most of the time, when you deliver a debugged system, you turn it off.
  • It is easy to teach. As a colleague once mentioned, if you can write an if-then-else you can write a precondition. Our invariants in the above example where a bit more sophisticated, but programmers do write loops (in fact, the Eiffel loop for iterating over a structure also uses across, with loop and instructions instead of all or some and boolean expressions). If you can write a loop over an array, you can write a property of the array’s elements.
  • A big system is an accumulation of small things. In a blog article [5] I recounted how I lost a full day of producing a series of technical diagrams of increasing complexity, using one of the major Web-based collaborative development tools. A bug of the system caused all the diagrams to reproduce the first, trivial one. I managed to get through to the developers. My impression (no more than an educated guess resulting from this interaction) is that the data structures involved were far simpler than the ones used in the above discussion. One can surmise that even simple invariants would have uncovered the bug during testing rather than after deployment.
  • Talking about deployment and tools used directly on the cloud: the action in software engineering today is in DevOps, a rapid develop-deploy loop scheme. This is where my perplexity becomes utter cluelessness. How can anyone even consider venturing into that kind of exciting but unforgiving development model without the fundamental conceptual tools outlined above?

We are back then to the core question. These techniques are simple, demonstrably useful, practical, validated by years of use, explained in professional books (e.g. [6]), introductory programming textbooks (e.g. [7]), EdX MOOCs (e.g. [8]), YouTube videos, online tutorials at eiffel.org, and hundreds of articles cited thousands of times. On the other hand, most people reading this article are not using Eiffel. On reflection, a simple quantitative criterion does exist to identify the inmates: there are far more people outside the asylum than inside. So the evidence is incontrovertible.

What, then, is wrong with me?

References

(Nurse to psychiatrist: these are largely self-references. Add narcissism to list of patient’s symptoms.)

1.    Ilinca Ciupa, Andreas Leitner, Bertrand Meyer, Manuel Oriol, Yu Pei, Yi Wei and others: AutoTest articles and other material on the AutoTest page.

2. Bertrand Meyer, Ilinca Ciupa, Lisa (Ling) Liu, Manuel Oriol, Andreas Leitner and Raluca Borca-Muresan: Systematic evaluation of test failure results, in Workshop on Reliability Analysis of System Failure Data (RAF 2007), Cambridge (UK), 1-2 March 2007 available here.

3.    Nadia Polikarpova, Ilinca Ciupa and Bertrand Meyer: A Comparative Study of Programmer-Written and Automatically Inferred Contracts, in ISSTA 2009: International Symposium on Software Testing and Analysis, Chicago, July 2009, available here.

4.    Carlo Furia, Bertrand Meyer, Nadia Polikarpova, Julian Tschannen and others: AutoProof articles and other material on the AutoProof page. See also interactive web-based online tutorial here.

5.    Bertrand Meyer, The Cloud and Its Risks, blog article, October 2010, available here.

6.    Bertrand Meyer: Object-Oriented Software Construction, 2nd edition, Prentice Hall, 1997.

7.    Bertrand Meyer: Touch of Class: Learning to Program Well Using Objects and Contracts, Springer, 2009, see touch.ethz.ch and Amazon page.

8.    MOOCs (online courses) on EdX : Computer: Art, Magic, Science, Part 1 and Part 2. (Go to archived versions to follow the courses.)

VN:F [1.9.10_1130]
Rating: 9.9/10 (12 votes cast)
VN:F [1.9.10_1130]
Rating: +8 (from 10 votes)

Festina retro

We “core” computer scientists and software engineers always whine that our research themes forever prevent us, to the delight of our physicist colleagues but unjustly, from reaching the gold standard of academic recognition: publishing in Nature. I think I have broken this barrier now by disproving the old, dusty laws of physics! Brace yourself for my momentous discovery: I have evidence of negative speeds.

My experimental setup (as a newly self-anointed natural scientist I am keen to offer the possibility of replication) is the Firefox browser. I was downloading an add-on, with a slow connection, and at some point got this in the project bar:

Negative download speed

Negative speed! Questioning accepted wisdom! Nobel in sight! What next, cold fusion?

I fear I have to temper my enthusiasm in deference to more mundane explanations. There’s the conspiracy explanation: the speed is truly negative (more correctly, it is a “velocity”, a vector of arbitrary direction, hence in dimension 1 possibly negative); Firefox had just reversed the direction of transfer, surreptitiously dumping my disk drive to some spy agency’s server.

OK, that is rather far-fetched. More likely, it is a plain bug. A transfer speed cannot be negative; this property is not just wishful thinking but should be expressed as an integral part of the software. Maybe someone should tell Firefox programmers about class invariants.

VN:F [1.9.10_1130]
Rating: 9.6/10 (9 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

The end of software engineering and the last methodologist

(Reposted from the CACM blog [*].)

Software engineering was never a popular subject. It started out as “programming methodology”, evoking the image of bearded middle-aged men telling you with a Dutch, Swiss-German or Oxford accent to repent and mend your ways. Consumed (to paraphrase Mark Twain) by the haunting fear that someone, somewhere, might actually enjoy coding.

That was long ago. With a few exceptions including one mentioned below, to the extent that anyone still studies programming methodology, it’s in the agile world, where the decisive argument is often “I always say…”. (Example from a consultant’s page:  “I always tell teams: `I’d like a [user] story to be small, to fit in one iteration but that isn’t always the way.’“) Dijkstra did appeal to gut feeling but he backed it through strong conceptual arguments.

The field of software engineering, of which programming methodology is today just a small part, has enormously expanded in both depth and width. Conferences such as ICSE and ESEC still attract a good crowd, the journals are buzzing, the researchers are as enthusiastic as ever about their work, but… am I the only one to sense frustration? It is not clear that anyone outside of the community is interested. The world seems to view software engineering as something that everyone in IT knows because we all develop software or manage people who develop software. In the 2017 survey of CS faculty hiring in the U.S., software engineering accounted, in top-100 Ph.D.-granting universities, for 3% of hires! (In schools that stop at the master’s level, the figure is 6%; not insignificant, but not impressive either given that these institutions largely train future software engineers.) From an academic career perspective, the place to go is obviously  “Artificial Intelligence, Data Mining, and Machine Learning”, which in those top-100 universities got 23% of hires.

Nothing against our AI colleagues; I always felt “AI winter” was an over-reaction [1], and they are entitled to their spring. Does it mean software engineering now has to go into a winter of its own? That is crazy. Software engineering is more important than ever. The recent Atlantic  “software apocalypse” article (stronger on problems than solutions) is just the latest alarm-sounding survey. Or, for just one recent example, see the satellite loss in Russia [2] (juicy quote, which you can use the next time you teach a class about the challenges of software testing: this revealed a hidden problem in the algorithm, which was not uncovered in decades of successful launches of the Soyuz-Frigate bundle).

Such cases, by the way, illustrate what I would call the software professor’s dilemma, much more interesting in my opinion than the bizarre ethical brain-teasers (you see what I mean, trolley levers and the like) on which people in philosophy departments spend their days: is it ethical for a professor of software engineering, every morning upon waking up, to go to cnn.com in the hope that a major software-induced disaster has occurred,  finally legitimizing the profession? The answer is simple: no, that is not ethical. Still, if you have witnessed the actual state of ordinary software development, it is scary to think about (although not to wish for) all the catastrophes-in-waiting that you suspect are lying out there just waiting for the right circumstances .

So yes, software engineering is more relevant than ever, and so is programming methodology. (Personal disclosure: I think of myself as the very model of a modern methodologist [3], without a beard or a Dutch accent, but trying to carry, on today’s IT scene, the torch of the seminal work of the 1970s and 80s.)

What counts, though, is not what the world needs; it is what the world believes it needs. The world does not seem to think it needs much software engineering. Even when software causes a catastrophe, we see headlines for a day or two, and then nothing. Radio silence. I have argued to the point of nausea, including at least four times in this blog (five now), for a simple rule that would require a public auditing of any such event; to quote myself: airline transportation did not become safer by accident but by accidents. Such admonitions fall on deaf ears. As another sign of waning interest, many people including me learned much of what they understand of software engineering through the ACM Risks Forum, long a unique source of technical information on software troubles. The Forum still thrives, and still occasionally reports about software engineering issues, but most of the traffic is about privacy and security (with a particular fondness for libertarian rants against any reasonable privacy rule that the EU passes). Important topics indeed, but where do we go for in-depth information about what goes wrong with software?

Yet another case in point is the evolution of programming languages. Language creation is abuzz again with all kinds of fancy new entrants. I can think of one example (TypeScript) in which the driving force is a software engineering goal: making Web programs safer, more scalable and more manageable by bringing some discipline into the JavaScript world. But that is the exception. The arguments for many of the new languages tend to be how clever they are and what expressive new constructs they introduce. Great. We need new ideas. They would be even more convincing if they addressed the old, boring problems of software engineering: correctness, robustness, extendibility, reusability.

None of this makes software engineering less important, or diminishes in the least the passion of those of us who have devoted our careers to the field. But it is time to don our coats and hats: winter is upon us.

Notes

[1] AI was my first love, thanks to Jean-Claude Simon at Polytechnique/Paris VI and John McCarthy at Stanford.

[2] Thanks to Nikolay Shilov for alerting me to this information. The text is in Russian but running it through a Web translation engine (maybe this link will work) will give the essentials.

[3] This time borrowing a phrase from James Noble.

[*] I am reposting these CACM blog articles rather than just putting a link, even though as a software engineer I do not like copy-paste. This is my practice so far, and it might change since it raises obvious criticism, but here are the reasons: (A) The audiences for the two blogs are, as experience shows, largely disjoint. (B) I like this site to contain a record of all my blog articles, regardless of what happens to other sites. (C) I can use my preferred style conventions.

VN:F [1.9.10_1130]
Rating: 9.9/10 (11 votes cast)
VN:F [1.9.10_1130]
Rating: +7 (from 7 votes)

Blockchains, bitcoin and distributed trust: LASER school lineup complete

The full lineup of speakers at the 2018 LASER summer school on Software for Blockchains, Bitcoin and Distributed Trust is now ready, with the announcement of a new speaker, Primavera De Filippi from CNRS and Harvard on social and legal aspects.

The other speakers are Christian Cachin (IBM), Maurice Herlihy (Brown), Christoph Jentzsch (slock.it), me, Emil Gun Sirer (Cornell) and Roger Wattenhofer (ETH).

The school is the 14th in the LASER series and takes place June 2-10, 2018, on the island of Elba in Italy.

Early-fee registration deadline is February 10. The school’s page is here.

VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Devops (the concept, and a workshop announcement)

One of the most significant recent developments in software engineering is the concept of Devops*. Dismissing the idea as “just the latest buzzword” would be wrong. It may be a buzzword but it reflects a fundamental change in the way we structure system development; with web applications in particular the traditional distinctions between steps of development, V&V** and deployment fade out. If you are using Microsoft Word, you know or can easily find out the version number; but which version of your search engine are you using?

With the new flexibility indeed come new risks, as when a bug in the latest “devopsed”  version of Google Docs caused me to lose a whole set of complex diagrams irretrievably; an earlier article on this blog (“The Cloud and Its Risks“, October 2010) told the story.

In the new world of continuous integrated development/V&V/deployment, software engineering principles are more necessary than ever, but their application has to undergo a profound adaptation.

With Jean-Michel Bruel (Toulouse), Elisabetta Di Nitto (Milan) and Manuel Mazzara (Innopolis), we are organizing a workshop on the topic, DEVOPS 18, on 5-6 March 2018 near Toulouse. The Call for Papers is available here, with Springer LNCS proceedings. The submission deadline is January 15, but for that date a 2-page extended abstract is sufficient. I hope that the event will help the community get a better grasp of the software engineering techniques and practices applicable to this new world of software development.

Notes

*I know, it’s supposed to be DevOps (I am not a great fan of upper case in the middle of words).
** Validation & Verification.

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Split the Root: a little design pattern

Many programs take “execution arguments” which the program users provide at the start of execution. In EiffelStudio you can enter them under Execution -> Execution parameters.

The program can access them through the Kernel Library class ARGUMENTS. Typically, the root class of the system inherits from ARGUMENTS and its creation procedure will include something like

if argument_count /= N then
……..print (“XX expects exactly N arguments: AA, BB, …%N”)
else
……..u := argument (1) ; v := argument (2) ; …
……..“Proceed with normal execution, using u, v, …”
end

where N is the number of expected arguments, XX is the name of the program, and AA, …. are the roles of arguments. u, v, … are local variables. The criterion for acceptance could be “at least N” instead of exactly N. The features argument_count and arguments come from class ARGUMENTS.

In all but trivial cases this scheme (which was OK years ago, in a less sophisticated state of the language) does not work! The reason is that the error branch will fail to initialize attributes. Typically, the “Proceed with…” part in the other branch is of the form

               attr1 := u
                attr2 := v
                …
                create obj1.make (attr1, …)
                create obj2.make (attr2, …)
                “Work with obj1, obj2, …”

If you try to compile code of this kind, you will get a compilation error:

Compiler error message

Eiffel is void-safe: it guarantees that no execution will ever produce null-pointer dereference (void call). To achieve this guarantee, the compiler must make sure that all attributes are “properly set” to an object reference (non-void) at the end of the creation procedure. But the error branch fails to initialize obj1 etc.

You might think of replacing the explicit test by a precondition to the creation procedure:

               require
                                argument_count = N

but that does not work; the language definition explicit prohibits preconditions in a root creation procedure. The Ecma-ISO standard (the official definition of the language, available here) explains the reason for the corresponding validity rule (VSRP, page 32):

A routine can impose preconditions on its callers if these callers are other routines; but it makes no sense to impose a precondition on the external agent (person, hardware device, other program…) that triggers an entire system execution, since there is no way to ascertain that such an agent, beyond the system’s control, will observe the precondition.

The solution is to separate the processing of arguments from the rest of the program’s work. Add a class CORE which represents the real core of the application and separate it from the root class, say APPLICATION. In APPLICATION, all the creation procedure does is to check the arguments and, if they are fine, pass them on to an instance of the core class:

                note
                                description: “Root class, processes execution arguments and starts execution”
                class APPLICATION create make feature
                                core: CORE
                                                — Application’s core object
                                make
……..……..……..……..……..……..— Check arguments and proceed if they make sense.
                                                do
                                                             if argument_count /= N then
                                                                                print (“XX expects exactly N arguments: AA, BB, …%N”)
                                                                else
                                                                                create core.make (argument (1), argument (2) ; …)
                                                                                                — By construction the arguments are defined!
                                                                                core.live
                                                                                                — Perform actual work
                                                                                               — (`live’ can instead be integrated with `make’ in CORE.)

                                                                end
                                                end
                 end
 
We may call this little design pattern “Split the Root”. Nothing earth-shattering; it is simply a matter of separating concerns (cutting off the Model from the View). It assumes a system that includes text-based output, whereas many applications are graphical. It is still worth documenting, for two reasons.

First, in its own modest way, the pattern is useful for simple programs; beginners, in particular, may not immediately understand why the seemingly natural way of processing and checking arguments gets rejected by the compiler.

The second reason is that Split the Root illustrates the rules that preside over a carefully designed language meant for carefully designed software. At first it may be surprising and even irritating to see code rejected because, in a first attempt, the system’s root procedure has a precondition, and in a second attempt because some attributes are not initialized — in the branch where they do not need to be initialized. But there is a reason for these rules, and once you understand them you end up writing more solid software.

 

VN:F [1.9.10_1130]
Rating: 10.0/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

LASER summer school on software for robotics: last call for registration

Much of the progress in robotics is due to software advances, and software issues remain at the heart of the formidable challenges that remain. The 2017 LASER summer school, held in September in Elba, brings together some of the most prestigious international experts in the area.

The LASER school has established itself as one of the principal forums to discussed advanced software issues. The 2017 school takes place from 9 to 17 September in the idyllic setting of the Hotel del Golfo in Procchio, Elba Island, Italy.

Robotics is progressing at an amazing pace, bringing improvements to almost areas of human activity. Today’s robotics systems rely ever more fundamentally on complex software, raising difficult issues. The LASER 2017 summer school covers both the current state of robotics software technology and open problems. The lecturers are top international experts with both theoretical contributions and major practical achievements in developing robotics systems.
The LASER school is intended for professionals from the industry (engineers and managers) as well as university researchers, including PhD students. Participants learn about the most important software technology advances from the pioneers in the field. The school’s focus is applied, although theory is welcome to establish solid foundations. The format of the school favors extensive interaction between participants and speakers.

We have lined up an impressive roster of speakers from the leading edge of both industry and academia:

Rodolphe Gélin, Aldebaran Robotics
Ashish Kapoor, Microsoft Research
Davide Brugali, University of Bergamo, on Managing software variability in robotic control systems
Nenad Medvidovic, University of Southern California, on Software Architectures of Robotics Systems
Bertrand Meyer, Politecnico di Milano & Innopolis University, on Concurrent Object-Oriented Robotics Software
Issa Nesnas, NASA Jet Propulsion Laboratory, on Experiences from robotic software development for research and planetary flight robots
Hiroshi (“Gitchang”) Okuno, Waseda University & Kyoto University, on Open-Sourced Robot Audition Software HARK: Capabilities and Applications

The school takes place at the magnificent Hotel del Golfo in the Gulf of Procchio, Elba. Along with an intensive scientific program, participants will have time to enjoy the countless natural and cultural riches of this wonderful, history-laden jewel of the Mediterranean.

For more information about the school, the speakers and registration see the LASER site.

VN:F [1.9.10_1130]
Rating: 5.5/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

The perils of feature interaction

One of the most delicate aspects of design is feature interaction. As users, we suffer daily from systems offering features that individually make sense but clash with each other. In my agile book [1] I explained in detail, building on the work of Pamela Zave, why this very problem makes one of the key ideas of agile methods,  the reliance on “user stories” for requirements, worthless and damaging.

A small recent incident reminded me of the perils of feature interaction. I used my Lenovo W540 laptop without power for a short while, then reached a sedentary location and plugged it in. Hence my surprise when, some hours later, it started beeping to alert me that it was running out of battery. The natural reactions — check the outlet and the power cord — had no effect. I found the solution, but just in time: otherwise, including if I had not heard the warning sound, I would have been unable to use the laptop any further. That’s right: I would not have been able to restart the computer at all, even with access to a power outlet, and even though it was perfectly functional and so was its (depleted) battery. The reason is that the problem arose from a software setting, which (catch-22 situation) I could not correct without starting the computer [2].

The only solution would have been to find another, non-depleted battery. That is not a trivial matter if you have traveled with your laptop outside of a metropolis: the W540 has a special battery which ordinary computer shops do not carry [3].

The analysis of what made such a situation possible must start with the list of relevant hardware and software product features.

Hardware:

  • HA. This Lenovo W series includes high-end laptops with high power requirements, which the typical 65-watt airplane power jack does not satisfy.
  • HB. With models prior to the W540, if you tried to connect a running laptop to the power supply in an airplane, it would not charge, and the power indicator would start flickering.  But you could still charge it if you switched it off.
  • HC. The W540 effectively requires 135 watts and will not take power from a 65-watt power source under any circumstances.

Software:

  • SA. The operating system (this discussion assumes Windows) directly reflects HC by physically disabling charging if the laptop is in the “Airplane” power mode.
  • SB. If you disable wireless, the operating system automatically goes into the “Airplane” power mode.
  • SC. In the “Airplane” power mode, the laptop, whether or not connected through a charger to a power outlet of any wattage, will not charge. The charging function is just disabled.
  • SD. One can edit power modes to change parameters, such as time to automatic shutoff, but the no-charging property in Airplane mode is not editable and not even mentioned in the corresponding UI dialog. It seems to be a behind-the-scenes property magically attached to the power-mode name “Airplane”.
  • SE. There is a function key for disabling wireless: F8. As a consequence of SB it also has the effect of switching to “Airplane” mode.
  • SF. Next to F8 on the keyboard is F7.
  • SG. F7 serves to display the screen content on another monitor (Windows calls it a “projector”). F7 offers a cyclic set of choices: laptop only, laptop plus monitor etc.
  • SH. In the old days (like five years ago), such function keys setting important operating system parameters on laptops used to be activated only if you held them together with a special key labeled “Fn”. For some reason (maybe the requirement was considered too complicated for ordinary computer users) the default mode on Lenovo laptops does not use the “Fn” key anymore: you just press the desired key, such as F7 or F8.
  • SI. You can revert to the old mode, requiring pressing “Fn”, by going into the BIOS and performing some not-absolutely-trivial steps, making this possibility the preserve of techies. (Helpfully, this earlier style is called “Legacy mode”, as a way to remind you that your are an old-timer, probably barely graduated from MS-DOS and still using obsolete conventions. In reality, the legacy mode is the right one to use, whether for techies or novices: it is all too easy to hit a function key by mistake and get totally unexpected results. The novice, not the techie, is the one who will be completely confused and panicked as a result. The first thing I do with a new laptop is to go to the BIOS and set legacy mode.)

By now you have guessed what happened in my case, especially once you know that I had connected the laptop to a large monitor and had some trouble getting that display to work. In the process I hit Fn-F7 (feature SG) several times.  I must have mistakenly (SF) pressed F8 instead of F7 at some point. Normally, Legacy mode (SI) should have made me immune to the effects of hitting a function key by mistake, but I did use the neighboring key F7 for another purpose. Hitting F8 disabled wireless (SE) and switched on Airplane power mode (SB). At that point the laptop, while plugged in correctly, stopped charging (SC, SD).

How did I find out? Since I was looking for a hardware problem I could have missed the real cause entirely and ended up with a seemingly dead laptop. Fortunately I opened the Power Options dialog to see what it said about the battery. I noticed that among the two listed power plans the active one was not “Power Saver”, to which I am used, but “Airplane”. I did not immediately pay  attention to that setting; since I had not used the laptop for a while I just thought that maybe the last time around I had switched on “Airplane”, even though that made little sense since I was not even aware of the existence of that option. After trying everything else, though, I came back to that intriguing setting, changed to the more usual “Power Saver”, and the computer started to charge again. I was lucky to have a few percent of battery still left at that point.

Afterwards I found a relevant discussion thread on a Lenovo user forum.

As is often the case in such feature-interaction mishaps, most of the features make sense individually [4]. What causes trouble is some unforeseen combination of features.

There is no sure way to avoid such trouble, but there is a sure way to cause it: design a system feature by feature, as with user stories in agile development. The system must do this and it must do that. Oh, by the way, it must also do that. And that. User stories have one advantage: everyone understands them. But that is also their limitation. Good requirements and design require professionals who can see the whole beyond the parts.

A pernicious side of this situation is that many people believe that use cases and user stories are part of object-oriented analysis, whereas the OO approach to requirements and design is the reverse: rise above individual examples to uncover the fundamental abstractions.

As to my laptop, it is doing well, thanks. And I will be careful with function keys.

Reference and notes

[1] Bertrand Meyer: Agile! The Good, the Hype and the Ugly, Springer, 2014,  Amazon page: here, book page: here. A description of the book appeared here on this blog at the time of publication.

[2] Caveat: I have not actually witnessed this state in which a plugged-in laptop will not restart. The reason is simply that I do not have an alternate battery at the moment so I cannot perform the experiment with the almost certain result of losing the use of my laptop. I will confirm the behavior as soon as I have access to a spare battery.

[3] It has been my systematic experience over the past decade and a half that Lenovo seems to make a point, every couple of years, to introduce new models with incompatible batteries and docking stations. (They are also ever more incredibly bulky, with the one for the W540 almost as heavy as the laptop itself. On the other hand the laptops are good, otherwise I would not be bothering with them.)

[4] One exception here is feature SB: switching wireless off does not necessaril y mean you want to select a specific power mode! It is a manifestation of the common syndrome  of software tools that think they are smarter than you, and are not. Another exception is SE: to let a simple key press change fundamental system behavior is to court disaster. But I had protected myself by using legacy mode and was hit anyway.

VN:F [1.9.10_1130]
Rating: 7.3/10 (9 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

AutoProof workshop: Verification As a Matter of Course

The AutoProof technology pursues the goal of “Verification As a Matter Of Course”, integrated into the EVE development environment. (The AutoProof  project page here; see particularly the online interactive tutorial.) A one-day workshop devoted to the existing AutoProof and current development will take place on October 1 near Toulouse in France. It is an informal event (no proceedings planned at this point, although based on the submissions we might decide to produce a volume), on a small scale, designed to bring together people interested in making the idea of practical verification a reality.

The keynote will be given by Rustan Leino from Microsoft Research, the principal author of the Boogie framework on which the current implementation of AutoProof relies.

For submissions (or to attend without submitting) see the workshop page here. You are also welcome to contact me for more information.

VN:F [1.9.10_1130]
Rating: 5.3/10 (15 votes cast)
VN:F [1.9.10_1130]
Rating: -2 (from 6 votes)

Robotics and concurrency

Many robotics applications are by nature concurrent; in his ongoing PhD work, Andrey Rusakov [1] is building a comprehensive concurrent robot programming framework, Roboscoop [2], based on the SCOOP model for simple concurrent object-oriented programming [3] and the Ros operating system. As part of this work it is important to know how much robotics applications use concurrency, stay away from concurrency — perhaps because programmers are afraid of the risks — and could benefit from more concurrency. Rusakov has prepared a questionnaire [4] to help find out. If you have experience in robot programming, please help him by answering the questionnaire, which takes only a few minutes.

References

[1] Rusakov’s home page here.

[2] Roboscoop project page, here,

[3] Simple Concurrent Object-Oriented Programming, see here.

[4] The questionnaire is here.

VN:F [1.9.10_1130]
Rating: 5.4/10 (18 votes cast)
VN:F [1.9.10_1130]
Rating: -5 (from 5 votes)

Design by Contract: ACM Webinar this Thursday

A third ACM webinar this year (after two on agile methods): I will be providing a general introduction to Design by Contract. The date is this coming Thursday, September 17, and the time is noon New York (18 Paris/Zurich, 17 London, 9 Los Angeles, see here for hours elsewhere). Please tune in! The event is free but requires registration here.

VN:F [1.9.10_1130]
Rating: 5.8/10 (19 votes cast)
VN:F [1.9.10_1130]
Rating: -4 (from 8 votes)

New paper: Theory of Programs

Programming, wrote Dijkstra many years ago, is a branch of applied mathematics. That is only half of the picture: the other half is engineering, and this dual nature of programming is part of its attraction.

Descriptions of the mathematical side are generally, in my view, too complicated. This article [1] presents a mathematical theory of programs and programming based on concepts taught in high school: elementary set theory. The concepts covered include:

  • Programming.
  • Specification.
  • Refinement.
  • Non-determinism.
  • Feasibility.
  • Correctness.
  • Programming languages.
  • Kinds of programs: imperative, functional, object-oriented.
  • Concurrency (small-step and large-step)
  • Control structures (compound, if-then-else and Dijkstra-style conditional, loop).
  • State, store and environment.
  • Invariants.
  • Notational conventions for building specifications and programs incrementally.
  • Loop invariants and variants.

One of the principal ideas is that a program is simply the description of a mathematical relation. The program text is a rendering of that relation. As a consequence, one may construct programming languages simply as notations to express certain kinds of mathematics. This approach is the reverse of the usual one, where the program text and its programming languages are the starting point and the center of attention: theoreticians develop techniques to relate them to mathematical concepts. It is more effective to start from the mathematics (“unparsing” rather than parsing).

All the results (74 properties expressed formally, a number of others in the text) are derived as theorems from rules of elementary set theory; there are no new axioms whatsoever.

The paper also has a short version [2], omitting proofs and many details.

References

[1] Theory of Programs, available here.
[2] Theory of Programs, short version of [1] (meant for quick understanding of the ideas, not for publication), available here.

 

VN:F [1.9.10_1130]
Rating: 5.6/10 (29 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 12 votes)

Framing the frame problem (new paper)

Among the open problems of verification, particularly the verification of object-oriented programs, one of the most vexing is framing: how to specify and verify what programs element do not change. Continuing previous work, this article presents a “double frame inference” method, automatic on both sides the specification and verification sides. There is no need to write frame specifications: they will be inferred from routine postconditions. For verification, the method computes the set of actually changed properties through a “change calculus”, itself based on the previously developed alias calculus.

Some verification techniques, such as Hoare-style proofs, require significant annotation effort and potentially yield full functional verification; others, such as model checking and abstract interpretation, have more limited goals but seek full automation. Framing, in my opinion, should be automatic, freeing the programmer-verifier to devote the annotation effort to truly interesting properties.

Reference

[1] Bertrand Meyer: Framing the Frame Problem, in Dependable Software Systems, Proceedings of August 2014 Marktoberdorf summer school, eds. Alexander Pretschner, Manfred Broy and Maximilian Irlbeck, NATO Science for Peace and Security, Series D: Information and Communication Security, Springer, 2015 (to appear), pages 174-185; preprint available here.

VN:F [1.9.10_1130]
Rating: 6.0/10 (18 votes cast)
VN:F [1.9.10_1130]
Rating: -1 (from 9 votes)