Archive for the ‘Reading notes’ Category.

## The power and terror of imagination

Reading notes. From: Quelques éléments d’histoire des nombres négatifs (Elements of a history of negative numbers) by Anne Boyé, Proyecto Pénélope, 2002, revision available here; On Solving Equations, Negative Numbers, and Other Absurdities: Part II by Ralph Raimi, available  here; Note sur l’histoire des nombres entiers négatifs (Note on the History of Negative Numbers) by Rémi Lajugie, 2016, hereThe History of Negative Numbers by Leo Rogers, here; Historical Objections against the Number Line, by Albrecht Heeffer, here; Making Sense of Negative Numbers by Cecilia Kilhamn, 2011 PhD thesis at the University of Gothenburg, here.  Also the extensive book by Gert Schubring on Number Concepts Underlying the Development of Analysis in 17-19th Century France and Germany, here. Translations are mine (including from Maclaurin and De Morgan, retranslated from Lajugie’s and Boyé’s French citations). This excursion was spurred by a side remark in the article How to Take Advantage of the Blur Between the Finite and the Infinite by the recently deceased mathematician Pierre Cartier, available here.

At dinner recently, with non-scientists, discussion revolved about ages and a very young child, not even able to read yet, volunteered about his forthcoming little brother that “when he comes out his age will be zero”. An adult remarked “indeed, and right now his age is minus five months”, which everyone young and old seemingly found self-evident. How remarkable!

### From a elite concept to grade school topic

It is a characteristic of potent advances in human understanding that for a while they are understandable to a few geniuses only, or, if not geniuses, to a handful of forward-thinking luminaries, and a generation later, sometimes less, they are taught in grade school. When I came across object-oriented programming, those of us who had seen the light, so to speak, were very few. Feeling very much like plotting Carbonari, we would excitedly meet once in a while in exotic locations (for my Simula-fueled band usually in Scandinavia, although for the Smalltalk crowd it must have been California) to share our shared passion and commiserate about the decades it would take for the rest of humankind to see the truth. Then at some point, almost overnight, without any noticeable harbinger, the whole thing exploded and from then on it was object-oriented everything. Nowadays every beginning programmer talks objects — I did not write “understands”, they do not, but that will be for another article.

Zero too was a major invention. Its first recorded use as a number (not just a marker for absent entities) was in India in the first centuries of our era. It is not hard to imagine the mockeries. “Manish here has twenty sheep, Rahul has twelve sheep, and look at that nitwit Shankar, he sold all his sheep and still claims he has some, zero of them he says! Can you believe the absurdity? Ha ha ha.”

That dialog is imaginary, but for another momentous concept, negative numbers, we have written evidence of the resistance. From the best quarters!

### The greatest minds on the attack

The great Italian mathematician Cardan (Gerolamo Cardano), in his Ars Magna from 1545, was among the skeptics. As told in a 1758 French History of Mathematics by Montucla (this quote and the next few ones are from Boyé):

In his article 7 Cardan proposes an equation which in our language would be x2 + 4 x = 21 and observes that the value of x can equally be +3 or -7, and that by changing the sign of the second term it becomes -3 or +7. The name he gives to such values is “fake”.

The words I am translating here as “fake values” are, in Montucla, valeurs feintes, where feint in French means feigned, or pretended (“pretend values”). Although I have not seen the text of Ars Magna, which is in Latin anyway, I like to think that Cardan was thinking of the Italian word finto. (Used for example  in the title of an opera composed by Mozart at the age of 19, La Finta Giardinera, the fake girl gardener — English has no feminine for “gardener”. The false gardenerette in question is a disguised marchioness.) It is fun to think of negative roots as feigned.

Cardan also uses terms like “abundant” versus “failing” quantities (abondantes and défaillantes in French texts) for positive and negative:

Simple advice: do not confuse failing quantities with abundant quantities. One must add the abundant quantities between themselves, also subtract failing quantities between themselves, and subtract failing quantities from abundant quantities but only by taking species into account, that is to say, only operate same with same […]

There is a recognition of negative values, but with a lot of apprehension. Something strange, the author seems to feel, is at play here. Boyé cites the precedent of Chinese accountants who could manipulate positive values through black sticks and negative ones through red sticks and notes that it resembles what Cardan seems to be thinking here. In the fifteenth century, Nicolas Chuquet “used negative numbers as exponents but referred to them as `absurd numbers’”.

For all his precautions, Cardan did consider negative quantities. No lesser mind as Descartes, a century later (La Géométrie, 1637), is more circumspect. In discussing roots of equations he writes:

Often it turns out that some of those roots are false, or less than nothing [“moindres que rien”] as if one supposes that x can also denote the lack of a quantity, for example 5, in which case we have x + 5 = 0, which, if we multiply it by x3 − 9 x2  + 26 x − 24 = 0 yields  x4 − 4 x3 − 19 x2 + 106 x − 120 = 0, an equation for which there are four roots, as follows: three true ones, namely 2, 3, 4, and a false one, namely 5.

Note the last value: “5”. Not a -5, but a 5 dismissed as “false”. The list of exorcising adjectives continues to grow: negative values are no longer “failing”, or “fake”, or “absurd”, now for Descartes they are “false”!  To the modern mind they are neither more nor less true than the “true” ones, but to him they are still hot potatoes, to be handled with great suspicion.

### Carnot cannot take the heat

One more century later we are actually taking a step back with Lazare Carnot. Not the one of the thermodynamic cycle — that would be his son, as both were remarkable mathematicians and statesmen. Lazare in 1803 cannot hide his fear of negative numbers:

If we really were to obtain a negative quantity by itself, we would have to deduct an effective quantity from zero, that is to say, remove something from nothing : an impossible operation. How then can one conceived a negative quantity by itself?.

(Une quantité négative isolée : an isolated negative quantity, meaning a negative quantity considered in isolation). How indeed! What a scary thought!

The authors of all these statements, even when they consider negative values, cannot bring themselves to talk of negative numbers, only of negative quantities. Numbers, of course, are positive: who has ever heard of a shepherd who is guarding a herd of minus 7 lambs? Negative quantities are a slightly crazy concoction to be used only reluctantly as a kind of kludge.

Lajugie mentions another example, mental arithmetic: to compute 19 x 31  in your head, it is clever to multiply (20 -1) by (30 + 1), but then as you expand the product by applying the laws of distributivity you get negative values.

### De Morgan too

We move on by three decades to England and Augustus De Morgan, yes, the one who came up with the two famous laws of logic duality. De Morgan in 1803, as cited by Raimi:

8-3 is easily understood; 3 can be taken from 8 and the remainder is 5; but 3-8 is an impossibility; it requires you to take from 3 more than there is in 3, which is absurd. If such an expression as 3-8 should be the answer to a problem, it would denote either that there was some absurdity inherent in the problem itself, or in the manner of putting it into an equation.

Raimi points out that “De Morgan is not naïve” but wants to caution students about possible errors. Maybe, but we are back to fear and to words such as “absurd”, as used by Chuquet three centuries before. De Morgan (cited by Boyé) doubles down in his reluctance to accept negatives as numbers:

0 − a is just as inconceivable as -a.

Here is an example. A father is 56 years old and his [son] is 29 years old. In how many years will the father’s age be twice his son’s age? Let x be that number of years; x satisfies 56 +x = 2 (29 + x). We find x = -2.

Great, we say, he got it! This simple result is screaming at De Morgan but he has to reject it:

This result is absurd. However if we change x into -x and correspondingly resolve 56−x = 2 (29−x), we find x = -2. The [previous] negative answer shows that we had made an error in the initial phrasing of the equation.

In other words, if you do not like the solution, change the problem! I too can remember a few exam situations in which I would have loved to make an equation more sympathetic by replacing a plus sign with a minus. Too bad no one told me I could.

De Morgan’s comment is remarkable as the “phrasing of  the equation” contained no “error” whatsoever.   The equation correctly reflected the problem as posed. One could find the statement of the problem mischievous (“in how many years” suggests a solution in the future whereas there is only one in the past), but the equation is meaningful and  has a solution — one, however, that horrifies De Morgan. As a result, when discussing the quadratic (second-degree) equation ax2 + bx + c = 0, instead of accepting that a, b and c can be negative, he distinguishes no fewer than 6 cases, such as ax2 – bx + c = 0, ax2 + bx – c = 0 etc. The coefficients are always non-negative, it is the operators that change between + and  -. As a consequence, the determinant actually has two possible values, the one familiar to us, b2 – 4ac, but also b2 + 4ac for some of the cases. According to Raimi, many American textbooks of the 19th century taught that approach, forcing students to remember all six cases. (For a report about a current teaching distortion of the same topic, see a recent article on the present blog, “Mathematics Is Not a Game of Hit and Miss”, here.)

De Morgan (cited here by Boyé) feels the need to turn this reluctance to use negative numbers into a general rule:

When the answer to a problem is negative, by changing the sign of x in the equation that produced the result, we can discover that an error was made in the method that served to form this equation, or show that the question asked by the problem is too limited.

Sure! It is no longer “if the facts do not fit the theory, change the facts” (a sarcastic definition of bad science), but also “if you do not like the solution, change the problem”. All the more unnecessary (to a modern reader, who thanks to the work of countless mathematicians over centuries learned negative numbers in grade school, and does not spend time wondering whether they mean something) that if we keep the original problem the computed solution, x = -2, makes perfect sense: the father was twice his son’s age two years ago. The past is a negative future. But to see things this way, and to accept that there is nothing fishy here, requires a mindset for which an early 19-th century mathematician was obviously not ready.

### And Pascal, and Maclaurin

Not just a mathematician but a great mathematical innovator. What is remarkable in all such statements against negative numbers is that they do not emanate from little minds, unable to grasp abstractions. Quite the contrary! These negative-number-skeptics are outstanding mathematicians. Lajugie gives more examples from the very top. Blaise Pascal in 1670:

Too much truth surprises; I know people who cannot understand that when you deduct 4 from zero, what remains is zero.

(Oh yes?, one is tempted to tell the originator of probability theory, who was fascinated by betting and games of chance: then I put the 4 back and get 4? Quick way to get rich. Give me the address of that casino please.) A friend of Pascal, skeptical about the equality -1 / 1 = 1 / -1: “How could a smaller number be to a larger one as a larger one to a smaller one?”. An English contemporary, John Wallis, one of the creators of infinitesimal calculus — again, not a nitwit! — complains that a / 0 is infinity, but since in a / -1 the denominator is lesser than zero it must follow that a / -1, which is less than zero (since it is negative by the rule of signs), must also be greater than infinity! Nice one actually.

This apparent paradox also bothered the great scientist D’Alembert, the 18-th century co-editor of the Encyclopédie, who resolves it, so to speak, by stating (as cited by Heeffer) that “One can only go from positive to negative through either zero or through infinity”; so unlike Wallis he accepts that 1 / -a is negative, but only because it becomes negative when it passes through infinity. D’Alembert concludes (I am again going after Heeffer) that it is wrong to say that negatives numbers are always smaller than zero. Euler was similarly bothered and similarly looking for explanations through infinity: what does Leibniz’s expansion of 1 / (1 – x)  into 1 + x + x2 + x3 + … become for x = 2? Well, the sum 1 + 2 + 4 + 8 + … diverges, so 1 / -1 is infinity!

We all know the name “Maclaurin” from the eponymous series. Colin Maclaurin  wrote in 1742, decades after Pascal (Boyé):

The use of the negative sign in algebra leads to several consequences that one initially has trouble accepting and has led to ideas that seem not to have any real foundation.

Again the supposed trouble is the absence of an immediately visible connection to everyday reality (a “real foundation”). And again Maclaurin accepts that quantities can be negative, but numbers cannot:

While abstract quantities can be both negative and positive, concrete quantities are not always capable of being the opposite of each other.

(cited by Kilhamn). Apparently Colin’s wife Anne never thought of buying him a Réaumur thermometer (see below) for his birthday.

### Yes, two negatives make a positive

We may note that the authors cited above, and many of their contemporaries, had no issue manipulating negative quantities in some contexts, and to accept the law of signs, brilliantly expressed by the Indian mathematician Brahmagupta  in the early 7th century (not a typo); as cited by Rogers:

A debt minus zero is a debt.
A fortune minus zero is a fortune.
Zero minus zero is a zero.
A debt subtracted from zero is a fortune.
A fortune subtracted from zero is a debt.
The product of zero multiplied by a debt or fortune is zero.
The product of zero multiplied by zero is zero.
The product or quotient of two fortunes is one fortune.
The product or quotient of two debts is one fortune.
The product or quotient of a debt and a fortune is a debt.
The product or quotient of a fortune and a debt is a debt.

That view must have been clear to accountants. Whatever Pascal may have thought, 4 francs removed from nothing do not vanish; they become a debt. What the great mathematicians cited above could not fathom was that there is such a thing as a negative number. You can count up as far as your patience will let you; you can then count down, but you will inevitably stop. Everyone knows that, and even Pascal or Euler have trouble going beyond. (Old mathematical joke: “Do you know about the mathematician who was afraid of negative numbers? He will stop at nothing to avoid them”.)

The conceptual jump that took centuries to achieve was to accept that there are not only negative quantities, but negative numbers: numbers in their own right, not just temporarily  negated positive numbers (that is, the only ones to which we commonly rely in everyday life), prefaced with a minus sign because we want to use them as “debts”, but with the firm intention to move them back to the other side so as to restore their positivity  — their supposed naturalness —  at the end of the computation. We have seen superior minds “stopping at nothing” to avoid that step.

Others were bolder; Schubring has a long presentation of how Fontenelle, an 18-th century French scientist and philosopher who contributed to many fields of knowledge,  made the leap.

### Not everyone may yet get it

While I implied above that today even small children understand the concept, we may note in passing that there may still be people for whom it remains a challenge. Lajugie notes that the Fahrenheit temperature scale frees people from having to think about negative temperatures in ordinary circumstances, but since the 18-th century the (much more reasonable) Réaumur thermometer and Celsius scale goes under as well as above zero, helping people get familiar with negative values as something quite normal and not scary. (Will the US ever switch?)

Maybe the battle is not entirely won.  Thanks to Rogers I learned about the 2018 Lottery Incident in the United Kingdom of Great Britain and Northern Ireland, where players could win by scratching away, on a card, a temperature lower than the displayed figure. Some temperatures were below freezing. The game had to be pulled after less than a week as a result of player confusion. Example complaints included this one from a  23-year-old who was adamant she should have won:

On one of my cards it said I had to find temperatures lower than -8. The numbers I uncovered were -6 and -7 so I thought I had won, and so did the woman in the shop. But when she scanned the card the machine said I hadn’t. I phoned Camelot [the lottery office] and they fobbed me off with some story that -6 is higher – not lower – than -8 but I’m not having it. I think Camelot are giving people the wrong impression – the card doesn’t say to look for a colder or warmer temperature, it says to look for a higher or lower number. Six is a lower number than 8. Imagine how many people have been misled.

Again, quantities versus numbers. As we have seen, she could claim solid precedent for this reasoning. Most people, of course, have figured out that while 8 is greater than 6 (actually, because of that), -6 is greater than -8. But as Lajugie points out the modern, rigorous definition of negative numbers is (in the standard approach) far from the physical intuition (which typically looks like the two-directional line pictured at the beginning of this article, with numbers spreading away from zero towards both the right and the left). The picture helps, but it is only a picture.

### Away from the perceptible world

If we ignore the intuition coming from observing a Réaumur or Celsius thermometer (which does provide a “real world” guide), the early deniers of negative numbers were right that this concept does not directly reflect the experiential understanding of numbers, readily accessible to everyone. The general progress of science, however, has involved moving away from such immediate intuition. Everyday adventures (such as falling on the floor) absolutely do not suggest to us that matter is made of sparse atoms interacting through electrical and magnetic phenomena. This march towards abstraction has guided the evolution of modern science — most strikingly, the evolution of modern mathematics.

Some lament this trend; think of the negative reactions to the so-called “new math”. (Not from me. I was caught by the  breaking of the wave and loved every minute of it.) But there is no going back; in addition, it is well known that some of the initially most abstract mathematical development, initially pursued without any perceived connection with reality, found momentous unexpected applications later on; two famous examples are Minkowski’s space-time formalism, which provided the mathematical framework for specifying relativity, and number-theoretical research about factoring large numbers into primes, which made modern cryptography (and hence e-commerce) possible.

Negative numbers too required abstraction to acquire mathematical activity. That step required setting aside the appeal to intuition and considering the purely concepts solely through its posited properties. We computer scientists would say “applying the abstract data type approach”. The switch took place sometime in the middle of the 19th century, spurred among others by Évariste Galois. The German mathematician Hermann Hankel — who lived only a little longer than Galois — explained clearly how this transition occurred for negative numbers (cited by Boyé among others):

The [concept of] number is no longer today a thing, a substance that is supposed to exist outside of the thinking subject or the objects that lead to it being considered; it is no longer an independent principle, as the Pythagoreans thought. […] The mathematician considers as impossible only that which is logically impossible, in the sense of implying a contradiction. […] But if the numbers under study are logically possible, if the underlying concept is defined clearly and distinctly, the question can no longer be whether a substrate exists in the world of reality.

A very modern view: if you can dream it, and you can make it free of contradiction (well, Hankel lived in the blissful times before Gödel), then you can consider it exists. An engineer might replace the second of these conditions by: if you can build it. And a software engineer, by: if you can compile and run it. In the end it is all the same idea.

### Formally: a general integer is an equivalence class

In modern mathematics, while no one forbids you from clinging for help to some concrete intuition such as the Celsius scale, it is not part of the definition. Negative numbers are formally defined members of the zoo.

For those interested (and not remembering the details), the rigorous definition goes like this. We start from zero-or-positive integers (the set N of “natural” numbers) and consider pairs [a, b] of numbers (as we would do to define rationals, but the sequel quickly diverges). We define an equivalence relation which holds with another pair [a’, b’] if a + b’ = a’ + b. Then we can define the set Z of all integers (positive, zero, negative) as the quotient of N x N by that relation. The intuition if that the characteristic property of an equivalence class, such as [1, 4], [2, 5],  [3, 6]… , is that b – a, the difference between the second and first values, is the same for all pairs: 3 in this example (4 – 1, 5 – 2, 6 – 3 etc.). At least that property holds for b >= a; since we are starting from N, subtraction is defined only in that case. But then if we take that quotient as the definition of Z, we call members of that set “negative”, by pure convention, whenever b < a (if this property holds for one of the pairs in an equivalence class it holds for all of them), and positive if b > a. Zero is obtained for a = b.

We reestablish the connection with our good old natural integers by identifying N with the subset of Z for which b >= a. (This is an informal statement; the correct technical phrasing is that there is a “bijection” — a one-to-one correspondence, in fact an isomorphism — between that subset and N.) So we have plunged, or “embedded”, N into something bigger, to which most of its treasured properties (associativity and commutativity of addition etc.) immediately spread, while some limitations disappear; in particular, unlike in N, we can now subtract any Z integer from any other.

We also get the opposites of numbers as a result: for any m in Z, we can easily prove that there is another one n such that m + n = 0. That n can be written -m. The property is true for both positive and negative numbers, concepts that are also easy to define: we show that “>” is one of those operations that extend from N to Z, and the positive numbers are those m such that m > 0. Then if m is positive -m is negative, and conversely; 0 is the only number for which m = -m.

Remarkably, Z too is in one-to-one correspondence with N. (It is one of the definitions of an infinite set that it can be in one-to-one correspondence with one of its strict subsets, something that is obviously not possible for a finite set. To shine in cocktail parties you can refer to this property as “Dedekind-infinite”.) In other words, we have uncovered yet a new attraction of Hilbert’s Grand Hotel: the hotel has an annex, ready for the case of a guest coming with an unannounced companion. The companion will be hosted in the annex, in a room uniquely paired with the original guest’s room. The annex is a second hotel, but it is not exactly like the first: it does not have an annex of its own in the form of yet another hotel. It does have an annex, but that is the original hotel (the hotel of which it itself is the annex).

If you were not aware of the construction through equivalence classes of pairs and your reaction is “so much ado about so little! I do not need any of this to understand negative numbers and to know that m + -m = 0”, well, maybe, but you are missing part of the story: the observation that even the “natural” numbers are not that natural. Those we can readily apprehend as part of “natural” reality are the ones from 1 to something like 1000,  denoting quantities that we can reasonably count. If you really have extraordinary patience and time make this 1000,000 or even 1 million, that does not change the argument.

Even zero, as noted, took millennia to be recognized as a number. Beyond the numbers that we can readily fathom in relation to our experience at human scale, the set of natural integers is also an intellectual fiction. (Its official construction in the modern mathematical canon is seemingly even more contorted than the extension to Z sketched above: N, in the so-called Zermelo-Fraenkel theory (more pickup lines for cocktail parties!) contains the empty set for 0, and then sets each containing the previous one and a set made of that previous one. It is clearer with symbols: ø, {ø}, {ø, {ø}}, {ø, {ø}, {ø, {ø}}}, ….)

Coming back to negative numbers, Riemann (1861, cited by Schubring) held their construction as a fundamental step in the generalization process that characterizes mathematics, beautifully explaining the process:

The original object of mathematics is the integer number; the field of study increases only gradually. This extension does not happen arbitrarily, however; it is always motivated by the fact that the initially restricted view leads toward a need for such an extension. Thus the task of subtraction requires us to seek such quantities, or to extend our concept of quantity in such a way that its execution is always possible, thus guiding us to the concept of the negative.

### Nature and nurture

The generalization process is also a process of abstraction. The move away from the “natural” and “intuitive” is inevitable to understand negative numbers. All the misunderstandings and fears by great minds, reviewed above, were precisely caused by an exaggerated, desperate attempt to cling to supposedly natural concepts. And we only talked about negative numbers! Similar or worse resistance met the introduction of imaginary and complex numbers (the names themselves reflect the trepidation!), quaternions and other fruitful but artificial creation of mathematics. Millenia before, the Greeks experienced shock when they realized that numbers such as π and the square root of 2 could not be expressed as ratios of integers.

Innovation occurs when someone sets out to disprove a statement of impossibility. (This technique also lies behind one approach to solving puzzles and riddles: you despair that there is no way out; then try to prove that there is no solution. Failing to complete that proof might end up opening for you the path to one.)

Parallels exist between innovators and children. Children do not know yet that some things are impossible; they make up ways. Right now I am sitting next to the Rhine and I would gladly take a short walk on the other bank, but I do not want to go all the way to the bridge and back. If I were 4 years old, I would dream up some magic carpet or other fancy device, inferred from bedtime stories, that would instantly transport me there. We grow up and learn that there are no magic carpets, but true innovators who see an unsolved problem refuse to accept that state of affairs.

In their games, children often use the conditional: “I would be a princess, and you would be a magician!”. Innovators do this too when they refuse to be stopped by conventional-wisdom statements of impossibility. They set out to disprove the statements. The French expression “prouver le mouvement en marchant”, prove movement by walking, refers to the Greek philosophers Diogenes of Sinope and  Zeno of Elea. Zeno, the story goes, used the paradox of Achilles and the tortoise to claim he had proved that movement is impossible. Diogenes proved the reverse by starting to walk.

In mathematics and in computer science, we are even more like children because we can in fact summon our magic carpets — build anything we dream of, provided we can define it properly. Mathematics and computer science are among the best illustrations of Yuval Noah Harari’s thesis that a defining characteristic of the human race is our ability to tell ourselves stories, including very large and complex stories. A mathematical theory is a story that we tell ourselves and to which we can convert other mathematicians (plus, if the theory is really successful, generations of future students). Computer programs are the same with the somewhat lateral extra condition that we must also enable some computing system to execute it, although that system is itself a powerful story that has undergone the same process. You can find variants of these observations in such famous pronouncements as Butler Lampson’s “in computer science, we can solve any problem by introducing an extra level of indirection” and Alan Kay‘s  “the best way to predict the future is to invent it”.

There is a difference, however, with children’s role-playing; and it can have dramatic effects. Children can indulge in make-believe for quite some time, continuing to live their illusions until they grow up and become reasonable. Normally they will not experience bad consequences (well, apart from the child who believes a little too hard, or from a window little too high, that his arms really are wings.) In adult innovation, sooner or later you have to reconcile the products of your imagination with the world. It may be the physical world (your autonomous robot was fantastic in the lab but it requires heavy batteries making it impractical), but things are just as bad with the virtual world of mathematics or software. It is great to define and extend your own freaky artificial worlds, but at some point you have to make sure they are consistent not just with already defined worlds but with themselves. As noted earlier, a mathematical concoction, however audacious, should be free of contradictions; and a software concept, however powerful, should be implementable. (Efficiently implementable.)

By any measure the most breathtaking virtual construction of modern mathematics is Cantor’s set theory, which scared many mathematicians,  the way negative numbers had terrified their predecessors. (Case in point: the editor of a journal to which Cantor had submitted a paper wrote that it was “a hundred years too soon”.  Cantor did not want to wait until 1984. The great mathematician Kronecker described him as “a corrupter of youth”. And so on.) More enlightened colleagues, however, soon recognized the work as ushering in a new era. Hilbert, in particular, was a great supporter, as were many of the top names in several countries. Then intellectual disaster struck.

Cantor himself and others, most famously Russell in a remark included in a letter of Frege, noticed a problem. If sets can contain other sets, and even contain themselves (the set of infinite sets must be infinite), what do we make of the set of all sets that do not contain themselves? Variants of this simple question so shook the mathematical edifice that it took a half-century to put things back in order.

### Dream, check, build

Cantor, for his part, went into depression and illness. He died destitute and desperate. There may not have been a direct cause-to-effect relationship, but certainly the intellectual rejection and crisis did not help.

All the sadder that in the end set theory, after significant cleanup, turned out to be one of the biggest successes of history. We still discuss the paradoxes, but it is unlikely that today they prevent anyone from sleeping soundly at night.

Unlike those genuinely disturbing paradoxes of set theory, the paradoxes that led mathematicians of previous centuries to reject negative numbers were apparent only. They were not paradoxes but tokens of intellectual timidity.

The sole reason for fearing and skirting negative numbers was an inability to accept a construction that contradicted a simplistic view of physical reality. Like object-oriented programming and many other bold advances, all that was required was the audacity to take imagined abstractions seriously.

Dream it; check it; build it.

VN:F [1.9.10_1130]
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

## A problem child?

The latest issue of the New York Review of Books contains a book review by George Stauffer about Alban Berg with this bewildering sentence about Berg’s childhood:

He showed few signs of musical talent as a youth aside from informal piano lessons, reading through the scores of songs and operas, and playing four-hand arrangements of orchestral and chamber works with his sister, Smaragda.

Well, well… “Aside from”? If you had a child who could only read through lots of opera scores and play four-hand arrangements of symphonies, would you immediately get to the logical conclusion that he is devoid of musical talent?

(Sorry about poor Alban, he is the shame of our family, let’s just hope Smaragda won’t turn out to be such an abject musical failure.)

VN:F [1.9.10_1130]
VN:F [1.9.10_1130]

## One way to become a top scientist…

… is to have a top scientist spot your talent and encourage you, however humble your status may be then.

Wikipedia has a terse entry about Dirk Rembrandtsz (with “sz” at the end), presented as a “seventeenth-century Dutch cartographer, mathematician, surveyor, astronomer, teacher and [religious dignitary]” with “more than thirty scientific publications to his name” and various inventions. Seems just like another early scientific career, but digging a bit deeper reveals that the story goes beyond the ordinary.

The reason I looked up Rembrandtsz is that I ran into the following mention in a seminal book about Descartes, by Geneviève Rodis-Lewis (Calmann-Lévy, 1995). I did not know about Rodis-Lewis herself even though I now realize she was an impressive personality with a remarkable if difficult career (there is an entry about her in French Wikipedia). Here is the relevant extract from her book (pages 255-256), part of the story of Descartes’s years in the Netherlands. The translation is mine, as well as comments in brackets.

During the last years of his life in the Netherlands, Descartes had several opportunities to show [his] interest in people of very modest means. Baillet [Descartes’s first biographer, in the 17-th century] did not give the exact date of the first visit of a “peasant from Holland”, a “shoemaker” by trade, who was studying mathematics in books in vulgar language. [That is to say, not in Latin, presumably in Dutch or French.] When he came for the first time to Egmond [Descartes’s residence] and asked to see Descartes, the servants sent him away. Dirk Rembrandtsz “came back two or three months later”, insisting on being brought in. “His external appearance did nothing to help him get a better reception than before.” Descartes was told, however, of the return of this “annoying beggar” who obviously “wanted to talk philosophy with the purpose of getting some alms”. “Descartes sent him some money, which he refused, saying that he hoped that a third journey would be more productive than the first two.” When Descartes heard about this answer, he gave orders to receive him. “Rembrandtsz came back a few months later” and Descartes was able to appreciate “his skill and merit”. He helped him overcome difficulties and shared his method with him. “He added him to the circle of his friends.” Rembrandtsz “became, through studying with Descartes, one of the premier astronomers of this century”.

I find this story moving. The passionate, stubborn autodidact, determined to reach the highest steps in science in spite of miserable circumstances. The rejection by the servants, from instinctive class-based prejudices. The great scientist’s ability to overcome such prejudice and recognize a kindred, noble spirit and his devotion to the pursuit of knowledge. His generosity, his openness, his availability in spite of the many demands on his time. His encouragement to a young, unknown disciple. The numerous encounters which begin as lessons from a master and evolve towards a relationship of peers. And the later success of the aspiring scientist.

VN:F [1.9.10_1130]
VN:F [1.9.10_1130]

## Lampsort

In support of his view of software methodology, Leslie Lamport likes to use the example of non-recursive Quicksort. Independently of the methodological arguments, his version of the algorithm should be better known. In fact, if I were teaching “data structures and algorithms” I would consider introducing it first.

As far as I know he has not written down his version in an article, but he has presented it in lectures; see [1]. His trick is to ask the audience to give a non-recursive version of Quicksort, and of course everyone starts trying to remove the recursion, for example by making the stack explicit or looking for invertible functions in calls. But his point is that recursion is not at all fundamental in Quicksort. The recursive version is a specific implementation of a more general idea.

Lamport’s version — let us call it Lampsort —is easy to express in Eiffel. We may assume the following context:

a: ARRAY [G -> COMPARABLE]        — The array to be sorted.
pivot: INTEGER                                      —  Set by partition.
picked: INTEGER_INTERVAL            — Used by the sorting algorithm, see below.
partition (i, j: INTEGER)
……..require      — i..j is a sub-interval of the array’s legal indexes:
……..……..i < j
……..……..i >= a.lower
……..……..j <= a.upper
……..do
……..……..… Usual implementation of partition
……..ensure     — The expected effect of partition:
……..……..pivot >= i
……..……..pivot < j
……..……..a [i..j] has been reshuffled so that elements in i..pivot are less than
……..……..or equal to those in pivot+1 .. j.
……..end

We do not write the implementation of partition since the point of the present discussion is the overall algorithm. In the usual understanding, that algorithm consists of doing nothing if the array has no more than one element, otherwise performing a partition and then recursively calling itself on the two resulting intervals. The implementation can take advantage of parallelism by forking the recursive calls out to different processors. That presentation, says Lamport, describes only a possible implementation. The true Quicksort is more general. The algorithm works on a set not_sorted of integer intervals i..j such that the corresponding array slices a [i..j] are the only ones possibly not sorted; the goal of the algorithm is to make not_sorted empty, since then we know the entire array is sorted. In Eiffel we declare this set as:

not_sorted: SET [INTEGER_INTERVAL]

The algorithm initializes not_sorted to contain a single element, the entire interval; at each iteration, it removes an interval from the set, partitions it if that makes sense (i.e. the interval has more than one element), and inserts the resulting two intervals into the set. It ends when not_sorted is empty. Here it is:

……..from                                 — Initialize interval set to contain a single interval, the array’s entire index range:
……..…..create not_sorted.make_one (a.lower |..| a.upper)….         ..……..
……..invariant
……..…..— See below
……..until
……..…..not_sorted.is_empty                                                            — Stop when there are no more intervals in set
……..loop
……..…..picked := not_sorted.item                                                     — Pick an interval from (non-empty) interval set.
……..……if picked.count > 1 then                                                      — (The precondition of partition holds, see below.)
……..……..…..partition (picked.lower, picked.upper)                 — Split, moving small items before & large ones after pivot.
……..……..…..not_sorted.extend (picked.lower |..| pivot)            — Insert new intervals into the set of intervals: first
……..……....not_sorted.extend (pivot + 1 |..| picked.upper)     — and second.
……..……end
……..…...not_sorted.remove (picked)                                               — Remove interval that was just partitioned.
…….end

Eiffel note: the function yielding an integer interval is declared in the library class INTEGER using the operator |..| (rather than just  ..).

The query item from SET, with the precondition not is_empty,  returns an element of the set. It does not matter which element. In accordance with the Command-Query Separation principle, calling item does not modify the set; to remove the element you have to use the command remove. The command extend adds an element to the set.

The abstract idea behind Lampsort, explaining why it works at all, is the following loop invariant (see [2] for a more general discussion of how invariants provide the basis for understanding loop algorithms). We call “slice” of an array a non-empty contiguous sub-array; for adjacent slices we may talk of concatenation; also, for slices s and t s <= t means that every element of s is less than or equal to every element of t. The invariant is:

a is the concatenation of the members of a set slices of disjoint slices, such that:
– The elements of a are a permutation of its original elements.
– The index range of any member  of slices having more than one element is in not_sorted.
– For any adjacent slices s and t (with s before t), s <= t.

The first condition (conservation of the elements modulo permutation) is a property of partition, the only operation that can modify the array. The rest of the invariant is true after initialization (from clause) with slices made of a single slice, the full array. The loop body maintains it since it either removes a one-element interval from not_sorted (slices loses the corresponding slice) or performs partition with the effect of partitioning one slice into two adjacent ones satisfying s <= t, whose intervals replace the original one in not_sorted. On exit, not_sorted is empty, so slices is a set of one-element slices, each less than or equal to the next, ensuring that the array is sorted.

The invariant also ensures that the call to partition satisfies that routine’s precondition.

The Lampsort algorithm is a simple loop; it does not use recursion, but relies on an interesting data structure, a set of intervals. It is not significantly longer or more difficult to understand than the traditional recursive version

sort (i, j: INTEGER)
……..require
……..……..i <= j
……..……..i >= a.lower
……..……..j <= a.upper
……..do
……..……if j > i then                    — Note that precondition of partition holds.
……..……..…..partition (i, j)         — Split into two slices s and t such that s <= t.
……..……..…..sort (i, pivot)          — Recursively sort first slice.
……..……..…..sort (pivot+1, j)      — Recursively sort second slice.
……..……end……..…..
……..end

Lampsort, in its author’s view, captures the true idea of Quicksort; the recursive version, and its parallelized variants, are only examples of possible implementations.

I wrote at the start that the focus of this article is Lampsort as an algorithm, not issues of methodology. Let me, however, give an idea of the underlying methodological debate. Lamport uses this example to emphasize the difference between algorithms and programs, and to criticize the undue attention being devoted to programming languages. He presents Lampsort in a notation which he considers to be at a higher level than programming languages, and it is for him an algorithm rather than a program. Programs will be specific implementations guided in particular by efficiency considerations. One can derive them from higher-level versions (algorithms) through refinement. A refinement process may in particular remove or restrict non-determinism, present in the above version of Lampsort through the query item (whose only official property is that it returns an element of the set).

The worldview underlying the Eiffel method is almost the reverse: treating the whole process of software development as a continuum; unifying the concepts behind activities such as requirements, specification, design, implementation, verification, maintenance and evolution; and working to resolve the remaining differences, rather than magnifying them. Anyone who has worked in both specification and programming knows how similar the issues are. Formal specification languages look remarkably like programming languages; to be usable for significant applications they must meet the same challenges: defining a coherent type system, supporting abstraction, providing good syntax (clear to human readers and parsable by tools), specifying the semantics, offering modular structures, allowing evolution while ensuring compatibility. The same kinds of ideas, such as an object-oriented structure, help on both sides. Eiffel as a language is the notation that attempts to support this seamless, continuous process, providing tools to express both abstract specifications and detailed implementations. One of the principal arguments for this approach is that it supports change and reuse. If everything could be fixed from the start, maybe it could be acceptable to switch notations between specification and implementation. But in practice specifications change and programs change, and a seamless process relying on a single notation makes it possible to go back and forth between levels of abstraction without having to perform repeated translations between levels. (This problem of change is, in my experience, the biggest obstacle to refinement-based approaches. I have never seen a convincing description of how one can accommodate specification changes in such a framework without repeating the whole process. Inheritance, by the way, addresses this matter much better.)

The example of Lampsort in Eiffel suggests that a good language, equipped with the right abstraction mechanisms, can be effective at describing not only final implementations but also abstract algorithms. It does not hurt, of course, that these abstract descriptions can also be executable, at the possible price of non-optimal performance. The transformation to an optimal version can happen entirely within the same method and language.

Quite apart from these discussions of software engineering methodology, Lamport’s elegant version of Quicksort deserves to be known widely.

#### References

[1] Lamport video here, segment starting at 0:32:34.
[2] Carlo Furia, Bertrand Meyer and Sergey Velder: Loop invariants: Analysis, Classification and Examples, in ACM Computing Surveys, September 2014, preliminary text here.

VN:F [1.9.10_1130]
VN:F [1.9.10_1130]

It is remarkable that almost half a century after Dijkstra’s goto article, and however copiously and reverently it may be cited, today’s programs (other than in Eiffel) are still an orgy of gotos. There are not called gotos, being described as constructs that break out of a loop or exit a routine in multiple places, but they are gotos all the same. Multiple routine exits are particularly bad since they are in effect interprocedural gotos.

Ian Joyner has just released a simple and cogent summary of why routines should always have one entry and one exit.

#### References

[1] Ian Joyner: Single-entry, single-exit (SESE) heuristic, available here.

VN:F [1.9.10_1130]
VN:F [1.9.10_1130]

## Reading notes: strong specifications are well worth the effort

This report continues the series of ICSE 2013 article previews (see the posts of these last few days, other than the DOSE announcement), but is different from its predecessors since it talks about a paper from our group at ETH, so you should not expect any dangerously delusional,  disingenuously dubious or downright deceptive declaration or display of dispassionate, disinterested, disengaged describer’s detachment.

The paper [1] (mentioned on this blog some time ago) is entitled How good are software specifications? and will be presented on Wednesday by Nadia Polikarpova. The basic result: stronger specifications, which capture a more complete part of program functionality, cause only a modest increase in specification effort, but the benefits are huge; in particular, automatic testing finds twice as many faults (“bugs” as recently reviewed papers call them).

Strong specifications are specifications that go beyond simple contracts. A straightforward example is a specification of a push operation for stacks; in EiffelBase, the basic Eiffel data structure library, the contract’s postcondition will read

item =                                          /A/
count = old count + 1

where x is the element being pushed, item the top of the stack and count the number of elements. It is of course sound, since it states that the element just pushed is now the new top of the stack, and that there is one more element; but it is also  incomplete since it says nothing about the other elements remaining as they were; an implementation could satisfy the contract and mess up with these elements. Using “complete” or “strong” preconditions, we associate with the underlying domain a theory [2], or “model”, represented by a specification-only feature in the class, model, denoting a sequence of elements; then it suffices (with the convention that the top is the first element of the model sequence, and that “+” denotes concatenation of sequences) to use the postcondition

model = <x> + old model         /B/

which says all there is to say and implies the original postconditions /A/.

Clearly, the strong contracts, in the  /B/ style, are more expressive [3, 4], but they also require more specification effort. Are they worth the trouble?

The paper explores this question empirically, and the answer, at least according to the criteria used in the study, is yes.  The work takes advantage of AutoTest [5], an automatic testing framework which relies on the contracts already present in the software to serve as test oracles, and generates test cases automatically. AutoTest was applied to both to the classic EiffelBase, with classic partial contracts in the /A/ style, and to the more recent EiffelBase+ library, with strong contracts in the /B/ style. AutoTest is for Eiffel programs; to check for any language-specificity in the results the work also included testing a smaller set of classes from a C# library, DSA, for which a student developed a version (DSA+) equipped with strong model-based contracts. In that case the testing tool was Microsoft Research’s Pex [7]. The results are similar for both languages: citing from the paper, “the fault rates are comparable in the C# experiments, respectively 6 . 10-3 and 3 . 10-3 . The fault complexity is also qualitatively similar.

The verdict on the effect of strong specifications as captured by automated testing is clear: the same automatic testing tools applied to the versions with strong contracts yield twice as many real faults. The term “real fault” comes from excluding spurious cases, such as specification faults (wrong specification, right implementation), which are a phenomenon worth studying but should not count as a benefit of the strong specification approach. The paper contains a detailed analysis of the various kinds of faults and the corresponding empirically determined measures. This particular analysis is for the Eiffel code, since in the C#/Pex case “it was not possible to get an evaluation of the faults by the original developers“.

In our experience the strong specifications are not that much harder to write. The paper contains a precise measure: about five person-weeks to create EiffelBase+, yielding an “overall benefit/effort ratio of about four defects detected per person-day“. Such a benefit more than justifies the effort. More study of that effort is needed, however, because the “person” in the person-weeks was not just an ordinary programmer. True, Eiffel experience has shown that most programmers quickly get the notion of contract and start applying it; as the saying goes in the community, “if you can write an if-then-else, you can write a contract”. But we do not yet have significant evidence of whether that observation extends to model-based contracts.

Model-based contracts (I prefer to call them “theory-based” because “model” means so many other things, but I do not think I will win that particular battle) are, in my opinion, a required component of the march towards program verification. They are the right compromise between simple contracts, which have proved to be attractive to many practicing programmers but suffer from incompleteness, and full formal specification à la Z, which say everything but require too much machinery. They are not an all-or-nothing specification technique but a progressive one: programmers can start with simple contracts, then extend and refine them as desired to yield exactly the right amount of precision and completeness appropriate in any particular context. The article shows that the benefits are well worth the incremental effort.

According to the ICSE program the talk will be presented in the formal specification session, Wednesday, May 22, 13:30-15:30, Grand Ballroom C.

#### References

[1] Nadia Polikarpova, Carlo A. Furia, Yu Pei, Yi Wei and Bertrand Meyer: What Good Are Strong Specifications?, to appear in ICSE 2013 (Proceedings of 35th International Conference on Software Engineering), San Francisco, May 2013, draft available here.

[2] Bertrand Meyer: Domain Theory: the forgotten step in program verification, article on this blog, see here.

[3] Bernd Schoeller, Tobias Widmer and Bertrand Meyer: Making Specifications Complete Through Models, in Architecting Systems with Trustworthy Components, eds. Ralf Reussner, Judith Stafford and Clemens Szyperski, Lecture Notes in Computer Science, Springer-Verlag, 2006, available here.

[4] Nadia Polikarpova, Carlo Furia and Bertrand Meyer: Specifying Reusable Components, in Verified Software: Theories, Tools, Experiments (VSTTE ‘ 10), Edinburgh, UK, 16-19 August 2010, Lecture Notes in Computer Science, Springer Verlag, 2010, available here.

[5] Bertrand Meyer, Ilinca Ciupa, Andreas Leitner, Arno Fiva, Yi Wei and Emmanuel Stapf: Programs that Test Themselves, IEEE Computer, vol. 42, no. 9, pages 46-55, September 2009, also available here.

[6] Bertrand Meyer, Ilinca Ciupa, Andreas Leitner, Arno Fiva, Yi Wei and Emmanuel Stapf: Programs that Test Themselves, in IEEE Computer, vol. 42, no. 9, pages 46-55, September 2009, also available here.

[7] Nikolai Tillman and Peli de Halleux, Pex: White-Box Generation for .NET, in Tests And Proofs (TAP 2008), pp. 134-153.

VN:F [1.9.10_1130]
VN:F [1.9.10_1130]

## Reading notes: the design of bug fixes

To inaugurate the “Reading Notes” series [1] I will take articles from the forthcoming International Conference on Software Engineering. Since I am not going to ICSE this year I am instead spending a little time browsing through the papers, obligingly available on the conference site. I’ll try whenever possible to describe a paper before it is presented at the conference, to alert readers to interesting sessions. I hope in July and August to be able to do the same for some of the papers to be presented at ESEC/FSE [2].

Please note the general disclaimer [1].

The Design of Bug Fixes [3] caught my attention partly for selfish reasons, since we are working, through the AutoFix project [3], on automatic bug fixing, but also out of sheer interest and because I have seen previous work by some of the authors. There have been article about bug patterns before, but not so much is known with credible empirical evidence about bug fixes (corrections of faults). When a programmer encounters a fault, what strategies does he use to correct it? Does he always produce the best fix he can, and if so, why not? What is the influence of the project phase on such decisions (e.g. will you fix a bug the same way early in the process and close to shipping)? These are some of the questions addressed by the paper.

The most interesting concrete result is a list of properties of bug fixes, classified along two criteria: nature of a fix (the paper calls it “design space”), and reasoning behind the choice of a fix. Here are a few examples of the “nature” classification:

• Data propagation: the bug arises in a component, fix it in another, for example a library class.
• More or less accuracy: are we fixing the symptom or the cause?
• Behavioral alternatives: rather than directly correcting the reported problem, change the user-experienced behavior (evoking the famous quip that “it’s not a bug, it’s a feature”). The authors were surprised to see that developers (belying their geek image) seem to devote a lot of effort trying to understand how users actually use the products, but also found that even so developers do not necessarily gain a solid, objective understanding of these usage patterns. It would be interesting to know if the picture is different for traditional locally-installed products and for cloud-based offerings, since in the latter case it is possible to gather more complete, accurate and timely usage data.

On the “reasoning” side, the issue is why and how programmers decide to adopt a particular approach. For example, bug fixes tend to be more audacious (implying redesign if appropriate) at the beginning of a project, and more conservative as delivery nears and everyone is scared of breaking something. Another object of the study is how deeply developers understand the cause rather than just the symptom; the paper reports that 18% “did not have time to figure out why the bug occurred“. Surprising or not, I don’t know, but scary! Yet another dimension is consistency: there is a tension between providing what might ideally be the best fix and remaining consistent with the design decisions that underlie a software system throughout its architecture.

I was more impressed by the individual categories of the classification than by that classification as a whole; some of the categories appear redundant (“interface breakage“, “data propagation” and “internal vs external“, for example, seem to be pretty much the same; ditto for “cause understanding” and “accuracy“). On the other hand the paper does not explicitly claim that the categories are orthogonal. If they turn this conference presentation into a journal article I am pretty sure they will rework the classification and make it more robust. It does not matter that it is a bit shaky at the moment since the main insights are in the individual kinds of fix and fix-reasoning uncovered by the study.

The authors are from Microsoft Research (one of them was visiting faculty) and interviewed numerous programmers from various Microsoft product groups to find out how they fix bugs.

The paper is nicely written and reads easily. It includes some audacious syntax, as in “this dimension” [internal vs external] “describes how much internal code is changed versus external code is changed as part of a fix“. It has a discreet amount of humor, some of which may escape non-US readers; for example the authors explain that when approaching programmers out of the blue for the survey they tried to reassure them through the words “we are from Microsoft Research, and we are here to help“, a wry reference to the celebrated comment by Ronald Reagan (or his speechwriter) that the most dangerous words in the English language are “I am from the government, and I am here to help“. To my taste the authors include too many details about the data collection process; I would have preferred the space to be used for a more detailed discussion of the findings on bug fixes. On the other hand we all know that papers to selective conferences are written for referees, not readers, and this amount of methodological detail was probably the minimum needed to get past the reviewers (by avoiding the typical criticism, for empirical software engineering research, that the sample is too small, the questions biased etc.). Thankfully, however, there is no pedantic discussion of statistical significance; the authors openly present the results as dependent on the particular population surveyed and on the interview technique. Still, these results seem generalizable in their basic form to a large subset of the industry. I hope their publication will spawn more detailed studies.

According to the ICSE program the paper will be presented on May 23 in the Debugging session, 13:30 to 15:30.

#### Notes and references

[2] European Software Engineering Conference 2013, Saint Petersburg, Russia, 18-24 August, see here.

[3] Emerson Murphy-Hill, Thomas Zimmerman, Christian Bird and Nachiappan Nagapan: The Design of Bug Fixes, in ICSE 2013, available here.

[4] AutoFix project at ETH Zurich, see project page here.

[5] Ronald Reagan speech extract on YouTube.

VN:F [1.9.10_1130]
VN:F [1.9.10_1130]