Archive for the ‘Software verification’ Category.

The theory and calculus of aliasing

In a previous post I briefly mentioned some work that I am doing on aliasing. There is a draft paper [1], describing the theory, calculus, graphical notation (alias diagrams) and implementation. Here I will try to give an idea of what it’s about, with the hope that you will be intrigued enough to read the article. Even before you read it you might try out the implementation[2], a simple interactive interface with all the examples of the article .

What the article does not describe in detail — that will be for a companion paper— is how the calculus will be used as part of a general framework for developing object-oriented software proved correct from the start, the focus of our overall  “Trusted Components” project’ [3]. Let me simply state that the computation of aliases is the key missing step in the effort to make correctness proofs a standard part of software development. This is a strong claim which requires some elaboration, but not here.

The alias calculus asks a simple question: given two expressions representing references (pointers),  can their values, at a given point in the program, ever reference the same object during execution?

As an example  application, consider two linked lists x and y, which can be manipulated with operations such as extend, which creates a new cell and adds it at the end of the list:

lists

 The calculus makes it possible to prove that if  x and y are not aliased to each other, then none of the pointers in any of the cells in either of the lists can point to  (be aliased to) any cell of  the other. If  x is initially aliased to y, the property no longer holds. You can run the proof (examples 18 and 19) in the downloadable implementation.

The calculus gives a set of rules, each applying to a particular construct of the language, and listed below.

The rule for a construct p is of the form

          a |= p      =   a’
 

where a and a’ are alias relations; this states that executing p  in a state where the alias relation is a will yield the alias relation a’ in the resulting state. An alias relation is a symmetric, irreflexive relation; it indicates which expressions and variables can be aliased to each other in a given state.

The constructs p considered in the discussion are those of a simplified programming language; a modern object-oriented language such as Eiffel can easily be translated into that language. Some precision will be lost in the process, meaning that the alias calculus (itself precise) can find aliases that would not exist in the original program; if this prevents proofs of desired properties, the cut instruction discussed below serves to correct the problem.

The first rule is for the forget instruction (Eiffel: x := Void):

          a |= forget x       =   a \- {x}

where the \- operator removes from a relation all the elements belonging to a given set A. In the case of object-oriented programming, with multidot expressions x.y.z, the application of this rule must remove all elements whose first component, here x, belongs to A.

The rule for creation is the same as for forget:

         a |= create x          =   a \- {x}

The two instructions have different semantics, but the same effect on aliasing.

The calculus has a rule for the cut instruction, which removes the connection between two expressions:

        a |= cut x, y       =   a — <x, y>

where is set difference and <x, y> includes the pairs [x, y] and [y, x] (this is a special case of a general notation defined in the article, using the overline symbol). The cut   instruction corresponds, in Eiffel, to cut   x /=end:  a hint given to the alias calculus (and proved through some other means, such as standard axiomatic semantics) that some references will not be aliased.

The rule for assignment is

      a |= (x := y)      =   given  b = a \- {x}   then   <b È {x} x (b / y)}> end

where b /y (“quotient”), similar to an equivalence class in an equivalence relation, is the set of elements aliased to y in b, plus y itself (remember that the relation is irreflexive). This rule works well for object-oriented programming thanks to the definition of the \- operator: in x := x.y, we must not alias x to x.y, although we must alias it to any z that was aliased to x.y.

The paper introduces a graphical notation, alias diagrams, which makes it possible to reason effectively about such situations. Here for example is a diagram illustrating the last comment:

Alias diagram for a multidot assignment

Alias diagram for a multidot assignment

(The grayed elements are for explanation and not part of the final alias relation.)

For the compound instruction, the rule is:

           a |= (p ;  q)      =   (a |= p) |= q)

For the conditional instruction, we get:

           a |= (then p else  q end)      =   (a |= p) È  (a |= q)

Note the form of the instruction: the alias calculus ignores information from the then clause present in the source language. The union operator is the reason why  alias relations,  irreflexive and symmetric, are not  necessarily transitive.

The loop instruction, which also ignores the test (exit or continuation condition), is governed by the following rule:

           a |= (loop p end)       =   tN

where span style=”color: #0000ff;”>N is the first value such that tN = tN+1 (in other words, tN is the fixpoint) in the following sequence:

            t0          =    a
           tn+1       =   (tn È (tn |= p))     

The existence of a fixpoint and the correctness of this formula to compute it are the result of a theorem in the paper, the “loop aliasing theorem”; the proof is surprisingly elaborate (maybe I missed a simpler one).

For procedures, the rule is

         a |= call p        =   a |= p.body

where p.body is the body of the procedure. In the presence of recursion, this gives rise to a set of equations, whose solution is the fixpoint; again a theorem is needed to demonstrate that the fixpoint exists. The implementation directly applies fixpoint computation (see examples 11 to 13 in the paper and implementation).

The calculus does not directly consider routine arguments but treats them as attributes of the corresponding class; so a call is considered to start with assignments of the form f : = a for every pair of formal and actual arguments f and a. Like the omission of conditions in loops and conditionals, this is a source of possible imprecision in translating from an actual programming language into the calculus, since values passed to recursive activations of the same routine will be conflated.

Particularly interesting is the last rule, which generalizes the previous one to qualified calls of the form x. f (…)  as they exist in object-oriented programming. The rule involves the new notion of inverse variable, written x’ where x is a variable. Laws of the calculus (with Current denoting the current object, one of the fundamental notions of object-oriented programming) are

        Current.x            = x   
        x.Current            = x
        x.x’                      = Current
        x’.x                      = Current

In other words, Current plays the role of zero element for the dot operator, and variable inversion is the inverse operation. In a call x.f, where x denotes the supplier object (the target of the call), the inverse variable provides a back reference to the client object (the caller), indispensable to interpret references in the original context. This property is reflected by the qualified client rule, which uses  the auxiliary operator n (where x n a, for a relation a and a variable x, is the set of pairs [x.u, y.v] such that the pair [u, v] is in a). The rule is:

         a |= call x.r       =   x n ((x’ n a ) |= call r)

You need to read the article for the full explanation, but let me offer the following quote from the corresponding section (maybe you will note a familiar turn of phrase):

Thus we are permitted to prove that the unqualified call creates certain aliasings, on the assumption that it starts in its own alias environment but has access to the caller’s environment through the inverted variable, and then to assert categorically that the qualified call has the same aliasings transposed back to the original environment. This change of environment to prove the unqualified property, followed by a change back to the original environment to prove the qualified property, explains well the aura of magic which attends a programmer’s first introduction to object-oriented programming.

I hope you will enjoy the calculus and try the examples in the implementation. It is fun to apply, and there seems to be no end to the potential applications.

Reference

[1] Bertrand Meyer: The Theory and Calculus of Aliasing, draft paper, first published 12 January 2009 (revised 21 January 2010), available here and also at arxiv.org.
[2] Implementation (interactive version to try all the examples of the paper): downloadable Windows executable here.
[3] Bertrand Meyer: The Grand Challenge of Trusted Components, in 2003 International Conference on Software Engineering, available here.

VN:F [1.9.10_1130]
Rating: 7.9/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +7 (from 7 votes)

Dwelling on the point

Once again, and we are not learning!

La Repubblica of last Thursday [1] and other Italian newspapers have reported on a “computer” error that temporarily brought thousands of accounts at the national postal service bank into the red. It is a software error, due to a misplacement of the decimal points in some transactions.

As usual the technical details are hazy; La Repubblica writes that:

Because of a software change that did not succeed, the computer system did not always read the decimal point during transactions”.

As a result, it could for example happen that a 15.00-euro withdrawal was understood as 1500 euros.
I have no idea what “reading the decimal point ” means. (There is no mention of OCR, and the affected transactions seem purely electronic.) Only some of the 12 million checking or “Postamat” accounts were affected; the article cites a number of customers who could not withdraw money from ATMs because the system wrongly treated their accounts as over-drawn. It says that this was the only damage and that the postal service will send a letter of apology. The account leaves many questions unanswered, for example whether the error could actually have favored some customers, by allowing them to withdraw money they did not have, and if so what will happen.

The most important unanswered question is the usual one: what was the software error? As usual, we will probably never know. The news items will soon be forgotten, the postal service will somehow fix its code, life will go on. Nothing will be learned; the next time around similar causes will produce similar effects.

I criticized this lackadaisical attitude in an earlier column [2] and have to hammer its conclusion again: any organization using public money should be required, when it encounters a significant software malfunction, to let experts investigate the incident in depth and report the results publicly. As long as we keep forgetting our errors we will keep repeating them. Where would airline safety be in the absence of thorough post-accident reports? That a software error did not kill anyone is not a reason to ignore it. Whether it is the Italian post messing up, a US agency’s space vehicle crashing on the moon or any other software fault causing systems to fail, it is not enough to fix the symptoms: we must have a professional report and draw the lessons for the future.

Reference

[1] Luisa Grion: Poste in tilt per una virgola — conti gonfiati, stop ai prelievi. In La Repubblica, 26 November 2009, page 18 of the print version. (At the time of writing it does not appear at repubblica.it,  but see  the TV segment also titled “Poste in tilt per una virgola” on Primocanale Web TV here, and other press articles e.g. in Il Tempo here.)

[2] On this blog: The one sure way to advance software engineering (post of 21 August 2009).

VN:F [1.9.10_1130]
Rating: 9.5/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 5 votes)

The one sure way to advance software engineering

Airplanes today are incomparably safer than 20, 30, 50 years ago: 0.05 deaths per billion kilometers. That’s not by accident.

Rather, it’s by accidents. What has turned air travel from a game of chance into one of the safest modes of traveling is the relentless study of crashes and other mishaps. In the US the National Transportation Safety Board has investigated more than 110,000 accidents since it began its operations in 1967. Any accident must, by law, be investigated thoroughly; airplanes themselves carry the famous “black boxes” whose only purpose is to provide evidence in the case of a catastrophe. It is through this systematic and obligatory process of dissecting unsafe flights that the industry has made almost all flights safe.

Now consider software. No week passes without the announcement of some debacle due to “computers” — meaning, in most cases, bad software. The indispensable Risks forum [1] and many pages around the Web collect software errors; several books have been devoted to the topic.

A few accidents have been investigated thoroughly; two examples are Nancy Leveson’s milestone study of the Therac-25 patient-killing medical device [2], and Gilles Kahn’s analysis of the Ariane 5 crash (which Jean-Marc Jézéquel and I used as a basis for our 1997 article [3]). Both studies improved our understanding of software engineering. But these are exceptions. Most of what we have elsewhere is made of hearsay and partial information, and plain urban legends (like the endlessly repeated story about the Venus probe that supposedly failed because a period was typed instead of a comma — most likely a canard).

Software disasters continue; they attract attention when they arise, and inevitably some kind of announcement is made that the problem is being corrected, or that a committee will study the causes; almost as inevitably, that is the last we hear of it. In the latest issue of Risks alone, you can find several examples (such as [4]). In the past months, breakdowns at Skype, Google and Twitter made headlines; we all learned about the failures, but have you seen precise analyses of what actually happened?

As another typical example, we heard a few months ago from the French press that an “IT error” (une erreur informatique) led to overestimating the pensions of about a million people; since (strangely!)  no one was suggesting that they would be asked to pay the money back, the cost to taxpayers will be over 300 million euros. I looked in vain for any follow-up story: what happened? What was the actual error? Were the tools at fault? The quality assurance procedures? The programmers’ qualifications? Or was it a matter of bad deployment? Of erroneous data, and if so, what was the process for validating inputs? And so on. Most likely we will never know.

But we should know. Especially with public money, any such incident should have a post-mortem, with experts called in (surely at a fraction of the cost of the failure) to analyze what happened and produce a public report.

At least this was a public project, for which some disclosure was inevitable. The software engineering community buzzes with unconfirmed reports of huge software-induced errors, that go unreported because they happen in private companies eager to avoid bad publicity. It’s as if we had allowed aircraft manufacturers, decade after decade, to keep mum about accidents. Where then would air travel safety be today?

Progress in software engineering will come from many sources. Research is critical, including on topics which today appear exotic. But if anyone is looking for one practical, low-tech idea that has an iron-clad guarantee of improving software engineering, here it is: pass a law that requires extensive professional analysis of any large software failure.

The details are not so hard to refine. The initiative would probably have to start at the national level; any industrialized country could be the pioneer. (Or what about Europe as whole?) The law would have to define what constitutes a “large” failure; for example it could be any failure that may be software-related and has resulted in loss either of human life or of property beyond a certain threshold, say $50 million. In the latter case, to avoid accusations of government meddling in private matters, the law could initially be limited to cases involving public money; when it has shown its value, it could then be extended to private failures as well. Even with some limitations, such a law would have a tremendous effect. Only with a thorough investigation of software projects gone wrong can we help the majority of projects to go right.

We can no longer afford to let the IT industry get away with covering up its failures. Lobbying for a Software Incident Full Disclosure Law is the single most important step we can take today to make the world’s software better.

Note (2011)

Later articles have come back to the theme discussed here, and there will probably be more in the future as it remains ever current. They can be found by selecting the tag “Advance.

References

[1] Peter G. Neumann, moderator: The Risks Digest Forum on Risks to the Public in Computers and Related Systems, available online (going back to 1985!).

[2] Nancy Leveson: Medical Devices: The Therac-25, extract from her book Safeware: System Safety and Computers, Addison-Wesley, 1995, available online.

[3] Jean-Marc Jézéquel and Bertrand Meyer: Design by Contract: The Lessons of Ariane, in Computer (IEEE), vol. 30, no. 1, January 1997, pages 129-130, also available here.

[4] Monty Solomon: Computer Error Caused Rent Troubles for Public Housing Tenants, in Risks 25.76, 15 August 2009, see here.

[5] Une erreur informatique à 300 millions d’euros, in Le Point, 12 May 2009, available here.

 

VN:F [1.9.10_1130]
Rating: 9.6/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +6 (from 6 votes)

Specifying user interfaces

Many blogs including this one rely on the WordPress software. In previous states of the present page you may have noticed a small WordPress bug, which I find interesting.

“Tags” are a nifty WordPress feature. When you post a message, you can specify one or more informative “tags”. The tags of all messages appear in the right sidebar, each with a smaller or bigger font size depending on the number of messages that specified it. You can click such a tag in the sidebar and get, on the left, a page containing all the associated messages.

Now assume that many posts use a particular tag; in our example it is “Design by Contract”, not unexpected for this blog. Assume further that the tag name is long. It is indeed in this case: 18 characters. As a side note, no problem would arise if I used normal spaces in the name, which would then appear on two or three lines; precisely to avoid this  I use HTML “non-breaking spaces”. This is probably not in the WordPress spirit, but any other long tag without spaces would create the same problem. That problem is a garbled display:

dbc_overflows

The long tag overflows the bluish browser area assigned to tags, producing an ugly effect. This behavior is hard to defend: either the tag should have been rejected as too long when the poster specified it or it should fit in its zone, whether by truncation or by applying a smaller font.

I quickly found a workaround, not nice but good enough: make sure that some short tag  (such as “Hoare”) appears much more often than the trouble-making tag. Since font size indicates the relative frequency of tags, the long one will be scaled down to a smaller font which fits.

Minor as it is, this WordPress glitch raises some general questions. First, is it really a bug? Assume, by a wild stretch of imagination, that a jury had to resolve this question; it could easily find an expert to answer positively, by stating that the behavior does not satisfy reasonable user expectations, and another who notes that it is not buggy behavior since it does not appear to violate any expressly stated property of the specification. (At least I did not find in the WordPress documentation any mention of either the display size of tags or a limit on tag length; if I missed it please indicate the reference.)

Is it a serious matter? Not in this particular example; uncomely Web display does not kill.   But the distinction between “small matter of esthetics” and software fault can be tenuous. We may note in particular that the possibility for large data to overflow its assigned area is a fundamental source of security risks; and even pure user interface issues can become life-threatening in the case of critical applications such as air-traffic control.

Our second putative expert is right, however: no behavior is buggy unless it contradicts a specification. Where will the spec be in such an example? There are three possibilities, each with its limitations.

The first solution is to expect that in a carefully developed system every such property will have to be specified. This is conceivable, but hard, and the question arises of how to make sure nothing has been forgotten. Past  some threshold of criticality and effort, the only specifications that count are formal; there is not that much literature on specifying user interfaces formally, since much of the work on formal specifications has understandably concentrated on issues thought to be more critical.

Because of the tediousness of specifying such general properties again and again for each case, it might be better  — this is the second solution — to specify them globally, for an entire system, or for the user interfaces of an entire class of systems. Like any serious effort at specification, if it is worth doing, it is worth doing formally.

In either of these approaches the question remains of how we know we have specified everything of interest. This question, specification completeness, is not as hopeless as most people think; I plan to write an entry about it sometime (hint:  bing for “guttag horning”). Still, it is hard to be sure you did not miss anything relevant. Remember this the next time formal methods advocates — who should know better — tell you that with their techniques there “no longer is a need to test”, or when you read about the latest OS kernel that is “guaranteed correct and secure”. However important formal methods and proofs are, they can only guarantee satisfaction of the properties that the specifier has considered and stated. To paraphrase someone [1], I would venture that

Proofs can only show the absence of envisaged bugs, never rule out the presence of unimagined ones.

This is one of the reasons why tests will always, regardless of the progress of proofs, remain an indispensable part of the software development landscape [2]. Whatever you have specified and proved, you will still want to run the system (or, for certain classes of embedded software, some simulation of it) and see the results for yourself.

What then if we do not explicitly specify the desired property (as we did in the two approaches considered so far) but testing or actual operation still reveals some behavior that is clearly unsatisfactory? On what basis do we complain to the software’s producer? A solution here, the third in our list, might be to rely on generally accepted standards of professional development. This is common in other kinds of engineering: if you commission someone to build you a house, the contract may not explicitly state that the roof should not fall on your head while you are asleep; when this happens, you will still sue and accuse the builder of malpractice. Such remedies can work for software too, but the rules are murkier because we have not accepted, or even just codified, a set of general professional practices that would cover such details as “no display of information should overflow its assigned area”.

Until then I will remember to use one short tag a lot.

References

[1] Edsger W. Dijksra, Notes on Structured Programming, in Dahl, Dijkstra, Hoare, Structured Programming, Academic Press, 1972.

[2]  See Tests And Proofs (TAP) conference series, since 2007. The next conference, program-chaired by Angelo Gargantini and Gordon Fraser, will take place in the week of the TOOLS Federated Conferences in Málaga, Spain, in the week of June 28, 2010.

VN:F [1.9.10_1130]
Rating: 5.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

What is the purpose of testing?

Last year I published in IEEE Computer a short paper entitled “Seven Principles of Software Testing” [1]. Although technical, it was an opinion piece and the points were provocative enough to cause a reader, Gerald Everett, to express strong disagreement. Robert Glass, editor of the “Point/Counterpoint” rubric of the sister publication, IEEE Software, invited both of us to a debate in the form of a critique by Mr. Everett, my answer to the critique, his rejoinder to the answer, and my rejoinder to his rejoinder. The result appeared recently [2].

Other than a matter of terminology (Mr. Everett wants “testing” to cover static as well as dynamic techniques of quality assurance), the main point of disagreement was my very first principle: to test a program is to make it fail. Indeed this flies in the face of some established wisdom, which holds that testing serves to increase one’s confidence in the software; see for example the Wikipedia entry [3]. My article explains that this is a delusion and that it is more productive to limit the purpose of the testing process to what it does well, finding faults,  rather than leting it claim goals of quality assurance that are beyond its scope. Finding faults is no minor feat already. In this view — the practical view, for example as seen by a software project manager  — Dijkstra’s famous dismissal of testing (it can prove the presence of bugs, never their absence) is the greatest compliment to testing, and the most powerful advertisement one can think of for taking testing seriously.

What do you think? What is testing good for?

I should add that in terms of research this debate is a bit of a sideline.  The real goal of our work is to build completely automatic testing tools. An article on this topic will appear in the next issue of Computer (September); I will post a link to it when the issue is out.

References

[1] Bertrand Meyer: Seven Principles of Software Testing, in IEEE Computer, vol. 41, no. 8, pages 99-101, Aug. 2008; available on the IEEE site, and also in draft form here. (An earlier version, without the beautiful picture of bees and flies in the bottle drawn by Computer‘s artist, appeared as an EiffelWorld column.)

[2] Gerald D. Everett and Bertrand Meyer: Point/Counterpoint, in IEEE Software, Vol. 26, No. 4, pages 62-64, July/August 2009.  Available on the IEEE site and also here.

[3] Wikipedia entry on Software Testing.

[4] Bertrand Meyer, Ilinca Ciupa, Andreas Leitner, Arno Fiva, Emmanuel Stapf and Yi Wei: Programs that test themselves, to appear in IEEE Computer, September 2009.

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)