June LASER school, Elba, on Devops, Microservices…

The 2020 LASER summer school has been announced. It will take place June 1 to 6* , as always in Elba Island, this year with the theme DevOps, Microservices and Software Development for the Age of the Web. The first five speakers are listed on the conference page, with more to come, from both academia and industry.

This is the 16th edition of the school (already) and, as always, rests on the LASER recipe of “Sea, Sun and Software”: densely packed lectures by top experts with the opportunity to enjoy the extraordinary surroundings of the Island of Elba and the Hotel del Golfo’s unique food, beach and facilities, with lots of time devoted to interactions between speakers and attendees.

This year’s theme is devoted to advances in the newest Web technologies and the corresponding software engineering issues and development models.

*Arrival on May 31st, departure on June 7th.

VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Those were the days

Earlier this year I was in Sofia for a conference, at the main university (Saint Kliment) which in the entrance hall had an exhibition about its history. There was this student poster and song from I think around 1900:

 

vivat

I like the banner (what do you think?). It even has the correct Latin noun and verb plurals.

Anyone know where to find a university today with that kind of students, that kind of slogan, that kind of attitude and that kind of grammar? Please send me the links.

VN:F [1.9.10_1130]
Rating: 10.0/10 (2 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Entrez sans frapper

Dourdan, 4 October 2014

Dourdan

 

(Canon EOS 70D, f/14, 1/500, ISO/4000)

VN:F [1.9.10_1130]
Rating: 10.0/10 (2 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Formality in requirements: new publication

The best way to make software requirements precise is to use one of the available “formal” approaches. Many have been proposed; I am not aware of a general survey published so far. Over the past two years, we have been working on a comprehensive survey of the use of formality in requirements, of which we are now releasing a draft. “We” is a joint informal research group from Innopolis University and the University of Toulouse, whose members have been cooperating on requirements issues, resulting in publications listed  under “References” below and in several scientific events.

The survey is still being revised, in particular because it is longer than the page limit of its intended venue (ACM Computing Surveys), and some sections are in need of improvement. We think, however, that the current draft can already provide a solid reference in this fundamental area of software engineering.

The paper covers a broad selection of methods, altogether 22 of them, all the way from completely informal to strictly formal. They are grouped into five categories: natural language, semi-formal, automata- or graph-based, other mathematical frameworks, programming-language based. Examples include SysML, Relax, Statecharts, VDM, Eiffel (as a requirements notation), Event-B, Alloy. For every method, the text proposes a version of a running example (the Landing Gear System, already used in some of our previous publications) expressed in the corresponding notation. It evaluates the methods using a set of carefully defined criteria.

The paper is: Jean-Michel Bruel, Sophie Ébersold, Florian Galinier, Alexandr Naumchev, Manuel Mazzara and Bertrand Meyer: Formality in Software Requirements, draft, November 2019.

The text is available here. Comments on the draft are welcome.

References

Bertrand Meyer, Jean-Michel Bruel, Sophie Ebersold, Florian Galinier and Alexandr Naumchev: Towards an Anatomy of Software Requirements, in TOOLS 2019, pages 10-40, see here (or arXiv version here). I will write a separate blog article about this publication.

Alexandr Naumchev and Bertrand Meyer: Seamless requirements. Computer Languages, Systems & Structures 49, 2017, pages 119-132, available here.

Florian Galinier, Jean-Michel Bruel, Sophie Ebersold and Bertrand Meyer: Seamless Integration of Multirequirements, in Complex Systems, 25th International Requirements Engineering Conference Workshop, IEEE, pages 21-25, 2017, available here.

Alexandr Naumchev, Manuel Mazzara, Bertrand Meyer, Jean-Michel Bruel, Florian Galinier and Sophie Ebersold: A contract-based method to specify stimulus-response requirements, Proceedings of the Institute for System Programming, vol. 29, issue 4, 2017, pp. 39-54. DOI: 10.15514, available here.

Alexandr Naumchev and Bertrand Meyer: Complete Contracts through Specification Drivers., in 10th International Symposium on Theoretical Aspects of Software Engineering (TASE), pages 160-167, 2016, available here.

Alexandr Naumchev, Bertrand Meyer and Víctor Rivera: Unifying Requirements and Code: An Example, in PSI 2015 (Ershov conference, Perspective of System Informatics), pages 233-244, available here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (7 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

Publications on CS/SE/informatics education

Recently I had a need to collect my education-related publications, so I went through my publication list and extracted items devoted to issues of learning computer science (informatics) and software engineering. There turned out to be far more than I expected; I did not think of myself as primarily an education researcher but it seems I am that too. (Not so many research computer scientists take the trouble to publish in SIGCSE, ITiCSE and other top CS education venues.)

Without presuming that the list will be of interest I am reproducing it below for the record. All comes from my publication list here, which contains more information, in particular a descriptive paragraph or two for every single publication.

I have also included PhD theses in education. (Whole list of PhD theses supervised here.)

The topics include among others, in approximate chronological order (although the list below is in the reverse order):

    • Early experience teaching modern programming concepts in both industry and universities.
    • In the nineties, I was full time at Eiffel Software, the development of a general framework for teaching programming. This was written from the safe position of someone in industry advising academic colleagues on what to do (usually the advice goes the other way). I did have, however, the opportunity to practice my preaching in short stints at University of Technology, Sydney and  particularly Monash University. The concept of the Inverted Curriculum (also known as “ Outside-In”) date back to that period, with objects first (actually classes) and contracts first too.
    • When I joined ETH, a general paper on the fundamental goals and concepts of software engineering education, “Software Engineering in the Academy”, published in IEEE Computer.
    • At ETH, putting the Inverted Curriculum in practice, with 14 consecutive sessions of the introductory programming courses for all computer science students, resulting in the Touch of Class textbook and a number of papers coming out of our observations. An estimated 6000 students took the course. A variant of it has also been given several times at Innopolis University.
    • A theory of how to structure knowledge for educational purposes, leading to the notion of “Truc” (Teachable, Reusable Unit of Cognition).
    • The development by Michela Pedroni of the Trucstudio environment, similar in its form to an IDE but devoted, instead of the development of programs, to the visual development of courses, textbooks, curricula etc.
    • Empirical work by Marie-Hélène Ng Cheong Vee (Nienaltowski) and Michela Pedroni on what beginners understand easily, and not, for example according to the phrasing of compiler error messages.
    • Other empirical work, by Michela Pedroni and Manuel Oriol, on the prior knowledge of entering computer science students.
    • The DOSE course (Distributed and Outsourced Software Engineering) ran for several years a student project done by joint student teams from several cooperating universities, including Politecnico di Milano which played a key role along with us. It enabled many empirical studies on the effect on software development of having geographically distributed teams. People who played a major role in this effort are, at ETH, Martin Nordio, Julian Tschannen and Christian Estler and, at Politecnico, Elisabetta di Nitto, Giordano Tamburrelli and Carlo Ghezzi.
    • Several MOOCs, among the first at ETH, on introductory computing and agile methods. They do not appear below because they are not available at the moment on the EdX site (I do not know why and will try to get them reinstated). The key force there was Marco Piccioni. MOOCs are interesting for many reasons; they are a substitute neither for face-to-face teaching nor for textbooks, but an interesting complement offering novel educational possibilities. Thanks to codeboard, see below, our programming MOOCs provide the opportunity to compile and run program directly from the course exercise pages, compare the run’s result to correct answers for prepared tests, and get immediate feedback .
    • A comparative study of teaching effectiveness of two concurrency models, Eiffel SCOOP and JavaThreads (Sebastian Nanz, Michela Pedroni).
    • The development of the EiffelMedia multimedia library at ETH, which served as a basis for dozens of student projects over many years. Credit for both the idea and its realization, including student supervision, goes to Till Bay and Michela Pedroni.
    • The development (Christian Estler with Martin Nordio) of the Codeboard system and site, an advanced system for cloud support to teach programming, enabling students to compile, correct and run programs on the web, with support for various languages. Codeboard is used in the programming MOOCs.
    • A hint system (Paolo Antonucci, Michela Pedroni) to help students get progressive help, as in video games, when they stumble trying to write a program, e.g. with Codeboard.

Supervised PhD theses on education

The following three theses are devoted to educational topics (although many of the  other theses have educational aspects too):

Christian Estler, 2014, Understanding and Improving Collaboration in Distributed Software Development, available here.

Michela Pedroni, 2009, Concepts and Tools for Teaching Programming, available here.

Markus Brändle, 2006: GraphBench: Exploring the Limits of Complexity with Educational Software, available here. (The main supervisor in this case was Jürg Nievergelt.)

MOOCs (Massive Online Open Courses)

Internal MOOCs, and three courses on EdX (links will be added when available):

  • Computing: Art, Magic, Science? Part 1 (CAMS 1), 2013.
  • Computing: Art, Magic, Science? Part 1 (CAMS 2), 2014.
  • Agile Software Development, 2015.

Publications about education

1. Paolo Antonucci, Christian Estler, Durica Nikolic, Marco Piccioni and Bertrand Meyer: An Incremental Hint System For Automated Programming Assignments, in ITiCSE ’15, Proceedings of 2015 ACM Conference on Innovation and Technology in Computer Science Education, 6-8 July 2015, Vilnius, ACM Press, pages 320-325. (The result of a master’s thesis, a system for helping students solve online exercises, through successive hints.) Available here.

2. Jiwon Shin, Andrey Rusakov and Bertrand Meyer: Concurrent Software Engineering and Robotics Education, in 37th International Conference on Software Engineering (ICSE 2015), Florence, May 2015, IEEE Press, pages 370-379. (Describes our innovative Robotics Programming Laboratory course, where students from 3 departments, CS, Mechanical Engineering and Electrical Engineering learned how to program robots.) Available here.

3. Cristina Pereira, Hannes Werthner, Enrico Nardelli and Bertrand Meyer: Informatics Education in Europe: Institutions, Degrees, Students, Positions, Salaries — Key Data 2008-2013, Informatics Europe report, October 2014. (Not a scientific publication but a report. I also collaborated in several other editions of this yearly report series, which I started, from 2011 on. A unique source of information about the state of CS education in Europe.) Available here.

4. (One of the authors of) Informatics education: Europe cannot afford to miss the boat, edited by Walter Gander, joint Informatics Europe and ACM Europe report, April 2013. An influential report which was instrumental in the introduction of computer science in high schools and primary schools in Europe, particularly Switzerland. Emphasized the distinction between “digital literacy” and computer science. Available here.

5. Sebastian Nanz, Faraz Torshizi, Michela Pedroni and Bertrand Meyer: Design of an Empirical Study for Comparing the Usability of Concurrent Programming Languages, in Information and Software Technology Journal Elsevier, volume 55, 2013. (Journal version of conference paper listed next.) Available here.

6. Bertrand Meyer: Knowledgeable beginners, in Communications of the ACM, vol. 55, no. 3, March 2012, pages 10-11. (About a survey of prior knowledge of entering ETH CS students, over many years. Material from tech report below.) Available here.

7. Sebastian Nanz, Faraz Torshizi, Michela Pedroni and Bertrand Meyer: Design of an Empirical Study for Comparing the Usability of Concurrent Programming Languages, in ESEM 2011 (ACM/IEEE International Symposium on Empirical Software Engineering and Measurement), 22-23 September 2011 (best paper award). Reports on a carefully designed empirical study to assess the teachability of various approaches to concurrent programming. Available here.

8. Martin Nordio, H.-Christian Estler, Julian Tschannen, Carlo Ghezzi, Elisabetta Di Nitto and Bertrand Meyer: How do Distribution and Time Zones affect Software Development? A Case Study on Communication, in Proceedings of the 6th International Conference on Global Software Engineering (ICGSE), IEEE Computer Press, 2011, pages 176-184. (A study of the results of our DOSE distributed course, which involved students from different universities in different countries collaborating on a common software development project.) Available here.

9. Martin Nordio, Carlo Ghezzi, Elisabetta Di Nitto, Giordano Tamburrelli, Julian Tschannen, Nazareno Aguirre, Vidya Kulkarni and Bertrand Meyer: Teaching Software Engineering using Globally Distributed Projects: the DOSE course, in Collaborative Teaching of Globally Distributed Software Development – Community Building Workshop (CTGDSD), Hawaii (at ICSE), May 2011. (Part of the experience of our Distributed Outsourced Software Engineering course, taught over many years with colleagues from Politecnico di Milano and elsewhere, see paper in previous entry.) Available here.

10. Bertrand Meyer: From Programming to Software Engineering (slides only), material for education keynote at International Conference on Software Engineering (ICSE 2010), Cape Town, South Africa, May 2010. Available here.

11. Michela Pedroni and Bertrand Meyer: Object-Oriented Modeling of Object-Oriented Concepts, in ISSEP 2010, Fourth International Conference on Informatics in Secondary Schools, Zurich, January 2010, eds. J. Hromkovic, R. Královic, J. Vahrenhold, Lecture Notes in Computer Science 5941, Springer, 2010. Available here.

12. Michela Pedroni, Manuel Oriol and Bertrand Meyer: What Do Beginning CS Majors Know?, ETH Technical Report, 2009. (Unpublished report about the background of 1st-year ETH CS students surveyed over many years. See shorter 2012 CACM version above.) Available here.

13. Bertrand Meyer: Touch of Class: Learning to Program Well Using Object Technology and Design by Contract, Springer, 2009 (also translated into Russian). (Introductory programming textbook, used for many years at ETH Zurich and Innopolis University for the first programming course. The herecontains a long discussion of pedagogical issues of teaching programming and CS.) Book page and text of several chapters here.

14. Michela Pedroni, Manuel Oriol, Lukas Angerer and Bertrand Meyer: Automatic Extraction of Notions from Course Material, in Proceedings of SIGCSE 2008 (39th Technical Symposium on Computer Science Education), Portland (Oregon), 12-15 March 2008, ACM SIGCSE Bulletin, vol. 40, no. 1, ACM Press, 2008, pages 251-255. (As the title indicates, tools for automatic analysis of course material to extract the key pedagogical notions or “Trucs”.) Available here.

15. Marie-Hélène Nienaltowski, Michela Pedroni and Bertrand Meyer: Compiler Error Messages: What Can Help Novices?, in Proceedings of SIGCSE 2008 (39th Technical Symposium on Computer Science Education), Portland (Oregon), Texas, 12-15 March 2008, ACM SIGCSE Bulletin, vol. 40, no. 1, ACM Press, 2008, pages 168-172. (Discusses the results of experiments with different styles of compiler error messages, which can be baffling to beginners, to determine what works best.) Available here.

16. Bertrand Meyer and Marco Piccioni: The Allure and Risks of a Deployable Software Engineering Project: Experiences with Both Local and Distributed Development, in Proceedings of IEEE Conference on Software Engineering & Training (CSEE&T), Charleston (South Carolina), 14-17 April 2008, ed. H. Saiedian, pages 3-16. (Paper associated with a keynote at an SE education conference. See other papers on the DOSE distributed project experience below.) Available here.

17. Till Bay, Michela Pedroni and Bertrand Meyer: By students, for students: a production-quality multimedia library and its application to game-based teaching, in JOT (Journal of Object Technology), vol. 7, no. 1, pages 147-159, January 2008. Available here (PDF) and here (HTML).

18. Marie-Hélène Ng Cheong Vee (Marie-Hélène Nienaltowski), Keith L. Mannock and Bertrand Meyer: Empirical study of novice error paths, Proceedings of workshop on educational data mining at the 8th international conference on intelligent tutoring systems (ITS 2006), 2006, pages 13-20. (An empirical study of the kind of programming mistakes learners make.) Available here.

19. Bertrand Meyer: Testable, Reusable Units of Cognition, in Computer (IEEE), vol. 39, no. 4, April 2006, pages 20-24. (Introduced a general approach for structuring knowledge for teaching purposes: “Trucs”. Served as the basis for some other work listed, in particular papers with Michela Pedroni on the topics of her PhD thesis. Available here.

21. Michela Pedroni and Bertrand Meyer: The Inverted Curriculum in Practice, in Proceedings of SIGCSE 2006, Houston (Texas), 1-5 March 2006, ACM Press, 2006, pages 481-485. (Develops the idea of inverted curriculum which served as the basis for our teaching of programming at ETH, Innopolis etc. and led to the “Touch of Class” textbook.) Available here.

22. Bertrand Meyer: The Outside-In Method of Teaching Introductory Programming, in Perspective of System Informatics, Proceedings of fifth Andrei Ershov Memorial Conference, Akademgorodok, Novosibirsk, 9-12 July 2003, eds. Manfred Broy and Alexandr Zamulin, Lecture Notes in Computer Science 2890, Springer, 2003, pages 66-78. (An early version of the ideas presented in the previous entry.) Available here.

23. Bertrand Meyer: Software Engineering in the Academy, in Computer (IEEE), vol. 34, no. 5, May 2001, pages 28-35. Translations: Russian in Otkrytye Systemy (Open Systems Publications), #07-08-2001, October 2001. (A general discussion of the fundamental concepts to be taught in software engineering. Served as a blueprint for my teaching at ETH.) Available here.

24. Bertrand Meyer: Object-Oriented Software Construction, second edition, Prentice Hall, 1296 pages, January 1997. Translations: Spanish, French Russian, Serbian, Japanese. (Not a publication on education per se but cited here since it is a textbook that has been widely used for teaching and has many comments on pedagogy.)
23. Bertrand Meyer: The Choice for Introductory Software Education, Guest editorial in Journal of Object-Oriented Programming, vol. 7, no. 3, June 1994, page 8. (A discussion of the use of Eiffel for teaching software engineering topics.)

25. Bertrand Meyer, Towards an Object-Oriented Curriculum, in Journal of Object-Oriented Programming, vo. 6, number 2, May 1993, pages 76-81. (Journal version of paper cited next.) Available here.

26. Bertrand Meyer: Towards an Object-Oriented Curriculum, in TOOLS 11, Technology of Object-Oriented Languages and Systems, Santa Barbara, August 1993, eds. Raimund Ege, Madhu Singh and B. Meyer, Prentice Hall 1993, pages 585-594. (Early advocacy for using OO techniques in teaching programming – while I was not in academia. Much of my subsequent educational work relied on those ideas.) Available here.

27. Bertrand Meyer: Object-Oriented Software Construction, Prentice Hall, 592 pages, 1988. (First edition, translated into German, Italian, French, Dutch, Romanian, Chinese. As noted for second edition above, not about education per se, but widely used textbook with pedagogical implications.)

28. Initiation à la programmation en milieu industriel (Teaching Modern Programming Methodology in an Industrial Environment), in RAIRO, série bleue (informatique), vol. 11, no. 1, pages 21-34 1977. (Early paper on teaching advanced programming techniques in industry.) Available here.

29. Claude Kaiser, Bertrand Meyer and Etienne Pichat, L’Enseignement de la Programmation à l’IIE (Teaching Programming at the IIE engineering school), in Zéro-Un Informatique, 1977. (A paper on my first teaching experience barely out of school myself.) Available here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

A theorem of software engineering

Some of the folk wisdom going around in software engineering, often cluessly repeated for decades, is just wrong.  It can be particularly damaging when it affects key aspects of software development and is contradicted by solid scientific evidence. The present discussion covers a question that meets both of these conditions: whether it makes sense to add staff to a project to shorten its delivery time.

My aim is to popularize a result that is well known in the software engineering literature, going back to the early work of Barry Boehm [1], and explained with great clarity by Steve McConnell in his 2006 book on software cost estimation [2] under the name “Shortest Possible Schedule”. While an empirical rather than a logical result, I believe it deserves to be called a theorem (McConnell stays shy of using the term) because it is as close as we have in the area of software engineering management to a universal property, confirmed by numerous experimental studies.

This article contributes no new concept since McConnell’s chapter 20 says all there is to say about the topic;  my aim is simply to make the Shortest Possible Schedule Theorem better known, in particular to practitioners.

The myth about shortening project times begins with an observation that is clearly correct, at least in an extreme form. Everyone understands that if our project has been evaluated, through accepted cost estimation techniques, to require three developers over a year we cannot magically hire 36 people to complete it in one month. Productivity does not always scale up.

But neither does common sense. Too often the conclusion from the preceding trival observation takes the form of an old  saw, “Brooks’ Law”: adding people to a late project delays it further. The explanation is that the newcomers cost more through communication overhead than they bring through actual contributions. While a few other sayings of Brooks’ Mythical Man-Month have stood the test of time, this one has always struck me as describing, rather than any actual law, a definition of bad management. Of course if you keep haplessly throwing people at deadlines you are just going to add communication problems and make things worse. But if you are a competent manager expanding the team size is one of the tools at your disposal to improve the state of a project, and it would be foolish to deprive yourself of it. A definitive refutation of the supposed law, also by McConnell, was published 20 years ago [3].

For all the criticism it deserves, Brooks’s pronouncement was at least limited in its scope: it addressed addition of staff to a project that is already late. It is even wronger to apply it to the more general issue of cost-estimating and staffing software projects, at any stage of their progress.  Forty-year-old platitudes have even less weight here. As McConnell’s book shows, cost estimation is no longer a black art. It is not an exact science either, but techniques exist for producing solid estimates.

The Shortest Possible Schedule theorem is one of the most interesting results. Much more interesting than Brooks’s purported law, because it is backed by empirical studies (rather than asking us to believe one person’s pithy pronouncement), and instead of just a general negative view it provides a positive result complemented by a limitation of that result; and both are expressed quantitatively.

Figure 1 gives the general idea of the SPS theorem. General idea only; Figure 2 will provide a more precise view.

Image4

Figure 1: General view of the Shortest Possible Schedule theorem.

The  “nominal project” is the result of a cost and schedule estimation yielding the optimum point. The figure and the theorem provide project managers with both a reason to rejoice and a reason to despair:

  • Rejoice: by putting in more money, i.e. more people (in software engineering, project costs are essentially people costs [4]), you can bring the code to fruition faster.
  • Despair: whatever you do, there is a firm limit to the time you can gain: 25%. It seems to be a kind of universal constant of software engineering.

The “despair” part typically gets the most attention at first, since it sets an absolute value on how much money can buy (so to speak) in software: try as hard as you like, you will never get below 75% of the nominal (optimal) value. The “impossible zone” in Figure 1 expresses the fundamental limitation. This negative result is the reasoned and precise modern replacement for the older folk “law”.

The positive part, however, is just as important. A 75%-empty glass is also 25%-full. It may be disappointing for a project manager to realize that no amount of extra manpower will make it possible to guarantee to higher management more than a 25% reduction in time. But it is just as important to know that such a reduction, not at all insignificant, is in fact reachable given the right funding, the right people, the right tools and the right management skills. The last point is critical: money by itself does not suffice, you need management; Brooks’ law, as noted, is mostly an observation of the effects of bad management.

Figure 1 only carries the essential idea, and is not meant to provide precise numerical values. Figure 2, the original figure from McConnell’s book, is. It plots effort against time rather than the reverse but, more importantly, it shows several curves, each corresponding to a published empirical study or cost model surveyed by the book.

Image5

Figure 2: Original illustration of the Shortest Possible Schedule
(figure 2-20 of [3], reproduced with the author’s permission)

On the left of the nominal point, the curves show how, according to each study, increased cost leads to decreased time. They differ on the details: how much the project needs to spend, and which maximal reduction it can achieve. But they all agree on the basic Shortest Possible Schedule result: spending can decrease time, and the maximal reduction will not exceed 25%.

The figure also provides an answer, although a disappointing one, to another question that arises naturally. So far this discussion has assumed that time was the critical resource and that we were prepared to spend more to get a product out sooner. But sometimes it is the other way around: the critical resource is cost, or, concretely, the number of developers. Assume that nominal analysis tells us that the project will take four developers for a year and, correspondingly, cost 600K (choose your currency).  We only have a budget of 400K. Can we spend less by hiring fewer developers, accepting that it will take longer?

On that side, right of the nominal point in Figure 2, McConnell’s survey of surveys shows no consensus. Some studies and models do lead to decreased costs, others suggest that with the increase in time the cost will actually increase too. (Here is my interpretation, based on my experience rather than on any systematic study: you can indeed achieve the original goal with a somewhat smaller team over a longer period; but the effect on the final cost can vary. If the new time is t’= t + T and the new team size s’= s – S, t and s being the nominal values, the cost difference is proportional to  Ts – t’S. It can be positive as well as negative depending on the values of the original t and s and the precise effect of reduced team size on project duration.)

The firm result, however, is the left part of the figure. The Shortest Possible Schedule theorem confirms what good project managers know: you can, within limits, shorten delivery times by bringing all hands on deck. The precise version deserves to be widely known.

References and note

[1] Barry W. Boehm: Software Engineering Economics, Prentice Hall, 1981.

[2] Steve McConnell: Software Estimation ― Demystifying the Black Art, Microsoft Press, 2006.

[3] Steve McConnell: Brooks’ Law Repealed, in IEEE Software, vol. 16, no. 6, pp. 6–8, November-December 1999, available here.

[4] This is the accepted view, even though one might wish that the industry paid more attention to investment in tools in addition to people.

Recycled A version of this article was first published on the Comm. ACM blog under the title The Shortest Possible Schedule Theorem: Yes, You Can Throw Money at Software Deadlines

VN:F [1.9.10_1130]
Rating: 9.9/10 (7 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

Software Engineering Education: FISEE coming up

Over the past few days I have come across several people who told me they want to attend the Frontiers In Software Engineering Education (FISEE) workshop in Villebrumier, 11-13 November, but have not registered yet. If that’s your case please register right now because:

  • The number of spots is limited (it’s a residential event, everyone is hosted onsite, and there is a set number of rooms).
  • We need a preliminary program. The format of the event is flexible, Springer LNCS proceedings come after the meeting, we make room for impromptu presentations and discussions, but still we need a basic framework and we need to finalize it now.

So please go ahead and fill in the registration form.

From the previous posting about FISEE:

The next event at the LASER center in Villebrumier (Toulouse area, Southwest France) is FISEE, Frontiers in Software Engineering Education, see the web site. This small-scale workshop, 11 to 13 November is devoted to what Software Engineering needs, what should be changed, and how new and traditional institutions can adapt to the fast pace of technology.

Workshops at the Villebrumier center favor a friendly, informal and productive interaction between participants, who are all hosted on site. There are no formal submissions, but post-event proceedings will be published as part of the LASER sub-series of Springer Lecture Notes in Computer Science.

Like other events there, FISEE is by invitation; if you are active in the field of software engineering education as an educator, as a potential employer of software engineering graduates, or as a researcher, you can request an invitation by writing to me or one of the other organizers. Attendance is limited to 15-20 participants.

Among already scheduled talks: a keynote by Alexander Tormasov, rector of Innopolis University, and a talk by me on “the 15 concepts of software engineering”.

VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Software engineering education: Villebrumier LASER center, November

The next event at the LASER center in Villebrumier (Toulouse area, Southwest France) is FISEE, Frontiers in Software Engineering Education, see the web site. This small-scale workshop, 11 to 13 November is devoted to what Software Engineering needs, what should be changed, and how new and traditional institutions can adapt to the fast pace of technology.

Workshops at the Villebrumier center favor a friendly, informal and productive interaction between participants, who are all hosted on site. There are no formal submissions, but post-event proceedings will be published as part of the LASER sub-series of Springer Lecture Notes in Computer Science.

Like other events there, FISEE is by invitation; if you are active in the field of software engineering education as an educator, as a potential employer of software engineering graduates, or as a researcher, you can request an invitation by writing to me or one of the other organizers. Attendance is limited to 15-20 participants.

VN:F [1.9.10_1130]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Sunrise was foggy today

Once you have learned the benefits of formally expressing requirements, you keep noticing potential ambiguities and other deficiencies [1] in everyday language. Most such cases are only worth a passing smile, but here’s one that perhaps can serve to illustrate a point with business analysts in your next requirements engineering workshop or with students in your next software engineering lecture.

As a customer of the Swiss telecommunications company Sunrise I receive an occasional “news” email. (As a customer of the Swiss telecommunications company Sunrise I would actually prefer that they spend my money improving  bandwidth,  but let us not digress.) Rather than raw marketing messages these are tips for everyday life, with the presumed intent of ingratiating the populace. For example, today’s message helpfully advises me on how to move house. The admirable advice starts (my translation):

10.7% of all Swiss people relocate every year. Is that your case too for next Autumn?

Actually no, it’s not my case (neither a case of being one of the “Swiss people” nor a case of intending to relocate this Fall). And, ah, the beauty of ridiculously precise statistics! Not 10.8% or 10.6%, mind you, no, 10.7% exactly! But consider the first sentence and think of something similar appearing in a requirements document or user story. Something similar does appear in such documents, all the time, leading to confusions for the programmers interpreting them and to bugs in the resulting systems. Those restless Swiss! Did you know that they include an itchy group, exactly 922,046 people (I will not be out-significant-digited!), who relocate every year?

Do not be silly, I hear you saying. What Sunrise is sharing of its wisdom is that every year a tenth of the Swiss population moves, but not the same tenth every year. Well, OK, maybe I am being silly. But if you think of a programmer reading such a statement about some unfamiliar domain (not one about which we can rely on common sense), the risk of confusion and consequent bugs is serious.

As [1] illustrated in detail, staying within the boundaries of natural language to resolve such possible ambiguities only results in convoluted requirements that make matters worse. The only practical way out is, for delicate system properties, to use precise language, also known technically as “mathematics”.

Here for example a precise formulation of the two possible interpretations removes any doubt. Let Swiss denote the set of Swiss people and  E the number of elements (cardinal) of a finite set E, which we can apply to the example because the set of Swiss people is indeed finite. Let us define slice as the Sunrise-official number of Swiss people relocating yearly, i.e. slice = Swiss ∗ 0.107 (the actual value appeared above). Then one interpretation of the fascinating Sunrise-official fact is:

{s: Swiss | (∀y: Year | s.is_moving (y))} = slice

In words: the cardinal of the set of Swiss people who move every year (i.e., such that for every year y they move during y) is equal to the size of the asserted population subset.

The other possible interpretation, the one we suspect would be officially preferred by the Sunrise powers (any formal-methods fan from Sunrise marketing reading this, please confirm or deny!), is:

∀y: Year | {s: Swiss | s.is_moving (y)} = slice

In words: for any year y, the cardinal of the set of Swiss people who move during y is equal to the size of the asserted subset.

This example is typical of where and why we need mathematics in software requirements. No absolutist stance here, no decree  that everything become formal (mathematical). Natural language is not going into retirement any time soon. But whenever one spots a possible ambiguity or imprecision, the immediate reaction should always be to express the concepts mathematically.

To anyone who has had a successful exposure to formal methods this reaction is automatic. But I keep getting astounded not only by  the total lack of awareness of these simple ideas among the overwhelming majority of software professionals, but also by their absence from the standard curriculum of even top universities. Most students graduate in computer science without ever having heard such a discussion. Where a formal methods course does exist, it is generally as a specialized topic reserved for a small minority, disconnected (as Leslie Lamport has observed [2]) from the standard teaching of programming and software engineering.

In fact all software engineers should possess the ability to go formal when and where needed. That skill is not hard to learn and should be practiced as part of the standard curriculum. Otherwise we keep training the equivalent of electricians rather than electrical engineers, programmers keep making damaging mistakes from misunderstanding ambiguous or inconsistent requirements, and we all keep suffering from buggy programs.

 

References

[1] Self-citation appropriate here: Bertrand Meyer: On Formalism in Specifications, IEEE Software, vol. 3, no. 1, January 1985, pages 6-25, available here.

[2] Leslie Lamport: The Future of Computing: Logic or Biology, text of a talk given at Christian Albrechts University, Kiel on 11 July 2003, available here.

VN:F [1.9.10_1130]
Rating: 9.8/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Schedule and last deadline for LASER AI + ML + SE, Elba, June

The lecture schedule has now been posted for the 2019 LASER summer school on artificial intelligence, machine learning and software engineering. The speakers are Shai Ben-David (Waterloo), Lionel Briand (Luxembourg), Pascal Fua (EPFL), Erik Meijer (Facebook), Tim Menzies (NC State) and I.

The last deadline for registration is May 20.

The school takes place June 1-9 in the magnificent Hotel del Golfo in Elba Island, Italy.

All details at www.laser-foundation.org/school/2019.

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Soundness and completeness: with precision

Over breakfast at your hotel you read an article berating banks about the fraudulent credit card transactions they let through. You proceed to check out and bang! Your credit card is rejected because (as you find out later) the bank thought [1] it couldn’t possibly be you in that exotic place. Ah, those banks! They accept too much. Ah, those banks! They reject too much. Finding the right balance is a case of soundness versus precision.

Similar notions are essential to the design of tools for program analysis, looking for such suspicious cases as  dead code (program parts that will never be executed). An analysis can be sound, or not; it can be complete, or not.

These widely used concepts are sometimes misunderstood.  The first answer I get when innocently asking people whether the concepts are clear is yes, of course, everyone knows! Then, as I bring up such examples as credit card rejection or dead code detection, assurance quickly yields to confusion. One sign that things are not going well is when people start throwing in terms like “true positive” and “false negative”. By then any prospect of reaching a clear conclusion has vanished. I hope that after reading this article you will never again (in a program analysis context) be tempted to use them.

Now the basic idea is simple. An analysis is sound if it reports all errors, and complete if it only reports errors. If not complete, it is all the more precise that it reports fewer non-errors.

You can stop here and not be too far off [2]. But a more nuanced and precise discussion helps.

1. A relative notion

As an example of common confusion, one often encounters attempts to help through something like Figure 1, which cannot be right since it implies that all sound methods are complete. (We’ll have better pictures below.)

Figure 1: Naïve (and wrong) illustration

Perhaps this example can be dismissed as just a bad use of illustrations [3] but consider the example of looking for dead code. If the analysis wrongly determines that some reachable code is unreachable, is it unsound or incomplete?

With this statement of the question, the only answer is: it depends!

It depends on the analyzer’s mandate:

  • If it is a code checker that alerts programmers to cases of bad programming style, it is incomplete: it reports as an error a case that is not. (Reporting that unreachable code is reachable would cause unsoundness, by missing a case that it should have reported.)
  • If it is the dead-code-removal algorithm of an optimizing compiler, which will remove unreachable code, it is unsound: the compiler will remove code that it should not. (Reporting that unreachable code is reachable would cause incompleteness, by depriving the compiler of an optimization.)

As another example, consider an analyzer that finds out whether a program will terminate. (If you are thinking “but that can’t be done!“, see the section “Appendix: about termination” at the very end of this article.) If it says a program does not terminates when in fact it does, is it unsound or incomplete?

Again, that depends on what the analyzer seeks to establish. If it is about the correctness of a plain input-to-output program (a program that produces results and then is done), we get incompleteness: the analyzer wrongly flags a program that is actually OK. But if it is about verifying that continuously running programs, such as the control system for a factory, will not stop (“liveness”), then the analyzer is unsound.

Examples are not limited to program analysis. A fraud-indentification process that occasionally rejects a legitimate credit card purchase is, from the viewpoint of preserving the bank from fraudulent purchases, incomplete. From the viewpoint of the customer who understands a credit card as an instrument enabling payments as long as you have sufficient credit, it is unsound.

These examples suffice to show that there cannot be absolute definitions of soundness and precision: the determination depends on which version of a boolean property we consider desirable. This decision is human and subjective. Dead code is desirable for the optimizing compiler and undesirable (we will say it is a violation) for the style checker. Termination is desirable for input-output programs and a violation for continuously running programs.

Once we have decided which cases are desirable and which are violations, we can define the concepts without any ambiguity: soundness means rejecting all violations, and completeness means accepting all desirables.

While this definition is in line with the unpretentious, informal one in the introduction, it makes two critical aspects explicit:

  • Relativity. Everything depends on an explicit decision of what is desirable and what is a violation. Do you want customers always to be able to use their credit cards for legitimate purchases, or do you want to detect all frauds attempts?
  • Duality. If you reverse the definitions of desirable and violation (they are the negation of each other), you automatically reverse the concepts of soundness and completeness and the associated properties.

We will now explore the consequences of these observations.

2. Theory and practice

For all sufficiently interesting problems, theoretical limits (known as Rice’s theorem) ensure that it is impossible to obtain both soundness and completeness.

But it is not good enough to say “we must be ready to renounce either soundness or completeness”. After all, it is very easy to obtain soundness if we forsake completeness: reject every case. A termination-enforcement analyzer can reject every program as potentially non-terminating. A bank that is concerned with fraud can reject every transaction (this seems to be my bank’s approach when I am traveling) as potentially fraudulent. Dually, it is easy to ensure completeness if we just sacrifice soundness: accept every case.

These extreme theoretical solutions are useless in practice; here we need to temper the theory with considerations of an engineering nature.

The practical situation is not as symmetric as the concept of duality theoretically suggests. If we have to sacrifice one of the two goals, it is generally better to accept some incompleteness: getting false alarms (spurious reports about cases that turn out to be harmless) is less damaging than missing errors. Soundness, in other words, is essential.

Even on the soundness side, though, practice tempers principle. We have to take into account the engineering reality of how tools get produced. Take a program analyzer. In principle it should cover the entire programming language. In practice, it will be built step by step: initially, it may not handle advanced features such as exceptions, or dynamic mechanisms such as reflection (a particularly hard nut to crack). So we may have to trade soundness for what has been called  “soundiness[4], meaning soundness outside of cases that the technology cannot handle yet.

If practical considerations lead us to more tolerance on the soundness side, on the completeness side they drag us (duality strikes again) in the opposite direction. Authors of analysis tools have much less flexibility than the theory would suggest. Actually, close to none. In principle, as noted, false alarms do not cause catastrophes, as missed violations do; but in practice they can be almost as bad.  Anyone who has ever worked on or with a static analyzer, going back to the venerable Lint analyzer for C, knows the golden rule: false alarms kill an analyzer. When people discover the tool and run it for the first time, they are thrilled to discover how it spots some harmful pattern in their program. What counts is what happens in subsequent runs. If the useful gems among the analyzer’s diagnostics are lost in a flood of irrelevant warnings, forget about the tool. People just do not have the patience to sift through the results. In practice any analysis tool has to be darn close to completeness if it has to stand any chance of adoption.

Completeness, the absence of false alarms, is an all-or-nothing property. Since in the general case we cannot achieve it if we also want soundness, the engineering approach suggests using a numerical rather than boolean criterion: precision. We may define the precision pr as 1 – im where im is the imprecision:  the proportion of false alarms.

The theory of classification defines precision differently: as pr = tp / (tp + fp), where tp is the number of false positives and fp the number of true positives. (Then im would be fp / (tp + fp).) We will come back to this definition, which requires some tuning for program analyzers.

From classification theory also comes the notion of recall: tp / (tp + fn) where fn is the number of false negatives. In the kind of application that we are looking at, recall corresponds to soundness, taken not as a boolean property (“is my program sound?“) but  a quantitative one (“how sound is my program?“). The degree of unsoundness un would then be fn / (tp + fn).

3. Rigorous definitions

With the benefit of the preceding definitions, we can illustrate the concepts, correctly this time. Figure 2 shows two different divisions of the set of U of call cases (universe):

  • Some cases are desirable (D) and others are violations (V).
  • We would like to know which are which, but we have no way of finding out the exact answer, so instead we run an analysis which passes some cases (P) and rejects some others (R).

Figure 2: All cases, classified

The first classification, left versus right columns in Figure 2, is how things are (the reality). The second classification, top versus bottom rows, is how we try to assess them. Then we get four possible categories:

  • In two categories, marked in green, assessment hits reality on the nail:  accepted desirables (A), rightly passed, and caught violations (C), rightly rejected.
  • In the other two, marked in red, the assessment is off the mark: missed violations (M), wrongly passed; and false alarms (F), wrongly accepted.

The following properties hold, where U (Universe) is the set of all cases and  ⊕ is disjoint union [5]:

— Properties applicable to all cases:
U = D ⊕ V
U = P ⊕ R
D = A ⊕ F
V = C ⊕ M
P = A ⊕ M
R = C ⊕ F
U = A ⊕M ⊕ F ⊕ C

We also see how to define the precision pr: as the proportion of actual violations to reported violations, that is, the size of C relative to R. With the convention that u is the size of U and so on, then  pr = c / r, that is to say:

  • pr = c / (c + f)      — Precision
  • im = f / (c + f)      — Imprecision

We can similarly define soundness in its quantitative variant (recall):

  • so = a / (a + m)      — Soundness (quantitative)
  • un = m / (a + m)   — Unsoundness

These properties reflect the full duality of soundness and completeness. If we reverse our (subjective) criterion of what makes a case desirable or a violation, everything else gets swapped too, as follows:

Figure 3: Duality

We will say that properties paired this way “dual” each other [6].

It is just as important (perhaps as a symptom that things are not as obvious as sometimes assumed) to note which properties do not dual. The most important examples are the concepts of  “true” and “false” as used in “true positive” etc. These expressions are all the more confusing that the concepts of True and False do dual each other in the standard duality of Boolean algebra (where True duals False,  Or duals And, and an expression duals its negation). In “true positive” or “false negative”,  “true” and “false” do not mean True and False: they mean cases in which (see figure 2 again) the assessment respectively matches or does not match the reality. Under duality we reverse the criteria in both the reality and the assessment; but matching remains matching! The green areas remain green and the red areas remain red.

The dual of positive is negative, but the dual of true is true and the dual of false is false (in the sense in which those terms are used here: matching or not). So the dual of true positive is true negative, not false negative, and so on. Hereby lies the source of the endless confusions.

The terminology of this article removes these confusions. Desirable duals violation, passed duals rejected, the green areas dual each other and the red areas dual each other.

4. Sound and complete analyses

If we define an ideal world as one in which assessment matches reality [7], then figure 2 would simplify to just two possibilities, the green areas:

Figure 4: Perfect analysis (sound and complete)

This scheme has the following properties:

— Properties of a perfect (sound and complete) analysis as in Figure 4:
M = ∅              — No missed violations
F = ∅               — No false alarms
P = D                — Identify  desirables exactly
R = V                –Identify violations exactly

As we have seen, however, the perfect analysis is usually impossible. We can choose to build a sound solution, potentially incomplete:

Figure 5: Sound desirability analysis, not complete

In this case:

— Properties of a sound analysis (not necessarily complete) as in Figure 5:
M = ∅              — No missed violations
P = A                — Accept only desirables
V = C                — Catch all violations
P ⊆ D               — Under-approximate desirables
R ⊇ V               — Over-approximate violations

Note the last two properties. In the perfect solution, the properties P = D and R = V mean that the assessment, yielding P and V, exactly matches the reality, D and V. From now on we settle for assessments that approximate the sets of interest: under-approximations, where the assessment is guaranteed to compute no more than the reality, and over-approximations, where it computes no less. In all cases the assessed sets are either subsets or supersets of their counterparts. (Non-strict, i.e. ⊆ and ⊇ rather than ⊂ and ⊃; “approximation” means possible approximation. We may on occasion be lucky and capture reality exactly.)

We can go dual and reach for completeness at the price of possible unsoundness:

Figure 6: Complete desirability analysis, not sound

The properties are dualled too:

— Properties of a complete analysis (not necessarily sound), as in Figure 6:
F = ∅              — No false alarms
R = C               — Reject only violations
D = A               — Accept all desirables
P ⊇ D               — Over-approximate desirables
R ⊆ V              — Under-approximate violations

5. Desirability analysis versus violation analysis

We saw above why the terms “true positives”, “false negatives” etc., which do not cause any qualms in classification theory, are deceptive when applied to the kind of pass/fail analysis (desirables versus violations) of interest here. The definition of precision provides further evidence of the damage. Figure 7 takes us back to the general case of Figure 2 (for analysis that is guaranteed neither sound nor complete)  but adds these terms to the respective categories.

Figure 7: Desirability analysis (same as fig. 2 with added labeling)

The analyzer checks for a certain desirable property, so if it wrongly reports a violation (F) that is a false negative, and if it misses a violation (M) it is a false positive. In the  definition from classification theory (section 2, with abbreviations standing for True/False Positives/Negatives): TP = A, FP = M, FN =  F, TN = C, and similarly for the set sizes: tp = a, fp = m, fn = f, tn = c.

The definition of precision from classification theory was pr = tp / (tp + fp), which here gives a / (a + m). This cannot be right! Precision has to do with how close the analysis is to completeness, that is to day, catching all violations.

Is classification theory wrong? Of course not. It is simply that, just as Alice stepped on the wrong side of the mirror, we stepped on the wrong side of duality. Figures 2 and 7 describe desirability analysis: checking that a tool does something good. We assess non-fraud from the bank’s viewpoint, not the stranded customer’s; termination of input-to-output programs, not continuously running ones; code reachability for a static checker, not an optimizing compiler. Then, as seen in section 3, a / (a + m) describes not precision but  soundness (in its quantitative interpretation, the parameter called “so” above).

To restore the link with classification theory , we simply have to go dual and take the viewpoint of violation analysis. If we are looking for possible violations, the picture looks like this:

Figure 8: Violation analysis (same as fig. 7 with different positive/negative labeling)

Then everything falls into place:  tp = c, fp = f, fn =  m, tn = a, and the classical definition of  precision as pr = tp / (tp + fp) yields c / (c + f) as we are entitled to expect.

In truth there should have been no confusion since we always have the same picture, going back to Figure 2, which accurately covers all cases and supports both interpretations: desirability analysis and violation analysis. The confusion, as noted, comes from using the duality-resistant “true”/”false” opposition.

To avoid such needless confusion, we should use the four categories of the present discussion:  accepted desirables, false alarms, caught violations and missed violations [8]. Figure 2 and its variants clearly show the duality, given explicitly in Figure 3, and sustains  interpretations both for desirability analysis and for violation analysis. Soundness and completeness are simply special cases of the general framework, obtained by ruling out one of the cases of incorrect analysis in each of Figures 4 and 5. The set-theoretical properties listed after Figure 2 express the key concepts and remain applicable in all variants. Precision c / (c + f) and quantitative soundness a / (a + m) have unambiguous definitions matching intuition.

The discussion is, I hope, sound. I have tried to make it complete. Well, at least it is precise.

Notes and references

[1] Actually it’s not your bank that “thinks” so but its wonderful new “Artificial Intelligence” program.

[2] For a discussion of these concepts as used in testing see Mauro Pezzè and Michal Young, Software Testing and Analysis: Process, Principles and Techniques, Wiley, 2008.

[3] Edward E. Tufte: The Visual Display of Quantitative Information, 2nd edition, Graphics Press, 2001.

[4] Michael Hicks,What is soundness (in static analysis)?, blog article available here, October 2017.

[5] The disjoint union property X = Y ⊕ Z means that Y ∩ Z = ∅ (Y and Z are disjoint) and X = Y ∪ Z (together, they yield X).

[6] I thought this article would mark the introduction into the English language of “dual” as a verb, but no, it already exists in the sense of turning a road from one-lane to two-lane (dual).

[7] As immortalized in a toast from the cult movie The Prisoner of the Caucasus: “My great-grandfather says: I have the desire to buy a house, but I do not have the possibility. I have the possibility to buy a goat, but I do not have the desire. So let us drink to the matching of our desires with our possibilities.” See 6:52 in the version with English subtitles.

[8] To be fully consistent we should replace the term “false alarm” by rejected desirable. I is have retained it because it is so well established and, with the rest of the terminology as presented, does not cause confusion.

[9] Byron Cook, Andreas Podelski, Andrey Rybalchenko: Proving Program Termination, in Communications of the ACM, May 2011, Vol. 54 No. 5, Pages 88-98.

Background and acknowledgments

This reflection arose from ongoing work on static analysis of OO structures, when I needed to write formal proofs of soundness and completeness and found that the definitions of these concepts are more subtle than commonly assumed. I almost renounced writing the present article when I saw Michael Hicks’s contribution [4]; it is illuminating, but I felt there was still something to add. For example, Hicks’s set-based illustration is correct but still in my opinion too complex; I believe that the simple 2 x 2 pictures used above convey the ideas  more clearly. On substance, his presentation and others that I have seen do not explicitly mention duality, which in my view is the key concept at work here.

I am grateful to Carlo Ghezzi for enlightening discussions, and benefited from comments by Alexandr Naumchev and others from the Software Engineering Laboratory at Innopolis University.

Appendix: about termination

With apologies to readers who have known all of the following from kindergarten: a statement such as (section 1): “consider an analyzer that finds out whether a program will terminate” can elicit no particular reaction (the enviable bliss of ignorance) or the shocked rejoinder that such an analyzer is impossible because termination (the “halting” problem) is undecidable. This reaction is just as incorrect as the first. The undecidability result for the halting problem says that it is impossible to write a general termination analyzer that will always provide the right answer, in the sense of both soundness and completeness, for any program in a realistic programming language. But that does not preclude writing termination analyzers that answer the question correctly, in finite time, for given programs. After all it is not hard to write an analyzer that will tell us that the program from do_nothing until True loop do_nothing end will terminate and that the program from do_nothing until False loop do_nothing end will not terminate. In the practice of software verification today, analyzers can give such sound answers for very large classes of programs, particularly with some help from programmers who can obligingly provide variants (loop variants, recursion variants). For a look into the state of the art on termination, see the beautiful survey by Cook, Podelski and Rybalchenko [9].

Also appears in the Communications of the ACM blog

VN:F [1.9.10_1130]
Rating: 10.0/10 (7 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

Gail Murphy to speak at Devops 19

The DEVOPS 2019 workshop (6-8 May 2019) follows a first 2018 workshop whose proceedings [1] have just been published in the special LASER-Villebrumier subseries of Springer Lecture notes in Computer Science. It is devoted to software engineering aspects of continuous development and new paradigms of software production and deployment, including but not limited to DevOps.

The keynote will be delivered by Gail Murphy, vice-president Research & Innovation at University of British Columbia and one of leaders in the field of empirical software engineering.

The workshop is held at the LASER conference center in Villebrumier near Toulouse. It is by invitation; if you would like to receive an invitation please contact one of the organizers (Jean-Michel Bruel, Manuel Mazzara and me) with a short description of your interest in the field.

Reference

Jean-Michel Bruel, Manuel Mazzara and Bertrand Meyer (eds.), Software Engineering Aspects of Continuous Development and New Paradigms of Software Production and Deployment, First International Workshop, DEVOPS 2018, Chateau de Villebrumier, France, March 5-6, 2018, Revised Selected Papers, see here..

VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

AI+ML+SE — Reminder about LASER school, coming up in June

A reminder about this year’s LASER school, taking place in Elba, Italy, June 1 to 9. The theme is

               AI + ML + SE

and the speakers:

  • Shai Ben-David, University of Waterloo
  • Lionel C. Briand, University of Luxembourg
  • Pascal Fua, EPFL
  • Eric Meijer, Facebook
  • Tim Menzies, NC State University
  • Me

Details at https://www.laser-foundation.org/school/.  From that page:

The 15th edition of the prestigious LASER summer school, in the first week of June 2019, will be devoted to the complementarity and confluence of three major areas of innovation in IT: Artificial Intelligence, Machine Learning and of course Software Engineering.

The school takes place in the outstanding environment of the Hotel del Golfo in Procchio, Elba, off the coast of Tuscany.

 

VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Sense and sensibility of systematically soliciting speaker slides

There is a fateful ritual to keynote invitations. The first message reads (I am paraphrasing): “Respected peerless luminary of this millennium and the next, Will your excellency ever forgive me for the audacity of asking if you would deign to leave for a short interlude the blessed abodes that habitually beget your immortal insights, and (how could I summon the gall of even forming the thought!) consider the remote possibility of honoring our humble gathering with your august presence, condescending upon that historic occasion to partake of some minute fragment of your infinite wisdom with our undeserving attendees?”. The subsequent email, a few months later, is more like: “Hi Bertrand, looks like we don’t have your slides yet, the conference is coming up and the deadline was last week, did I miss anything?”. Next: “Hey you! Do you ever read your email? Don’t try the old line about your spam filter. Where are your slides? Our sponsors are threatening to withdraw funding. People are cancelling their registrations. Alexandria Ocasio-Cortez is tweeting about the conference. Send the damn stuff!” Last: “Scum, listen. We have athletic friends and they know where you live. The time for sending your PDF is NOW.

Actually I don’t have my slides. I will have them all right for the talk. Maybe five minutes before, if you insist nicely. The talk will be bad or it will be less bad, but the slides are meant for the talk, they are not the talk. Even if I were not an inveterate procrastinator, rising at five on the day of the presentation to write them, I would still like to attend the talks before mine and refer to them. And why in the world would I let you circulate my slides in advance and steal my thunder?

This whole business of slides has become bizarre, a confusion of means and ends. Cicero did not use slides. Lincoln did not have PowerPoint. Their words still struck.

We have become hooked to slides. We pretend that they help the talk but that is a blatant lie except for the maybe 0.1% of speakers who use slides wisely. Slides are not for the benefit of the audience (are you joking? What slide user cares about the audience? Hahaha) but for the sole, exclusive and utterly selfish benefit of the speaker.

Good old notes (“cheat sheets”) would be more effective. Or writing down your speech and reading it from the lectern as historians still do (so we are told) at their conferences.

If we use slides at all, we should reserve them for illustration: to display a photograph, a graph, a table. The way politicians and police chiefs do when they bring a big chart to support their point a press conference.

I am like everyone else and still use slides as crutches. It’s so tempting. But don’t ask me to provide them in advance.

What about afterwards? It depends. Writing down the talk in the form of a paper is better. If you do not have the time, the text of the slides can serve as a simplified record. But only if the speaker wants to spread them that way. There is no rule  that slides should be published. If you see your slides as an ephemeral artifact to support an evanescent event (a speech at a conference) and wish them to remain only in the memory of those inspired enough to have attended it, that is your privilege.

Soliciting speaker slides: somewhat sane, slightly strange, or simply silly?

VN:F [1.9.10_1130]
Rating: 8.7/10 (7 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 5 votes)

La folie française

Nulle part, dans la cohue des exégèses du mouvement des « gilets jaunes », ne trouve-t-on l’explication pourtant évidente : c’est pour partie une affaire de droit commun et pour le reste un coup de main proto-fasciste. Rien d’autre.

L’aspect le plus clair est celui de la délinquance. Dans quel autre pays civilisé des énergumènes se mettent-ils, pour clamer leurs frustrations, à opprimer leurs concitoyens en paralysant la société par la violence ? Dans un seul. La France. Et c’est en France seulement que l’on ne trouve rien à redire. En France et dans tous les pays du monde, si vous bloquez l’entrée d’un rond-point avec votre voiture, les gendarmes arrivent et vous emmènent au poste. En France seulement, si vous faites la même chose avec trente de vos acolytes, tout le monde compatit, le préfet vient repectueusement palabrer avec vous, et Le Monde convoque un professeur de sociologie pour expliquer combien vous avez raison de souffrir du mépris des élites. Absurde et inouï. Si le gouvernement Macron a fait une erreur, c’est celle-là : au premier péage bloqué, au premier radar neutralisé, il fallait dans les dix minutes coffrer les délinquants et les déférer à la justice – quitte à elle, dans la meilleure tradition d’un pays démocratique, de les juger sans passion en écoutant leurs doléances. Mais se plier à la morgue de ces gens qui utilisent la force pour empêcher les autres de vivre leur vie et d’assurer leur subsistance ? La suite était à prévoir : l’illégalité étant officiellement sanctionnée, tout ce qu’un pays compte d’extrémistes de gauche et de droite, et de simples malfrats ravis de casser et de piller, s’engouffre dans la brèche. Mais c’est hypocrite de regretter les malheureux débordements. Avant même l’entrée des casseurs professionnels, la violence était dès la première heure la définition même du mouvement. Il ne s’agissait pas de plaintes, de pétitions, de manifestations ; il s’agissait de saboter le fonctionnement le plus élémentaire d’une société civilisée. D’empêcher les citoyens de circuler et de travailler. Dans tout autre pays les voyous se retrouvaient immédiatement en prison. En France, on les invite à la télévision.

L’illégalité de droit commun n’est que le début. L’idéologie et surtout la pratique de ces gens rappellent de plus en plus le fascisme. Le fascisme est, pour une large part, le triomphe de la force brute sur la légalité: la prise de pouvoir d’une minorité par la violence, et l’imposition par la violence de ses valeurs au reste de la société. Les 250 000 bloqueurs du premier samedi représentaient moins d’un pour cent de la population adulte. De quel droit s’arrogent-ils l’autorité de décider qui passe et qui ne passe pas ? De tabasser un jeune homme et sa compagne, partis pour le cinéma, parce qu’ils refusent de klaxonner leur approbation ? C’est pour ne pas avoir arrêté dans l’œuf ce genre d’action brutale et illégale que l’Allemagne, l’Italie, l’Espagne, le Portugal et d’autres se sont retrouvés dans les années trente sous le joug de dictatures sanguinaires. Le semblant bonhomme et sincère de certains bloqueurs de ronds-points ne peut faire illusion. Il ne s’agit ni plus ni moins que de l’attaque de la force brute. Celle qui ne s’embarrasse pas d’arguments et qui se contente de vous asséner : vous ferez ce que vous dis, car je suis fort, vous êtes faible, et vous êtes en mon pouvoir.

Et leurs revendications ? Tous les conservatismes, tous les refus de raisonner, tout le fiel des envieux s’y retrouvent. Le mouvement, on ne le dira jamais assez, est d’abord celui des chauffards. Qui fréquente la France des provinces sait quelle haine a suscitée l’une des réformes précédentes, la limitation à 80 km/h. La raison était pourtant simple : les ingénieurs ont calculé qu’on pouvait sauver 300 vies par an de cette façon. Les chauffards — qui fréquente la France des routes départementales les connaît bien — n’accordent aucune attention à cet objectif de salut public : non, prétendent-ils, ce n’est qu’un prétexte pour nous ponctionner un peu plus. D’ailleurs l’ire des chauffards, des gilets jaunes, se concentre sur tout ce qui améliore la sécurité routière, comme les radars. La hausse des taxes sur les carburants n’est que le prétexte suivant pour se mettre en colère, prétexte d’autant plus absurde que cette hausse survient à un moment où les prix de base chutent. Quant à la transition énergétique, personne n’y prête attention non plus. Là aussi pourtant, les scientifiques s’époumonent à nous avertir : il est minuit moins une pour faire quelque chose, sinon le monde court à la catastrophe ; accidents climatiques constants, îles englouties, migrations cette fois-ci par dizaines de millions. Vous pourrez bien bloquer les ronds-points alors. Mais non, ce sont encore ces technocrates de Paris qui veulent nous prendre notre argent.

Le problème politique est profondément et exclusivement français. Les Français sont uniques, y compris parmi leurs voisins d’Europe occidentale. L’exception française a ses attraits : le goût, la tradition, l’élégance (pas chez les gilets jaunes), l’amour pour une langue d’une beauté sans égale. Mais elle se manifeste aussi par des défauts indéracinables. Dans tous les pays du monde, le citoyen moyen comprend que pour que quelqu’un reçoive de l’argent quelqu’un doit en produire. L’état c’est moi, et toi, et elle, et lui. Pas en France (et l’on peut avoir fait Polytechnique sans que jamais on vous ait expliqué ce qui ailleurs relève de l’école communale). En France « L’État » c’est quelqu’un d’autre. Il nous prend notre argent, toujours trop, et il est tenu de nous en donner, jamais assez. Il est de bon ton de se moquer des Américains qui croient que le monde a été créé tel quel en six jours, mais les Américains, jusqu’au moins instruit, comprennent les rudiments de l’économie. Les Français non. D’où les revendications conjointes de moins d’impôts et de plus d’aides. On ne peut sous-estimer ici l’influence de la gauche à la française. Cent ans de gauchisme primaire ont profondément corrompu le conscient et l’inconscient collectifs. Les patrons sont des exploiteurs, les salariés des exploités, révoltez-vous !

La deuxième catastrophe va avec la première : l’incompréhension des règles de la démocratie et l’imputation au gouvernement en place (dans le cas présent, en place depuis à peine un an et demi) de tout ce qui va mal. Gavroche le chantait déjà : Je suis tombé par terre / C’est la faute à Voltaire / Le nez dans le ruisseau / C’est la faute à Rousseau. Il ajouterait aujourd’hui :

J’en ai pris plein le front
C’est la faute à Macron

La démocratie, comprise à la française, ce sont tous les privilèges et aucun devoir. C’est le droit inaliénable de la minorité à se venger de son sort sur les innocents. Titre du Monde : « En occupant le rond-point de Gaillon, dans l’Eure, des manifestants forgent leur conscience politique et s’exercent à la démocratie ». Remarquable. On imagine les variantes : « En tirant au bazooka sur mes voisins, je m’exerce au pacifisme ». « En volant des voitures, je m’exerce au civisme ». « En trichant à l’examen, je m’exerce à l’honnêteté ». Invraisembable inversion des valeurs : l’arbitraire et le règne de la force brute érigés en morale.

La troisième catastrophe française est le recours immédiat et constant au sabotage et à la violence. Qui vient régulièrement en France de l’étranger est habitué au phénomène, que l’on pourrait appeler, si c’était drôle, le syndrome des Galeries Lafayette : il se passe toujours quelque chose. Parfois tragiquement venu de l’extérieur, comme dans le cas du terrorisme. Mais le plus souvent interne : grève du rail, grève d’Air France, grève des contrôleurs aériens, manifestation violente, incendie de voitures (dans quel autre pays le nouvel an signifie-t-il qu’on brûle chaque année des centaines de voitures ?), blocage de l’approvisionnement en essence, grève des « intermittents du spectacle » (parce qu’on ne les paye pas assez quand ils ne travaillent pas). Résultat : dans tous les pays voisins, on peut tranquillement planifier un voyage ; en France c’est impossible, on ne sait jamais ce qui va se produire. La violence en particulier est indigne d’un pays démocratique. De ce point de vue les gilets jaunes et leurs coups de main fascisants ne font que suivre une tradition ininterrompue, et largement impunie par crainte des conséquences (toujours le règne de la force) : séquestration de patrons, occupation illégale des universités avec dégradations en millions d’euros et représailles physiques contre ceux qui osent essayer de passer leurs examens, tabassage des responsables des relations humaines d’Air France par des syndicats de type quasi-mafieux. Au-delà de la violence, le dérèglement continuel est la source principale du retard français. La France est aujourd’hui le seul pays d’Europe où les vendeurs par correspondance ont cessé de garantir des dates de livraison, pour cause de troubles. Comment accepter une situation aussi humiliante ? Si les Suisses, les Allemands et d’autres réussissent tellement mieux, ce n’est pas qu’ils soient particulièrement plus intelligents. (D’intelligence et de créativité, la France n’en manque pas, du reste elle en exporte de plus en plus, comme elle exporta ses Huguenots après 1685.) C’est tout simplement qu’ils travaillent dans un environnement stable.

Le résultat récent le plus clair et le plus tragique est l’échec de ce qui aurait pu être une chance majeure pour la France : la récupération de l’industrie financière britannique pulvérisée par l’imbécile Brexit. Paris avait tous les atouts : la magie de la ville (vous iriez vivre à Francfort, vous, si vous aviez le choix ?), un gouvernement jeune et dynamique. Mais les banquiers ne sont pas fous. La banque a besoin de calme et de stabilité. Pas d’occupations, de grèves, de blocages, de déprédations et d’émeutes. Partie perdue, irrémédiablement.

Les destructions ne sont pas des débordements du mouvement : elles sont le mouvement. Dès le début, dès le premier automobiliste empêché de se rendre à son travail, il ne s’agissait pas de protester : il s’agissait de casser l’activité économique. Déjà les commerçants, pour qui novembre et décembre sont les mois clés, annoncent la pire saison depuis des années (et demandent bien sûr des dédommagements à l’État, c’est-à-dire une ponction supplémentaire). Une conspiration au seul bénéfice d’Amazon ne s’y serait pas prise autrement. Qui ne peut voir qu’il ne s’agit en aucun cas d’une protestation politique respectueuse de la démocratie, mais purement et simplement d’une tentative de destruction du pays ?

On s’arrêtera à la quatrième catastrophe française : la démission des clercs. C’est toujours très bien vu en France de s’enthousiasmer pour des idéologies rutilantes et généralement meurtrières. C’est très, très mal vu de soutenir le pouvoir, même quand il représente la raison, le droit et l’avenir. Mais où sont donc les fameuses élites (celles contre qui, selon les poncifs, le peuple est censé se révolter) ? Elles sont occupées à trouver des excuses aux vandales. Le Monde, auto-proclamé « journal de référence » (traduction : le New York Times sans les prix Nobel et sans les correcteurs d’orthographe), a passé tout l’été sur un scandale qu’il avait monté de toutes pièces, et sacrifie quotidiennement la vérité à une espèce de bonne conscience gauchisante sans aucun souci de l’avenir du pays. Le Figaro, au lieu de rallier la bourgeoisie au seul garant possible de l’ordre, se perd en élucubrations identitaires. Libération se croit toujours en Mai 68 et ne suit plus très bien ce qui se passe. Le Canard Enchaîné, vestige de la presse à chantage des années trente, dont on ne saurait sous-estimer dans le paysage français la puissance ricanante, méprisante, délétère et invincible, propage un peu plus chaque mercredi (entre ses contrepèteries obscènes) l’image du « tous pourris ». Pour soutenir Macron, personne.

Les élites devraient pourtant se rallier en masse ; non que Macron et Philippe soient infaillibles (ils ont fait des erreurs et ils en referont) mais tout simplement parce que dans la situation politique française actuelle ils sont le seul espoir crédible d’éviter le désastre. Le désastre, c’est la tiers-mondisation accélérée, l’écroulement de l’économie et le glissement vers le totalitarisme. D’un côté, un démagogue avide de pouvoir, suppôt de toutes les dictatures, admirateur de Chavez et de Maduro (qui en quelques années ont fait d’un des pays les plus stables de l’Amérique Latine, producteur de pétrole de surcroît, un abîme de pauvreté où les enfants meurent faute de médicaments et l’inflation mensuelle est à 94%, et qui serait pour nous le modèle ?) ; de l’autre, une extrémiste incompétente, issue d’un clan familial corrompu qui n’a jamais complètement renoncé à ses sources idéologiques des années trente. Macron est jeune, intelligent, compétent, calme et veut réformer la France là où elle en a le plus besoin, pour le bénéfice même de ceux qui n’ont rien trouvé de mieux pour progresser que de faire du chantage au reste du pays. Il a été démocratiquement élu, par une majorité sans ambages. La simple éthique démocratique appelle à le laisser faire son travail. Le simple souci du salut public appelle à le soutenir.

Tous ceux qui croient en la démocratie ; qui ont confiance dans l’énorme potentiel de la France ; qui savent qu’il faut en finir avec les lourdeurs et incongruités qui la paralysent ; qui perçoivent le risque énorme de totalitarisme ; et qui refusent que la violence d’une minorité l’emporte sur l’état de droit ; tous ceux-là doivent mettre au vestiaire le cynisme et l’éternel moquerie française pour s’engager publiquement et sans réserve, sans complaisance mais sans états d’âme, derrière l’unique force qui peut éviter la descente aux enfers.

VN:F [1.9.10_1130]
Rating: 4.8/10 (23 votes cast)
VN:F [1.9.10_1130]
Rating: -10 (from 18 votes)

The Formal Picnic approach to requirements

picnicRequirements engineering (studying and documenting what a software system should do, independently of how it will do it) took some time to be recognized as a key part of software engineering, since the early focus was, understandably, on programming. It is today a recognized sub-discipline and has benefited in the last decades from many seminal concepts. An early paper of mine, On Formalism in Specifications [1], came at the beginning of this evolution; it made the case for using formal (mathematics-based) approaches. One of the reasons it attracted attention is its analysis of the “seven sins of the specifier”: a list of pitfalls into which authors of specifications and requirements commonly fall.

One of the techniques presented in the paper has not made it into the standard requirements-enginering bag of tricks. I think it deserves to be known, hence the present note. There really will not be anything here that is not in the original article; in fact I will be so lazy as to reuse its example. (Current requirements research with colleagues should lead to the publication of new examples.)

Maybe the reason the idea did not register is that I did not give it a name. So here goes: formal picnic.

The usual software engineering curriculum includes, regrettably, no room for  field trips. We are jealous of students and teachers of geology or zoology and their occasional excursions: once in a while you put on your boots, harness your backpack, and head out to quarries or grasslands to watch pebbles or critters in flagrante, after a long walk with the other boys and girls and before all having lunch together in the wild. Yes, scientific life in these disciplines really is a picnic. What I propose for the requirements process is a similar excursion; not into muddy fields, but into the dry pastures of mathematics.

The mathematical picnic process starts with a natural-language requirements document. It continues, for some part of the requirements, with a translation into a mathematical version. It terminates with a return trip into natural language.

The formal approach to requirements, based on mathematical notations (as was discussed in my paper), is still controversial; a common objection is that requirements must be understandable by ordinary project stakeholders, many of whom do not have advanced mathematical skills. I am not entering this debate here, but there can be little doubt that delicate system properties can be a useful step, if only for the requirements engineers themselves. Mathematical notation forces precision.

What, then, if we want to end up with natural language for clarity, but also to take advantage of the precision of mathematics? The formal picnic answer is that we can use mathematics as a tool to improve the requirements. The three steps are:

  • Start: a natural-language requirements document. Typically too vague and deficient in other ways (the seven sins) to serve as an adequate basis for the rest of the software process, as a good requirements document should.
  • Picnic: an excursion into mathematics. One of the main purposes of a requirements process is to raise and answer key questions about the system’s properties. Using mathematics helps raise the right questions and obtain precise answers. You do not need to apply the mathematical picnic to the entire system: even if the overall specification remains informal, some particularly delicate aspects may benefit from a more rigorous analysis.
  • Return trip: thinking of the non-formalist stakeholders back home, we translate the mathematical descriptions into a new natural-language version.

This final version is still in (say) English, but typically not the kind of English that most people naturally write. It may in fact “sound funny”. That is because it is really just mathematical formulae translated back into English. It retains the precision and objectivity of mathematics, but is expressed in terms that anyone can understand.

Let me illustrate the mathematical picnic idea with the example from my article. For reasons that do not need to be repeated here (they are in the original), it discussed a very elementary problem of text processing: splitting a text across lines. The original statement of the problem, from a paper by Peter Naur, read:

Given a text consisting of words separated by BLANKS or by NL (new line) characters, convert it to a line-by-line form in accordance with the following rules: (1) line breaks must be made only where the given text has BLANK or NL; (2) each line is filled as far as possible as long as  (3) no line will contain more than MAXPOS characters.

My article then cited an alternative specification proposed in a paper by testing experts John Goodenough and Susan Gerhart. G&G criticized Naur’s work (part of the still relevant debate between proponents of tests and proponents of proofs such as Naur). They pointed out deficiencies in his simple problem statement above; for example, it says nothing about the case of a text containing a word of more than MAXPOS characters. G&G stated that the issue was largely one of specification (requirements) and went on to propose a new problem description, four times as long as Naur’s. In my own article, I had a field day taking aim at their own endeavor. (Sometime later I met Susan Gerhart, who was incredibly gracious about my critique of her work, and became an esteemed colleague.) I am not going to cite the G&G replacement specification here; you can find it in my article.

Since that article’s topic was formal approaches, it provided a mathematical statement of Naur’s problem. It noted that  the benefit of mathematical formalization is not just to gain precision but also to identify important questions about the problem, with a view to rooting out dangerous potential bugs. Mathematics means not just formalization but proofs. If you formalize the Naur problem, you soon realize that — as originally posed — it does not always have a solution (because of over-MAXPOS words). The process forces you to specify the conditions under which solutions do exist. This is one of the software engineering benefits of a mathematical formalization effort: if such conditions are not identified at the requirements level, they will take their revenge in the program, in the form of erroneous results and crashes.

You can find the mathematical specification (only one of several possibilities) in the article.  The discussion also noted that one could start again from that spec and go back to English. That was, without the name, the mathematical picnic. The result’s length is in-between the other two versions: twice Naur’s, but half G&G’s. Here it is:

Given are a non-negative integer MAXPOS and a character set including two “break characters” blank and newline. The program shall accept as input a finite sequence of characters and produce as output a sequence of characters satisfying the following conditions:
• It only differs from the input by having a single break character wherever the input has one or more break characters;
• Any MAXPOS + 1 consecutive characters include a newline;
• The number of newline characters is minimal.
If (and only if) an input sequence contains a group of MAXPOS + 1 consecutive nonbreak characters, there exists no such output. In this case, the program shall produce the output associated with the initial part of the sequence, up to and including the MAXPOS·th character of the first such group, and report the error.

This post-picnic version is the result of a quasi-mechanical retranscription from the mathematical specification in the paper.

It uses the kind of English that one gets after a mathematical excursion. I wrote above that this style might sound funny; not to me in fact, because I am used to mathematical picnics, but probably to others (does it sound funny to you?).

The picnic technique provides a good combination of the precision of mathematics and the readability of English. English requirements as ordinarily written are subject to the seven sins described in my article, from ambiguity and contradiction to overspecification and noise. A formalization effort can correct these issues, but yields a mathematical text. Whether we like it or not, many people react negatively to such texts. We might wish they learn, but that is often not an option, and if they are important stakeholders we need their endorsement or correction of the requirements. With a mathematical picnic we translate the formal text back into something they will understand, while avoiding the worst problems of natural-language specifications.

Practicing the Formal Picnic method also has a long-term benefit for a software team. Having seen first-hand that better natural-language specifications (noise-free and more precise) are possible, team members little by little learn to apply the same style to the English texts they write, even without a mathematical detour.

If the goal is high-quality requirements, is there any alternative? What I have seen in many requirements documents is a fearful attempt to avoid ambiguity and imprecision by leaving no stone unturned: adding information and redundancy over and again. This was very much what I criticized in the G&G statement of requirements, which attempted to correct the deficiencies of the Naur text by throwing ever-more details that caused ever more risks of entanglement. It is fascinating to see how every explanation added in the hope of filling a possible gap creates more sources of potential confusion and a need for even more explanations. In industrial projects, this is the process that leads to thousands-of-pages documents, so formidable that they end up (as in the famous Ariane-5 case) on a shelf where no one will consult them when they would provide critical answers.

Mathematical specifications yield the precision and uncover the contradictions, but they also avoid noise and remain terse. Translating them back into English yields a reasonable tradeoff. Try a formal picnic one of these days.

Acknowledgments

For numerous recent discussions of these and many other related topics, I am grateful to my colleagues from the Innopolis-Toulouse requirements research group: Jean-Michel Bruel, Sophie Ebersold, Florian Galinier, Manuel Mazzara and Alexander Naumchev. I remain grateful to Axel van Lamsweerde (beyond his own seminal contributions to requirements engineering) for telling me, six years after I published a version of [1] in French, that I should take the time to produce a version in English too.

Reference

Bertrand Meyer: On Formalism in Specifications, in IEEE Software, vol. 3, no. 1, January 1985, pages 6-25. PDF available via IEEE Xplore with account, and also from here. Adapted translation of an original article in French (AFCET Software Engineering newsletter, no. 1, pages 81-122, 1979).

(This article was originally published on the Comm. ACMM blog.)

VN:F [1.9.10_1130]
Rating: 10.0/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

Ten traits of exceptional innovators

Imagine having had coffee, over the years, with each of Euclid, Galileo, Descartes, Marie Curie, Newton, Einstein, Lise Leitner, Planck and de Broglie. For a computer scientist, if we set aside the founding generation (the Turings and von Neumanns), the equivalent is possible. I have had the privilege of meeting and in some cases closely interacting with pioneer scientists, technologists and entrepreneurs, including Nobel, Fields and Turing winners, Silicon-Valley-type founders and such. It is only fair that I should share some of the traits I have observed in them.

Clarification and disclaimer:

  • This discussion is abstract and as a result probably boring because I am not citing anyone by name (apart from a few famous figures, most of whom are dead and none of whom I have met). It would be more concrete and lively if I buttressed my generalities by actual examples, of which I have many. The absence of any name-dropping is a matter of courtesy and respect for people who have interacted with me unguardedly as a colleague, not a journalist preparing a tell-all book. I could of course cite the names for positive anecdotes only, but that would bias the story (see point 4). So, sorry, no names (and I won’t relent even if you ask me privately — mumm like a fish).
  • I am looking at truly exceptional people. They are drawn from a more general pool of brilliant, successful scientists and technologists, of which they form only a small subset. Many of their traits also apply to this more general community and to highly successful people in any profession. What interests me is the extra step from brilliant to exceptional. It would not be that difficult to identify fifty outstanding mathematics researchers in, say, 1900, and analyze their psychological traits. The question is: why are some of them Hilbert and Poincaré, and others not?
  • Of course I do not even begin to answer that question. I only offer a few personal remarks.
  • More generally, cargo cult does not work. Emulating every one of the traits listed below will not get you a Nobel prize. You will not turn into a great composer by eating lots of Tournedos Rossini. (Well, you might start looking like the aging Rossini.) This note presents some evidence; it does not present any conclusion, let alone advice. Any consequence is for you to draw, or not.
  • The traits obviously do not universally characterize the population observed. Not all of the people exhibit all of the traits. On the other hand, my impression is that most exhibit most.

1 Idiosyncratic

“Idiosyncratic” is a high-sounding synonym for “diverse,” used here to deflect the ridicule of starting a list of what is common to those people by stating that they are different from each other. The point is important, though, and reassuring. Those people come in all stripes, from the stuffy professor to the sandals-shorts-and-Hawaiian-shirt surfer.  Their ethnic backgrounds vary. And (glad you asked) some are men and some are women.

Consideration of many personality and lifestyle features yields no pattern at all. Some of the people observed are courteous, a delight to deal with, but there are a few jerks too. Some are voluble, some reserved. Some boastful, some modest. Some remain for their full life married to the same person, some have been divorced many times, some are single. Some become CEOs and university presidents, others prefer the quieter life of a pure researcher. Some covet honors, others are mostly driven by the pursuit of knowledge. Some wanted to become very rich and did, others care little about money.  It is amazing to see how many traits appear irrelevant, perhaps reinforcing the value of those that do make a difference.

2 Lucky

In trying to apply a cargo-cult-like recipe, this one would be the hardest to emulate. We all know that Fleming came across penicillin thanks to a petri dish left uncleaned on the window sill; we also know that luck favors only the well-prepared: someone other than Fleming would have grumbled at the dirtiness of the place and thrown the dish into the sink. But I am not just talking about that kind of luck. You have to be at the right place at the right time.

Read the biographies, and you will see that almost always the person happened to study with a professor who just then was struggling with a new problem, or did an internship in a group that had just invented a novel technique, or heard about recent results before everyone else did.

Part of what comes under “luck” is luck in obtaining the right education. Sure, there are a few autodidacts, but most of the top achievers studied in excellent institutions.

Success comes from a combination of nature and nurture. The perfect environment, such as a thriving laboratory or world-class research university, is not enough; but neither is individual brilliance. In most cases it is their combination that produces the catalysis.

3 Smart

Laugh again if you wish, but I do not just mean the obvious observation that those people were clever in what they did. In my experience they are extremely intelligent in other ways too. They often possess deep knowledge beyond their specialties and have interesting conversations.

You approach them because of the fame they gained in one domain, and learn from them about topics far beyond it.

4 Human

At first, the title of this section is another cause for ridicule: what did you expect, extraterrestrials? But “human” here means human in their foibles too. You might expect, if not an extraterrestrial, someone of the oracle-of-Delphi or wizard-on-a-mountain type, who after a half-hour of silence makes a single statement perfect in its concision and exactitude.

Well, no. They are smart, but they say foolish things too. And wrong things. Not only do they say them, they even publish them. (Newton wasted his brilliance on alchemy. Voltaire — who was not a scientist but helped promote science, translating Newton and supporting the work of Madame du Châtelet — wasted his powerful wit to mock the nascent study of paleontology: so-called fossils are just shells left over by picnicking tourists! More recently, a very famous computer scientist wrote a very silly book — of which I once wrote, fearlessly, a very short and very disparaging review.)

So what? It is the conclusion of the discussion that counts, not the meanderous path to it, or the occasional hapless excursion into a field where your wisdom fails you. Once you have succeeded, no one will care how many wrong comments you made in the process.

It is fair to note that the people under consideration probably say fewer stupid things than most. (The Erich Kästner ditty from an earlier article applies.) But no human, reassuringly perhaps, is right 100% of the time.

What does set them apart from many people, and takes us back to the previous trait (smart), is that even those who are otherwise vain have no qualms recognizing  mistakes in their previous thinking. They accept the evidence and move on.

5 Diligent

Of two people, one an excellent, top-ranked academic, the other a world-famous pioneer, who is the more likely to answer an email? In my experience, the latter.

Beyond the folk vision of the disheveled, disorganized, absent-minded professor lies the reality of a lifetime of rigor and discipline.

This should not be a surprise. There is inspiration, and there is perspiration.  Think of it as the dual of the  broken-windows theory, or of the judicial view that a defendant who lies in small things probably lies in big things: the other way around, if you do huge tasks well, you probably do small tasks well too.

6 Focused

Along with diligence comes focus, carried over from big matters to small matters. It is the lesser minds that pretend to multiplex. Great scientists, in my experience, do not hack away at their laptops during talks, and they turn off their cellphones. They choose carefully what they do (they are deluged with requests and learn early to say no), but what they accept to do they do. Seriously, attentively, with focus.

A fascinating spectacle is a world-famous guru sitting in the first row at a conference presentation by a beginning Ph.D. student, and taking detailed notes. Or visiting an industrial lab and quizzing a junior engineer about the details of the latest technology.

For someone who in spite of the cargo cult risk is looking for one behavior to clone, this would be it. Study after study has shown that we only delude ourselves in thinking we can multiplex. Top performers understand this. In the seminar room, they are not the ones doing email. If they are there at all, then watch and listen.

7 Eloquent

Top science and technology achievers are communicators. In writing, in speaking, often in both.

This quality is independent from their personal behavior, which can cover the full range from shy to boisterous.  It is the quality of being articulate. They know how to convey their results — and often do not mind crossing the line to self-advertising. It is not automatically the case that true value will out: even the most impressive advances need to be pushed to the world.

The alternative is to become Gregor Mendel: he single-handedly discovered the laws of genetics, and was so busy observing the beans in his garden that no one heard about his work until some twenty years after his death. Most of us prefer to get the recognition earlier. (Mendel was a monk, so maybe he believed in an afterlife; yet again maybe he, like everyone else, might have enjoyed attracting interest in this world first.)

In computer science it is not surprising that many of the names that stand out are of people who have written seminal books that are a pleasure to read. Many of them are outstanding teachers and speakers as well.

8 Open

Being an excellent communicator does not mean that you insist on talking. The great innovators are excellent listeners too.

Some people keep talking about themselves. They exist in all human groups, but this particular trait is common among scientists, particularly junior scientists, who corner you and cannot stop telling you about their ideas and accomplishments. That phenomenon is understandable, and in part justified by an urge to avoid the Mendel syndrome. But in a conversation involving some less and some more recognized professionals it is often the most accomplished members of the group who talk least. They are eager to learn. They never forget that the greatest insighs can start with a casual observation from an improbable source. They know when to talk, and when to shut up and listen.

Openness also means intellectual curiosity, willingness to have your intellectual certainties challenged, focus on the merit of a comment rather than the commenter’s social or academic status, and readiness to learn from disciplines other than your own.

9 Selfish

People having achieved exceptional results were generally obsessed with the chase and the prey. They are as driven as an icebreaker ship in the Sea of Barents. They have to get through; the end justifies the means; anything in the way is collateral damage.

So it is not surprising, in the case of academics, to hear colleagues from their institutions mumble that X never wanted to do his share, leaving it to others to sit in committees, teach C++ to biology majors and take their turn as department chair. There are notable exceptions, such as the computer architecture pioneer who became provost then president at Stanford before receiving the Turing Award. But  you do not achieve breakthroughs by doing what everything else is doing. When the rest of the crowd is being sociable and chatty at the conference party long into the night, they go back to their hotel to be alert for tomorrow’s session. A famous if extreme case is Andrew Wiles, whom colleagues in the department considered a has-been, while he was doing the minimum necessary to avoid trouble while working secretly and obsessively to prove Fermat’s last theorem.

This trait is interesting in light of the soothing discourse in vogue today. Nothing wrong with work-life balance, escaping the rat race, perhaps even changing your research topic every decade (apparently the rule in some research organizations). Sometimes a hands-off, zen-like attitude will succeed where too much obstination would get stuck. But let us not fool ourselves: the great innovators never let go of the target.

10. Generous

Yes, selfishness can go with generosity. You obsess over your goals, but it does not mean you forget other people.

Indeed, while there are a few solo artists in the group under observation, a striking feature of the majority is that in addition to their own achievements they led to the creation of entire communities, which often look up to them as gurus. (When I took the comprehensive exam at Stanford, the first question was what the middle initial “E.” of a famous professor stood for. It was a joke question, counting for maybe one point out of a hundred, helpfully meant to defuse students’ tension in preparation for the hard questions that followed. But what I remember is that every fellow student whom I asked afterwards knew the answer. Me too. Such was the personality cult.) The guru effect can lead to funny consequences, as with the famous computer scientist whose disciples you could spot right away in conferences by their sandals and beards (I do not remember how the women coped), carefully patterned after the master’s.

The leader is often good at giving every member of that community flattering personal attention. In a retirement symposium for a famous professor, almost every person I talked too was proud of having developed a long-running, highly personal and of course unique relationship with the honoree. One prestigious computer scientist who died in the 80’s encouraged and supported countless young people in his country; 30 years later, you keep running into academics, engineers and managers who tell you that they owe their career to him.

Some of this community-building can be self-serving and part of a personal strategy for success. There has to be more to it, however. It is not just that community-building will occur naturally as people discover the new ideas: since these ideas are often controversial at first, those who understood their value early band together to defend them and support their inventor. But there is something else as well in my observation: the creators’ sheer, disinterested generosity.

These people are passionate in their quest for discovery and creation and genuinely want to help others. Driven and self-promoting they may be, but the very qualities that led to their achievements — insight, intellectual courage, ability to think beyond accepted ideas — are at the antipodes of pettiness and narrow-mindedness. A world leader cannot expect any significant personal gain from spotting and encouraging a promising undergraduate, telling a first-time conference presenter that her idea is great and worth pushing further, patiently explaining elementary issues to a beginning student, or responding to a unknown correspondent’s emails. And still, as I have observed many times, they do all of this and more, because they are in the business of advancing knowledge.

These are some of the traits I have observed. Maybe there are more but, sorry, I have to go now. The pan is sizzling and I don’t like my tournedos too well-done.

recycled-logo (Originally published on CACM blog.)

VN:F [1.9.10_1130]
Rating: 9.3/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

I didn’t make it up…

An article published here a few years ago, reproducing a note I wrote much earlier (1992), pointed out that conventional wisdom about the history of software engineering, cited in every textbook, is inaccurate: the term “software engineering” was in use before the famous 1968 Garmisch-Partenkichen conference. See that article for details.

Recently a colleague wanted to cite my observation but could not find my source, a 1966 Communications of the ACM article using the term. Indeed that text is not currently part of the digitalized ACM archive (Digital Library). But I knew it was not a figment of my imagination or of a bad memory.

The reference given in my note is indeed correct; with the help of the ETH library, I was able to get a scan of the original printed article. It is available here.

The text is not a regular CACM article but a president’s letter, part of the magazine’s front matter, which the digital record does not always include. In this case historical interest suggests it should; I have asked the ACM to add it. In the meantime, you can read the scanned version for  a nostalgic peek into what the profession found interesting half a century ago.

Note (12 November 2018): The ACM Digital Library responded (in a matter of hours!) and added the letter to the digital archive.

VN:F [1.9.10_1130]
Rating: 8.4/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

Just call me “Your Highness”

To buy a Zurich Opera ticket online you have to register on their site, including choosing a form of address from a list which beats any I have seen:

anrede

As a side detail, their system does not work: to enable registration and booking it is supposed to send you an email with your password, and it never did in spite of many attempts over several days. The only way to buy tickets turned out to be going to the theater and queuing at the box office. (There I was assured with extreme politeness that I am now registered on the site, but no, never got a password and cannot log in.) I agree with you, though: what a trivial quibble! You may never be able to book a performance, but by clicking the right menu entry you can call yourself Frau Stadträtin.

VN:F [1.9.10_1130]
Rating: 10.0/10 (5 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

Empirical answers: keynote and deadline

The EAQSE workshop is devoted to Empirical Answers to Questions of Software Engineering. The deadline is looming. Just announced: Tom Zimmerman from Microsoft Research will deliver the keynote. Tom is uniquely qualified; he has performed unique studies of what people, particularly software engineering practitioners, expect from empirical research. See hereand there. Outstanding papers.

See also the list of candidate questions on the workshop site.

The background for the workshop is the observation, stated in two earlier articles in this blog (starting here), that empirical studies of software engineering have made tremendous progress over the past decades, and that it is important to move the focus to what is important for practicing software developers, not just what can be studied (the lamppost temptation).

As to the submissions: if anyone needs a little more time, write to the organizers, we should be able to accommodate a few more days (but only a few). The proceedings will be published (in LNCS) after the event,  so the focus for the meeting itself is on presentation of important ideas.

VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

More French Poetry on Amazon

Involuntary poetry that is. This one is even more puzzling, in its own charming way, than the previously cited example.

From https://www.amazon.fr/dp/B072V71WVN/ref=psdc_3155122031_t4_B073PYSZNG:

Pour être une dame ou un monsieur : Bouteille de vin automatique ouverte sans effort avec ce tire-bouchon électrique. Gardez votre élégant ou votre gentleman pendant que vous ouvrez la bouteille de vin. Pas de problème. Tu es une dame mais tu es aussi un homme, Vous en aurez besoin et cela vous permettra de garder votre élégant ou monsieur.

Finalement j’ai préféré garder mon élégant.

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Why not program right?

recycled-logo (Originally published on CACM blog.)

Most of the world programs in a very strange way. Strange to me. I usually hear the reverse question: people ask us, the Eiffel community, to explain why we program our way. I hardly understand the question, because the only mystery is how anyone can even program in any other way.

The natural reference is the beginning of One Flew Over the Cuckoo’s Nest: when entering an insane asylum and wondering who is an inmate and who a doctor, you may feel at a loss for objective criteria. Maybe the rest of the world is right and we are the nut cases. Common sense suggests it.

But sometimes one can go beyond common sense and examine the evidence. So lend me an ear while I explain my latest class invariant. Here it is, in Figure 1. (Wait, do not just run away yet.)

multigraph_invariant

Figure 1: From the invariant of class MULTIGRAPH

This is a program in progress and by the time you read this note the invariant and enclosing class will have changed. But the ideas will remain.

Context: multigraphs

The class is called MULTIGRAPH and describes a generalized notion of graph, illustrated in Figure 2. The differences are that: there can be more than one edge between two nodes, as long as they have different tags (like the spouse and boss edges between 1 and 2); and there can be more than one edge coming out of a given node and with a given tag (such as the two boss edges out of 1, reflecting that 1’s boss might be 2 in some cases and 3 in others). Some of the nodes, just 1 here, are “roots”.

The class implements the notion of multigraph and provides a wide range of operations on multigraphs.

multigraph_example

Figure 2: A multigraph

Data structures

Now we turn to the programming and software engineering aspects. I am playing with various ways of accessing multigraphs. For the basic representation of a multigraph, I have chosen a table of triples:

                triples_table: HASH_TABLE [TRIPLE, TUPLE [source: INTEGER; tag: INTEGER; target: INTEGER]]  — Table of triples, each retrievable through its `source’, `tag’ and `target’.

where the class TRIPLE describes [source, tag, target] triples, with a few other properties, so they are not just tuples. It is convenient to use a hash table, where the key is such a 3-tuple. (In an earlier version I used just an ARRAY [TRIPLE], but a hash table proved more flexible.)

Sources and targets are nodes, also called “objects”; we represent both objects and tags by integers for efficiency. It is easy to have structures that map symbolic tag names such as “boss” to integers.

triples_table is the core data structure but it turns out that for the many needed operations it is convenient to have others. This technique is standard: for efficiency, provide different structures to access and manipulate the same underlying information, with some redundancy. So I also have:

 triples_from:  ARRAYED_LIST [LIST [TRIPLE]]
               — Triples starting from a given object. Indexed by object numbers.

  triples_with:  HASH_TABLE [LIST [TRIPLE], INTEGER]
               — Triples labeled by a given tag. Key is tag number.

 triples_to:  ARRAYED_LIST [LIST [TRIPLE]]
               — Triples leading into a given object. Indexed by object numbers.

Figure 3 illustrates triples_from and Figures 4 illustrates triples_with. triples_to is similar.

triples_from

Figure 3: The triples_from array of lists and the triples_table

triples_with

Figure 4: The triples_with array of lists and the triples_table

It is also useful to access multigraphs through yet another structure, which gives us the targets associated with a given object and tag:

successors: ARRAY [HASH_TABLE [LIST [TRIPLE], INTEGER]]
               — successors [obj] [t] includes all o such that there is a t- reference from obj to o.

For example in Figure 1 successors [1] [spouse] is {2, 3}, and in Figures 3 and 4 successors [26] [t] is {22, 55, 57}. Of course we can obtain the “successors” information through the previously defined structures, but since this is a frequently needed operation I decided to include a specific data structure (implying that every operation modifying the multigraph must update it). I can change my mind later on and decide to make “successors” a function rather than a data structure; it is part of the beauty of OO programming, particularly in Eiffel, that such changes are smooth and hardly impact client classes.

There is similar redundancy in representing roots:

                roots:  LINKED_SET [INTEGER]
                              — Objects that are roots.

                is_root:  ARRAY [BOOLEAN]
                              — Which objects are roots? Indexed by object numbers.

If o is a root, then it appears in the “roots” set and is_root [o] has value True.

Getting things right

These are my data structures. Providing such a variety of access modes is a common programming technique. From a software engineering perspective ― specification, implementation, verification… ― it courts disaster. How do we maintain their consistency? It is very easy for a small mistake to slip into an operation modifying the graph, causing one of the data structures to be improperly updated, but in a subtle and rare enough way that it will not manifest itself during testing, coming back later to cause strange behavior that will be very hard to debug.

For example, one of the reasons I have a class TRIPLE and not just 3-tuples is that a triple is not exactly  the same as an edge in the multigraph. I have decided that by default the operation that removes and edge would not remove the corresponding triple from the data structure, but leave it in and mark it as “inoperative” (so class TRIPLE has an extra “is_inoperative” boolean field). There is an explicit GC-like mechanism to clean up deleted edges occasionally. This approach brings efficiency but makes the setup more delicate since we have to be extremely careful about what a triple means and what removal means.

This is where I stop understanding how the rest of the world can work at all. Without some rigorous tools I just do not see how one can get such things right. Well, sure, spend weeks of trying out test cases, printing out the structures, manually check everything (in the testing world this is known as writing lots of “oracles”), try at great pains to find out the reason for wrong results, guess what program change will fix the problem, and start again. Stop when things look OK. When, as Tony Hoare once wrote, there are no obvious errors left.

Setting aside the minuscule share of projects (typically in embedded life-critical systems) that use some kind of formal verification, this process is what everyone practices. One can only marvel that systems, including many successful ones, get produced at all. To take an analogy from another discipline, this does not compare to working like an electrical engineer. It amounts to working like an electrician.

For a short time I programmed like that too (one has to start somewhere, and programming methodology was not taught back then). I no longer could today. Continuing with the Hoare citation, the only acceptable situation is to stop when there are obviously no errors left.

How? Certainly not, in my case, by always being right the first time. I make mistakes like everyone else does. But I have the methodology and tools to avoid some, and, for those that do slip through, to spot and fix them quickly.

Help is available

First, the type system. Lots of inconsistencies, some small and some huge, which in an untyped language would only hit during execution, do not make it past compilation. We are not just talking here about using REAL instead of INTEGER. With a sophisticated type system involving multiple inheritance, genericity, information hiding and void safety, a compiler error message can reflect a tricky logical mistake. You are using a SET as if it were a LIST (some operations are common, but others not). You are calling an operation on a reference that may be void (null) at run time. And so on.

By the way, about void-safety: for a decade now, Eiffel has been void-safe, meaning a compile-time guarantee of no run-time null pointer dereferencing. It is beyond my understanding how the rest of the world can still live with programs that run under myriad swords of Damocles: x.op (…) calls that might any minute, without any warning or precedent, hit a null x and crash.

Then there is the guarantee of logical consistency, which is where my class invariant (Figure 1) comes in. Maybe it scared you, but in reality it is all simple concepts, intended to make sure that you know what you are doing, and rely on tools to check that you are right. When you are writing your program, you are positing all kinds, logical assumptions, large and (mostly) small, all the time. Here, for the structure triples_from [o] to make sense, it must be a list such that:

  • It contains all the triples t in the triples_table such that t.source = o.
  •  It contains only those triples!

You know this when you write the program; otherwise you would not be having a “triples_from” structure. Such gems of knowledge should remain an integral part of the program. Individually they may not be rocket science, but accumulated over the lifetime of a class design, a subsystem design or a system design they collect all the intelligence that makes the software possible.  Yet in the standard process they are gone the next minute! (At best, some programmers may write a comment, but that does not happen very often, and a comment has no guarantee of precision and no effect on testing or correctness.)

Anyone who takes software development seriously must record such fundamental properties. Here we need the following invariant clause:

across triples_from as tf all

across tf.item as tp all tp.item.source = tf.cursor_index end

end

(It comes in the class, as shown in Figure 1, with the label “from_list_consistent”. Such labels are important for documentation and debugging purposes. We omit them here for brevity.)

What does that mean? If we could use Unicode (more precisely, if we could type it easily with our keyboards) we would write things like “∀ x: E | P (x) for all x in E, property P holds of x. We need programming-language syntax and write this as across E as x all P (x.item) end. The only subtlety is the .item part, which gives us generality beyond the  notation: x in the across is not an individual element of E but a cursor that moves over E. The actual element at cursor position is x.item, one of the properties of that cursor. The advantage is that the cursor has more properties, for example x.cursor_index, which gives its position in E. You do not get that with the plain of mathematics.

If instead of  you want  (there exists), use some instead of all. That is pretty much all you need to know to understand all the invariant clauses of class MULTIGRAPH as given in Figure 1.

So what the above invariant clause says is: take every position tf in triples_from; its position is tf.cursor_index and its value is tf.item. triples_from is declared as ARRAYED_LIST [LIST [TRIPLE]], so tf.cursor_index is an integer representing an object o, and tf.item is a list of triples. That list should  consist of the triples having tf.cursor_index as their source. This is the very property that we are expressing in this invariant clause, where the innermost across says: for every triple tp.item in the list, the source of that triple is the cursor index (of the outside across). Simple and straightforward, I think (although such English explanations are so much more verbose than formal versions, such as the Eiffel one here, and once you get the hang of it you will not need them any more).

How can one ever include a structure such as triples_from without expressing such a property? To put the question slightly differently: am I inside the asylum looking out, or outside the asylum looking in? Any clue would be greatly appreciated.

More properties

For the tag ( with_) and target lists, the properties are similar:

across triples_with as tw all across tw.item as tp all tp.item.tag = tw.key end end

across triples_to as tt all across tt.item as tp all tp.item.target = tt.cursor_index end end 

We also have some properties of array bounds:

 is_root.lower = 1 and is_root.upper = object_count

triples_from.lower = 1 and triples_from.upper = object_count

triples_to.lower = 1 and triples_to.upper = object_count

where object_count is the number of objects (nodes), and for an array a (whose bounds in Eiffel are arbitrary, not necessarily 0 or 1, and set on array creation), a.lower and a.upper are the bounds. Here we number the arrays from 1.

There are, as noted, two ways to represent rootness. We must express their consistency (or risk trouble). Two clauses of the invariant do the job:

across roots as t all is_root [t.item] end

across is_root as t all (t.item = roots.has (t.cursor_index)) end

The first one says that if we go through the list roots we only find elements whose is_root value is true; the second, that if we go through the array “is_root” we find values that are true where and only where the corresponding object, given by the cursor index, is in the roots set. Note that the = in that second property is between boolean values (if in doubt, check the type instantly in the EIffelStudio IDE!), so it means “if and only if.

Instead of these clauses, a more concise version, covering them both, is just

roots ~ domain (is_root)

with a function domain that gives the domain of a function represented by a boolean array. The ~ operator denotes object equality, redefined in many classes, and in particular in the SET classes (roots is a LINKED_SET) to cover equality between sets, i.e. the property of having the same elements.

The other clauses are all similarly self-explanatory. Let us just go through the most elaborate one, successors_consistent, involving three levels of across:

across successors as httpl all                   — httpl.item: hash table of list of triples

        across httpl.item as tpl all                — tpl.item: list of triples (tpl.key: key (i.e. tag) in hash table (tag)

                  across tpl.item as tp all            — tp.item: triple

                         tp.item.tag = tpl.key

and tp.item.source = httpl.cursor_index

                   end

          end

end

You can see that I struggled a bit with this one and made provisions for not having to struggle again when I would look at the code again 10 minutes, 10 days or 10 months later. I chose (possibly strange but consistent) names such as httpl for hash-table triple, and wrote comments (I do not usually need any in invariant and other contract clauses) to remind me of the type of everything. That was not strictly needed since once again the IDE gives me the types, but it does not cost much and could help.

What this says: go over successors; which as you remember is an ARRAY, indexed by objects, of HASH_TABLE, where each entry of such a hash table has an element of type [LIST [TRIPLE] and a key of type INTEGER, representing the tag of a number of outgoing edges from the given object. Go over each hash table httpl. Go over the associated list of triples tpl. Then for each triple tp in this list: the tag of the triple must be the key in the hash table entry (remember, the key does denote a tag); and the source of the triple must the object under consideration, which is the current iteration index in the array of the outermost iteration.

I hope I am not scaring you at this point. Although the concepts are simple, this invariant is more sophisticated than most of those we typically write. Many invariant clauses (and preconditions, and postconditions) are very simple properties, such as x > 0 or x ≠ y. The reason this one is more elaborate is not that I am trying to be fussy but that without it I would be the one scared to death. What is elaborate here is the data structure and programming technique. Not rocket science, not anything beyond programmers typically do, but elaborate. The only way to get it right is to buttress it by the appropriate logical properties. As noted, these properties are there anyway, in the back of your head, when you write the program. If you want to be more like an electrical engineer than an electrician, you have to write them down.

There is more to contracts

Invariants are not the only kind of such “contract properties. Here for example, from the same class, is a (slightly abbreviated) part of the postcondition (output property) of the operation that tells us, through a boolean Result, if the multigraph has an edge of given components osource, t (the tag) and otarget :

Result =

(across successors [osource] [t] as tp some

not tp.item.is_inoperative and tp.item.target = otarget

end)

In words, this clause expresses the compatibility of the operation with the successors view: it must answer yes if and only if otarget appears in the successor set of osource for t, and the corresponding triple is not marked inoperative.

The concrete benefits

And so? What do we get out of making these logical properties explicit? Just the intellectual satisfaction of doing things right, and the methodological guidance? No! Once you have done this work, it is all downhill. Turn on the run-time assertion monitoring option (tunable separately for preconditions, postconditions, invariants etc., and on by default in development mode), and watch your tests run. If you are like almost all of us, you will have made a few mistakes, some which will seem silly when or rather if you find them in time (but there is nothing funny about a program that crashes during operation) and some more subtle. Sit back, and just watch your contracts be violated. For example if I change <= to < in the invariant property tw.key <= max_tag, I get the result of Figure 5. I see the call stack that I can traverse, the object run-time structure that I can explore, and all the tools of a modern debugger for an OO language. Finding and correcting the logical flaw will be a breeze.

debugger

Figure 5: An invariant violation brings up the debugger

The difference

It will not be a surprise that I did not get all the data structures and algorithms of the class MULTIGRAPH  right the first time. The Design by Contract approach (the discipline of systematically expressing, whenever you write any software element, the associated logical properties) does lead to fewer mistakes, but everyone occasionally messes up. Everyone also looks at initial results to spot and correct mistakes. So what is the difference?

Without the techniques described here, you execute your software and patiently examine the results. In the example, you might output the content of the data structures, e.g.

List of outgoing references for every object:

        1: 1-1->1|D, 1-1->2|D, 1-1->3|D, 1-2->1|D, 1-2->2|D,  1-25->8|D, 1-7->1|D, 1-7->6|D,

1-10->8|D, 1-3->1|D, 1-3->2|D, 1-6->3|D, 1-6->4|D, 1-6->5|D

        3: 3-6->3, 3-6->4, 3-6->5, 3-9->14, 3-9->15,   3-9->16, 3-1->3, 3-1->2, 3-2->3, 3-2->2,

                  3-25->8, 3-7->3, 3-7->6, 3-10->8, 3-3->3,  3-3->2    

List of outgoing references for every object:

        1: 1-1->1|D, 1-1->2|D, 1-1->3|D, 1-2->1|D, 1-2->2|D, 1-25->8|D, 1-7->1|D, 1-7->6|D,

1-10->8|D, 1-3->1|D,  1-3->2|D, 1-6->3|D, 1-6->4|D, 1-6->5|D

        3: 3-6->3, 3-6->4, 3-6->5, 3-9->14, 3-9->15,  3-9->16, 3-1->3, 3-1->2, 3-2->3, 3-2->2,

                                 3-25->8, 3-7->3, 3-7->6, 3-10->8, 3-3->3,  3-3->2

and so on for all the structures. You check the entries one by one to ascertain that they are as expected. The process nowadays has some automated support, with tools such as JUnit, but it is still essentially manual, tedious and partly haphazard: you write individual test oracles for every relevant case. (For a more automated approach to testing, taking advantage of contracts, see [1].) Like the logical properties appearing in contracts, these oracles are called assertions but the level of abstraction is radically different: an oracle describes the desired result of one test, where a class invariant, or routine precondition, or postcondition expresses the properties desired of all executions.

Compared to the cost of writing up such contract properties (simply a matter of formalizing what you are thinking anyway when you write the code), their effect on testing is spectacular. Particularly when you take advantage of across iterators. In the example, think of all the checks and crosschecks automatically happening across all the data structures, including the nested structures as in the 3-level across clause. Even with a small test suite, you immediately get, almost for free, hundreds or thousands of such consistency checks, each decreasing the likelihood that a logical flaw will survive this ruthless process.

Herein lies the key advantage. Not that you will magically stop making mistakes; but that the result of such mistakes, in the form of contract violations, directly points to logical properties, at the level of your thinking about the program. A wrong entry in an output, whether you detect it visually or through a Junit clause, is a symptom, which may be far from the cause. (Remember Dijkstra’s comment, the real point of his famous Goto paper, about the core difficulty of programming being to bridge the gap between the static program text, which is all that we control, and its effect: the myriad possible dynamic executions.) Since the cause of a bug is always a logical mistake, with a contract violation, which expresses a logical inconsistency, you are much close to that cause.

(About those logical mistakes: since a contract violation reflects a discrepancy between intent, expressed by the contract, and reality, expressed by the code, the mistake may be on either side. And yes, sometimes it is the contract that is wrong while the implementation in fact did what is informally expected. There is partial empirical knowledge [1] of how often this is the case. Even then, however, you have learned something. What good is a piece of code of which you are not able to say correctly what it is trying to do?)

The experience of Eiffel programmers reflects these observations. You catch the mistakes through contract violations; much of the time, you find and correct the problem easily. When you do get to producing actual test output (which everyone still does, of course), often it is correct.

This is what has happened to me so far in the development of the example. I had mistakes, but converging to a correct version was a straightforward process of examining violations of invariant violations and other contract elements, and fixing the underlying logical problem each time.

By the way, I believe I do have a correct version (in the sense of the second part of the Hoare quote), on the basis not of gut feeling or wishful thinking but of solid evidence. As already noted it is hard to imagine, if the code contains any inconsistencies, a test suite surviving all the checks.

Tests and proofs

Solid evidence, not perfect; hard to imagine, not impossible. Tests remain only tests; they cannot exercise all cases. The only way to achieve demonstrable correctness is to rely on mathematical proofs performed mechanically. We have this too, with the AutoProof proof system for Eiffel, developed in recent years [1]. I cannot overstate my enthusiasm for this work (look up the Web-based demo), its results (automated proof of correctness of a full-fledged data structures and algorithms library [2]) and its potential, but it is still a research effort. The dynamic approach (meaning test-based rather than proof-based) presented above is production technology, perfected over several decades and used daily for large-scale mission-critical applications. Indeed (I know you may be wondering) it scales up without difficulty:

  • The approach is progressive. Unlike fully formal methods (and proofs), it does not require you to write down every single property down to the last quantifier. You can start with simple stuff like x > 0. The more you write, the more you get, but it is the opposite of an all-or-nothing approach.
  • On the practical side, if you are wondering about the consequences on performance of a delivered system: there is none. Run-time contract monitoring is a compilation option, tunable for different kinds of contracts (invariants, postconditions etc.) and different parts of a system. People use it, as discussed here, for development, testing and debugging. Most of the time, when you deliver a debugged system, you turn it off.
  • It is easy to teach. As a colleague once mentioned, if you can write an if-then-else you can write a precondition. Our invariants in the above example where a bit more sophisticated, but programmers do write loops (in fact, the Eiffel loop for iterating over a structure also uses across, with loop and instructions instead of all or some and boolean expressions). If you can write a loop over an array, you can write a property of the array’s elements.
  • A big system is an accumulation of small things. In a blog article [5] I recounted how I lost a full day of producing a series of technical diagrams of increasing complexity, using one of the major Web-based collaborative development tools. A bug of the system caused all the diagrams to reproduce the first, trivial one. I managed to get through to the developers. My impression (no more than an educated guess resulting from this interaction) is that the data structures involved were far simpler than the ones used in the above discussion. One can surmise that even simple invariants would have uncovered the bug during testing rather than after deployment.
  • Talking about deployment and tools used directly on the cloud: the action in software engineering today is in DevOps, a rapid develop-deploy loop scheme. This is where my perplexity becomes utter cluelessness. How can anyone even consider venturing into that kind of exciting but unforgiving development model without the fundamental conceptual tools outlined above?

We are back then to the core question. These techniques are simple, demonstrably useful, practical, validated by years of use, explained in professional books (e.g. [6]), introductory programming textbooks (e.g. [7]), EdX MOOCs (e.g. [8]), YouTube videos, online tutorials at eiffel.org, and hundreds of articles cited thousands of times. On the other hand, most people reading this article are not using Eiffel. On reflection, a simple quantitative criterion does exist to identify the inmates: there are far more people outside the asylum than inside. So the evidence is incontrovertible.

What, then, is wrong with me?

References

(Nurse to psychiatrist: these are largely self-references. Add narcissism to list of patient’s symptoms.)

1.    Ilinca Ciupa, Andreas Leitner, Bertrand Meyer, Manuel Oriol, Yu Pei, Yi Wei and others: AutoTest articles and other material on the AutoTest page.

2. Bertrand Meyer, Ilinca Ciupa, Lisa (Ling) Liu, Manuel Oriol, Andreas Leitner and Raluca Borca-Muresan: Systematic evaluation of test failure results, in Workshop on Reliability Analysis of System Failure Data (RAF 2007), Cambridge (UK), 1-2 March 2007 available here.

3.    Nadia Polikarpova, Ilinca Ciupa and Bertrand Meyer: A Comparative Study of Programmer-Written and Automatically Inferred Contracts, in ISSTA 2009: International Symposium on Software Testing and Analysis, Chicago, July 2009, available here.

4.    Carlo Furia, Bertrand Meyer, Nadia Polikarpova, Julian Tschannen and others: AutoProof articles and other material on the AutoProof page. See also interactive web-based online tutorial here.

5.    Bertrand Meyer, The Cloud and Its Risks, blog article, October 2010, available here.

6.    Bertrand Meyer: Object-Oriented Software Construction, 2nd edition, Prentice Hall, 1997.

7.    Bertrand Meyer: Touch of Class: Learning to Program Well Using Objects and Contracts, Springer, 2009, see touch.ethz.ch and Amazon page.

8.    MOOCs (online courses) on EdX : Computer: Art, Magic, Science, Part 1 and Part 2. (Go to archived versions to follow the courses.)

VN:F [1.9.10_1130]
Rating: 9.9/10 (12 votes cast)
VN:F [1.9.10_1130]
Rating: +8 (from 10 votes)

New speaker and final announcement for LASER 2018 on blockchains (Elba, June)

There is one more speaker at the LASER 2018 summer school two weeks from now: Dominic Woerner from Bosch. Mr. Woerner is a solution architect for Internet of Things and distributed ledger applications with Bosch Digital Solutions, and a researcher within the Bosch Economy of Things research project. He did his PhD at ETH Zurich working at the Bosch IoT Lab and was a visiting researcher at the MIT Media Lab’s Digital Currency Initiative. His talk will be on “From the cloud-based Internet of Things to the Economy of Things”.

The 2018 LASER, the school’s 14th  edition, will take place in Elba, Italy, June 2 to 10, 2018. The theme is

                Software Technology for Blockchains, Bitcoins and Distributed Trust Systems

and the speakers (4 to 6 lectures each):

  • Christian Cachin, IBM, on Secure distributed programming for blockchains
  • Primavera De Filippi, CNRS and Harvard University, on Social and legal effects of distributed trust systems
  • Maurice Herlihy, Brown, on Blockchains, Smart Contracts, & the future of distributed computing
  • Christoph Jentzsch, Slock.it
  • me, on Software engineering for distributed systems
  • Emin Gun Sirer, Cornell University
  • Roger Wattenhofer, ETH Zurich

Details at https://www.laser-foundation.org/school/.  From that page:

The combination of advances in distributed systems and cryptography is bringing about a revolution not only in finance, with Bitcoin and other cryptocurrencies, but in many other fields whose list keeps growing. The LASER 2018 summer school brings the best experts in the field, from both industry and academia, to provide an in-depth exploration of the fascinating software issues and solutions behind these breakthrough technology advances.

VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Creative writing

(This article is about the kind of French that you find on e-commerce sites these days.)

Le site amazon.fr a de plus en plus de produits destinés à des marchés internationaux et bénéficiant d’une présentation en français produite par des algorithmes approximatifs. Je ne suis pas absolument sûr du « bénéfice » en question, ni d’ailleurs de la relation exacte de ces textes à la langue de Molière. Exemple :

Service de longue vie: La lumière de la chaîne est en parallèle connecté qui empêche un échec ampoule influence à d’autres bulbs.Even avec les ampoules cassées ou enlevées, restant ampoules continuera à éclairer. Si une ampoule sortir, vous remplacez tout simplement.

Les deux ou trois premières fois cela peut avoir un certain charme mais on s’en lasse. Bien que n’ayant pas de statistiques précises, j’ai l’impression que si l’on exclut les livres ce style est aujourd’hui celui de la majorité des articles présentés sur Amazon France. On peut en rire. Mais au vu de la dégradation générale du langage visible par exemple sur les forums — il ne s’agit plus des « fautes de français » qu’on nous apprenait à éviter et qui paraissent maintenant bien innocentes, mais d’une dissolution complète, ou l’on ne distingue plus par exemple le participe passé, parlé, de l’infinitif, parler, ni de la seconde personne du présent au pluriel, parlez etc. etc. — c’est plutôt inquiétant.

Apparemment le marché français n’est pas assez grand pour justifier autre chose que ce mépris. Cela n’a pas l’air de gêner Amazon non plus. Mais even avec la tolérance on finira par remplacez tout simplement.

VN:F [1.9.10_1130]
Rating: 10.0/10 (2 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

New paper: making sense of agile methods

Bertrand Meyer: Making Sense of Agile Methods, in IEEE Software, vol. 35, no. 2, March 2018, pages 91-94. IEEE article page here (may require membership or purchase). Draft available here.

An assessment of agile methods, based on my book Agile! The Good, the Hype and the Ugly. It discusses, beyond the hype, the benefits and dangers of agile principles and practices, focusing on concrete examples of what helps and what hurts.

VN:F [1.9.10_1130]
Rating: 9.3/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 6 votes)

Mainstream enough for me

Every couple of weeks or so, I receive a message such as the one below; whenever I give a talk on any computer science topic anywhere in the world, strangers come to me to express similar sentiments. While I enjoy compliments as much as anyone else, I am not the right recipient for such comments. In fact there are 7,599,999,999  more qualified recipients. For me, Eiffel is “mainstream” enough.

What strikes me is why so many commenters, after the compliment, stop at the lament. Eiffel is not some magical dream, it is a concrete technology available for download at eiffel.org. Praising Eiffel will not change the world. Using EiffelStudio might.

When one answers the compliments with “Thanks! Then use it for your work“, the variety of excuses is amusing, or sad depending on the perspective, from “my boss would not allow it” (variant: “my subordinates would not accept it”) to “does it work with [library that does not work with anything else]?”.

Well, you might have some library wrapping to do (EiffelStudio easily interfaces with C, C++ and others). Also, you should not stop at the first hurdle: it might be due to a bug (surprise! The technology is not perfect!), but it might also just be that Eiffel and EiffelStudio are different and you have to shed some long-held assumptions and practices. What matters is that the technology does work; companies large and small use Eiffel all the time for long-running projects, some into the millions of lines and tens of thousands of classes, and refuse to switch to anything else.

What follows is a literal translation of the original message into English (it was written in another language). Since the author, whom I do not know, did not state the email was a public comment, I removed identifying details.

 

Subject:Eiffel is fantastic! But why is it not mainstream?

Dear Professor Meyer:

Greetings from [the capital of a country on another continent].

I graduated from [top European university] in 1996 and completed a master’s in physics from [institute on another continent] in 2006.

I have worked for twenty years in the industry, from application engineer to company head. In my industry career I have been able to be both CEO and CTO at the same time, thanks to the good education I received originally.

Information systems were always a pillar of my business strategy. Unfortunately, I was disappointed every single time I commissioned the development of a new system. This led me to study further and to investigate why the problem is not solved. That’s how I found your book Object-Oriented Software Construction and became enthusiastic about Design by Contract, Eiffel and EiffelStudio. To me your method is the only method for developing “correct” software. The Eiffel programming language is, in my view, the only true object-oriented language.

However it befuddles me — I cannot understand —  why the “big” players in this industry (Apple, Google, Microsoft etc.) do not use Design by Contract. .NET has a Visual Studio extension with the name “Code Contracts” but it is no longer supported in the latest Visual Studio 2017. Big players, why don’t you promote Design by Contract?

Personally, after 20 years in industry, I found out that my true calling is in research. It would be a great pleasure to be able to work in research. My dream job is Data Scientist and I had thought to apply to Google for a job. Studying the job description, I noted that “Python” is one of the desired languages. Python is dynamically typed and does not support good encapsulation. No trace of Design by Contract…

What’s wrong with the software industry?

With best regards,

VN:F [1.9.10_1130]
Rating: 10.0/10 (7 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

Blue hair and tenure track

Interview (in Russian) of Nadia Polikarpova (who proved the correctness of the EiffelBase 2 library in her PhD at ETH and is now an assistant professor at UCSD) on the site of her original university, ITMO. She explains the US tenure-track system to Europeans — good luck! She also says that programming language people are nicer than systems people; really?

VN:F [1.9.10_1130]
Rating: 9.7/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

Un po’ tondo

Every Mozart study states that his last Symphony, “Jupiter” (Köchel 551), is one of humankind’s greatest musical achievements. Every description of the symphony indicates that the first movement borrows a theme from a concert aria. Every one that I have read expresses surprise at this self-borrowing and states that the reason for it is a complete mystery. I think I know that reason.

The theme as it appears in the symphony begins like this:

jupiter

Click to listen [1]:

That theme is taken from the development of the short concert aria Un Bacia di Mano (A Handkiss) (K 451):

bacio

Hear it [2]:

The words sung on this theme are only part of the text, but they are the important part, emphasized and several times repeated:

Voi siete un po’ tondo
Mio caro Pompeo
L’usanze del mondo
Andate a studiar

meaning (my translation):

You are a bit of a simpleton,
My dear Pompeo.
Time for you to get out
And learn the ways of the world.

(Note to my Italian friends: yes, the Italian text  says “tondo”, not “tonto”. It may sound strange to you but apparently that’s how they talked in the settecento. The text, by the way, is attributed, although with no certainty, to Lorenzo Da Ponte.)

(Note to my American friends: yes, the current director of the CIA happens to be called “Pompeo”. From what I read in the news he could benefit from the advice. But let us not digress.)

What is this aria? It belongs to the “interpolated aria” genre, in which a composer would reuse the words of an air from an existing opera and set them to new music. Avoids having to ask a librettist for a new text, pleases singers by giving them bravura pieces, and undoubtedly for Mozart offers an excellent way to show the world how much more he could do, with the same words, than your average court composer. The text to Un bacio di mano originally came from an opera by Pasquale Anfossi.  The context of the aria is standard 18/19-th comedy fare: mock advice to an old man wanting to take a young spouse (as in Donizetti’s Don Pasquale).

All the Mozart biographies and analyses sound puzzled. Why in the world would Mozart, in one of his most momentous and majestic works, his last symphony, also the longest, insert a hint to an aria with such a lowbrow, almost silly subject? Here (from countless examples) is the kind of explanation you read:

Why risk interpolating yet another tune into the concatenation of ideas that he’s already given his listeners, and asked his orchestra to dramatize; and a melody, what’s more, that comes from a different expressive world, the low comedy of opera buffa as opposed to high-minded symphonic discussion? Mozart puts the whole structure of this movement on the line, seemingly for the sake of a compositional joke. It’s a piece of postmodernism avant la lettre, and the kind of thing that Beethoven, for all his iconoclasm, hardly risked in the same way in his symphonies.

Nonsense. Mozart liked jokes, but to think of him as some kind of dodecaphonist putting in random inserts is absurd. He would not include a gratuitous joke in a major work. “Postmodernism avant la lettre”, what is that supposed to mean? Some of the other commenters at least have the honesty to admit that they do not have a clue.

The clue is not so hard to find if you look at the words. In the aria already, the four lines cited break out seemingly from nowhere and through their repetition soar on their own, far above the triviality of the rest of the text. Mozart wanted to showcase this theme of urging a naïve man to get out and learn how the world works. And now, just a few weeks later — the aria is from June 1788, the symphony from July or August  — Mozart is broke, he just lost a child, his wife is sick, he has to beg his friend Puchberg for money, his stardom as a boy wonder is long gone, audiences (he thinks) have moved on, no one truly recognizes his genius. Other, more docile composers have decent, stable positions with a prince here or a duke there, and he who wanted to play the proud independent artist can hardly feed his family. He could have been organist at Versailles, and turned down the position [4] as beneath him; which it was, but at least it was a position. Here he is, the greatest genius of musical history, composing a symphony like no one else could even conceive of, and he sits alone in his derelict study with his wife coughing next door. He may not want to admit it, but deep down he feels that he has not long to live. Not one for self-pity, he looks sarcastically at his hungry self: you poor naïve soul, you never wanted to be a mere Anfossi or Salieri, and so you did not condescend to bow and smile humbly and flatter like everyone else did. You were so far above the rest of them that sooner or later the world was going to give you the recognition you deserve. Now this dirty attic. You are a bit of a simpleton, my dear Wolfie. Isn’t it time you got out, and learned the ways of the world?

That is the logical and human explanation. Do not ask for historical proof; it is a conjecture. But listen to the music, think of Wolfgang Amadeus in his mansard, read the words, and it will dawn on you too that this was what he meant when quoting his own looney tune.

Notes and references

[1] From Mackerras (Scottish Chamber Orchestra) performance here (first movement only). More performances (complete symphony) here.

[2] From the Bryn Terfel performance here. See more performances here. One is by José van Dam, of whom I am generally a great fan, but here I find the tempo too slow; same for the Thomas Hampson version. The Fischer-Dieskau recording is not what one would expect. Note that the aria is originally for a bass but most of these performances are by barytones (which is fine too). The Jardin des Voix (William Christie) video is fun.

[3] Symphony guide: Mozart’s 41st , see here.

[4] See e.g. here from Mozart, by Robert Gutman. Think of the effect on the later history of French music if he had been of a different mind!

VN:F [1.9.10_1130]
Rating: 9.3/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Towards empirical answers to important software engineering questions

(Adapted from a two-part article on the Communications of the ACM blog.)

1 The rise of empirical software engineering

One of the success stories of software engineering research in recent decades has been the rise of empirical studies. Visionaries such as Vic Basili, Marvin Zelkowitz and Walter Tichy advocated empirical techniques early [1, 2, 3]; what enabled the field to take off was the availability of software repositories for such long-running projects as Apache, Linux and Eclipse [4], which researchers started mining using modern data analysis techniques.

These studies have yielded many insights including surprises. More experienced developers can produce more buggy code (Schröter, Zimmermann, Premraj, Zeller). To predict whether a module has bugs, intrinsic properties such as complexity seem to matter less than how many changes it went through (Moser, Pedrycz, Succi). Automatic analysis of user reports seems much better at identifying bugs than spotting feature requests (Panichella, Di Sorbo, Guzman, Visaggio, Canfora, Gall). More extensively tested modules tend to have more bugs (Mockus, Nagappan, Dinh-Trong). Eiffel programmers do use contracts (Estler, Furia, Nordio, Piccioni and me). Geographical distance between team members negatively affects the amount of communication in projects (Nordio, Estler, Tschannen, Ghezzi, Di Nitto and me). And so on.

The basic observation behind empirical software engineering is simple: if software products and processes are worthy of discussion, they must be worthy of quantitative discussion just like any natural artifact or human process. Usually at that point the advocacy cites Lord Kelvin:”If you cannot measure it, you cannot improve it” [5].

Not that advocacy is much needed today, at least for publishing research in software engineering and in computer science education. The need for empirical backing of conceptual proposals has achieved consensus.  The so-called a “Marco Polo paper” [6] (I traveled far and saw wonderful things, thank you very much for your attention) no longer suffices for program committees; today they want numbers (and also, thankfully, a “threats to validity” section which protects you against suspicions that the numbers are bogus by stating why they might be). Some think this practice of demanding empirical backing for anything you propose has gone too far; see Jeff Ullman’s complaint [7], pertaining to database research rather than software engineering, but reflecting some of the same discussions. Here we can counter Kelvin with another quote (for more effect attributed to Einstein, albeit falsely): not everything that can be counted counts, and not everything that counts can be counted.

2 Limits of empirical research

There can indeed be too much of a good thing. Still, no one would seriously deny the fundamental role that empirical research has gained in modern software engineering. Which does not prevent us from considering the limits of what it has achieved; not in a spirit of criticism for its own sake, but to help researchers define an effective agenda for the next steps. There are in my opinion two principal limitations of current empirical results in software engineering.

The first has to do with the distinction introduced above between the two kinds of possible targets for empirical assessment: products (artifacts) versus processes.

Both aspects are important, but one is much easier to investigate than the other. For software products, the material of study is available in the form of repositories mentioned above, with their wealth of information about lines of code, control and data structures, commits, editing changes, bug reports and bug fixes. Processes are harder to grasp. You gain some information on processes from the repositories (for example, patterns and delays of bug fixing), but processes deserve studies of their own. For example, agile teams practice iterations (sprints) of widely different durations, from a few days to a few weeks; what is the ideal length? A good empirical answer would help many practitioners. But this example illustrates how difficult empirical studies of processes can be: you would need to try many variations with teams of professional programmers (not students) in different projects, different application areas, different companies; for the results to be believable the projects should be real ones with business results at stake, there should be enough samples in each category to ensure statistical significance, and the companies should agree to publication of some form, possibly anonymized, of the outcomes. The difficulties are formidable.

This issue of how to obtain project-oriented metrics is related to the second principal limitation of some of the initial empirical software engineering work: the risk of indulging in lamppost research. The term refers to the well-known joke about the drunkard who, in the dark of the night, searches for his lost keys next to the lamp post, not because he has lost them there but because it is the only place where one can see anything. To a certain extent all research is lamppost research: by definition, if you succeed in studying something, it will be because it can be studied. But the risk is to choose to work on a problem only, or principally, because it is easy to set up an empirical study — regardless of its actual importance. To cite an example that I have used elsewhere, one may suspect that the reason there are so many studies of pair programming is not that it’s of momentous relevance but that it is not hard to set up an experiment.

3 Beyond the lamppost

As long as empirical software engineering was a young, fledgling discipline, it made good sense to start with problems that naturally lended themselves to empirical investigation. But now that the field has matured, it may be time to reverse the perspective and start from the consumer’s perspective: for practitioners of software engineering, what problems, not yet satisfactorily answered by software engineering theory, could benefit, in the search for answers, from empirical studies?

Indeed, this is what we are entitled to expect from empirical studies: guidance. The slogan of empirical software engineering is that software is worthy of study just like geological strata, photons, and lilies-of-the-valley; OK, sure, but we are talking about human artifacts rather than wonders of the natural world, and the idea should be to help us produce better software and produce software better.

4 A horror story

Whenever we call for guidance from empirical studies, we should immediately include a caveat: every empirical study has its limitations (politely called “threats to validity”) and one must be careful about any generalization. The following horror story serves as caution [9]. The fashion today in programming language design is to use the semicolon not as separator in the Algol tradition (instruction1 ; instruction2) but as a terminator in the C tradition (instruction1; instruction2;). The original justification, particularly in the case of Ada [10], is an empirical paper by Gannon and Horning [11], which purported to show that the terminator convention led to fewer errors. (The authors themselves not only give their experimental results but, departing from the experimenter’s reserve, explicitly jump to the conclusion that terminators are better.) This view defies reason: witness, among others, the ever-recommenced tragedy of if c then a; else; b where the semicolon after else is an error (a natural one, since one gets into the habit of adding semicolons just in case) but the code compiles, with the result that b will be executed in all cases rather than (as intended) just when c is false [12].

How in the world could an empirical study come up with such a bizarre conclusion? Go back to the original Gannon-Horning paper and the explanation becomes clear: the experiments used subjects who were familiar with the PL/I programming language, where semicolons are used generously and an extra semicolon is harmless, as it is in all practical languages (two successive semicolons being simply interpreted as the insertion of an empty instruction, causing no harm); but the experimental separator-based language and compiler used to the experiment treated an extra semicolon as an error! As if this were not enough, checking the details of the article reveals that the terminator language is terminator-based for both declarations and instructions, whereas the example delimiter language is only delimiter-based for instructions, but terminator-based for declarations. Talk about a biased experiment! The experiment was bogus and so are the results.

One should not be too harsh about a paper from 1975, when the very idea of systematic experimental studies of programming was novel, and some of its other results are worthy of consideration. But the sad terminator story, even though it only affected a syntax property, should serve as a reminder that we should not accept a view blindly just because someone invokes some empirical study to justify it. We should assess the study itself, its methods and its credibility.

5 Addressing the issues that matter

With this warning in mind, we should still expect empirical software engineering to help us practitioners. It should help address important software engineering problems.

Ideally, I should now list the open issues of software engineering, but I am in no position even to start such a list. All I can do is to give a few examples. They may not be important to you, but they give an idea:

  • What are the respective values of upfront design and refactoring? How best can we combine these approaches?
  • Specification and testing are complementary techniques. Specifications are in principles superior to testing in general, but testing remains necessary. What combination of specification and testing works best?
  • What is the best commit/release technique, and in particular should we use RTC (Review Then Commit, as with Apache originally then Google) or CTR (Commit To Review, as Apache later) [13]?
  • What measure of code properties best correlates with effort? Many fancy metrics have appeared in the literature over the years, but there is still a nagging feeling among many of us that for all its obvious limitations the vulgar SLOC metrics (Source Lines Of Code) still remains the least bad.
  • When can a manager decide to stop testing? We did some work on the topic [14], but it is only a start.
  • Is test coverage a good measure of test quality [15] (spoiler: it is not, but again we need more studies)?

And so on. These examples may not be the cases that you consider most important; indeed what we need is input from many software engineers to help steer empirical software engineering towards the topics that truly matter to the community.

To provide a venue for that discussion, a workshop will take place 10-12 September 2018 (provisional dates) in the Toulouse area, involving many of the tenors in empirical software engineering, with the same title as these two articles: Empirical Answers to Important Software Engineering Questions. The key idea is to start not from the solutions side (the lamppost) but from the actual challenges facing software engineers. It will not just be a traditional publication-oriented meeting but will also include ample time for discussions and joint work.

If you would like to contribute your example “important questions”, please use any appropriate support (responses to this blog, email to me, Facebook, LinkedIn, anything as long as one can find it). Suggestions will be taken into consideration for the workshop. Empirical software engineering has already established itself as a core area of research; it is time feed that research with problems that actually matter to software developers, managers and users

Acknowledgments

These reflections originated in a keynote that I gave at ESEM in Bolzano in 2010 (I am grateful to Barbara Russo and Giancarlo Succi for the invitation). I never wrote up the talk but I dug up the slides [8] since they might contain a few relevant observations. I used some of these ideas in a short panel statement at ESEC/FSE 2013 in Saint Petersburg, and I am grateful to Moshe Vardi for suggesting I should write them up for Communications of the ACM, which I never did.

References and notes

[1] Victor R. Basili: The role of experimentation in software engineering: past, present and future,  in 18th ICSE (International Conference on Software Engineering), 1996, see here.

[2] Marvin V. Zelkowitz and Dolores Wallace: Experimental validation in software engineering, International Conference on Empirical Assessment and Evaluation in Software Engineering, March 1997, see here.

[3] Walter F. Tichy: Should computer scientists experiment more?, in IEEE Computer, vol. 31, no. 5, pages 32-40, May 1998, see here.

[4] And EiffelStudio, whose repository goes back to the early 90s and has provided a fertile ground for numerous empirical studies, some of which appear in my publication list.

[5] This compact sentence is how the Kelvin statement is usually abridged, but his thinking was more subtle.

[6] Raymond Lister: After the Gold Rush: Toward Sustainable Scholarship in Computing, Proceedings of 10th conference on Australasian Computing Education Conference, pages 3-17, see here.

[7] Jeffrey D. Ullman: Experiments as research validation: have we gone too far?, in Communications of the ACM, vol. 58, no. 9, pages 37-39, 2015, see here.

[8] Bertrand Meyer, slides of a talk at ESEM (Empirical Software Engineering and Measurement), Bozen/Bolzano, 2010, available here. (Provided as background material only, they are  not a paper but just slide support for a 45-minute talk, and from several years ago.)

[9] This matter is analyzed in more detail in section 26.5 of my book Object-Oriented Software Construction, 2nd edition, Prentice Hall. No offense to the memory of Jim Horning, a great computer scientist and a great colleague. Even great computer scientists can be wrong once in a while.

[10] I know this from the source: Jean Ichbiah, the original designer of Ada, told me explicitly that this was the reason for his choice of  the terminator convention for semicolons, a significant decision since it was expected that the language syntax would be based on Pascal, a delimiter language.

[11] Gannon & Horning, Language Design for Programming Reliability, IEEE Transactions on Software Engineering, vol. SE-1, no. 2, June 1975, pages 179-191, see here.

[12] This quirk of C and similar languages is not unlike the source of the Apple SSL/TLS bug discussed earlier in this blog under the title Code matters.

[13] Peter C. Rigby, Daniel M. German, Margaret-Anne Storey: Open Source Software Peer Review Practices: a Case study of the Apache Server, in ICSE (International Conference on Software Engineering) 2008, pages 541-550, see here.

[14] Carlo A. Furia, Bertrand Meyer, Manuel Oriol, Andrey Tikhomirov and  Yi Wei:The Search for the Laws of Automatic Random Testing, in Proceedings of the 28th ACM Symposium on Applied Computing (SAC 2013), Coimbra (Portugal), ACM Press, 2013, see here.

[15] Yi Wei, Bertrand Meyer and Manuel Oriol: Is Coverage a Good Measure of Testing Effectiveness?, in Empirical Software Engineering and Verification (LASER 2008-2010), eds. Bertrand Meyer and Martin Nordio, Lecture Notes in Computer Science 7007, Springer, February 2012, see here.

VN:F [1.9.10_1130]
Rating: 9.8/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Festina retro

We “core” computer scientists and software engineers always whine that our research themes forever prevent us, to the delight of our physicist colleagues but unjustly, from reaching the gold standard of academic recognition: publishing in Nature. I think I have broken this barrier now by disproving the old, dusty laws of physics! Brace yourself for my momentous discovery: I have evidence of negative speeds.

My experimental setup (as a newly self-anointed natural scientist I am keen to offer the possibility of replication) is the Firefox browser. I was downloading an add-on, with a slow connection, and at some point got this in the project bar:

Negative download speed

Negative speed! Questioning accepted wisdom! Nobel in sight! What next, cold fusion?

I fear I have to temper my enthusiasm in deference to more mundane explanations. There’s the conspiracy explanation: the speed is truly negative (more correctly, it is a “velocity”, a vector of arbitrary direction, hence in dimension 1 possibly negative); Firefox had just reversed the direction of transfer, surreptitiously dumping my disk drive to some spy agency’s server.

OK, that is rather far-fetched. More likely, it is a plain bug. A transfer speed cannot be negative; this property is not just wishful thinking but should be expressed as an integral part of the software. Maybe someone should tell Firefox programmers about class invariants.

VN:F [1.9.10_1130]
Rating: 9.6/10 (9 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)