Archive for the ‘Requirements’ Category.

A standard plan for modern requirements

Requirements documents for software projects in industry, agile or not, typically follow a plan defined in a 1998 IEEE standard (IEEE 830-1998 [1]),  “reaffirmed” in 2009. IEEE 830 has the merit of simplicity, as it fits in 37 pages of which just a few (competently) describe basic requirements concepts and less than 10 are devoted to explaining the standard recommended plan, which itself consists of 3 sections with subsections. Simplicity is good but the elementary nature of the IEEE-830 plan is just not up to the challenges of modern information technology. It is time to retire this venerable precursor and move to a structure that works for the kind of ambitious, multi-faceted IT systems we build today.

For the past few years I have worked on defining a systematic approach to requirements, culminating in a book to be published in the Fall, Handbook of Requirements and Business Analysis. One of the results of this effort is a standard plan, based on the “PEGS” view of requirements where the four parts cover Project, Environment, Goals and System. The details are in the book (for some of the basic concepts see a paper at TOOLS 2019, [2]). Here I will introduce some of the key principles, since they are already  be used — as various people have done since I first started presenting the ideas in courses and seminars (particularly an ACM Webinar, organized by Will Tracz last March, whose recording is available on YouTube, and another hosted by Grady Booch for IBM).

pegs

The starting point, which gives its name to the approach, is that requirements should cover the four aspects mentioned, the four “PEGS”, defined as follows:

  • A Goal is a result desired by an organization.
  • A System is a set of related artifacts, devised to help meet certain goals.
  • A Project is the set of human processes involved in the planning, construction, revision and  operation of a system.
  • An Environment is the set of entities (such as people, organizations, devices and other material objects, regulations and other systems) external to the project and system but with the potential to affect the goals, project or system or to be affected by them.

The recommended standard plan consequently consists of four parts or books.

This proposed standard does not prescribe any particular approach to project management, software development, project lifecycle or requirements expression, and is applicable in particular to both traditional (“waterfall”) and agile projects. It treats requirements as a project activity, not necessarily a lifecycle step. One of the principles developed in the book is that requirements should be treated as a dynamic asset of the project, written in a provisional form (more or less detailed depending on the project methodology) at the beginning of the project, and then regularly extended and updated.

Similarly, the requirements can be written using any appropriate notation and method, from the most informal to the most mathematical.  In a recently published ACM Computing Surveys paper [3], my colleagues and I reviewed the various levels of formalism available  in today’s requirements approaches. The standard plan is agnostic with respect to this matter.

The books may reference each other but not arbitrarily. The permitted relations are as follows:

references

Note in particular that the description of the Goals should leave out technical details and hence may not refer to Project and System elements, although it may need to explain the properties of the Environment that influence or constrain the business goals. The Environment exists independently of the IT effort, and hence the Environment book should not reference any of the others, with the possible exception (dotted arrow in the figure) of effects of the System if it is to change the environment. (For example, a payroll IT system may affect the payroll process; a heating system may affect the ambient temperature.)

The multi-book structure of the recommended PEGS standard plan already goes beyond the traditional view of a single, linear “requirements document”. The books themselves are not necessarily written as linear texts; with today’s technology, requirements parts can and generally should be stored in a requirements repository which includes all requirements-relevant elements.  A linear form remains necessary; it can be either written as such or produced by tools from elements in the repository.

The standard plan defines the structure of the four PEGS books down to one more level, chapters. For any further levels (sections), each organization can define its own rules. Books can also include front and back matter, including for example  tables of contents, legal disclaimers, revision history etc., not covered by the standard structure. Here is that structure:

books

It is meant to be self-explanatory, but here are a few comments on each of the books:

  • One of the products of the requirements effort should be to help plan and manage  the rest of the Project. This is the goal of the Project book; note in particular P.4 and P.5 covering tasks and deadlines. P.7 starts out at the beginning of the project as a blueprint for the requirements effort, and as this effort proceeds (stakeholder interviews, stakeholder workshops, competitive analysis, requirements writing …) can be regularly updated to report on how it went. (This feature is an example of treating the requirements repository and documents as a living, dynamic asset, as noted above.)
  • In the Environment book, constraints (E.3) are properties of the environment (the problem domain) imposed on the project and system. Effects (E.5) go the other way around, describing how the system may affect the environment. Invariants (E.6) do both. Assumptions (E.4) are properties that are taken for granted to simplify the construction of the system (for example, a maximum number of simultaneous users), as distinct from actual constraints.
  • The Goals book is intended for a less technical audience than the other books: it must be understandable to decision makers and non-IT-savvy stakeholders. It includes a short summary (G.4) of functionality, a kind of capsule version of the System book trimmed down to the essentials. Note the importance of specifying (in G.6) what aspects the system is not intended to address. The Goals book can include some (G.5) usage scenarios expressed in terms meaningful to the book’s constituencies and hence remaining at a high level of generality.
  • Detailed usage scenarios will appear in the System book (S.4).  It is important to prioritize the functions (S.5), allowing a reasoned approach (rather than decisions made in a panic) if the project hits roadblocks and must sacrifice some of the functionality.

A naïve but still widely encountered view of requirements is that they serve to  “describe what the system will do” (independently of how it will do it). In the structure above, it corresponds to just one-fourth of the requirements effort, the System part. Work on requirements engineering in the past few decades has emphasized, among other concepts, the need to separate system and environment properties (Michael Jackson, Pamela Zave) and the importance of goals (Axel van Lamsweerde).

The plan reflects this richness of the requirements concept and I hope it can help many projects produce better requirements for better software.

References

[1] IEEE 830-1998, available here.

[2] Bertrand Meyer, Jean-Michel Bruel, Sophie Ebersold, Florian Galinier and Alexandr Naumchev: The Anatomy of Software Requirements, in TOOLS 2019, Springer Lecture Notes in Computer Science 11771, 2019, pages 10-40.

[3] Jean-Michel Bruel, Sophie Ebersold, Florian Galinier, Manuel Mazzara, Alexander Naumchev and Bertrand Meyer:  The Role of Formalism in System Requirements, in  Computing Surveys (ACM), vol. 54, no. 5, June 2021, pages 1-36, DOI: https://doi.org/10.1145/3448975, preprint available here.

RecycledA version of this article appeared earlier in the Communications of the ACM blog.

 

VN:F [1.9.10_1130]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)

Tomorrow (Thursday) noon EDT: ACM talk on requirements

In the software engineering family requirements engineering is in my experience the poor cousin, lagging behind the progress of other parts (such as design). I have been devoting attention to the topic in recent months and am completing a book on the topic.

Tomorrow (Thursday), I will be covering some of the material in a one-hour Tech Talk for ACM, with the title

The Four PEGS of Requirements Engineering

The time is Thursday, 4 March 2021, at noon EDT (New York) and 18 CET (Paris, Zurich etc.). Attendance is free but requires registration, on the event page  here.

Abstract:

Bad software requirements can jeopardize projects. There is a considerable literature on requirements, but practice is far behind: what passes for requirements in industry usually consists of a few use cases or user stories, which are useful but not sufficient as a solution. Can we fix requirements engineering (known in other circles as business analysis) so that it is no longer the weak link in software engineering?

I will present ongoing work intended to help industry produce more useful requirements. It includes precise definitions of requirements concepts and a standard plan for requirements specifications, intended to replace the venerable but woefully obsolete IEEE standard from 1998. The plan contains four books covering the four “PEGS” of requirements engineering (which I will explain). The approach builds on existing knowledge to define a practical basis for requirements engineering and provide projects with precise and helpful guidelines.

This is I think the fourth time I am giving talks in this venue (previous talks were about Design by Contract, Agile Methods and Concurrency).

VN:F [1.9.10_1130]
Rating: 6.4/10 (10 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 9 votes)

Some contributions

Science progresses through people taking advantage of others’ insights and inventions. One of the conditions that makes the game possible is that you acknowledge what you take. For the originator, it is rewarding to see one’s ideas reused, but frustrating when that happens without acknowledgment, especially when you are yourself punctilious about citing your own sources of inspiration.

I have started to record some concepts that are widely known and applied today and which I believe I originated in whole or in part, whether or not their origin is cited by those who took them. The list below is not complete and I may update it in the future. It is not a list of ideas I contributed, only of those fulfilling two criteria:

  • Others have built upon them.  (If there is an idea that I think is great but no one paid attention to it, the list does not include it.)
  • They have gained wide visibility.

There is a narcissistic aspect to this exercise and if people want to dismiss it as just showing I am full of myself so be it. I am just a little tired of being given papers to referee that state that genericity was invented by Java, that no one ever thought of refactoring before agile methods, and so on. It is finally time to state some facts.

Facts indeed: I back every assertion by precise references. So if I am wrong — i.e. someone preceded me — the claims of precedence can be refuted; if so I will update or remove them. All articles by me cited in this note are available (as downloadable PDFs) on my publication page. (The page is up to date until 2018; I am in the process of adding newer publications.)

Post-publication note: I have started to receive some comments and added them in a Notes section at the end; references to those notes are in the format [A].

Final disclaimer (about the narcissistic aspect): the exercise of collecting such of that information was new for me, as I do not usually spend time reflecting on the past. I am much more interested in the future and definitely hope that my next contributions will eclipse any of the ones listed below.

Programming concepts: substitution principle

Far from me any wish to under-represent the seminal contributions of Barbara Liskov, particularly her invention of the concept of abstract data type on which so much relies. As far as I can tell, however, what has come to be known as the “Liskov Substitution Principle” is essentially contained in the discussion of polymorphism in section 10.1 of in the first edition (Prentice Hall, 1988) of my book Object-Oriented Software Construction (hereafter OOSC1); for example, “the type compatibility rule implies that the dynamic type is always a descendant of the static type” (10.1.7) and “if B inherits from A, the set of objects that can be associated at run time with an entity [generalization of variable] includes instances of B and its descendants”.

Perhaps most tellingly, a key aspect of the substitution principle, as listed for example in the Wikipedia entry, is the rule on assertions: in a proper descendant, keep the invariant, keep or weaken the precondition, keep or strengthen the postcondition. This rule was introduced in OOSC1, over several pages in section 11.1. There is also an extensive discussion in the article Eiffel: Applying the Principles of Object-Oriented Design published in the Journal of Systems and Software, May 1986.

The original 1988 Liskov article cited (for example) in the Wikipedia entry on the substitution principle says nothing about this and does not in fact include any of the terms “assertion”, “precondition”, “postcondition” or “invariant”. To me this absence means that the article misses a key property of substitution: that the abstract semantics remain the same. (Also cited is a 1994 Liskov article in TOPLAS, but that was many years after OOSC1 and other articles explaining substitution and the assertion rules.)

Liskov’s original paper states that “if for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for oz, then S is a subtype of T.” As stated, this property is impossible to satisfy: if the behavior is identical, then the implementations are the same, and the two types are identical (or differ only by name). Of course the concrete behaviors are different: applying the operation rotate to two different figures o1 and o2, whose types are subtypes of FIGURE and in some cases of each other, will trigger different algorithms — different behaviors. Only with assertions (contracts) does the substitution idea make sense: the abstract behavior, as characterized by preconditions, postconditions and the class invariants, is the same (modulo respective weakening and strengthening to preserve the flexibility of the different version). Realizing this was a major step in understanding inheritance and typing.

I do not know of any earlier (or contemporary) exposition of this principle and it would be normal to get the appropriate recognition.

Software design: design patterns

Two of the important patterns in the “Gang of Four” Design Patterns book (GoF) by Gamma et al. (1995) are the Command Pattern and the Bridge Pattern. I introduced them (under different names) in the following publications:

  • The command pattern appears in OOSC1 under the name “Undo-Redo” in section 12.2. The solution is essentially the same as in GoF. I do not know of any earlier exposition of the technique. See also notes [B] and [C].
  • The bridge pattern appears under the name “handle technique” in my book Reusable Software: The Base Component Libraries (Prentice Hall, 1994). It had been described several years earlier in manuals for Eiffel libraries. I do not know of an earlier reference. (The second edition of Object-Oriented Software Construction — Prentice Hall, 1997, “OOSC2” –, which also describes it, states that a similar technique is described in an article by Josef Gil and Ricardo Szmit at the TOOLS USA conference in the summer of 1994, i.e. after the publication of Reusable Software.)

Note that it is pointless to claim precedence over GoF since that book explicitly states that it is collecting known “best practices”, not introducing new ones. The relevant questions are: who, pre-GoF, introduced each of these techniques first; and which publications does the GoF cites as “prior art”  for each pattern. In the cases at hand, Command and Bridge, it does not cite OOSC1.

To be concrete: unless someone can point to an earlier reference, then anytime anyone anywhere using an interactive system enters a few “CTRL-Z” to undo commands, possibly followed by some “CTRL-Y” to redo them (or uses other UI conventions to achieve these goals), the software most likely relying on a technique that I first described in the place mentioned above.

Software design: Open-Closed Principle

Another contribution of OOSC1 (1988), section 2.3, reinforced in OOSC2 (1997) is the Open-Closed principle, which explained one of the key aspects of inheritance: the ability to keep a module both closed (immediately usable as is) and open to extension (through inheritance, preserving the basic semantics. I am mentioning this idea only in passing since in this case my contribution is usually recognized, for example in the Wikipedia entry.

Software design: OO for reuse

Reusability: the Case for Object-Oriented Design (1987) is, I believe, the first publication that clearly explained why object-oriented concepts were (and still are today — in Grady Booch’s words, “there is no other game in town”) the best answer to realize the goal of software construction from software components. In particular, the article:

  • Explains the relationship between abstract data types and OO programming, showing the former as the theoretical basis for the latter. (The CLU language at MIT originated from Liskov’s pioneering work on abstract data types, but was not OO in the full sense of the term, missing in particular a concept of inheritance.)
  • Shows that reusability implies bottom-up development. (Top-down refinement was the mantra at the time, and promoting bottom-up was quite a shock for many people.)
  • Explains the role of inheritance for reuse, as a complement to Parnas’s interface-based modular construction with information hiding.

Software design: Design by Contract

The contribution of Design by Contract is one that is widely acknowledged so I don’t have any point to establish here — I will just recall the essentials. The notion of assertion goes back to the work of Floyd, Hoare and Dijkstra in the sixties and seventies, and correctness-by-construction to Dijktra, Gries and Wirth, but Design by Contract is a comprehensive framework providing:

  • The use of assertions in an object-oriented context. (The notion of class invariant was mentioned in a paper by Tony Hoare published back in 1972.)
  • The connection of inheritance with assertions (as sketched above). That part as far as I know was entirely new.
  • A design methodology for quality software: the core of DbC.
  • Language constructs carefully seamed into the fabric of the language. (There were precedents there, but in the form of research languages such as Alphard, a paper design only, not implemented, and Euclid.)
  • A documentation methodology.
  • Support for testing.
  • Support for a consistent theory of exception handling (see next).

Design by Contract is sometimes taken to mean simply the addition of a few assertions here and there. What the term actually denotes is a comprehensive methodology with all the above components, tightly integrated into the programming language. Note in particular that preconditions and postconditions are not sufficient; in an OO context class invariants are essential.

Software design: exceptions

Prior to the Design by Contract work, exceptions were defined very vaguely, as something special you do outside of “normal” cases, but without defining “normal”. Design by Contract brings a proper perspective by defining these concepts precisely. This was explained in a 1987 article, Disciplined Exceptions ([86] in the list), rejected by ECOOP but circulated as a technical report; they appear again in detail in OOSC1 (sections 7.10.3 to 7.10.5).

Other important foundational work on exceptions, to which I know no real precursor (as usual I would be happy to correct any omission), addressed what happens to the outcome of an exception in a concurrent or distributed context. This work was done at ETH, in particular in the PhD theses  of B. Morandi and A. Kolesnichenko, co-supervised with S. Nanz. See the co-authored papers [345] and [363].

On the verification aspect of exceptions, see below.

Software design: refactoring

I have never seen a discussion of refactoring that refers to the detailed discussion of generalization in both of the books Reusable Software (1994, chapter 3) and Object Success (Prentice Hall, 1995, from page 122 to the end of chapter 6). These discussions describe in detail how, once a program has been shown to work, it should be subject to a posteriori design improvements. It presents several of the refactoring techniques (as they were called when the idea gained traction several years later), such as moving common elements up in the class hierarchy, and adding an abstract class as parent to concrete classes ex post facto.

These ideas are an integral part of the design methodology presented in these books (and again in OOSC2 a few later). It is beyond me why people would present refactoring (or its history, as in the Wikipedia entry on the topic) without referring to these publications, which were widely circulated and are available for anyone to inspect.

Software design: built-in documentation and Single-Product principle

Another original contribution was the idea of including documentation in the code itself and relying on tools to extract the documentation-only information (leaving implementation elements aside). The idea, described in detail in OOSC1 in 1988 (sections 9.4 and 9.5) and already mentioned in the earlier Eiffel papers, is that code should be self-complete, containing elements of various levels of abstraction; some of them describe implementation, but the higher-level elements describe specification, and are distinguished syntactically in such a way that tools can extract them to produce documentation at any desired level of abstraction.

The ideas were later applied through such mechanisms as JavaDoc (with no credit as far as I know). They were present in Eiffel from the start and the underlying principles, in particular the “Single Product principle” (sometimes “Self-Documentation principle”, and also generalized by J. Ostroff and R. Paige as “Single-Model principle”). Eiffel is the best realization of these principles thanks to:

  • Contracts (as mentioned above): the “contract view” of a class (called “short form” in earlier descriptions) removes the implementations but shows the relevant preconditions, postconditions and class invariants, given a precise and abstract specification of the class.
  • Eiffel syntax has a special place for “header comments”, which describe high-level properties and remain in the contract view.
  • Eiffel library class documentation has always been based on specifications automatically extracted from the actual text of the classes, guaranteeing adequacy of the documentation. Several formats are supported (including, from 1995 on, HTML, so that documentation can be automatically deployed on the Web).
  • Starting with the EiffelCase tool in the early 90s, and today with the Diagram Tool of EiffelStudio, class structures (inheritance and client relationships) are displayed graphically, again in an automatically extracted form, using either the BON or UML conventions.

One of the core benefits of the Single-Product principle is to guard against what some of my publications called the “Dorian Gray” syndrome: divergence of an implementation from its description, a critical problem in software because of the ease of modifying stuff. Having the documentation as an integral part of the code helps ensure that when information at some level of abstraction (specification, design, implementation) changes, the other levels will be updated as well.

Crucial in the approach is the “roundtripping” requirement: specifiers or implementers can make changes in any of the views, and have them reflected automatically in the other views. For example, you can graphically draw an arrow between two bubbles representing classes B and A in the Diagram Tool, and the code of B will be updated with “inherit A”; or you can add this Inheritance clause textually in the code of class B, and the diagram will be automatically updated with an arrow.

It is important to note how contrarian and subversive these ideas were at the time of their introduction (and still to some extent today). The wisdom was that you do requirements then design then implementation, and that code is a lowly product entirely separate from specification and documentation. Model-Driven Development perpetuates this idea (you are not supposed to modify the code, and if you do there is generally no easy way to propagate the change to the model.) Rehabilitating the code (a precursor idea to agile methods, see below) was a complete change of perspective.

I am aware of no precedent for this Single Product approach. The closest earlier ideas I can think of are in Knuth’s introduction of Literate Programming in the early eighties (with a book in 1984). As in the Single-product approach, documentation is interspersed with code. But the literate programming approach is (as presented) top-down, with English-like explanations progressively being extended with implementation elements. The Single Product approach emphasizes the primacy of code and, in terms of the design process, is very much yoyo, alternating top-down (from the specification to the implementation) and bottom-up (from the implementation to the abstraction) steps. In addition, a large part of the documentation, and often the most important one, is not informal English but formal assertions. I knew about Literate Programming, of course, and learned from it, but Single-Product is something else.

Software design: from patterns to components

Karine Arnout’s thesis at ETH Zurich, resulting in two co-authored articles ([255] and [257], showed that contrary to conventional wisdom a good proportion of the classical design patterns, including some of the most sophisticated, can be transformed into reusable components (indeed part of an Eiffel library). The agent mechanism (see below) was instrumental in achieving that result.

Programming, design and specification concepts: abstract data types

Liskov’s and Zilles’s ground-breaking 1974 abstract data types paper presented the concepts without a mathematical specification, using programming language constructs instead. A 1976 paper (number [3] in my publication list, La Description des Structures de Données, i.e. the description of data structures) was as far as I know one of the first to present a mathematical formalism, as  used today in presentations of ADTs. John Guttag was taking a similar approach in his PhD thesis at about the same time, and went further in providing a sound mathematical foundation, introducing in particular (in a 1978 paper with Jim Horning) the notion of sufficient completeness, to which I devoted a full article in this blog  (Are My Requirements Complete?) about a year ago. My own article was published in a not very well known journal and in French, so I don’t think it had much direct influence. (My later books reused some of the material.)

The three-level description approach of that article (later presented in English for an ACM workshop in the US in 1981, Pingree Park, reference [28]) is not well known but still applicable, and would be useful to avoid frequent confusions between ADT specifications and more explicit descriptions.

When I wrote my 1976 paper, I was not aware of Guttag’s ongoing work (only of the Liskov and Zilles paper), so the use of a mathematical framework with functions and predicates on them was devised independently. (I remember being quite happy when I saw what the axioms should be for a queue.) Guttag and I both gave talks at a workshop organized by the French programming language interest group in 1977 and it was fun to see that our presentations were almost identical. I think my paper still reads well today (well, if you read French). Whether or not it exerted direct influence, I am proud that it independently introduced the modern way of thinking of abstract data types as characterized by mathematical functions and their formal (predicate calculus) properties.

Language mechanisms: genericity with inheritance

Every once in a while I get to referee a paper that starts “Generics, as introduced in Java…” Well, let’s get some perspective here. Eiffel from its introduction in 1985 combined genericity and inheritance. Initially, C++ users and designers claimed that genericity was not needed in an OO context and the language did not have it; then they introduced template. Initially, the designers of Java claimed (around 1995) that genericity was not needed, and the language did not have it; a few years later Java got generics. Initially, the designers of C# (around 1999) claimed that genericity was not needed, and the language did not have it; a few years later C# and .NET got generics.

Genericity existed before Eiffel of course; what was new was the combination with inheritance. I had been influenced by work on generic modules by a French researcher, Didier Bert, which I believe influenced the design of Ada as well; Ada was the language that brought genericity to a much broader audience than the somewhat confidential languages that had such a mechanism before. But Ada was not object-oriented (it only had modules, not classes). I was passionate about object-oriented programming (at a time when it was generally considered, by the few people who had heard of it as an esoteric, academic pursuit). I started — in the context of an advanced course I was teaching at UC Santa Barbara — an investigation of how the two mechanisms relate to each other. The results were a paper at the first OOPSLA in 1986, Genericity versus Inheritance, and the design of the Eiffel type system, with a class mechanism, inheritance (single and multiple), and genericity, carefully crafted to complement each other.

With the exception of a Trellis-Owl, a  design from Digital Equipment Corporation also presented at the same OOPSLA (which never gained significant usage), there were no other OO languages with both mechanisms for several years after the Genericity versus Inheritance paper and the implementation of genericity with inheritance in Eiffel available from 1986 on. Eiffel also introduced, as far as I know, the concept of constrained genericity, the second basic mechanism for combining genericity with inheritance, described in Eiffel: The Language (Prentice Hall, 1992, section 10.8) and discussed again in OOSC2 (section 16.4 and throughout). Similar mechanisms are present in many languages today.

It was not always so. I distinctly remember people bringing their friends to our booth at some conference in the early nineties, for the sole purpose of having a good laugh with them at our poster advertising genericity with inheritance. (“What is this thing they have and no one else does? Generi-sissy-tee? Hahaha.”). A few years later, proponents of Java were pontificating that no serious language needs generics.

It is undoubtedly part of of the cycle of invention (there is a Schopenhauer citation on this, actually the only thing from Schopenhauer’s philosophy that I ever understood [D]) that people at some point will laugh at you; if it did brighten their day, why would the inventor deny them one of the little pleasures of life? But in terms of who laughs last, along the way C++ got templates, Java got generics, C# finally did too, and nowadays all typed OO languages have something of the sort.

Language mechanisms: multiple inheritance

Some readers will probably have been told that multiple inheritance is a bad thing, and hence will not count it as a contribution, but if done properly it provides a major abstraction mechanism, useful in many circumstances. Eiffel showed how to do multiple inheritance right by clearly distinguishing between features (operations) and their names, defining a class as a finite mapping between names and features, and using renaming to resolve any name clashes.

Multiple inheritance was made possible by an implementation innovation: discovering a technique (widely imitated since, including in single-inheritance contexts) to implement dynamic binding in constant time. It was universally believed at the time that multiple inheritance had a strong impact on performance, because dynamic binding implied a run-time traversal of the class inheritance structure, already bad enough for single inheritance where the structure is a tree, but prohibitive with multiple inheritance for which it is a directed acyclic graph. From its very first implementation in 1986 Eiffel used what is today known as a virtual table technique which guarantees constant-time execution of routine (method) calls with dynamic binding.

Language mechanisms: safe GC through strong static typing

Simula 67 implementations did not have automatic garbage collection, and neither had implementations of C++. The official excuse in the C++ case was methodological: C programmers are used to exerting manual control of memory usage. But the real reason was a technical impossibility resulting from the design of the language: compatibility with C precludes the provision of a good GC.

More precisely, of a sound and complete GC. A GC is sound if it will only reclaim unreachable objects; it is complete if it will reclaim all unreachable objects. With a C-based language supporting casts (e.g. between integers and pointers) and pointer arithmetic, it is impossible to achieve soundness if we aim at a reasonable level of completeness: a pointer can masquerade as an integer, only to be cast back into a pointer later on, but in the meantime the garbage collector, not recognizing it as a pointer, may have wrongly reclaimed the corresponding object. Catastrophe.

It is only possible in such a language to have a conservative GC, meaning that it renounces completeness. A conservative GC will treat as a pointer any integer whose value could possibly be a pointer (because it lies between the bounds of the program’s data addresses in memory). Then, out of precaution, the GC will refrain from reclaiming the objects at these addresses even if they appear unreachable.

This approach makes the GC sound but it is only a heuristics, and it inevitably loses completeness: every once in a while it will fail to reclaim some dead (unreachable) objects around. The result is a program with memory leaks — usually unacceptable in practice, particularly for long-running or continuously running programs where the leaks inexorably accumulate until the program starts thrashing then runs out of memory.

Smalltalk, like Lisp, made garbage collection possible, but was not a typed language and missed on the performance benefits of treating simple values like integers as a non-OO language would. Although in this case I do not at the moment have a specific bibliographic reference, I believe that it is in the context of Eiffel that the close connection between strong static typing (avoiding mechanisms such as casts and pointer arithmetic) and the possibility of sound and complete garbage collection was first clearly explained. Explained in particular around 1990 in a meeting with some of the future designers of Java, which uses a similar approach, also taken over later on by C#.

By the way, no one will laugh at you today for considering garbage collection as a kind of basic human right for programmers, but for a long time the very idea was quite sulfurous, and advocating it subjected you to a lot of scorn. Here is an extract of the review I got when I submitted the first Eiffel paper to IEEE Transactions on Software Engineering:

Systems that do automatic garbage collection and prevent the designer from doing his own memory management are not good systems for industrial-strength software engineering.

Famous last words. Another gem from another reviewer of the same paper:

I think time will show that inheritance (section 1.5.3) is a terrible idea.

Wow! I wish the anonymous reviewers would tell us what they think today. Needless to say, the paper was summarily rejected. (It later appeared in the Journal of Systems and Software — as [82] in the publication list — thanks to the enlightened views of Robert Glass, the founding editor.)

Language mechanisms: void safety

Void safety is a property of a language design that guarantees the absence of the plague of null pointer dereferencing.

The original idea came (as far as I know) from work at Microsoft Research that led to the design of a research language called C-omega; the techniques were not transferred to a full-fledged programming language. Benefiting from the existence of this proof of concept, the Eiffel design was reworked to guarantee void safety, starting from my 2005 ECOOP keynote paper (Attached Types) and reaching full type safety a few years later. This property of the language was mechanically proved in a 2016 ETH thesis by A. Kogtenkov.

Today all significant Eiffel development produces void-safe code. As far as I know this was a first among production programming languages and Eiffel remains the only production language to provide a guarantee of full void-safety.

This mechanism, carefully crafted (hint: the difficult part is initialization), is among those of which I am proudest, because in the rest of the programming world null pointer dereferencing is a major plague, threatening at any moment to crash the execution of any program that uses pointers of references. For Eiffel users it is gone.

Language mechanisms: agents/delegates/lambdas

For a long time, OO programming languages did not have a mechanism for defining objects wrapping individual operations. Eiffel’s agent facility was the first such mechanism or among the very first together the roughly contemporaneous but initially much more limited delegates of C#. The 1999 paper From calls to agents (with P. Dubois, M. Howard, M. Schweitzer and E. Stapf, [196] in the list) was as far as I know the first description of such a construct in the scientific literature.

Language mechanisms: concurrency

The 1993 Communications of the ACM paper on Systematic Concurrent Object-Oriented Programming [136] was certainly not the first concurrency proposal for OO programming (there had been pioneering work reported in particular in the 1987 book edited by Tokoro and Yonezawa), but it innovated in offering a completely data-race-free model, still a rarity today (think for example of the multi-threading mechanisms of dominant OO languages).

SCOOP, as it came to be called, was implemented a few years later and is today a standard part of Eiffel.

Language mechanisms: selective exports

Information hiding, as introduced by Parnas in his two seminal 1972 articles, distinguishes between public and secret features of a module. The first OO programming language, Simula 67, had only these two possibilities for classes and so did Ada for modules.

In building libraries of reusable components I realized early on that we need a more fine-grained mechanism. For example if class LINKED_LIST uses an auxiliary class LINKABLE to represent individual cells of a linked list (each with a value field and a “right” field containing a reference to another LINKABLE), the features of LINKABLE (such as the operation to reattach the “right” field) should not be secret, since LINKED_LIST needs them; but they should also not be generally public, since we do not want arbitrary client objects to mess around with the internal structure of the list. They should be exported selectively to LINKED_LIST only. The Eiffel syntax is simple: declare these operations in a clause of the class labeled “feature {LINKED_LIST}”.

This mechanism, known as selective exports, was introduced around 1989 (it is specified in full in Eiffel: The Language, from 1992, but was in the Eiffel manuals earlier). I think it predated the C++ “friends” mechanism which serves a similar purpose (maybe someone with knowledge of the history of C++ has the exact date). Selective exports are more general than the friends facility and similar ones in other OO languages: specifying a class as a friend means it has access to all your internals. This solution is too coarse-grained. Eiffel’s selective exports make it possible to define the specific export rights of individual operations (including attributes/fields) individually.

Language mechanisms and implementation: serialization and schema evolution

I did not invent serialization. As a student at Stanford in 1974 I had the privilege, at the AI lab, of using SAIL (Stanford Artificial Intelligence Language). SAIL was not object-oriented but included many innovative ideas; it was far ahead of its time, especially in terms of the integration of the language with (what was not yet called) its IDE. One feature of SAIL with which one could fall in love at first sight was the possibility of selecting an object and having its full dependent data structure (the entire subgraph of the object graph reached by following references from the object, recursively) stored into a file, for retrieval at the next section. After that, I never wanted again to live without such a facility, but no other language and environment had it.

Serialization was almost the first thing we implemented for Eiffel: the ability to write object.store (file) to have the entire structure from object stored into file, and the corresponding retrieval operation. OOSC1 (section 15.5) presents these mechanisms. Simula and (I think) C++ did not have anything of the sort; I am not sure about Smalltalk. Later on, of course, serialization mechanisms became a frequent component of OO environments.

Eiffel remained innovative by tackling the difficult problems: what happens when you try to retrieve an object structure and some classes have changed? Only with a coherent theoretical framework as provided in Eiffel by Design by Contract can one devise a meaningful solution. The problem and our solutions are described in detail in OOSC2 (the whole of chapter 31, particularly the section entitled “Schema evolution”). Further advances were made by Marco Piccioni in his PhD thesis at ETH and published in joint papers with him and M. Oriol, particularly [352].

Language mechanisms and implementation: safe GC through strong static typing

Simula 67 (if I remember right) did not have automatic garbage collection, and neither had C++ implementations. The official justification in the case of C++ was methodological: C programmers are used to exerting manual control of memory usage. But the real obstacle was technical: compatibility with C makes it impossible to have a good GC. More precisely, to have a sound and complete GC. A GC is sound if it will only reclaim unreachable objects; it is complete if it will reclaim all unreachable objects. With a C-based language supporting casts (e.g. between integers and pointers) and pointer arithmetic, it is impossible to achieve soundness if we aim at a reasonable level of completeness: a pointer can masquerade as an integer, only to be cast back into a pointer later on, but in the meantime the garbage collector, not recognizing it as a pointer, may have wrongly reclaimed the corresponding object. Catastrophe. It is only possible in such a language to have a conservative GC, which will treat as a pointer any integer whose value could possibly be a pointer (because its value lies between the bounds of the program’s data addresses in memory). Then, out of precaution, it will not reclaim the objects at the corresponding address. This approach makes the GC sound but it is only a heuristics, and it may be over-conservative at times, wrongly leaving dead (i.e. unreachable) objects around. The result is, inevitably, a program with memory leaks — usually unacceptable in practice.

Smalltalk, like Lisp, made garbage collection possible, but was not a typed language and missed on the performance benefits of treating simple values like integers as a non-OO language would. Although in this case I do not at the moment have a specific bibliographic reference, I believe that it is in the context of Eiffel that the close connection between strong static typing (avoiding mechanisms such as casts and pointer arithmetic) and the possibility of sound and complete garbage collection was first clearly explained. Explained in particular to some of the future designers of Java, which uses a similar approach, also taken over later on by C#.

By the way, no one will laugh at you today for considering garbage collection as a kind of basic human right for programmers, but for a long time it was quite sulfurous. Here is an extract of the review I got when I submitted the first Eiffel paper to IEEE <em>Transactions on Software Engineering:

Software engineering: primacy of code

Agile methods are widely and properly lauded for emphasizing the central role of code, against designs and other non-executable artifacts. By reading the agile literature you might be forgiven for believing that no one brought up that point before.

Object Success (1995) makes the argument very clearly. For example, chapter 3, page 43:

Code is to our industry what bread is to a baker and books to a writer. But with the waterfall code only appears late in the process; for a manager this is an unacceptable risk factor. Anyone with practical experience in software development knows how many things can go wrong once you get down to code: a brilliant design idea whose implementation turns out to require tens of megabytes of space or minutes of response time; beautiful bubbles and arrows that cannot be implemented; an operating system update, crucial to the project which comes five weeks late; an obscure bug that takes ages to be fixed. Unless you start coding early in the process, you will not be able to control your project.

Such discourse was subversive at the time; the wisdom in software engineering was that you need to specify and design a system to death before you even start coding (otherwise you are just a messy “hacker” in the sense this word had at the time). No one else in respectable software engineering circles was, as far as I know, pushing for putting code at the center, the way the above extract does.

Several years later, agile authors started making similar arguments, but I don’t know why they never referenced this earlier exposition, which still today I find not too bad. (Maybe they decided it was more effective to have a foil, the scorned Waterfall, and to claim that everyone else before was downplaying the importance of code, but that was not in fact everyone.)

Just to be clear, Agile brought many important ideas that my publications did not anticipate; but this particular one I did.

Software engineering: the roles of managers

Extreme Programming and Scrum have brought new light on the role of managers in software development. Their contributions have been important and influential, but here too they were for a significant part prefigured by a long discussion, altogether two chapters, in Object Success (1995).

To realize this, it is enough to read the titles of some of the sections in those chapters, describing roles for managers (some universal, some for a technical manager): “risk manager”, “interface with the rest of the world” (very scrummy!), “protector of the team’s sanity”, “method enforcer” (think Scrum Master), “mentor and critic”. Again, as far as I know, these were original thoughts at the time; the software engineering literature for the most part did not talk about these issues.

Software engineering: outsourcing

As far as I know the 2006 paper Offshore Development: The Unspoken Revolution in Software Engineering was the first to draw attention, in the software engineering community, to the peculiar software engineering challenges of distributed and outsourced development.

Software engineering: automatic testing

The AutoTest project (with many publications, involving I. Ciupa, A. Leitner, Y. Wei, M. Oriol, Y. Pei, M. Nordio and others) was not the first to generate tests automatically by creating numerous instances of objects and calling applicable operations (it was preceded by Korat at MIT), but it was the first one to apply this concept with Design by Contract mechanisms (without which it is of little practical value, since one must still produce test oracles manually) and the first to be integrated in a production environment (EiffelStudio).

Software engineering: make-less system building

One of the very first decisions in the design of Eiffel was to get rid of Make files.

Feldman’s Make had of course been a great innovation. Before Make, programmers had to produce executable systems manually by executing sequences of commands to compile and link the various source components. Make enabled them to instead  to define dependencies between components in a declarative way, resulting in a partial order, and then performed a topological sort to produce the sequence of comments. But preparing the list of dependencies remains a tedious task, particularly error-prone for large systems.

I decided right away in the design of Eiffel that we would never force programmers to write such dependencies: they would be automatically extracted from the code, through an exhaustive analysis of the dependencies between modules. This idea was present from the very the first Eiffel report in 1985 (reference [55] in the publication list): Eiffel programmers never need to write a Make file or equivalent (other than for non-Eiffel code, e.g. C or C++, that they want to integrate); they just click a Compile button and the compiler figures out the steps.

Behind this approach was a detailed theoretical analysis of possible relations between modules in software development (in many programming languages), published as the “Software Knowledge Base” at ICSE in 1985. That analysis was also quite instructive and I would like to return to this work and expand it.

Educational techniques: objects first

Towards an Object-Oriented Curriculum ( TOOLS conference, August 1993, see also the shorter JOOP paper in May of the same year) makes a carefully argued case for what was later called the Objects First approach to teaching programming. I would be interested to know if there are earlier publications advocating starting programming education with an OO language.

The article also advocated for the “inverted curriculum”, a term borrowed from work by Bernie Cohen about teaching electrical engineering. It was the first transposition of this concept to software education. In the article’s approach, students are given program components to use, then little by little discover how they are made. This technique met with some skepticism and resistance since the standard approach was to start from the very basics (write trivial programs), then move up. Today, of course, many introductory programming courses similarly provide students from day one with a full-fledged set of components enabling them to produce significant programs.

More recent articles on similar topics, taking advantage of actual teaching experience, are The Outside-In Method of Teaching Programming (2003) and The Inverted Curriculum in Practice (at ICSE 2006, with Michela Pedroni). The culmination of that experience is the textbook Touch of Class from 2009.

Educational techniques: Distributed Software Projects

I believe our team at ETH Zurich (including among others M. Nordio, J. Tschannen, P. Kolb and C. Estler and in collaboration with C. Ghezzi, E. Di Nitto and G. Tamburrelli at Politecnico di Milano, N. Aguirre at Rio Cuarto and many others in various universities) was the first to devise,  practice and document on a large scale (see publications and other details here) the idea of an educational software project conducted in common by student groups from different universities. It yielded a wealth of information on distributed software development and educational issues.

Educational techniques: Web-based programming exercises

There are today a number of cloud-based environments supporting the teaching of programming by enabling students to compile and test their programs on the Web, benefiting from a prepared environment (so that they don’t have to download any tools or prepare control files) and providing feedback. One of the first — I am not sure about absolute precedence — and still a leading one, used by many universities and applicable to many programming languages, is Codeboard.

The main developer, in my chair at ETH Zurich, was Christian Estler, supported in particular by M. Nordio and M. Piccioni, so I am only claiming a supporting role here.

Educational techniques: key CS/SE concepts

The 2001 paper Software Engineering in the Academy did a good job, I think, of defining the essential concepts to teach in a proper curriculum (part of what Jeannette Wing’s 2006 paper called Computational Thinking).

Program verification: agents (delegates etc.)

Reasoning about Function Objects (ICSE 2010, with M. Nordio, P. Müller and J. Tschannen) introduced verification techniques for objects representing functions (such as agents, delegates etc., see above) in an OO language. Not sure whether there were any such techniques before.

Specification languages: Z

The Z specification language has been widely used for formal development, particularly in the UK. It is the design of J-R Abrial. I may point out that I was a coauthor of the first publication on Z in English (1980),  describing a version that preceded the adaptation to a more graphical-style notation done later at Oxford. The first ever published description of Z, pertaining to an even earlier version, was in French, in my book Méthodes de Programmation (with C. Baudoin), Eyrolles, 1978, running over 15 pages (526-541), with the precise description of a refinement process.

Program verification: exceptions

Largely coming out of the PhD thesis of Martin Nordio, A Sound and Complete Program Logic for Eiffel (TOOLS 2009) introduces rules for dealing with exceptions in a Hoare-style verification framework.

Program verification: full library, and AutoProof

Nadia Polikarpova’s thesis at ETH, aided by the work of Carlo Furia and Julian Tschannen (they were the major contributors and my participation was less important), was as far as I know the first to produce a full functional verification of an actual production-quality reusable library. The library is EiffelBase 2, covering fundamental data structures.

AutoProof — available today, as a still experimental tool, through its Web interface, see here — relied on the AutoProof prover, built by the same team, and itself based on Microsoft Research’s Boogie and Z3 engines.

More

There are more concepts worthy of being included here, but for today I will stop here.

Notes

[A] One point of divergence between usual presentations of the substitution principle and the view in OOSC and my other publications is the covariance versus contravariance of routine argument types. It reflects a difference of views as to what the proper policy (both mathematically sound and practically usable) should be.

[B]  The GoF book does not cite OOSC for the command or bridge patterns. For the command pattern it cites (thanks to Adam Kosmaczewski for digging up the GoF text!) a 1985 SIGGRAPH paper by Henry Lieberman (There’s More to Menu Systems than Meets the Screen). Lieberman’s paper describes the notion of command object and mentions undoing in passing, but does not include the key elements of the command pattern (as explained in full in OOSC1), i.e. an abstract (deferred) command class with deferred procedures called (say) do_it and undo_it, then specific classes for each kind of command, each providing a specific implementation of those procedures, then a history list of commands supporting multiple-level undo and redo as explained in OOSC1. (Reading Lieberman’s paper with a 2021 perspective shows that it came tantalizingly close to the command pattern, but doesn’t get to it. The paper does talk about inheritance between command classes, but only to “define new commands as extensions to old commands”, not in the sense of a general template that can be implemented in many specific ways. And it does mention a list of objects kept around to enable recovery from accidental deletions, and states that the application can control its length, as is the case with a history list; but the objects in the list are not command objects, they are graphical and other objects that have been deleted.)

[C] Additional note on the command pattern: I vaguely remember seeing something similar to the OOSC1 technique in an article from a supplementary volume of the OOPSLA proceedings in the late eighties or early nineties, i.e. at the same time or slightly later, possibly from authors from Xerox PARC, but I have lost the reference.

[D] Correction: I just checked the source and learned that the actual Schopenhauer quote (as opposed to the one that is usually quoted) is different; it does not include the part about laughing. So much for my attempts at understanding philosophy.

 

VN:F [1.9.10_1130]
Rating: 8.7/10 (21 votes cast)
VN:F [1.9.10_1130]
Rating: +5 (from 11 votes)

Talk on requirements at UC Santa Barbara tomorrow

I am giving a “distinguished lecture” at the University of California, Santa Barbara, January 10 (Friday, tomorrow) at 14. The title is A Comprehensive Approach to Requirements Engineering.

The abstract and rest of the information are here.

I will spend the last few minutes of the talk discussing other current developments (verification, concurrency).

VN:F [1.9.10_1130]
Rating: 7.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Defining and classifying requirements (new publication)

Software engineering has improved a lot in the past couple of decades, but there remains an area where the old doomsday style of starting a software engineering paper (software crisis, everything is rotten…) still fits: requirements engineering. Just see the chasm between textbook advice and the practice of most projects.

I have written on requirements in this blog, including very recently, and will continue in forthcoming installments. For today I  want to point to a recent article [1],  presented at the newly revived TOOLS conference in October. It attempts to bring some order and rigor to the basic definitions in the field.

From the abstract:

Requirements engineering is crucial to software development but lacks a precise definition of its fundamental concepts. Even the basic definitions in the literature and in industry standards are often vague and verbose.

To remedy this situation and provide a solid basis for discussions of requirements, this work provides precise definitions of the fundamental requirements concepts and two systematic classifications: a taxonomy of requirement elements (such as components, goals, constraints…) ; and a taxonomy of possible relations between these elements (such as “extends”, “excepts”, “belongs”…).

The discussion evaluates the taxonomies on published requirements documents; readers can test the concepts in two online quizzes.

The intended result of this work is to spur new advances in the study and practice of software requirements by clarifying the fundamental concepts.

This version is a first step; we are aware of its limitations and are already revising the definitions and taxonomy. The project is aimed at providing a solid foundation for a delicate area of software engineering and it will take some time to get it completely right. Still, I think the paper as it is already introduces important concepts. I will within the next two weeks write a more detailed blog article summarizing some of them.

References

[1] Bertrand Meyer, Jean-Michel Bruel, Sophie Ebersold, Florian Galinier, Alexandr Naumchev, The Anatomy of Requirements, in TOOLS 51, Software Technology: Methods and Tools
Innopolis, Russia, October 15–17, 2019, pages 10-40, available here (Springer site, paywall) and here (arXiv draft).

VN:F [1.9.10_1130]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Are my requirements complete?

Some important concepts of software engineering, established over the years, are not widely known in the community. One use of this blog is to provide tutorials on such overlooked ideas. An earlier article covered one pertaining to project management: the Shortest Possible Schedule property . Here is another, this time in the area of requirements engineering, also based on a publication that I consider to be a classic (it is over 40 years old) but almost unknown to practitioners.

Practitioners are indeed, as in most of my articles, the intended audience. I emphasize this point right at the start because if you glance at the rest of the text you will see that it contains (horror of horrors) some mathematical formulae, and might think “this is not for me”. It is! The mathematics is very simple and my aim is practical: to shed light on an eternal question that faces anyone writing requirements (whatever the style, traditional or agile): how can I be sure that a requirements specification is complete?

To a certain extent you cannot. But there is better answer, a remarkably simple one which, while partial, helps.

Defining completeness

The better answer is called “sufficient completeness” and comes from the theory of abstract data types. It was introduced in a 1978 article by Guttag and Horning [1]. It is also implicit in a more down-to-earth document, the 1998 IEEE standard on how to write requirements [2].

There is nothing really new in the present article; in fact my book Object-Oriented Software Construction [3] contains an extensive discussion of sufficient completeness (meant to be more broadly accessible than Guttag and Horning’s scholarly article). But few people know the concepts; in particular very few practitioners have heard of sufficient completeness (if they have heard at all of abstract data types). So I hope the present introduction will be useful.

The reason the question of determining completeness of requirements seems hopeless at first is the natural reaction: complete with respect to what? To know that the specification is complete we would need a more general description of all that our stakeholders want and all the environment constraints, but this would only push the problem further: how do we know that such description itself is complete?

That objection is correct in principle: we can never be sure that we did not forget something someone wanted, or some property that the environment imposes. But there also exist more concrete and assessable notions of completeness.

The IEEE standard gives three criteria of completeness. The first states that “all requirements” have been included, and is useless, since it  runs into the logical paradox mentioned above, and is tautological anyway (the requirements are complete if they include all requirements, thank you for the information!). The second is meaningful but of limited interest (a “bureaucratic” notion of completeness): every element in the requirements document is numbered, every cross-reference is defined and so on. The last criterion is the interesting one: “Definition of the responses of the software to all realizable classes of input data in all realizable classes of situations”. Now this is meaningful. To understand this clause we need to step back to sufficient completeness and, even before that, to abstract data types.

Abstract data types will provide our little mathematical excursion (our formal picnic in the words of an earlier article) in our study of requirements and completeness. If you are not familiar with this simple mathematical theory, which every software practitioner should know, I hope you will benefit from the introduction and example. They will enable us to introduce the notion of sufficient completeness formally before we come back to its application to requirements engineering.

Specifying an abstract data type

 Abstract data types are the mathematical basis for object-oriented programming. In fact, OO programming but also OO analysis and OO design are just a realization of this mathematical concept at various levels of abstraction, even if few OO practitioners are aware of it. (Renewed reference to [3] here if you want to know more.)

An ADT (abstract data type) is a set of objects characterized not by their internal properties (what they are) but by the operations applicable to them (what they have), and the properties of these operations. If you are familiar with OO programming you will recognize that this is exactly, at the implementation level, what a class is. But here we are talking about mathematical objects and we do not need to consider implementation.

An example  of a type defined in this way, as an ADT, is a notion of POINT on a line. We do not say how this object is represented (a concept that is irrelevant at the specification level) but how it appears to the rest of the world: we can create a new point at the origin, ask for the coordinate of a point, or move the point by a certain displacement. The example is the simplest meaningful one possible, but it gives the ideas.

adt

An ADT specification has three part: Functions, Preconditions and Axioms. Let us see them (skipping Preconditions for the moment) for the definition of the POINT abstract data type.

The functions are the operations that characterize the type. There are three kinds of function, defined by where the ADT under definition, here POINT, appears:

  • Creators, where the type appears only among the results.
  • Queries, where it appears only among the arguments.
  • Commands, where it appears on both sides.

There is only one creator here:

new: → POINT

new is a function that takes no argument, and yields a point (the origin). We will write the result as just new (rather than using empty parentheses as in new ()).

Creators correspond in OO programming to constructors of a class (creation procedures in Eiffel). Like constructors, creators may have arguments: for example instead of always creating a point at the origin we could decide that new creates a point with a given coordinate, specifying it as INTEGER → POINT and using it as new (i) for some integer i (our points will have integer coordinates). Here for simplicity we choose a creator without arguments. In any case the new type, here POINT, appears only on the side of the results.

Every useful ADT specification needs at least one creator, without which we would never obtain any objects of the type (here any points) to work with.

There is also only one query:

x: POINT → INTEGER

 which gives us the position of a point, written x (p) for a point p. More generally, a query enables us to obtain properties of objects of the new type. These properties must be expressed in terms of types that we have already defined, like INTEGER here. Again there has to be at least one query, otherwise we could never obtain usable information (information expressed in terms of what we already know) about objects of the new type. In OO programming, queries correspond to fields (attributes) of a class and functions without side effects.

And we also have just one command:

move: POINT × INTEGER → POINT

a function that for any point p and integer i and yields a new point, move (p, i).  Again an ADT specification is not interesting unless it has at least one command, representing ways to modify objects. (In mathematics we do not actually modify objects, we get new objects. In imperative programming we will actually update existing objects.) In the classes of object-oriented programming, commands correspond to procedures (methods which may change objects).

You see the idea: define the notion of POINT through the applicable operations.

Listing their names and the types of their arguments types results (as in POINT × INTEGER → POINT) is not quite enough to specify these operations: we must specify their fundamental properties, without of course resorting to a programming implementation. That is the role of the second component of an ADT specification, the axioms.

For example I wrote above that new yields the origin, the point for which x = 0,  but you only had my word for it. My word is good but not good enough. An axiom will give you this property unambiguously:

x (new) = 0                                    — A0

The second axiom, which is also the last, tells us what move actually does. It applies to any point p and any integer m:

x (move (p, m)) = x (p) + m       — A1

In words: the coordinate of the point resulting from moving p by m is the coordinate of p plus m.

That’s it! (Except for the notion of precondition, which will wait a bit.) The example is trivial but this approach can be applied to any number of  data types, with any number of applicable operations and any level of complexity. That is what we do, at the design and implementation level, when writing classes in OO programming.

Is my ADT sufficiently complete?

Sufficient completeness is a property that we can assess on such specifications. An ADT specification for a type T (here POINT) is sufficiently complete if the axioms are powerful enough to yield the value of any well-formed query expression in a form not involving T. This definition contains a few new terms but the concepts are very simple; I will explain what it means through an example.

With an ADT specification we can form all kinds of expressions, representing arbitrarily complex specifications. For example:

x (move (move (move (new, 3), x (move (move (new, -2), 4))), -6))

This expression will yield an integer (since function x has INTEGER as its result type) describing the result of a computation with points. We can visualize this computation graphically; note that it involves creating two points (since there are two occurrences of new) and moving them, using in one case the current coordinate of one of them as displacement for the other. The following figure illustrates the process.

computation

The result, obtained informally by drawing this picture, is the x of P5, that is to say -1. We will derive it mathematically below.

Alternatively, if like most programmers (and many other people) you find it more intuitive to reason operationally than mathematically, you may think of the previous expression as describing the result of the following OO program (with variables of type POINT):

create p                                — In C++/Java syntax: p = new POINT();
create q
p.move (3)
q.move (-2)
q.move (4)
p.move (q.x)
p.move (-6)

Result := p.x

You can run this program in your favorite OO programming language, using a class POINT with new, x and move, and print the value of Result, which will be -1.

Here, however, we will stay at the mathematical level and simplify the expression using the axioms of the ADT, the same way we would compute any other mathematical formula, applying the rules without needing to rely on intuition or operational reasoning. Here is the expression again (let’s call it i, of type INTEGER):

ix (move (move (move (new, 3), x (move (move (new, -2), 4))), -6))

A query expression is one in which the outermost function being applied, here x, is a query function. Remember that a query function is one which the new type, here POINT, appears only on the left. This is the case with x, so the above expression i is indeed a query expression.

For sufficient completeness, query expressions are the ones of interest because their value is expressed in terms of things we already know, like INTEGERs, so they are the only way we can concretely obtain directly usable information the ADT (to de-abstract it, so to speak).

But we can only get such a value by applying the axioms. So the axioms are “sufficiently complete” if they always give us the answer: the value of any such query expression.

 Let us see if the above expression i satisfies this condition of sufficient completeness. To make it more tractable let us write  it in terms of simpler expressions (all of type POINT), as illustrated by the figure below:

p1 = move (new, 3)
p2= move (new, -2)
p3= move (p2, 4)
p4= move (p1, x (p3))
p5= move (p4, -6)
i = x (p5)

expression

(You may note that the intermediate expressions roughly correspond to the steps in the above interpretation of the computation as a program. They also appear in the illustrative figure repeated below.)

computation

Now we start applying the axioms to evaluating the expressions. Remember that we have two axioms: A0 tells us that x (new) = 0 and A1 that x (move (p, m)) = x (p) + m. Applying A1 to the definition the expression i yields

i = x (p4) – 6
= i4 – 6

if we define

i4 = x (p4)      — Of type INTEGER

We just have to compute i4. Applying A1 to the definion of p4 tells us that

i4 = x (p1) + x (p3)

To compute the two terms:

  • Applying A1 again, we see that the first term x (p1) is x (new) + 3, but then A0 tells us that x (new) is zero, so x (p1) is 3.
  • As to x (p3), it is, once more from A1, x (p2) + 4, and x (p2) is (from A1 then A0), just -2, so x (p3) is 2.

In the end, then, i4 is 5, and the value of the entire expression i = i4 – 6 is -1. Good job!

Proving sufficient completeness

The successful computation of i was just a derivation for one example, showing that in that particular case the axioms yield the answer in terms of an INTEGER. How do we go from one example to an entire specification?

The bad news first: like all interesting problems in programming, sufficient completeness of an ADT specification is theoretically undecidable. There is no general automatic procedure that will process an ADT specification and print out ““sufficiently complete” or “not sufficiently complete”.

Now that you have recovered from the shock, you can share the computer scientist’s natural reaction to such an announcement: so what. (In fact we might define the very notion of computer scientist as someone who, even before he brushes his teeth in the morning — if he brushes them at all — has already built the outline of a practical solution to an undecidable problem.) It is enough that we can find a way to determine if a given specification is sufficiently complete. Such a proof is, in fact, the computer scientist’s version of dental hygiene: no ADT is ready for prime time unless it is sufficiently complete.

The proof is usually not too hard and will follow the general style illustrated for our simple example.

We note that the definition of sufficient completeness said: “the axioms are powerful enough to yield the value of any well-formed query expression in a form not involving the type”. I have not defined “well-formed” yet. It simply means that the expressions are properly structured, with the proper syntax (basically the correct matching of parentheses) and proper number and types of arguments. For example the following are not well-formed (if p is an expression of type POINT):

move (p, 55(     — Bad use of parentheses.
move (p)            — Wrong number of arguments.
move (p, p)       — Wrong type: second argument should be an integer.

Such expressions are nonsense, so we only care about well-formed expressions. Note that in addition to new, x and move , an expression can use integer constants as in the example (although we could generalize to arbitrary integer expressions). We consider an integer constant as a query expression.

We have to prove that with the two axioms A0 and A1 we can determine the value of any query expression i. Note that since the only query functions is x, the only possible form for i, other than an integer constant, is x (p) for some expression p of type POINT.

The proof proceeds by induction on the number n of parenthesis pairs in a query expression i.

There are two base steps:

  • n = 0: in that case i can only be an integer constant. (The only expression with no parentheses built out of the ADT’s functions is new, and it is not a query expression.) So the value is known. In all other cases i will be of the form x (p) as noted.
  • n = 1: in that case p  can only be new, in other words i = x (new), since the only function that yields points, other than new, is move, and any use of it would add parentheses. In this case axiom A0 gives us the value of i: zero.

For the induction step, we consider i with n + 1 parenthesis pairs for n > 1. As noted, i is of the form x (p), so p has exactly n parenthesis pairs. p cannot be new (which would give 0 parenthesis pairs and was taken care of in the second base step), so p has to be of the form

p =  move (p’, i’)    — For expressions p’ of type POINT and i’ of type INTEGER.

implying (since i = x (p)) that by axiom A1, the value of i is

x (p’) + i’

So we will be able to determine the value of i if we can determine the value of both x (p’) and i’. Since p has n parenthesis pairs and p =  move (p’, i’), both p’ and i’ have at most n – 1 parenthesis pairs. (This use of n – 1 is legitimate because we have two base steps, enabling us to assume n > 1.) As a consequence, both x (p’) and i’ have at most n parenthesis pairs, enabling us to deduce their values, and hence the value of i, by the induction hypothesis.

Most proofs of sufficient completeness in my experience follow this style: induction on the number of parenthesis pairs (or the maximum nesting level).

Preconditions

I left until now the third component of a general ADT specification: preconditions. The need for preconditions arises because most practical specifications need some of their functions to be partial. A partial function from X to Y is a function that may not yield a value for some elements of X. For example, the inverse function on real numbers, which yields 1 / a for x, is partial  since it is not defined for a = 0 (or, on a computer, for non-zero but very small a).

Assume that in our examples we only want to accept points that lie in the interval [-4, +4]:

limited

 We can simply model this property by turning move into a partial function. It was specified above as

move: POINT × INTEGER → POINT

The ordinary arrow → introduces a total (always defined) function. For a partial function we will use a crossed arrow ⇸, specifying the function as

move: POINT × INTEGER ⇸ POINT

Other functions remain unchanged. Partial functions cause trouble: for f in X ⇸ Y we can no longer cheerfully use f (x) if f is a partial function, even for x of the appropriate type X. We have to make sure that x belongs to the domain of f, meaning the set of values for which f is defined. There is no way around it: if you want your specification to be meaningful and it uses partial functions, you must specify explicitly the domain of each of them. Here is how to do it, in the case of move:

move (p: POINT; d: INTEGER) require |x (p) + d | < 5    — where |…| is absolute value

To adapt the definition (and proofs) of sufficient completeness to the possible presence of partial functions:

  • We only need to consider (for the rule that axioms must yield the value of query expressions) well-formed expressions that satisfy the associated preconditions.
  • The definition must, however, include the property that axioms always enable us to determine whether an expression satisfies the associated preconditions (normally a straightforward part of the proof since preconditions are themselves query expressions).

Updating the preceding proof accordingly is not hard.

Back to requirements

The definition of sufficient completeness is of great help to assess the completeness of a requirements document. We must first regretfully note that for many teams today requirements stop at  “use cases” (scenarios) or  “user stories”. Of course these are not requirements; they only describe individual cases and are to requirements what tests are to programs. They can serve to check requirements, but do not suffice as requirements. I am assuming real requirements, which include descriptions of behavior (along with other elements such as environment properties and project properties). To describe behaviors, you will define operations and their effects. Now we know what the old IEEE standard is telling us by stating that complete requirements should include

definition of the responses of the software to all realizable classes of input data in all realizable classes of situations

Whether or not we have taken the trouble to specify the ADTs, they are there in the background; our system’s operations reflect the commands, and the effects we can observe reflect the queries. To make our specification complete, we should draw as much as possible of the (mental or explicit) matrix of possible effects of all commands on all queries. “As much as possible” because software engineering is engineering and we will seldom be able to reach perfection. But the degree of fullness of the matrix tells us a lot (possible software metric here?) about how close our requirements are to completeness.

I should note that there are other aspects to completeness of requirements. For example the work of Michael Jackson, Pamela Zave and Axel van Lamsweerde (more in some later article, with full references) distinguishes between business goals, environment constraints and system properties, leading to a notion of completeness as how much the system properties meet the goals and obey the constraints [4]. Sufficient completeness operates at the system level and, together with its theoretical basis, is one of those seminal concepts that every practicing software engineer or project manager should master.

References and notes

[1] John V. Guttag, Jim J. Horning: The Algebraic Specification of Abstract Data Types, in Acta Informatica, vol. 10, no. 1, pages 27-52, 1978, available here from the Springer site. This is a classic paper but I note that few people know it today; in Google Scholar I see over 700 citations but less than 100 of them in the past 8 years.

[2]  IEEE: Recommended Practice for Software Requirements Specifications, IEEE Standard 830-1998, 1998. This standard is supposed to be obsolete and replaced by newer ones, more detailed and verbose, but it remains the better reference: plain, modest and widely applied by the industry. It does need an update, but a good one.

[3] Bertrand Meyer, Object-Oriented Software Construction, 2nd edition, Prentice Hall, 1997. The discussion of sufficient completeness was in fact already there in the first edition from 1988.

[4] With thanks to Elisabetta Di Nitto from Politecnico di Milano for bringing up this notion of requirements completeness.

Recycled A version of this article was first published on the Communications of the ACM blog.

 

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Formality in requirements: new publication

The best way to make software requirements precise is to use one of the available “formal” approaches. Many have been proposed; I am not aware of a general survey published so far. Over the past two years, we have been working on a comprehensive survey of the use of formality in requirements, of which we are now releasing a draft. “We” is a joint informal research group from Innopolis University and the University of Toulouse, whose members have been cooperating on requirements issues, resulting in publications listed  under “References” below and in several scientific events.

The survey is still being revised, in particular because it is longer than the page limit of its intended venue (ACM Computing Surveys), and some sections are in need of improvement. We think, however, that the current draft can already provide a solid reference in this fundamental area of software engineering.

The paper covers a broad selection of methods, altogether 22 of them, all the way from completely informal to strictly formal. They are grouped into five categories: natural language, semi-formal, automata- or graph-based, other mathematical frameworks, programming-language based. Examples include SysML, Relax, Statecharts, VDM, Eiffel (as a requirements notation), Event-B, Alloy. For every method, the text proposes a version of a running example (the Landing Gear System, already used in some of our previous publications) expressed in the corresponding notation. It evaluates the methods using a set of carefully defined criteria.

The paper is: Jean-Michel Bruel, Sophie Ébersold, Florian Galinier, Alexandr Naumchev, Manuel Mazzara and Bertrand Meyer: Formality in Software Requirements, draft, November 2019.

The text is available here. Comments on the draft are welcome.

References

Bertrand Meyer, Jean-Michel Bruel, Sophie Ebersold, Florian Galinier and Alexandr Naumchev: Towards an Anatomy of Software Requirements, in TOOLS 2019, pages 10-40, see here (or arXiv version here). I will write a separate blog article about this publication.

Alexandr Naumchev and Bertrand Meyer: Seamless requirements. Computer Languages, Systems & Structures 49, 2017, pages 119-132, available here.

Florian Galinier, Jean-Michel Bruel, Sophie Ebersold and Bertrand Meyer: Seamless Integration of Multirequirements, in Complex Systems, 25th International Requirements Engineering Conference Workshop, IEEE, pages 21-25, 2017, available here.

Alexandr Naumchev, Manuel Mazzara, Bertrand Meyer, Jean-Michel Bruel, Florian Galinier and Sophie Ebersold: A contract-based method to specify stimulus-response requirements, Proceedings of the Institute for System Programming, vol. 29, issue 4, 2017, pp. 39-54. DOI: 10.15514, available here.

Alexandr Naumchev and Bertrand Meyer: Complete Contracts through Specification Drivers., in 10th International Symposium on Theoretical Aspects of Software Engineering (TASE), pages 160-167, 2016, available here.

Alexandr Naumchev, Bertrand Meyer and Víctor Rivera: Unifying Requirements and Code: An Example, in PSI 2015 (Ershov conference, Perspective of System Informatics), pages 233-244, available here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (7 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

Sunrise was foggy today

Once you have learned the benefits of formally expressing requirements, you keep noticing potential ambiguities and other deficiencies [1] in everyday language. Most such cases are only worth a passing smile, but here’s one that perhaps can serve to illustrate a point with business analysts in your next requirements engineering workshop or with students in your next software engineering lecture.

As a customer of the Swiss telecommunications company Sunrise I receive an occasional “news” email. (As a customer of the Swiss telecommunications company Sunrise I would actually prefer that they spend my money improving  bandwidth,  but let us not digress.) Rather than raw marketing messages these are tips for everyday life, with the presumed intent of ingratiating the populace. For example, today’s message helpfully advises me on how to move house. The admirable advice starts (my translation):

10.7% of all Swiss people relocate every year. Is that your case too for next Autumn?

Actually no, it’s not my case (neither a case of being one of the “Swiss people” nor a case of intending to relocate this Fall). And, ah, the beauty of ridiculously precise statistics! Not 10.8% or 10.6%, mind you, no, 10.7% exactly! But consider the first sentence and think of something similar appearing in a requirements document or user story. Something similar does appear in such documents, all the time, leading to confusions for the programmers interpreting them and to bugs in the resulting systems. Those restless Swiss! Did you know that they include an itchy group, exactly 922,046 people (I will not be out-significant-digited!), who relocate every year?

Do not be silly, I hear you saying. What Sunrise is sharing of its wisdom is that every year a tenth of the Swiss population moves, but not the same tenth every year. Well, OK, maybe I am being silly. But if you think of a programmer reading such a statement about some unfamiliar domain (not one about which we can rely on common sense), the risk of confusion and consequent bugs is serious.

As [1] illustrated in detail, staying within the boundaries of natural language to resolve such possible ambiguities only results in convoluted requirements that make matters worse. The only practical way out is, for delicate system properties, to use precise language, also known technically as “mathematics”.

Here for example a precise formulation of the two possible interpretations removes any doubt. Let Swiss denote the set of Swiss people and  E the number of elements (cardinal) of a finite set E, which we can apply to the example because the set of Swiss people is indeed finite. Let us define slice as the Sunrise-official number of Swiss people relocating yearly, i.e. slice = Swiss ∗ 0.107 (the actual value appeared above). Then one interpretation of the fascinating Sunrise-official fact is:

{s: Swiss | (∀y: Year | s.is_moving (y))} = slice

In words: the cardinal of the set of Swiss people who move every year (i.e., such that for every year y they move during y) is equal to the size of the asserted population subset.

The other possible interpretation, the one we suspect would be officially preferred by the Sunrise powers (any formal-methods fan from Sunrise marketing reading this, please confirm or deny!), is:

∀y: Year | {s: Swiss | s.is_moving (y)} = slice

In words: for any year y, the cardinal of the set of Swiss people who move during y is equal to the size of the asserted subset.

This example is typical of where and why we need mathematics in software requirements. No absolutist stance here, no decree  that everything become formal (mathematical). Natural language is not going into retirement any time soon. But whenever one spots a possible ambiguity or imprecision, the immediate reaction should always be to express the concepts mathematically.

To anyone who has had a successful exposure to formal methods this reaction is automatic. But I keep getting astounded not only by  the total lack of awareness of these simple ideas among the overwhelming majority of software professionals, but also by their absence from the standard curriculum of even top universities. Most students graduate in computer science without ever having heard such a discussion. Where a formal methods course does exist, it is generally as a specialized topic reserved for a small minority, disconnected (as Leslie Lamport has observed [2]) from the standard teaching of programming and software engineering.

In fact all software engineers should possess the ability to go formal when and where needed. That skill is not hard to learn and should be practiced as part of the standard curriculum. Otherwise we keep training the equivalent of electricians rather than electrical engineers, programmers keep making damaging mistakes from misunderstanding ambiguous or inconsistent requirements, and we all keep suffering from buggy programs.

 

References

[1] Self-citation appropriate here: Bertrand Meyer: On Formalism in Specifications, IEEE Software, vol. 3, no. 1, January 1985, pages 6-25, available here.

[2] Leslie Lamport: The Future of Computing: Logic or Biology, text of a talk given at Christian Albrechts University, Kiel on 11 July 2003, available here.

VN:F [1.9.10_1130]
Rating: 9.8/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

The Formal Picnic approach to requirements

picnicRequirements engineering (studying and documenting what a software system should do, independently of how it will do it) took some time to be recognized as a key part of software engineering, since the early focus was, understandably, on programming. It is today a recognized sub-discipline and has benefited in the last decades from many seminal concepts. An early paper of mine, On Formalism in Specifications [1], came at the beginning of this evolution; it made the case for using formal (mathematics-based) approaches. One of the reasons it attracted attention is its analysis of the “seven sins of the specifier”: a list of pitfalls into which authors of specifications and requirements commonly fall.

One of the techniques presented in the paper has not made it into the standard requirements-enginering bag of tricks. I think it deserves to be known, hence the present note. There really will not be anything here that is not in the original article; in fact I will be so lazy as to reuse its example. (Current requirements research with colleagues should lead to the publication of new examples.)

Maybe the reason the idea did not register is that I did not give it a name. So here goes: formal picnic.

The usual software engineering curriculum includes, regrettably, no room for  field trips. We are jealous of students and teachers of geology or zoology and their occasional excursions: once in a while you put on your boots, harness your backpack, and head out to quarries or grasslands to watch pebbles or critters in flagrante, after a long walk with the other boys and girls and before all having lunch together in the wild. Yes, scientific life in these disciplines really is a picnic. What I propose for the requirements process is a similar excursion; not into muddy fields, but into the dry pastures of mathematics.

The mathematical picnic process starts with a natural-language requirements document. It continues, for some part of the requirements, with a translation into a mathematical version. It terminates with a return trip into natural language.

The formal approach to requirements, based on mathematical notations (as was discussed in my paper), is still controversial; a common objection is that requirements must be understandable by ordinary project stakeholders, many of whom do not have advanced mathematical skills. I am not entering this debate here, but there can be little doubt that delicate system properties can be a useful step, if only for the requirements engineers themselves. Mathematical notation forces precision.

What, then, if we want to end up with natural language for clarity, but also to take advantage of the precision of mathematics? The formal picnic answer is that we can use mathematics as a tool to improve the requirements. The three steps are:

  • Start: a natural-language requirements document. Typically too vague and deficient in other ways (the seven sins) to serve as an adequate basis for the rest of the software process, as a good requirements document should.
  • Picnic: an excursion into mathematics. One of the main purposes of a requirements process is to raise and answer key questions about the system’s properties. Using mathematics helps raise the right questions and obtain precise answers. You do not need to apply the mathematical picnic to the entire system: even if the overall specification remains informal, some particularly delicate aspects may benefit from a more rigorous analysis.
  • Return trip: thinking of the non-formalist stakeholders back home, we translate the mathematical descriptions into a new natural-language version.

This final version is still in (say) English, but typically not the kind of English that most people naturally write. It may in fact “sound funny”. That is because it is really just mathematical formulae translated back into English. It retains the precision and objectivity of mathematics, but is expressed in terms that anyone can understand.

Let me illustrate the mathematical picnic idea with the example from my article. For reasons that do not need to be repeated here (they are in the original), it discussed a very elementary problem of text processing: splitting a text across lines. The original statement of the problem, from a paper by Peter Naur, read:

Given a text consisting of words separated by BLANKS or by NL (new line) characters, convert it to a line-by-line form in accordance with the following rules: (1) line breaks must be made only where the given text has BLANK or NL; (2) each line is filled as far as possible as long as  (3) no line will contain more than MAXPOS characters.

My article then cited an alternative specification proposed in a paper by testing experts John Goodenough and Susan Gerhart. G&G criticized Naur’s work (part of the still relevant debate between proponents of tests and proponents of proofs such as Naur). They pointed out deficiencies in his simple problem statement above; for example, it says nothing about the case of a text containing a word of more than MAXPOS characters. G&G stated that the issue was largely one of specification (requirements) and went on to propose a new problem description, four times as long as Naur’s. In my own article, I had a field day taking aim at their own endeavor. (Sometime later I met Susan Gerhart, who was incredibly gracious about my critique of her work, and became an esteemed colleague.) I am not going to cite the G&G replacement specification here; you can find it in my article.

Since that article’s topic was formal approaches, it provided a mathematical statement of Naur’s problem. It noted that  the benefit of mathematical formalization is not just to gain precision but also to identify important questions about the problem, with a view to rooting out dangerous potential bugs. Mathematics means not just formalization but proofs. If you formalize the Naur problem, you soon realize that — as originally posed — it does not always have a solution (because of over-MAXPOS words). The process forces you to specify the conditions under which solutions do exist. This is one of the software engineering benefits of a mathematical formalization effort: if such conditions are not identified at the requirements level, they will take their revenge in the program, in the form of erroneous results and crashes.

You can find the mathematical specification (only one of several possibilities) in the article.  The discussion also noted that one could start again from that spec and go back to English. That was, without the name, the mathematical picnic. The result’s length is in-between the other two versions: twice Naur’s, but half G&G’s. Here it is:

Given are a non-negative integer MAXPOS and a character set including two “break characters” blank and newline. The program shall accept as input a finite sequence of characters and produce as output a sequence of characters satisfying the following conditions:
• It only differs from the input by having a single break character wherever the input has one or more break characters;
• Any MAXPOS + 1 consecutive characters include a newline;
• The number of newline characters is minimal.
If (and only if) an input sequence contains a group of MAXPOS + 1 consecutive nonbreak characters, there exists no such output. In this case, the program shall produce the output associated with the initial part of the sequence, up to and including the MAXPOS·th character of the first such group, and report the error.

This post-picnic version is the result of a quasi-mechanical retranscription from the mathematical specification in the paper.

It uses the kind of English that one gets after a mathematical excursion. I wrote above that this style might sound funny; not to me in fact, because I am used to mathematical picnics, but probably to others (does it sound funny to you?).

The picnic technique provides a good combination of the precision of mathematics and the readability of English. English requirements as ordinarily written are subject to the seven sins described in my article, from ambiguity and contradiction to overspecification and noise. A formalization effort can correct these issues, but yields a mathematical text. Whether we like it or not, many people react negatively to such texts. We might wish they learn, but that is often not an option, and if they are important stakeholders we need their endorsement or correction of the requirements. With a mathematical picnic we translate the formal text back into something they will understand, while avoiding the worst problems of natural-language specifications.

Practicing the Formal Picnic method also has a long-term benefit for a software team. Having seen first-hand that better natural-language specifications (noise-free and more precise) are possible, team members little by little learn to apply the same style to the English texts they write, even without a mathematical detour.

If the goal is high-quality requirements, is there any alternative? What I have seen in many requirements documents is a fearful attempt to avoid ambiguity and imprecision by leaving no stone unturned: adding information and redundancy over and again. This was very much what I criticized in the G&G statement of requirements, which attempted to correct the deficiencies of the Naur text by throwing ever-more details that caused ever more risks of entanglement. It is fascinating to see how every explanation added in the hope of filling a possible gap creates more sources of potential confusion and a need for even more explanations. In industrial projects, this is the process that leads to thousands-of-pages documents, so formidable that they end up (as in the famous Ariane-5 case) on a shelf where no one will consult them when they would provide critical answers.

Mathematical specifications yield the precision and uncover the contradictions, but they also avoid noise and remain terse. Translating them back into English yields a reasonable tradeoff. Try a formal picnic one of these days.

Acknowledgments

For numerous recent discussions of these and many other related topics, I am grateful to my colleagues from the Innopolis-Toulouse requirements research group: Jean-Michel Bruel, Sophie Ebersold, Florian Galinier, Manuel Mazzara and Alexander Naumchev. I remain grateful to Axel van Lamsweerde (beyond his own seminal contributions to requirements engineering) for telling me, six years after I published a version of [1] in French, that I should take the time to produce a version in English too.

Reference

Bertrand Meyer: On Formalism in Specifications, in IEEE Software, vol. 3, no. 1, January 1985, pages 6-25. PDF available via IEEE Xplore with account, and also from here. Adapted translation of an original article in French (AFCET Software Engineering newsletter, no. 1, pages 81-122, 1979).

(This article was originally published on the Comm. ACMM blog.)

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Accurately Analyzing Agility

  
Book announcement:

Agile! The Good, the Hype and the Ugly
Bertrand Meyer
Springer, 2014 (just appeared)
Book page: here.
Amazon page: here.
Publisher’s page: here

A few years ago I became fascinated with agile methods: with the unique insights they include; with the obvious exaggerations and plainly wrong advice they also promote; and perhaps most of all with the constant intermingling of these two extremes.

I decided to play the game seriously: I read a good part of the agile literature, including all the important books; I sang the song, became a proud certified Scrum Master; I applied many agile techniques in my own work.

The book mentioned above is the result of that study and experience. It is both a tutorial and a critique.

The tutorial component was, I felt, badly needed. Most of the agile presentations I have seen are partisan texts, exhorting you to genuflect and adopt some agile method as the secret to a better life. Such preaching has a role but professionals know there is no magic in software development.  Agile! describes the key agile ideas objectively, concretely, and as clearly as I could present them. It does not introduce them in a vacuum, like the many agile books that pretend software engineering did not exist before (except for a repulsive idea, the dreaded “waterfall”). Instead, it relates them to many other concepts and results of software engineering, to which they bring their own additions and improvements.

Unfortunately, not all the additions are improvements. Up to now, the field has largely been left (with the exception of Boehm’s and Turner’s 2005 “Guide for the Perplexed“) to propaganda pieces and adoring endorsements. I felt that software developers would benefit more from a reasoned critical analysis. All the more so that agile methods are a remarkable mix of the best and the worst; the book carefully weeds out — in the terminology of the title — the ugly from the hype and the truly good.

Software developers and managers need to know about the “ugly”: awful agile advice that is guaranteed to harm your project. The “hype” covers ideas that have been widely advertised as shining agile contributions but have little relevance to the core goals of software development. The reason it was so critical to identify agile ideas belonging to these two categories is that they detract from the “good”, some of it remarkably good. I would not have devoted a good part of the last five years to studying agile methods if I did not feel they included major contributions to software engineering. I also found that some of these contributions do not get, in the agile literature itself, the explanations and exposure they deserve; I made sure they got their due in the book. An example is the “closed-window rule”, a simple but truly brilliant idea, of immediate benefit to any project.

Software methodology is a difficult topic, on which we still have a lot to learn. I expect some healthy discussions, but I hope readers will appreciate the opportunity to discuss agile ideas in depth for the greater benefit of quality software development.

I also made a point of writing a book that (unlike my last two) is short: 190 pages, including preface, index and everything else.

The table of contents follows; more details and sample chapters can be found on the book page listed above.

Preface
1 OVERVIEW
     1.1 VALUES
     1.2 PRINCIPLES
          Organizational principles
          Technical principles
     1.3 ROLES
     1.4 PRACTICES
          Organizational practices
          Technical practices
     1.5 ARTIFACTS
          Virtual artifacts
          Material artifacts
     1.6 A FIRST ASSESSMENT
          Not new and not good
          New and not good
          Not new but good
          New and good!

2 DECONSTRUCTING AGILE TEXTS
     2.1 THE PLIGHT OF THE TRAVELING SEMINARIST
          Proof by anecdote
          When writing beats speaking
          Discovering the gems
          Agile texts: reader beware!
     2.2 THE TOP SEVEN RHETORICAL TRAPS
          Proof by anecdote
          Slander by association
          Intimidation
          Catastrophism
          All-or-nothing
          Cover-your-behind
          Unverifiable claims
          Postscript: you have been ill-served by the software industry!

&3 THE ENEMY: BIG UPFRONT ANYTHING
     3.1 PREDICTIVE IS NOT WATERFALL
     3.2 REQUIREMENTS ENGINEERING
          Requirements engineering techniques
          Agile criticism of upfront requirements
          The waste criticism
          The change criticism
          The domain and the machine
     3.3 ARCHITECTURE AND DESIGN
          Is design separate from implementation?
          Agile methods and design
     3.4 LIFECYCLE MODELS
     3.5 RATIONAL UNIFIED PROCESS
     3.6 MATURITY MODELS
          CMMI in plain English
          The Personal Software Process
          CMMI/PSP and agile methods
          An agile maturity scale

4 AGILE PRINCIPLES
     4.1 WHAT IS A PRINCIPLE?
     4.2 THE OFFICIAL PRINCIPLES
     4.3 A USABLE LIST
     4.4 ORGANIZATIONAL PRINCIPLES
          Put the customer at the center
          Let the team self-organize
          Maintain a sustainable pace
          Develop minimal software
          Accept change
     4.5 TECHNICAL PRINCIPLES
          Develop iteratively
          Treat tests as a key resource
          Do not start any new development until all tests pass
          Test first
          Express requirements through scenarios

5 AGILE ROLES
     5.1 MANAGER
     5.2 PRODUCT OWNER
     5.3 TEAM
          Self-organizing
          Cross-functional
     5.4 MEMBERS AND OBSERVERS
     5.5 CUSTOMER
     5.6 COACH, SCRUM MASTER
     5.7 SEPARATING ROLES

6 AGILE PRACTICES: MANAGERIAL
     6.1 SPRINT
          Sprint basics
          The closed-window rule
          Sprint: an assessment
     6.2 DAILY MEETING
     6.3 PLANNING GAME
     6.4 PLANNING POKER
     6.5 ONSITE CUSTOMER
     6.6 OPEN SPACE
     6.7 PROCESS MINIATURE
     6.8 ITERATION PLANNING
     6.9 REVIEW MEETING
     6.10 RETROSPECTIVE
     6.11 SCRUM OF SCRUMS
     6.12 COLLECTIVE CODE OWNERSHIP
          The code ownership debate
          Collective ownership and cross-functionality

7 AGILE PRACTICES: TECHNICAL
     7.1 DAILY BUILD AND CONTINUOUS INTEGRATION
     7.2 PAIR PROGRAMMING
          Pair programming concepts
          Pair programming versus mentoring
          Mob programming
          Pair programming: an assessment
     7.3 CODING STANDARDS
     7.4 REFACTORING
          The refactoring concept
          Benefits and limits of refactoring
          Incidental and essential changes
          Combining a priori and a posteriori approaches
     7.5 TEST-FIRST AND TEST-DRIVEN DEVELOPMENT
          The TDD method of software development
          An assessment of TFD and TDD

8 AGILE ARTIFACTS
     8.1 CODE
     8.2 TESTS
     8.3 USER STORIES
     8.4 STORY POINTS
     8.5 VELOCITY
     8.6 DEFINITION OF DONE
     8.7 WORKING SPACE
     8.8 PRODUCT BACKLOG, ITERATION BACKLOG
     8.9 STORY CARD, TASK CARD
     8.10 TASK AND STORY BOARDS
     8.11 BURNDOWN AND BURNUP CHARTS
     8.12 IMPEDIMENT
     8.13 WASTE, TECHNICAL DEBT, DEPENDENCY, DEPENDENCY CHARTS

9 AGILE METHODS
     9.1 METHODS AND METHODOLOGY
          Terminology
          The fox and the hedgehog
     9.2 LEAN SOFTWARE AND KANBAN
          Lean Software’s Big Idea
          Lean Software’s principles
          Lean Software: an assessment
          Kanban
     9.3 EXTREME PROGRAMMING
          XP’s Big Idea
          XP: the unadulterated source
          Key XP techniques
          Extreme Programming: an assessment
     9.4 SCRUM
          Scrum’s Big Idea
          Key Scrum practices
          Scrum: an assessment
     9.5 CRYSTAL
          Crystal’s Big Idea
          Crystal principles
          Crystal: an assessment

10 DEALING WITH AGILE TEAMS
     10.1 GRAVITY STILL HOLDS
     10.2 THE EITHER-WHAT-OR-WHEN FALLACY

11 THE UGLY, THE HYPE AND THE GOOD: AN ASSESSMENT OF THE AGILE APPROACH
     11.1 THE BAD AND THE UGLY
          Deprecation of upfront tasks
          User stories as a basis for requirements
          Feature-based development and ignorance of dependencies
          Rejection of dependency tracking tools
          Rejection of traditional manager tasks
          Rejection of upfront generalization
          Embedded customer
          Coach as a separate role
          Test-driven development
          Deprecation of documents
     11.2 THE HYPED
     11.3 THE GOOD
     11.4 THE BRILLIANT
Bibliography
Index

 

VN:F [1.9.10_1130]
Rating: 8.6/10 (9 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

Multirequirements (new paper)

 

As part of a Festschrift volume for Martin Glinz of the university of Zurich I wrote a paper [1] describing a general approach to requirements that I have been practicing and developing for a while, and presented in a couple of talks. The basic idea is to rely on object-oriented techniques, including contracts for the semantics, and to weave several levels of discourse: natural-language, formal and graphical.

Reference

[1] Bertrand Meyer: Multirequirements, to appear in Martin Glinz Festschrift, eds. Anne Koziolek and Norbert Scheyff, 2013, available here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)