Archive for the ‘Requirements’ Category.

Talk on requirements at UC Santa Barbara tomorrow

I am giving a “distinguished lecture” at the University of California, Santa Barbara, January 10 (Friday, tomorrow) at 14. The title is A Comprehensive Approach to Requirements Engineering.

The abstract and rest of the information are here.

I will spend the last few minutes of the talk discussing other current developments (verification, concurrency).

VN:F [1.9.10_1130]
Rating: 7.0/10 (3 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Defining and classifying requirements (new publication)

Software engineering has improved a lot in the past couple of decades, but there remains an area where the old doomsday style of starting a software engineering paper (software crisis, everything is rotten…) still fits: requirements engineering. Just see the chasm between textbook advice and the practice of most projects.

I have written on requirements in this blog, including very recently, and will continue in forthcoming installments. For today I  want to point to a recent article [1],  presented at the newly revived TOOLS conference in October. It attempts to bring some order and rigor to the basic definitions in the field.

From the abstract:

Requirements engineering is crucial to software development but lacks a precise definition of its fundamental concepts. Even the basic definitions in the literature and in industry standards are often vague and verbose.

To remedy this situation and provide a solid basis for discussions of requirements, this work provides precise definitions of the fundamental requirements concepts and two systematic classifications: a taxonomy of requirement elements (such as components, goals, constraints…) ; and a taxonomy of possible relations between these elements (such as “extends”, “excepts”, “belongs”…).

The discussion evaluates the taxonomies on published requirements documents; readers can test the concepts in two online quizzes.

The intended result of this work is to spur new advances in the study and practice of software requirements by clarifying the fundamental concepts.

This version is a first step; we are aware of its limitations and are already revising the definitions and taxonomy. The project is aimed at providing a solid foundation for a delicate area of software engineering and it will take some time to get it completely right. Still, I think the paper as it is already introduces important concepts. I will within the next two weeks write a more detailed blog article summarizing some of them.

References

[1] Bertrand Meyer, Jean-Michel Bruel, Sophie Ebersold, Florian Galinier, Alexandr Naumchev, The Anatomy of Requirements, in TOOLS 51, Software Technology: Methods and Tools
Innopolis, Russia, October 15–17, 2019, pages 10-40, available here (Springer site, paywall) and here (arXiv draft).

VN:F [1.9.10_1130]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

Are my requirements complete?

Some important concepts of software engineering, established over the years, are not widely known in the community. One use of this blog is to provide tutorials on such overlooked ideas. An earlier article covered one pertaining to project management: the Shortest Possible Schedule property . Here is another, this time in the area of requirements engineering, also based on a publication that I consider to be a classic (it is over 40 years old) but almost unknown to practitioners.

Practitioners are indeed, as in most of my articles, the intended audience. I emphasize this point right at the start because if you glance at the rest of the text you will see that it contains (horror of horrors) some mathematical formulae, and might think “this is not for me”. It is! The mathematics is very simple and my aim is practical: to shed light on an eternal question that faces anyone writing requirements (whatever the style, traditional or agile): how can I be sure that a requirements specification is complete?

To a certain extent you cannot. But there is better answer, a remarkably simple one which, while partial, helps.

Defining completeness

The better answer is called “sufficient completeness” and comes from the theory of abstract data types. It was introduced in a 1978 article by Guttag and Horning [1]. It is also implicit in a more down-to-earth document, the 1998 IEEE standard on how to write requirements [2].

There is nothing really new in the present article; in fact my book Object-Oriented Software Construction [3] contains an extensive discussion of sufficient completeness (meant to be more broadly accessible than Guttag and Horning’s scholarly article). But few people know the concepts; in particular very few practitioners have heard of sufficient completeness (if they have heard at all of abstract data types). So I hope the present introduction will be useful.

The reason the question of determining completeness of requirements seems hopeless at first is the natural reaction: complete with respect to what? To know that the specification is complete we would need a more general description of all that our stakeholders want and all the environment constraints, but this would only push the problem further: how do we know that such description itself is complete?

That objection is correct in principle: we can never be sure that we did not forget something someone wanted, or some property that the environment imposes. But there also exist more concrete and assessable notions of completeness.

The IEEE standard gives three criteria of completeness. The first states that “all requirements” have been included, and is useless, since it  runs into the logical paradox mentioned above, and is tautological anyway (the requirements are complete if they include all requirements, thank you for the information!). The second is meaningful but of limited interest (a “bureaucratic” notion of completeness): every element in the requirements document is numbered, every cross-reference is defined and so on. The last criterion is the interesting one: “Definition of the responses of the software to all realizable classes of input data in all realizable classes of situations”. Now this is meaningful. To understand this clause we need to step back to sufficient completeness and, even before that, to abstract data types.

Abstract data types will provide our little mathematical excursion (our formal picnic in the words of an earlier article) in our study of requirements and completeness. If you are not familiar with this simple mathematical theory, which every software practitioner should know, I hope you will benefit from the introduction and example. They will enable us to introduce the notion of sufficient completeness formally before we come back to its application to requirements engineering.

Specifying an abstract data type

 Abstract data types are the mathematical basis for object-oriented programming. In fact, OO programming but also OO analysis and OO design are just a realization of this mathematical concept at various levels of abstraction, even if few OO practitioners are aware of it. (Renewed reference to [3] here if you want to know more.)

An ADT (abstract data type) is a set of objects characterized not by their internal properties (what they are) but by the operations applicable to them (what they have), and the properties of these operations. If you are familiar with OO programming you will recognize that this is exactly, at the implementation level, what a class is. But here we are talking about mathematical objects and we do not need to consider implementation.

An example  of a type defined in this way, as an ADT, is a notion of POINT on a line. We do not say how this object is represented (a concept that is irrelevant at the specification level) but how it appears to the rest of the world: we can create a new point at the origin, ask for the coordinate of a point, or move the point by a certain displacement. The example is the simplest meaningful one possible, but it gives the ideas.

adt

An ADT specification has three part: Functions, Preconditions and Axioms. Let us see them (skipping Preconditions for the moment) for the definition of the POINT abstract data type.

The functions are the operations that characterize the type. There are three kinds of function, defined by where the ADT under definition, here POINT, appears:

  • Creators, where the type appears only among the results.
  • Queries, where it appears only among the arguments.
  • Commands, where it appears on both sides.

There is only one creator here:

new: → POINT

new is a function that takes no argument, and yields a point (the origin). We will write the result as just new (rather than using empty parentheses as in new ()).

Creators correspond in OO programming to constructors of a class (creation procedures in Eiffel). Like constructors, creators may have arguments: for example instead of always creating a point at the origin we could decide that new creates a point with a given coordinate, specifying it as INTEGER → POINT and using it as new (i) for some integer i (our points will have integer coordinates). Here for simplicity we choose a creator without arguments. In any case the new type, here POINT, appears only on the side of the results.

Every useful ADT specification needs at least one creator, without which we would never obtain any objects of the type (here any points) to work with.

There is also only one query:

x: POINT → INTEGER

 which gives us the position of a point, written x (p) for a point p. More generally, a query enables us to obtain properties of objects of the new type. These properties must be expressed in terms of types that we have already defined, like INTEGER here. Again there has to be at least one query, otherwise we could never obtain usable information (information expressed in terms of what we already know) about objects of the new type. In OO programming, queries correspond to fields (attributes) of a class and functions without side effects.

And we also have just one command:

move: POINT × INTEGER → POINT

a function that for any point p and integer i and yields a new point, move (p, i).  Again an ADT specification is not interesting unless it has at least one command, representing ways to modify objects. (In mathematics we do not actually modify objects, we get new objects. In imperative programming we will actually update existing objects.) In the classes of object-oriented programming, commands correspond to procedures (methods which may change objects).

You see the idea: define the notion of POINT through the applicable operations.

Listing their names and the types of their arguments types results (as in POINT × INTEGER → POINT) is not quite enough to specify these operations: we must specify their fundamental properties, without of course resorting to a programming implementation. That is the role of the second component of an ADT specification, the axioms.

For example I wrote above that new yields the origin, the point for which x = 0,  but you only had my word for it. My word is good but not good enough. An axiom will give you this property unambiguously:

x (new) = 0                                    — A0

The second axiom, which is also the last, tells us what move actually does. It applies to any point p and any integer m:

x (move (p, m)) = x (p) + m       — A1

In words: the coordinate of the point resulting from moving p by m is the coordinate of p plus m.

That’s it! (Except for the notion of precondition, which will wait a bit.) The example is trivial but this approach can be applied to any number of  data types, with any number of applicable operations and any level of complexity. That is what we do, at the design and implementation level, when writing classes in OO programming.

Is my ADT sufficiently complete?

Sufficient completeness is a property that we can assess on such specifications. An ADT specification for a type T (here POINT) is sufficiently complete if the axioms are powerful enough to yield the value of any well-formed query expression in a form not involving T. This definition contains a few new terms but the concepts are very simple; I will explain what it means through an example.

With an ADT specification we can form all kinds of expressions, representing arbitrarily complex specifications. For example:

x (move (move (move (new, 3), x (move (move (new, -2), 4))), -6))

This expression will yield an integer (since function x has INTEGER as its result type) describing the result of a computation with points. We can visualize this computation graphically; note that it involves creating two points (since there are two occurrences of new) and moving them, using in one case the current coordinate of one of them as displacement for the other. The following figure illustrates the process.

computation

The result, obtained informally by drawing this picture, is the x of P5, that is to say -1. We will derive it mathematically below.

Alternatively, if like most programmers (and many other people) you find it more intuitive to reason operationally than mathematically, you may think of the previous expression as describing the result of the following OO program (with variables of type POINT):

create p                                — In C++/Java syntax: p = new POINT();
create q
p.move (3)
q.move (-2)
q.move (4)
p.move (q.x)
p.move (-6)

Result := p.x

You can run this program in your favorite OO programming language, using a class POINT with new, x and move, and print the value of Result, which will be -1.

Here, however, we will stay at the mathematical level and simplify the expression using the axioms of the ADT, the same way we would compute any other mathematical formula, applying the rules without needing to rely on intuition or operational reasoning. Here is the expression again (let’s call it i, of type INTEGER):

ix (move (move (move (new, 3), x (move (move (new, -2), 4))), -6))

A query expression is one in which the outermost function being applied, here x, is a query function. Remember that a query function is one which the new type, here POINT, appears only on the left. This is the case with x, so the above expression i is indeed a query expression.

For sufficient completeness, query expressions are the ones of interest because their value is expressed in terms of things we already know, like INTEGERs, so they are the only way we can concretely obtain directly usable information the ADT (to de-abstract it, so to speak).

But we can only get such a value by applying the axioms. So the axioms are “sufficiently complete” if they always give us the answer: the value of any such query expression.

 Let us see if the above expression i satisfies this condition of sufficient completeness. To make it more tractable let us write  it in terms of simpler expressions (all of type POINT), as illustrated by the figure below:

p1 = move (new, 3)
p2= move (new, -2)
p3= move (p2, 4)
p4= move (p1, x (p3))
p5= move (p4, -6)
i = x (p5)

expression

(You may note that the intermediate expressions roughly correspond to the steps in the above interpretation of the computation as a program. They also appear in the illustrative figure repeated below.)

computation

Now we start applying the axioms to evaluating the expressions. Remember that we have two axioms: A0 tells us that x (new) = 0 and A1 that x (move (p, m)) = x (p) + m. Applying A1 to the definition the expression i yields

i = x (p4) – 6
= i4 – 6

if we define

i4 = x (p4)      — Of type INTEGER

We just have to compute i4. Applying A1 to the definion of p4 tells us that

i4 = x (p1) + x (p3)

To compute the two terms:

  • Applying A1 again, we see that the first term x (p1) is x (new) + 3, but then A0 tells us that x (new) is zero, so x (p1) is 3.
  • As to x (p3), it is, once more from A1, x (p2) + 4, and x (p2) is (from A1 then A0), just -2, so x (p3) is 2.

In the end, then, i4 is 5, and the value of the entire expression i = i4 – 6 is -1. Good job!

Proving sufficient completeness

The successful computation of i was just a derivation for one example, showing that in that particular case the axioms yield the answer in terms of an INTEGER. How do we go from one example to an entire specification?

The bad news first: like all interesting problems in programming, sufficient completeness of an ADT specification is theoretically undecidable. There is no general automatic procedure that will process an ADT specification and print out ““sufficiently complete” or “not sufficiently complete”.

Now that you have recovered from the shock, you can share the computer scientist’s natural reaction to such an announcement: so what. (In fact we might define the very notion of computer scientist as someone who, even before he brushes his teeth in the morning — if he brushes them at all — has already built the outline of a practical solution to an undecidable problem.) It is enough that we can find a way to determine if a given specification is sufficiently complete. Such a proof is, in fact, the computer scientist’s version of dental hygiene: no ADT is ready for prime time unless it is sufficiently complete.

The proof is usually not too hard and will follow the general style illustrated for our simple example.

We note that the definition of sufficient completeness said: “the axioms are powerful enough to yield the value of any well-formed query expression in a form not involving the type”. I have not defined “well-formed” yet. It simply means that the expressions are properly structured, with the proper syntax (basically the correct matching of parentheses) and proper number and types of arguments. For example the following are not well-formed (if p is an expression of type POINT):

move (p, 55(     — Bad use of parentheses.
move (p)            — Wrong number of arguments.
move (p, p)       — Wrong type: second argument should be an integer.

Such expressions are nonsense, so we only care about well-formed expressions. Note that in addition to new, x and move , an expression can use integer constants as in the example (although we could generalize to arbitrary integer expressions). We consider an integer constant as a query expression.

We have to prove that with the two axioms A0 and A1 we can determine the value of any query expression i. Note that since the only query functions is x, the only possible form for i, other than an integer constant, is x (p) for some expression p of type POINT.

The proof proceeds by induction on the number n of parenthesis pairs in a query expression i.

There are two base steps:

  • n = 0: in that case i can only be an integer constant. (The only expression with no parentheses built out of the ADT’s functions is new, and it is not a query expression.) So the value is known. In all other cases i will be of the form x (p) as noted.
  • n = 1: in that case p  can only be new, in other words i = x (new), since the only function that yields points, other than new, is move, and any use of it would add parentheses. In this case axiom A0 gives us the value of i: zero.

For the induction step, we consider i with n + 1 parenthesis pairs for n > 1. As noted, i is of the form x (p), so p has exactly n parenthesis pairs. p cannot be new (which would give 0 parenthesis pairs and was taken care of in the second base step), so p has to be of the form

p =  move (p’, i’)    — For expressions p’ of type POINT and i’ of type INTEGER.

implying (since i = x (p)) that by axiom A1, the value of i is

x (p’) + i’

So we will be able to determine the value of i if we can determine the value of both x (p’) and i’. Since p has n parenthesis pairs and p =  move (p’, i’), both p’ and i’ have at most n – 1 parenthesis pairs. (This use of n – 1 is legitimate because we have two base steps, enabling us to assume n > 1.) As a consequence, both x (p’) and i’ have at most n parenthesis pairs, enabling us to deduce their values, and hence the value of i, by the induction hypothesis.

Most proofs of sufficient completeness in my experience follow this style: induction on the number of parenthesis pairs (or the maximum nesting level).

Preconditions

I left until now the third component of a general ADT specification: preconditions. The need for preconditions arises because most practical specifications need some of their functions to be partial. A partial function from X to Y is a function that may not yield a value for some elements of X. For example, the inverse function on real numbers, which yields 1 / a for x, is partial  since it is not defined for a = 0 (or, on a computer, for non-zero but very small a).

Assume that in our examples we only want to accept points that lie in the interval [-4, +4]:

limited

 We can simply model this property by turning move into a partial function. It was specified above as

move: POINT × INTEGER → POINT

The ordinary arrow → introduces a total (always defined) function. For a partial function we will use a crossed arrow ⇸, specifying the function as

move: POINT × INTEGER ⇸ POINT

Other functions remain unchanged. Partial functions cause trouble: for f in X ⇸ Y we can no longer cheerfully use f (x) if f is a partial function, even for x of the appropriate type X. We have to make sure that x belongs to the domain of f, meaning the set of values for which f is defined. There is no way around it: if you want your specification to be meaningful and it uses partial functions, you must specify explicitly the domain of each of them. Here is how to do it, in the case of move:

move (p: POINT; d: INTEGER) require |x (p) + d | < 5    — where |…| is absolute value

To adapt the definition (and proofs) of sufficient completeness to the possible presence of partial functions:

  • We only need to consider (for the rule that axioms must yield the value of query expressions) well-formed expressions that satisfy the associated preconditions.
  • The definition must, however, include the property that axioms always enable us to determine whether an expression satisfies the associated preconditions (normally a straightforward part of the proof since preconditions are themselves query expressions).

Updating the preceding proof accordingly is not hard.

Back to requirements

The definition of sufficient completeness is of great help to assess the completeness of a requirements document. We must first regretfully note that for many teams today requirements stop at  “use cases” (scenarios) or  “user stories”. Of course these are not requirements; they only describe individual cases and are to requirements what tests are to programs. They can serve to check requirements, but do not suffice as requirements. I am assuming real requirements, which include descriptions of behavior (along with other elements such as environment properties and project properties). To describe behaviors, you will define operations and their effects. Now we know what the old IEEE standard is telling us by stating that complete requirements should include

definition of the responses of the software to all realizable classes of input data in all realizable classes of situations

Whether or not we have taken the trouble to specify the ADTs, they are there in the background; our system’s operations reflect the commands, and the effects we can observe reflect the queries. To make our specification complete, we should draw as much as possible of the (mental or explicit) matrix of possible effects of all commands on all queries. “As much as possible” because software engineering is engineering and we will seldom be able to reach perfection. But the degree of fullness of the matrix tells us a lot (possible software metric here?) about how close our requirements are to completeness.

I should note that there are other aspects to completeness of requirements. For example the work of Michael Jackson, Pamela Zave and Axel van Lamsweerde (more in some later article, with full references) distinguishes between business goals, environment constraints and system properties, leading to a notion of completeness as how much the system properties meet the goals and obey the constraints [4]. Sufficient completeness operates at the system level and, together with its theoretical basis, is one of those seminal concepts that every practicing software engineer or project manager should master.

References and notes

[1] John V. Guttag, Jim J. Horning: The Algebraic Specification of Abstract Data Types, in Acta Informatica, vol. 10, no. 1, pages 27-52, 1978, available here from the Springer site. This is a classic paper but I note that few people know it today; in Google Scholar I see over 700 citations but less than 100 of them in the past 8 years.

[2]  IEEE: Recommended Practice for Software Requirements Specifications, IEEE Standard 830-1998, 1998. This standard is supposed to be obsolete and replaced by newer ones, more detailed and verbose, but it remains the better reference: plain, modest and widely applied by the industry. It does need an update, but a good one.

[3] Bertrand Meyer, Object-Oriented Software Construction, 2nd edition, Prentice Hall, 1997. The discussion of sufficient completeness was in fact already there in the first edition from 1988.

[4] With thanks to Elisabetta Di Nitto from Politecnico di Milano for bringing up this notion of requirements completeness.

Recycled A version of this article was first published on the Communications of the ACM blog.

 

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Formality in requirements: new publication

The best way to make software requirements precise is to use one of the available “formal” approaches. Many have been proposed; I am not aware of a general survey published so far. Over the past two years, we have been working on a comprehensive survey of the use of formality in requirements, of which we are now releasing a draft. “We” is a joint informal research group from Innopolis University and the University of Toulouse, whose members have been cooperating on requirements issues, resulting in publications listed  under “References” below and in several scientific events.

The survey is still being revised, in particular because it is longer than the page limit of its intended venue (ACM Computing Surveys), and some sections are in need of improvement. We think, however, that the current draft can already provide a solid reference in this fundamental area of software engineering.

The paper covers a broad selection of methods, altogether 22 of them, all the way from completely informal to strictly formal. They are grouped into five categories: natural language, semi-formal, automata- or graph-based, other mathematical frameworks, programming-language based. Examples include SysML, Relax, Statecharts, VDM, Eiffel (as a requirements notation), Event-B, Alloy. For every method, the text proposes a version of a running example (the Landing Gear System, already used in some of our previous publications) expressed in the corresponding notation. It evaluates the methods using a set of carefully defined criteria.

The paper is: Jean-Michel Bruel, Sophie Ébersold, Florian Galinier, Alexandr Naumchev, Manuel Mazzara and Bertrand Meyer: Formality in Software Requirements, draft, November 2019.

The text is available here. Comments on the draft are welcome.

References

Bertrand Meyer, Jean-Michel Bruel, Sophie Ebersold, Florian Galinier and Alexandr Naumchev: Towards an Anatomy of Software Requirements, in TOOLS 2019, pages 10-40, see here (or arXiv version here). I will write a separate blog article about this publication.

Alexandr Naumchev and Bertrand Meyer: Seamless requirements. Computer Languages, Systems & Structures 49, 2017, pages 119-132, available here.

Florian Galinier, Jean-Michel Bruel, Sophie Ebersold and Bertrand Meyer: Seamless Integration of Multirequirements, in Complex Systems, 25th International Requirements Engineering Conference Workshop, IEEE, pages 21-25, 2017, available here.

Alexandr Naumchev, Manuel Mazzara, Bertrand Meyer, Jean-Michel Bruel, Florian Galinier and Sophie Ebersold: A contract-based method to specify stimulus-response requirements, Proceedings of the Institute for System Programming, vol. 29, issue 4, 2017, pp. 39-54. DOI: 10.15514, available here.

Alexandr Naumchev and Bertrand Meyer: Complete Contracts through Specification Drivers., in 10th International Symposium on Theoretical Aspects of Software Engineering (TASE), pages 160-167, 2016, available here.

Alexandr Naumchev, Bertrand Meyer and Víctor Rivera: Unifying Requirements and Code: An Example, in PSI 2015 (Ershov conference, Perspective of System Informatics), pages 233-244, available here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (7 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)

Sunrise was foggy today

Once you have learned the benefits of formally expressing requirements, you keep noticing potential ambiguities and other deficiencies [1] in everyday language. Most such cases are only worth a passing smile, but here’s one that perhaps can serve to illustrate a point with business analysts in your next requirements engineering workshop or with students in your next software engineering lecture.

As a customer of the Swiss telecommunications company Sunrise I receive an occasional “news” email. (As a customer of the Swiss telecommunications company Sunrise I would actually prefer that they spend my money improving  bandwidth,  but let us not digress.) Rather than raw marketing messages these are tips for everyday life, with the presumed intent of ingratiating the populace. For example, today’s message helpfully advises me on how to move house. The admirable advice starts (my translation):

10.7% of all Swiss people relocate every year. Is that your case too for next Autumn?

Actually no, it’s not my case (neither a case of being one of the “Swiss people” nor a case of intending to relocate this Fall). And, ah, the beauty of ridiculously precise statistics! Not 10.8% or 10.6%, mind you, no, 10.7% exactly! But consider the first sentence and think of something similar appearing in a requirements document or user story. Something similar does appear in such documents, all the time, leading to confusions for the programmers interpreting them and to bugs in the resulting systems. Those restless Swiss! Did you know that they include an itchy group, exactly 922,046 people (I will not be out-significant-digited!), who relocate every year?

Do not be silly, I hear you saying. What Sunrise is sharing of its wisdom is that every year a tenth of the Swiss population moves, but not the same tenth every year. Well, OK, maybe I am being silly. But if you think of a programmer reading such a statement about some unfamiliar domain (not one about which we can rely on common sense), the risk of confusion and consequent bugs is serious.

As [1] illustrated in detail, staying within the boundaries of natural language to resolve such possible ambiguities only results in convoluted requirements that make matters worse. The only practical way out is, for delicate system properties, to use precise language, also known technically as “mathematics”.

Here for example a precise formulation of the two possible interpretations removes any doubt. Let Swiss denote the set of Swiss people and  E the number of elements (cardinal) of a finite set E, which we can apply to the example because the set of Swiss people is indeed finite. Let us define slice as the Sunrise-official number of Swiss people relocating yearly, i.e. slice = Swiss ∗ 0.107 (the actual value appeared above). Then one interpretation of the fascinating Sunrise-official fact is:

{s: Swiss | (∀y: Year | s.is_moving (y))} = slice

In words: the cardinal of the set of Swiss people who move every year (i.e., such that for every year y they move during y) is equal to the size of the asserted population subset.

The other possible interpretation, the one we suspect would be officially preferred by the Sunrise powers (any formal-methods fan from Sunrise marketing reading this, please confirm or deny!), is:

∀y: Year | {s: Swiss | s.is_moving (y)} = slice

In words: for any year y, the cardinal of the set of Swiss people who move during y is equal to the size of the asserted subset.

This example is typical of where and why we need mathematics in software requirements. No absolutist stance here, no decree  that everything become formal (mathematical). Natural language is not going into retirement any time soon. But whenever one spots a possible ambiguity or imprecision, the immediate reaction should always be to express the concepts mathematically.

To anyone who has had a successful exposure to formal methods this reaction is automatic. But I keep getting astounded not only by  the total lack of awareness of these simple ideas among the overwhelming majority of software professionals, but also by their absence from the standard curriculum of even top universities. Most students graduate in computer science without ever having heard such a discussion. Where a formal methods course does exist, it is generally as a specialized topic reserved for a small minority, disconnected (as Leslie Lamport has observed [2]) from the standard teaching of programming and software engineering.

In fact all software engineers should possess the ability to go formal when and where needed. That skill is not hard to learn and should be practiced as part of the standard curriculum. Otherwise we keep training the equivalent of electricians rather than electrical engineers, programmers keep making damaging mistakes from misunderstanding ambiguous or inconsistent requirements, and we all keep suffering from buggy programs.

 

References

[1] Self-citation appropriate here: Bertrand Meyer: On Formalism in Specifications, IEEE Software, vol. 3, no. 1, January 1985, pages 6-25, available here.

[2] Leslie Lamport: The Future of Computing: Logic or Biology, text of a talk given at Christian Albrechts University, Kiel on 11 July 2003, available here.

VN:F [1.9.10_1130]
Rating: 9.8/10 (8 votes cast)
VN:F [1.9.10_1130]
Rating: 0 (from 0 votes)

The Formal Picnic approach to requirements

picnicRequirements engineering (studying and documenting what a software system should do, independently of how it will do it) took some time to be recognized as a key part of software engineering, since the early focus was, understandably, on programming. It is today a recognized sub-discipline and has benefited in the last decades from many seminal concepts. An early paper of mine, On Formalism in Specifications [1], came at the beginning of this evolution; it made the case for using formal (mathematics-based) approaches. One of the reasons it attracted attention is its analysis of the “seven sins of the specifier”: a list of pitfalls into which authors of specifications and requirements commonly fall.

One of the techniques presented in the paper has not made it into the standard requirements-enginering bag of tricks. I think it deserves to be known, hence the present note. There really will not be anything here that is not in the original article; in fact I will be so lazy as to reuse its example. (Current requirements research with colleagues should lead to the publication of new examples.)

Maybe the reason the idea did not register is that I did not give it a name. So here goes: formal picnic.

The usual software engineering curriculum includes, regrettably, no room for  field trips. We are jealous of students and teachers of geology or zoology and their occasional excursions: once in a while you put on your boots, harness your backpack, and head out to quarries or grasslands to watch pebbles or critters in flagrante, after a long walk with the other boys and girls and before all having lunch together in the wild. Yes, scientific life in these disciplines really is a picnic. What I propose for the requirements process is a similar excursion; not into muddy fields, but into the dry pastures of mathematics.

The mathematical picnic process starts with a natural-language requirements document. It continues, for some part of the requirements, with a translation into a mathematical version. It terminates with a return trip into natural language.

The formal approach to requirements, based on mathematical notations (as was discussed in my paper), is still controversial; a common objection is that requirements must be understandable by ordinary project stakeholders, many of whom do not have advanced mathematical skills. I am not entering this debate here, but there can be little doubt that delicate system properties can be a useful step, if only for the requirements engineers themselves. Mathematical notation forces precision.

What, then, if we want to end up with natural language for clarity, but also to take advantage of the precision of mathematics? The formal picnic answer is that we can use mathematics as a tool to improve the requirements. The three steps are:

  • Start: a natural-language requirements document. Typically too vague and deficient in other ways (the seven sins) to serve as an adequate basis for the rest of the software process, as a good requirements document should.
  • Picnic: an excursion into mathematics. One of the main purposes of a requirements process is to raise and answer key questions about the system’s properties. Using mathematics helps raise the right questions and obtain precise answers. You do not need to apply the mathematical picnic to the entire system: even if the overall specification remains informal, some particularly delicate aspects may benefit from a more rigorous analysis.
  • Return trip: thinking of the non-formalist stakeholders back home, we translate the mathematical descriptions into a new natural-language version.

This final version is still in (say) English, but typically not the kind of English that most people naturally write. It may in fact “sound funny”. That is because it is really just mathematical formulae translated back into English. It retains the precision and objectivity of mathematics, but is expressed in terms that anyone can understand.

Let me illustrate the mathematical picnic idea with the example from my article. For reasons that do not need to be repeated here (they are in the original), it discussed a very elementary problem of text processing: splitting a text across lines. The original statement of the problem, from a paper by Peter Naur, read:

Given a text consisting of words separated by BLANKS or by NL (new line) characters, convert it to a line-by-line form in accordance with the following rules: (1) line breaks must be made only where the given text has BLANK or NL; (2) each line is filled as far as possible as long as  (3) no line will contain more than MAXPOS characters.

My article then cited an alternative specification proposed in a paper by testing experts John Goodenough and Susan Gerhart. G&G criticized Naur’s work (part of the still relevant debate between proponents of tests and proponents of proofs such as Naur). They pointed out deficiencies in his simple problem statement above; for example, it says nothing about the case of a text containing a word of more than MAXPOS characters. G&G stated that the issue was largely one of specification (requirements) and went on to propose a new problem description, four times as long as Naur’s. In my own article, I had a field day taking aim at their own endeavor. (Sometime later I met Susan Gerhart, who was incredibly gracious about my critique of her work, and became an esteemed colleague.) I am not going to cite the G&G replacement specification here; you can find it in my article.

Since that article’s topic was formal approaches, it provided a mathematical statement of Naur’s problem. It noted that  the benefit of mathematical formalization is not just to gain precision but also to identify important questions about the problem, with a view to rooting out dangerous potential bugs. Mathematics means not just formalization but proofs. If you formalize the Naur problem, you soon realize that — as originally posed — it does not always have a solution (because of over-MAXPOS words). The process forces you to specify the conditions under which solutions do exist. This is one of the software engineering benefits of a mathematical formalization effort: if such conditions are not identified at the requirements level, they will take their revenge in the program, in the form of erroneous results and crashes.

You can find the mathematical specification (only one of several possibilities) in the article.  The discussion also noted that one could start again from that spec and go back to English. That was, without the name, the mathematical picnic. The result’s length is in-between the other two versions: twice Naur’s, but half G&G’s. Here it is:

Given are a non-negative integer MAXPOS and a character set including two “break characters” blank and newline. The program shall accept as input a finite sequence of characters and produce as output a sequence of characters satisfying the following conditions:
• It only differs from the input by having a single break character wherever the input has one or more break characters;
• Any MAXPOS + 1 consecutive characters include a newline;
• The number of newline characters is minimal.
If (and only if) an input sequence contains a group of MAXPOS + 1 consecutive nonbreak characters, there exists no such output. In this case, the program shall produce the output associated with the initial part of the sequence, up to and including the MAXPOS·th character of the first such group, and report the error.

This post-picnic version is the result of a quasi-mechanical retranscription from the mathematical specification in the paper.

It uses the kind of English that one gets after a mathematical excursion. I wrote above that this style might sound funny; not to me in fact, because I am used to mathematical picnics, but probably to others (does it sound funny to you?).

The picnic technique provides a good combination of the precision of mathematics and the readability of English. English requirements as ordinarily written are subject to the seven sins described in my article, from ambiguity and contradiction to overspecification and noise. A formalization effort can correct these issues, but yields a mathematical text. Whether we like it or not, many people react negatively to such texts. We might wish they learn, but that is often not an option, and if they are important stakeholders we need their endorsement or correction of the requirements. With a mathematical picnic we translate the formal text back into something they will understand, while avoiding the worst problems of natural-language specifications.

Practicing the Formal Picnic method also has a long-term benefit for a software team. Having seen first-hand that better natural-language specifications (noise-free and more precise) are possible, team members little by little learn to apply the same style to the English texts they write, even without a mathematical detour.

If the goal is high-quality requirements, is there any alternative? What I have seen in many requirements documents is a fearful attempt to avoid ambiguity and imprecision by leaving no stone unturned: adding information and redundancy over and again. This was very much what I criticized in the G&G statement of requirements, which attempted to correct the deficiencies of the Naur text by throwing ever-more details that caused ever more risks of entanglement. It is fascinating to see how every explanation added in the hope of filling a possible gap creates more sources of potential confusion and a need for even more explanations. In industrial projects, this is the process that leads to thousands-of-pages documents, so formidable that they end up (as in the famous Ariane-5 case) on a shelf where no one will consult them when they would provide critical answers.

Mathematical specifications yield the precision and uncover the contradictions, but they also avoid noise and remain terse. Translating them back into English yields a reasonable tradeoff. Try a formal picnic one of these days.

Acknowledgments

For numerous recent discussions of these and many other related topics, I am grateful to my colleagues from the Innopolis-Toulouse requirements research group: Jean-Michel Bruel, Sophie Ebersold, Florian Galinier, Manuel Mazzara and Alexander Naumchev. I remain grateful to Axel van Lamsweerde (beyond his own seminal contributions to requirements engineering) for telling me, six years after I published a version of [1] in French, that I should take the time to produce a version in English too.

Reference

Bertrand Meyer: On Formalism in Specifications, in IEEE Software, vol. 3, no. 1, January 1985, pages 6-25. PDF available via IEEE Xplore with account, and also from here. Adapted translation of an original article in French (AFCET Software Engineering newsletter, no. 1, pages 81-122, 1979).

(This article was originally published on the Comm. ACMM blog.)

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Accurately Analyzing Agility

  
Book announcement:

Agile! The Good, the Hype and the Ugly
Bertrand Meyer
Springer, 2014 (just appeared)
Book page: here.
Amazon page: here.
Publisher’s page: here

A few years ago I became fascinated with agile methods: with the unique insights they include; with the obvious exaggerations and plainly wrong advice they also promote; and perhaps most of all with the constant intermingling of these two extremes.

I decided to play the game seriously: I read a good part of the agile literature, including all the important books; I sang the song, became a proud certified Scrum Master; I applied many agile techniques in my own work.

The book mentioned above is the result of that study and experience. It is both a tutorial and a critique.

The tutorial component was, I felt, badly needed. Most of the agile presentations I have seen are partisan texts, exhorting you to genuflect and adopt some agile method as the secret to a better life. Such preaching has a role but professionals know there is no magic in software development.  Agile! describes the key agile ideas objectively, concretely, and as clearly as I could present them. It does not introduce them in a vacuum, like the many agile books that pretend software engineering did not exist before (except for a repulsive idea, the dreaded “waterfall”). Instead, it relates them to many other concepts and results of software engineering, to which they bring their own additions and improvements.

Unfortunately, not all the additions are improvements. Up to now, the field has largely been left (with the exception of Boehm’s and Turner’s 2005 “Guide for the Perplexed“) to propaganda pieces and adoring endorsements. I felt that software developers would benefit more from a reasoned critical analysis. All the more so that agile methods are a remarkable mix of the best and the worst; the book carefully weeds out — in the terminology of the title — the ugly from the hype and the truly good.

Software developers and managers need to know about the “ugly”: awful agile advice that is guaranteed to harm your project. The “hype” covers ideas that have been widely advertised as shining agile contributions but have little relevance to the core goals of software development. The reason it was so critical to identify agile ideas belonging to these two categories is that they detract from the “good”, some of it remarkably good. I would not have devoted a good part of the last five years to studying agile methods if I did not feel they included major contributions to software engineering. I also found that some of these contributions do not get, in the agile literature itself, the explanations and exposure they deserve; I made sure they got their due in the book. An example is the “closed-window rule”, a simple but truly brilliant idea, of immediate benefit to any project.

Software methodology is a difficult topic, on which we still have a lot to learn. I expect some healthy discussions, but I hope readers will appreciate the opportunity to discuss agile ideas in depth for the greater benefit of quality software development.

I also made a point of writing a book that (unlike my last two) is short: 190 pages, including preface, index and everything else.

The table of contents follows; more details and sample chapters can be found on the book page listed above.

Preface
1 OVERVIEW
     1.1 VALUES
     1.2 PRINCIPLES
          Organizational principles
          Technical principles
     1.3 ROLES
     1.4 PRACTICES
          Organizational practices
          Technical practices
     1.5 ARTIFACTS
          Virtual artifacts
          Material artifacts
     1.6 A FIRST ASSESSMENT
          Not new and not good
          New and not good
          Not new but good
          New and good!

2 DECONSTRUCTING AGILE TEXTS
     2.1 THE PLIGHT OF THE TRAVELING SEMINARIST
          Proof by anecdote
          When writing beats speaking
          Discovering the gems
          Agile texts: reader beware!
     2.2 THE TOP SEVEN RHETORICAL TRAPS
          Proof by anecdote
          Slander by association
          Intimidation
          Catastrophism
          All-or-nothing
          Cover-your-behind
          Unverifiable claims
          Postscript: you have been ill-served by the software industry!

&3 THE ENEMY: BIG UPFRONT ANYTHING
     3.1 PREDICTIVE IS NOT WATERFALL
     3.2 REQUIREMENTS ENGINEERING
          Requirements engineering techniques
          Agile criticism of upfront requirements
          The waste criticism
          The change criticism
          The domain and the machine
     3.3 ARCHITECTURE AND DESIGN
          Is design separate from implementation?
          Agile methods and design
     3.4 LIFECYCLE MODELS
     3.5 RATIONAL UNIFIED PROCESS
     3.6 MATURITY MODELS
          CMMI in plain English
          The Personal Software Process
          CMMI/PSP and agile methods
          An agile maturity scale

4 AGILE PRINCIPLES
     4.1 WHAT IS A PRINCIPLE?
     4.2 THE OFFICIAL PRINCIPLES
     4.3 A USABLE LIST
     4.4 ORGANIZATIONAL PRINCIPLES
          Put the customer at the center
          Let the team self-organize
          Maintain a sustainable pace
          Develop minimal software
          Accept change
     4.5 TECHNICAL PRINCIPLES
          Develop iteratively
          Treat tests as a key resource
          Do not start any new development until all tests pass
          Test first
          Express requirements through scenarios

5 AGILE ROLES
     5.1 MANAGER
     5.2 PRODUCT OWNER
     5.3 TEAM
          Self-organizing
          Cross-functional
     5.4 MEMBERS AND OBSERVERS
     5.5 CUSTOMER
     5.6 COACH, SCRUM MASTER
     5.7 SEPARATING ROLES

6 AGILE PRACTICES: MANAGERIAL
     6.1 SPRINT
          Sprint basics
          The closed-window rule
          Sprint: an assessment
     6.2 DAILY MEETING
     6.3 PLANNING GAME
     6.4 PLANNING POKER
     6.5 ONSITE CUSTOMER
     6.6 OPEN SPACE
     6.7 PROCESS MINIATURE
     6.8 ITERATION PLANNING
     6.9 REVIEW MEETING
     6.10 RETROSPECTIVE
     6.11 SCRUM OF SCRUMS
     6.12 COLLECTIVE CODE OWNERSHIP
          The code ownership debate
          Collective ownership and cross-functionality

7 AGILE PRACTICES: TECHNICAL
     7.1 DAILY BUILD AND CONTINUOUS INTEGRATION
     7.2 PAIR PROGRAMMING
          Pair programming concepts
          Pair programming versus mentoring
          Mob programming
          Pair programming: an assessment
     7.3 CODING STANDARDS
     7.4 REFACTORING
          The refactoring concept
          Benefits and limits of refactoring
          Incidental and essential changes
          Combining a priori and a posteriori approaches
     7.5 TEST-FIRST AND TEST-DRIVEN DEVELOPMENT
          The TDD method of software development
          An assessment of TFD and TDD

8 AGILE ARTIFACTS
     8.1 CODE
     8.2 TESTS
     8.3 USER STORIES
     8.4 STORY POINTS
     8.5 VELOCITY
     8.6 DEFINITION OF DONE
     8.7 WORKING SPACE
     8.8 PRODUCT BACKLOG, ITERATION BACKLOG
     8.9 STORY CARD, TASK CARD
     8.10 TASK AND STORY BOARDS
     8.11 BURNDOWN AND BURNUP CHARTS
     8.12 IMPEDIMENT
     8.13 WASTE, TECHNICAL DEBT, DEPENDENCY, DEPENDENCY CHARTS

9 AGILE METHODS
     9.1 METHODS AND METHODOLOGY
          Terminology
          The fox and the hedgehog
     9.2 LEAN SOFTWARE AND KANBAN
          Lean Software’s Big Idea
          Lean Software’s principles
          Lean Software: an assessment
          Kanban
     9.3 EXTREME PROGRAMMING
          XP’s Big Idea
          XP: the unadulterated source
          Key XP techniques
          Extreme Programming: an assessment
     9.4 SCRUM
          Scrum’s Big Idea
          Key Scrum practices
          Scrum: an assessment
     9.5 CRYSTAL
          Crystal’s Big Idea
          Crystal principles
          Crystal: an assessment

10 DEALING WITH AGILE TEAMS
     10.1 GRAVITY STILL HOLDS
     10.2 THE EITHER-WHAT-OR-WHEN FALLACY

11 THE UGLY, THE HYPE AND THE GOOD: AN ASSESSMENT OF THE AGILE APPROACH
     11.1 THE BAD AND THE UGLY
          Deprecation of upfront tasks
          User stories as a basis for requirements
          Feature-based development and ignorance of dependencies
          Rejection of dependency tracking tools
          Rejection of traditional manager tasks
          Rejection of upfront generalization
          Embedded customer
          Coach as a separate role
          Test-driven development
          Deprecation of documents
     11.2 THE HYPED
     11.3 THE GOOD
     11.4 THE BRILLIANT
Bibliography
Index

 

VN:F [1.9.10_1130]
Rating: 8.6/10 (9 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 3 votes)

Multirequirements (new paper)

 

As part of a Festschrift volume for Martin Glinz of the university of Zurich I wrote a paper [1] describing a general approach to requirements that I have been practicing and developing for a while, and presented in a couple of talks. The basic idea is to rely on object-oriented techniques, including contracts for the semantics, and to weave several levels of discourse: natural-language, formal and graphical.

Reference

[1] Bertrand Meyer: Multirequirements, to appear in Martin Glinz Festschrift, eds. Anne Koziolek and Norbert Scheyff, 2013, available here.

VN:F [1.9.10_1130]
Rating: 10.0/10 (4 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 4 votes)