A carefully designed Result
In the Eiffel user discussion group [1], Ian Joyner recently asked:
A lot of people are now using Result as a variable name for the return value in many languages. I believe this first came from Eiffel, but can’t find proof. Or was it adopted from an earlier language?
Proof I cannot offer, but certainly my recollection is that the mechanism was an original design and not based on any previous language. (Many of Eiffel’s mechanisms were inspired by other languages, which I have always acknowledged as precisely as I could, but this is not one of them. If there is any earlier language with this convention — in which case a reader will certainly tell me — I was and so far am not aware of it.)
The competing conventions are a return instruction, as in C and languages based on it (C++, Java, C#), and Fortran’s practice, also used in Pascal, of using the function name as a variable within the function body. Neither is satisfactory. The return instruction suffers from two deficiencies:
- It is an extreme form of goto, jumping out of a function from anywhere in its control structure. The rest of the language sticks to one-entry, one-exit structures, as I think all languages should.
- In most non-trivial cases the return value is not just a simple formula but has to be computed through some algorithm, requiring the declaration of a local variable just to denote that result. In every case the programmer must invent a name for that variable and, in a typed language, include a declaration. This is tedious and suggests that the language should take care of the declaration for the programmer.
The Fortran-Pascal convention does not combine well with recursion (which Fortran for a long time did not support). In the body of the function, an occurrence of the function’s name can denote the result, or it can denote a recursive call; conventions can be defined to remove the ambiguity, but they are messy, especially for a function without arguments: in function f, does the instruction
f := f + 1
add one to the value of the function’s result as computed so far, as it would if f were an ordinary variable, or to the result of calling f recursively?
Another problem with the Fortran-Pascal approach is that in the absence of a language-defined rule for variable initialization a function can return an undefined result, if some path has failed to initialize the corresponding variable.
The Eiffel design addresses these problems. It combines several ideas:
- No nesting of routines. This condition is essential because without it the name Result would be ambiguous. In all Algol- and Pascal-like languages it was considered really cool to be able to declare routines within routines, without limitation on the depth of recursion. I realized that in an object-oriented language such a mechanism was useless and in fact harmful: a class should be a collection of features — services offered to the rest of the world — and it would be confusing to define features within features. Simula 67 offered such a facility; I wrote an analysis of inter-module relations in Simula, including inheritance and all the mechanisms retained from Algol such as nesting (I am trying to find that document, and if I do I will post it in this blog); my conclusion was the result was too complicated and that the main culprit was nesting. Requiring classes to be flat structures was, in my opinion, one of the most effective design decisions for Eiffel.
- Language-defined initialization. Even a passing experience with C and C++ shows that uninitialized variables are one of the major sources of bugs. Eiffel introduced a systematic rule for all variables, including Result, and it is good to see that some subsequent languages such as Java have retained that convention. For a function result, it is common to ignore the default case, relying on the standard initialization, as in if “interesting case” then Result:= “interesting value” end without an else clause (I like this convention, but some people prefer to make all cases explicit).
- One-entry, one-exit blocks; no goto in overt or covert form (break, continue etc.).
- Design by Contract mechanisms: postconditions usually need to refer to the result computed by a function.
The convention is then simple: in any function, you can use a language-defined local variable Result for you, of the type that you declared for the function result; you can use it as a normal variable, and the result returned by any particular call will be the final value of the variable on exit from the function body.
The convention has been widely imitated, starting with Delphi and most recently in Microsoft’s “code contracts”, a kind of poor-man’s Design by Contract emulation, achieved through libraries; it requires a Result notation to denote the function result in a postcondition, although this notation is unrelated to the mechanisms in the target languages such as C#. As the example of Eiffel’s design illustrates, a programming language is a delicate construction where all elements should fit together; the Result convention relies on many other essential concepts of the language, and in turn makes them possible.
Reference
[1] Eiffel Software discussion group, here.
Delphi demonstrates that an automatically-declared Result variable does in fact work with nested functions. It relies, however, on the Pascal-style scoping rules for identifier names: ‘Result’ refers to the innermost scope. Eiffel eschews these scoping rules because they could cause confusion (though in my personal experience it was never a problem in Delphi). Eiffel’s flat scoping rules wouldn’t have allowed nested routines in combination with ‘Result’.
Certainly there’s little need for nested routines in an object-oriented language. In Delphi, I only ever used them in order to avoid polluting the class’s interface with features that would be needed only by a single routine. Whenever I found myself writing such nested routines, however, it was a pretty clear indication that I ought to extract the implementation of that routine to a new class.
It’s interesting that, in the 21st century, it is now possible in fact to write nested routines in Eiffel. We have in-line agents! Some people believe that adding in-line agents to Eiffel was a mistake because their services are unavailable to the rest of the world. Personally, I like in-line agents; I just wish that there was a more concise syntax for using them. For example, as the article notes, C#’s “Code Contracts” are a clunky emulation of Design by Contract (i.e., clunky compared to Eiffel, although compared to any other DbC emulator that I’ve ever used they are excellent). I find that “Code Contracts” are better than Eiffel’s DbC in one respect, namely that I routinely find myself writing contracts that iterate over collections with ease. With Eiffel I rarely do so. This is because the syntax of C#’s lambda expressions is so convenient. Sure it looked like it came from Mars when I first saw it, but once I got the hang of it I suddenly felt free to write more complete contracts. The syntax of Eiffel’s in-line agents is too verbose. If Eiffel had an operator to simplify in-line agents, then I think a lot more interesting contracts would get written.
You’ve also touched on something that is a constant irritant to me in languages other than Eiffel:
* The ‘return’ instruction “is an extreme form of goto, jumping out of a function from anywhere in its control structure.”
* Eiffel has “One-entry, one-exit blocks; no goto in overt or covert form (break, continue etc.).”
I know this. I was taught this by reading Wirth in the mid-1980s, and with practice it became second nature to me to write structured code. But almost the only other programmers I have ever worked with who seemed to understand this were Eiffel programmers. Every one else believes that return, break and continue are okay, because although everybody knows that Dykstra considered ‘goto’ harmful they don’t believe he was talking about these other jump instructions. Even Martin Fowler’s otherwise excellent “Refactoring” book contains ten infuriating pages in which he sets up straw man examples of badly-written structured code which he then proceeds to “improve” by returning from the middle of routines. (The final versions are indeed clearer, but it’s because he got rid of the code’s other defects on the way to injecting a bit of spaghetti.) Needless to say, it is tedious to spend one’s entire career reading code written by colleagues, many of whom are great programmers, but who don’t understand that they shouldn’t jump out of the middle of a routine or a loop. I’ve just started working on a new project where all of the code (in Java) was written by a really smart programmer, but … (groan) … I’ve already run into dozens of cases where he returns from the middle of routines. I’ve given up arguing the issue with people, because I’ve never managed to find an article on-line to support my conviction; and OOSC2 disappointingly glosses over the issue, apparently assuming that the issue was settled, once and for all, decades ago.
Is there a good explanation supporting our conviction that break and continue and return are just covert forms of goto?