Attached by default?
Opinions requested! See at end.
A void call, during the execution of an object-oriented program, is a call of the standard OO form
x·some_routine (…) /CALL/
where x, a reference, happens to be void (null) instead of denoting, as expected, an object. The operation is not possible; it leads to an exception and, usually, a crash of the program. Void calls are also called “null pointer dereferencing”.
One of the major advances in Eiffel over the past years has been the introduction of attached types, entirely removing the risk of void calls. The language mechanisms, extending the type system, make void-call avoidance a static property, part of type checking: just as the compiler will prevent you from assigning a boolean value to an integer variable, so will it flag your program if it sees a risk of void call. Put the other way around, if your program passes compilation, you have the guarantee that its executions will never produce a void call. Attached types thus remove one of the major headaches of programming, what Tony Hoare [1] called his “one-billion-dollar mistake”:
I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W) [2]. My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty year
Thanks to attached types, Eiffel programmers can sleep at night: their programs will not encounter void calls.
To benefit from this advance, you must declare variables accordingly, as either attached (never void after initialization) or detachable (possibly void). You must also write the program properly:
- If you declare x attached, you must ensure in the rest of the program that before its first use x will have been attached to an object, for example through a creation instruction create x.
- If you declare x detachable, you must make sure that any call of the above form /CALL/ happens in a context where x is guaranteed to be non-void; for example, you could protect it by a test if x /= Void then… or, better, an “object test”.
Code satisfying these properties is called void-safe.
Void safety is the way to go: who wants to worry about programs, even after they have been thoroughly tested and have seemingly worked for a while, crashing at unpredictable times? The absence of null-pointer-dereferencing can be a statically enforced property, as the experience of Eiffel now demonstrates; and that what it should be. One day, children will think void-safely from the most tender age, and their great-grandparents will tell them, around the fireplace during long and scary winter nights, about the old days when not everyone was programming in Eiffel and even those who did were worried about the sudden null-pointer-derefencing syndrome. To get void safety through ordinary x: PERSON declarations, you had (children, hold your breath) to turn on a compiler option!
The transition to void safety was neither fast nor easy; in fact, it has taken almost ten years. Not everyone was convinced from the beginning, and we have had to improve and simplify the mechanism along the way to make void-safe programming practical. Compatibility has been a key issue throughout: older classes are generally not void-safe, but in a language that has been around for many years and has a large code base of operational software it is essential to ensure a smooth transition. Void safety has, from its introduction, been controlled by a compiler option:
- With the option off, old code will compile as it used to do, but you do not get any guarantee of void safety. At execution time, a void call can still cause your program to go berserk.
- With the option on, you get the guarantee: no void calls. To achieve this goal, you have to make sure the classes obey the void safety rules; if they do not, the compiler will reject them until you fix the problem.
In the effort to reconcile the compatibility imperative with the inexorable evolution to void safety, the key decisions have affected default values for compiler options and language conventions. Three separate decisions, in fact. Two of the defaults have already been switched; the question asked at the end of this article addresses the switching of the last remaining one.
The first default governed the void-safety compiler option. On its introduction, void-safety was off by default; the mechanism had to be turned on explicitly, part of the “experimental” option that most EiffelStudio releases offer for new, tentative mechanisms. That particular decision changed a year ago, with version 7.3 (May 2013): now void safety is the default. To include non-void-safe code you must mark it explicitly.
The second default affects a language convention: the meaning of a standard declaration. A typical declaration, such as
x: PERSON /A/
says that at run time x denotes a reference which, if not void, will be attached to an object of type PERSON. In pre-void-safety Eiffel, as in today’s other typed OO languages, the reference could occasionally become void at run time; in other words, x was detachable. With the introduction of void safety, you could emphasize this property by specifying it explicitly:
x: detachable PERSON /B/
You could also specify that x would never be void by declaring it attached, asking the compiler to guarantee this property for you (through its application of the void-safety rules to all operations involving x). The explicit form in this case is
x: attached PERSON /C/
In practical programming, of course, you do not want to specify attached or detachable all the time: you want to use the simple form /A/ as often as possible. Originally, since we were starting from a non-void-safe language, compatibility required /A/ to mean /B/ by default. But it turns out that “attached” really is the dominant case: most references should remain attached at all times and Void values should be reserved for important but highly specialized cases such as terminating linked data structures. So the simple form should, in the final state of the language, mean /C/. That particular default was indeed switched early (version 7.0, November 2011) for people using the void-safety compiler option. As a result, the attached keyword is no longer necessary for declarations such as the above, although it remains available. Everything is attached by default; when you want a reference that could be void (and are prepared to bear the responsibility for convincing the compiler that it won’t when you actually use it in a call), you declare it as detachable; that keyword remains necessary.
There remains one last step in the march to all-aboard-for-void-safety: removing the “detachable by default” option, that is to say, the compiler option that will make /A/ mean /B/ (rather than /C/). It is only an option, and not the default; but still it remains available. Do we truly need it? The argument for removing it is that it simplifies the specification (the fewer options the better) and encourages everyone, even more than before, to move to the new world. The argument against is to avoid disturbing existing projects, including their compiler control files (ECFs).
The question looms: when do we switch the defaults? Some of us think the time is now; specifically, the November release (14.11) [4].
Do you think the option should go? We would like your opinion. Please participate in the Eiffelroom poll [5].
References and note
[1] C.A.R. Hoare: Null References: The Billion Dollar Mistake , abstract of talk at QCon London, 9-12 March 2009, available here.
[2] (BM note) As a consolation, before Algol W, LISP already had NIL, which is the null pointer.
[3] Bertrand Meyer, Alexander Kogtenkov and Emmanuel Stapf: Avoid a Void: The Eradication of Null Dereferencing, in Reflections on the Work of C.A.R. Hoare, eds. C. B. Jones, A.W. Roscoe and K.R. Wood, Springer-Verlag, 2010, pages 189-211, available here.
[4] EiffelStudio version numbering changed in 2014: from a classic major_number.minor_number to a plain year.month, with two principal releases, 5 and 11 (May and November).
[5] Poll on switching the attachment defaults: at the bottom of the Eiffelroom page here (direct access here).
Attached by default is great, but detachable objects should have this represented in the type system using the (attached only) Option type. By itself, this is a small improvement, but higher order functions like map, bind and filter and a monad library that provides numerous useful functions like join, liftM2, sequence and traverse make a huge difference. Once functional progamming aspects are introduced in the next release, hopefully this can be created and be practical to improve the situation.
Interesting Eiffel feature. There is a similar language construct in Ada (since the 2005 revision), the non-null pointers (which can be used for any pointer type, including objects and functions):
type F is access function (X: Float) return Float;
Fn: not null F := Sqrt’Access;
http://www.adaic.org/resources/add_content/standards/05rat/html/Rat-3-2.html
A pointer (“access” in Ada terminology) variable or type can be declared as “not null” when it must always point to an adequate value. So the compiler is able to reject those programs when this is not the case, and to remove the null dereference check when handling non-null pointers.
I think that the main difference wrt Eiffel attached types is that the Ada compiler does not reject a program if there is not an explicity check before converting a regular pointer into a non-null pointer, but in that case the compiler will automatically add a not null check which would raise a run-time exception.
There is also the GCC-specific function attribute ‘nonnull’, to specify that some of the arguments to the function should not be NULL pointers:
extern void *
my_memcpy (void *dest, const void *src, size_t len)
__attribute__((nonnull (1, 2)));
In any case, this would just raise a compiler warning in case it discovers at compilation time that any of those arguments is NULL, but the type of problems it can detect is limited.