Archive for the ‘Writing and style’ Category.

The Modes and Uses of Scientific Publication

Publication is about helping the advancement of humankind. Of course.

Let us take this basis for granted and look at the other, possibly less glamorous aspects.

Publication has four modes: Publicity; Exam; Business; and Ritual.

1. Publication as Publicity

The first goal of publication is to tell the world that you have discovered something: “See how smart I am!” (and how much smarter than all the others out there!). In a world devoid of material constraints for science, or where the material constraints are handled separately, as in 19th-century German universities where professors were expected to fund their own labs, this would be the only mode and use of publication. Science today is a more complex edifice.

A good sign that Publication as Publicity is only one of the modes is that with today’s technology we could easily skip all the others. If all we cared about were to make our ideas and results known, we would simply put out our papers on ArXiv or just our own Web page. But almost no one stops there; researchers submit to conferences and journals, demonstrating how crucial the other three modes are to the modern culture of science.

2. Publication as Exam

Academic careers depend on a publication record. Actually this is not supposed to be the case; search and tenure committees are officially interested in “impact,” but any candidate is scared of showing a short publication list where competitors have tens or (commonly) hundreds of items.

We do not just publish; we want to be chosen for publication. Authors are proud of the low acceptance rates of conferences at which their papers have been accepted; in the past few years it has in fact become common practice, in publication lists attached to CVs, to list this percentage next to each accepted article. Acceptance rates are carefully tracked; see for example [2] for software engineering.

As Jeff Naughton has pointed out [1], this mode of working amounts to giving researchers the status of students forced to take exams again and again. Maybe that part is inevitable; the need to justify ourselves anew every morning may be an integral part of being a scientist, especially one funded by other people’s money. Two other consequences of this phenomenon are, I believe, more damaging.

The first risk directly affects the primary purpose of publication (remember the advancement of humankind?): a time-limited review process with low acceptance rates implies that some good papers get rejected and some flawed ones accepted. Everyone in software engineering knows (and recent PC chairs have admitted) that getting a paper accepted at the International Conference on Software Engineering is in part a lottery; with an acceptance rate hovering around 13%, this is inevitable. The mistakes occur both ways: papers accepted or even getting awards, then shown a few months later to be inaccurate; and innovative papers getting rejected because some sentence rubbed the referees the wrong way, or some paper was not cited. With a 4-month review cycle, and the next deadline coming several months later, the publication of a truly important result can be delayed significantly.

The second visible damage is publication inflation. Today’s research environment channels productive research teams towards an LPU (Least Publishable Unit) publication practice, causing an explosion of small contributions and the continuous decrease of the ratio of readers to writers. When submitting a paper I have always had, as my personal goal, to be read; but looking at the overall situation of computer science publication today suggests that this is not the dominant view: the overwhelming goal of publication is publication.

3. Publication as Business

Publishing requires an infrastructure, and money plays a role. Conferences in particular are a business. They have a budget to balance, not always an easy task, although a truly successful conference can be a big money-maker for its sponsor, commercial or non-profit. The financial side of conference publication has its consequences on authors: if you do not pay your fees, not only will you be unable to participate, but your paper will not be published.

One can deplore these practices, in particular their effect on authors from less well-endowed institutions, but they result from today’s computer science publication culture with its focus on the conference, what Lance Fortnow has called “A Journal in a Hotel”.

Sometimes the consequences border on the absurd. The ASE conference (Automated Software Engineering) accepts some contributions as “short papers”. Fair enough. At ASE 2009, “short paper” did not mean a shorter conference presentation but the permission to put up a poster and stand next to it for a while and answer passersby’s questions. For that privilege — and the real one: a publication in the conference volume — one had to register for the conference. ASE 2009 was in New Zealand, the other end of the world for a majority of authors. I ceded to the injunction: who was I to tell the PhD student whose work was the core of the submission, and who was so happy to have a paper accepted at a well-ranked conference, that he was not going to be published after all? But such practices are dubious. It would be more transparent to set up an explicit pay-for-play system, with page charges: at least the money would go to a scientific society or a university. Instead we ended up funding (in addition to the conference, which from what I heard was an excellent experience) airlines and hotels.

What makes such an example remarkable is that a reasonable justification exists for every one of its components: a highly selective refereeing process to maintain the value of the publication venue; limiting the number of papers selected for full presentation, to avoid a conference with multiple parallel tracks (and the all too frequent phenomenon of conference sessions whose audience consists of the three presenters plus the session chair); making sure that authors of published papers actually attend the event, so that it is a real conference with personal encounters, not just an opportunity to increment one’s publication count. The concrete result, however, is that authors of short papers have the impression of being ransomed without getting the opportunity to present their work in a serious way. Literally seconds as I was going to hit the “publish” button for the present article, an author of an accepted short paper for ASE 2012 (where the process appears similar) sent an email to complain, triggering a new discussion. We clearly need to find better solutions to resolve the conflicting criteria.

4. Publication as Ritual

Many of the seminal papers in science, including some of the most influential in computer science, defy classification and used a distinctive, one-of-a-kind style. Would they stand a chance in one of today’s highly ranked conferences, such as ICSE in software or VLDB in databases? It’s hard to guess. Each community has developed its own standard look-and-feel, so that after a while all papers start looking the same. They are like a classical mass with its Te Deum, Agnus Dei and Kyrie Eleison. (The “Te Deum” part is, in a conference submission, spread throughout the paper, in the form of adoring citations of the program committee members’ own divinely inspired articles, good for their H-indexes if they bless your own offering.)

All empirical software engineering papers, for example, have the obligatory “Threats to Validity” section, which is has developed into a true art form. The trick is the same as in the standard interview question “What can you say about your own deficiencies?”, to which every applicant know the key: describe a personality trait so that you superficially appear self-critical but in reality continue boasting, as in “sometimes I take my work too much to heart” [3]. The “Threats to Validity” section follows the same pattern: you try to think of all possible referee objections, the better to refute them.

Another part of the ritual is the “related work” section, treacherous because you have to make sure not to omit anything that a PC member finds important; also, you must walk a fine line between criticizing existing research too much, which could offend someone, or not enough, which enables the referee to say that you are not bringing anything significantly new. I often wonder who, besides the referees, reads those sections. But here too it is easier to lament than to fault the basic idea or propose better solutions. We do want to avoid wasting our time on papers whose authors are not aware of previous work. The related work section allows referees to perform this check. Its importance in the selection process has, however, grown out of proportion. It is one thing to make sure that a paper is state-of-the-art, but another to reject it (as often happens) because it fails to cite a particular contribution whose results would not directly affect its own. Here we move from the world of the rational to the world of the ritual. An extreme and funny recent example — funny to me, not necessarily to the coauthors — is a rejection from  APSEC 2011, the Australia-Pacific Software Engineering Conference, based on one review (the others were positive) that stated: “How novel is this? Are [there] not any cloud-based IDEs out there that have [a] similar awareness model integrated into their CM? This is something the related work [section] fails to describe precisely. [4] The ritual here becomes bizarre: as far as we know, no existing system discusses a similar model; the reviewer too does not know of any; but he blasts the paper all the same for not citing work that he thinks must have been done by someone, somehow, somewhere. APSEC is a fine conference — it has to be, from the totally unbiased criterion that it accepted another one of our submissions this year! — and this particular paper may or may not have been ready for publication; judge it for yourself [5]. Such examples suggest, however, that the ritual of computer science publication has its limits.

Publicity, Exam, Business, Ritual: to which one of the four modes of publication are you most attuned? Oh, sorry, I forgot: in your case, it is solely for the advancement of humankind.

References and notes

[1] Jeffrey F. Naughton, DBMS Research: First 50 Years, Next 50 Years, slides of keynote at 26th IEEE International Conference on Data Engineering, 2010, available at lazowska.cs.washington.edu/naughtonicde.pdf .

[2] Tao Xie, Software Engineering Conferences, at people.engr.ncsu.edu/txie/seconferences.htm .

[3] I once saw on French TV a hilarious interview of an entrepreneur who had started a software company in Vietnam, where job candidates just did not know “the code”, and moved on, in response to such a question, to tell the interviewer about being rude to their mother and all the other horrible things they had done in their lives.

[4] The words in brackets were not in the review but I added them for clarity.

[5] Martin Nordio, H.-Christian Estler, Carlo A. Furia and Bertrand Meyer: Collaborative Software Development on the Web, available at arxiv.org/abs/1105.0768 .

(This article was first published on the CACM blog in September 2011.)

VN:F [1.9.10_1130]
Rating: 9.1/10 (9 votes cast)
VN:F [1.9.10_1130]
Rating: +7 (from 7 votes)

The story of our field, in a few short words

 

(With all dues to [1], but going up from four to five as it is good to be brief yet not curt.)

At the start there was Alan. He was the best of all: built the right math model (years ahead of the real thing in any shape, color or form); was able to prove that no one among us can know for sure if his or her loops — or their code as a whole — will ever stop; got to crack the Nazis’ codes; and in so doing kind of saved the world. Once the war was over he got to build his own CPUs, among the very first two or three of any sort. But after the Brits had used him, they hated him, let him down, broke him (for the sole crime that he was too gay for the time or at least for their taste), and soon he died.

There was Ed. Once upon a time he was Dutch, but one day he got on a plane and — voilà! — the next day he was a Texan. Yet he never got the twang. The first topic that had put him on  the map was the graph (how to find a path, as short as can be, from a start to a sink); he also wrote an Algol tool (the first I think to deal with all of Algol 60), and built an OS made of many a layer, which he named THE in honor of his alma mater [2]. He soon got known for his harsh views, spoke of the GOTO and its users in terms akin to libel, and wrote words, not at all kind, about BASIC and PL/I. All this he aired in the form of his famed “EWD”s, notes that he would xerox and send by post along the globe (there was no Web, no Net and no Email back then) to pals and foes alike. He could be kind, but often he stung. In work whose value will last more, he said that all we must care about is to prove our stuff right; or (to be more close to his own words) to build it so that it is sure to be right, and keep it so from start to end, the proof and the code going hand in hand. One of the keys, for him, was to use as a basis for ifs and loops the idea of a “guard”, which does imply that the very same code can in one case print a value A and in some other case print a value B, under the watch of an angel or a demon; but he said this does not have to be a cause for worry.

At about that time there was Wirth, whom some call Nick, and Hoare, whom all call Tony. (“Tony” is short for a list of no less than three long first names, which makes for a good quiz at a party of nerds — can you cite them all from rote?) Nick had a nice coda to Algol, which he named “W”; what came after Algol W was also much noted, but the onset of Unix and hence of C cast some shade over its later life. Tony too did much to help the field grow. Early on, he had shown a good way to sort an array real quick. Later he wrote that for every type of unit there must be an axiom or a rule, which gives it an exact sense and lets you know for sure what will hold after every run of your code. His fame also comes from work (based in part on Ed’s idea of the guard, noted above) on the topic of more than one run at once, a field that is very hot today as the law of Moore nears its end and every maker of chips has moved to  a mode where each wafer holds more than one — and often many — cores.

Dave (from the US, but then at work under the clime of the North) must not be left out of this list. In a paper pair, both from the same year and both much cited ever since,  he told the world that what we say about a piece of code must only be a part, often a very small part, of what we could say if we cared about every trait and every quirk. In other words, we must draw a clear line: on one side, what the rest of the code must know of that one piece; on the other, what it may avoid to know of it, and even not care about. Dave also spent much time to argue that our specs must not rely so much on logic, and more on a form of table.  In a later paper, short and sweet, he told us that it may not be so bad that you do not apply full rigor when you chart your road to code, as long as you can “fake” such rigor (his own word) after the fact.

Of UML, MDA and other such TLAs, the less be said, the more happy we all fare.

A big step came from the cold: not just one Norse but two, Ole-J (Dahl) and Kris, came up with the idea of the class; not just that, but all that makes the basis of what today we call “O-O”. For a long time few would heed their view, but then came Alan (Kay), Adele and their gang at PARC, who tied it all to the mouse and icons and menus and all the other cool stuff that makes up a good GUI. It still took a while, and a lot of hit and miss, but in the end O-O came to rule the world.

As to the math basis, it came in part from MIT — think Barb and John — and the idea, known as the ADT (not all TLAs are bad!), that a data type must be known at a high level, not from the nuts and bolts.

There also is a guy with a long first name (he hates it when they call him Bert) but a short last name. I feel a great urge to tell you all that he did, all that he does and all that he will do, but much of it uses long words that would seem hard to fit here; and he is, in any case, far too shy.

It is not all about code and we must not fail to note Barry (Boehm), Watts, Vic and all those to whom we owe that the human side (dear to Tom and Tim) also came to light. Barry has a great model that lets you find out, while it is not yet too late, how much your tasks will cost; its name fails me right now, but I think it is all in upper case.  At some point the agile guys — Kent (Beck) and so on — came in and said we had got it all wrong: we must work in pairs, set our goals to no more than a week away, stand up for a while at the start of each day (a feat known by the cool name of Scrum), and dump specs in favor of tests. Some of this, to be fair, is very much like what comes out of the less noble part of the male of the cow; but in truth not all of it is bad, and we must not yield to the urge to throw away the baby along with the water of the bath.

I could go on (and on, and on); who knows, I might even come back at some point and add to this. On the other hand I take it that by now you got the idea, and even on this last day of the week I have other work to do, so ciao.

Notes

[1] Al’s Famed Model Of the World, In Words Of Four Signs Or Fewer (not quite the exact title, but very close): find it on line here.

[2] If not quite his alma mater in the exact sense of the term, at least the place where he had a post at the time. (If we can trust this entry, his true alma mater would have been Leyde, but he did not stay long.)

VN:F [1.9.10_1130]
Rating: 10.0/10 (14 votes cast)
VN:F [1.9.10_1130]
Rating: +11 (from 11 votes)

Fun with Bayes

 

Try this:  go to translate.google.com, choose Russian as the source and English as the target languages. In the input field, type“Андрей Иванович мне писал” or, if you do not have a Cyrillic keyboard, the transliteration into the Latin alphabet, “Andrej Ivanovich mne pisal”, which (unless you uncheck the default option) will be automatically transcribed into Cyrillic as you go. The correct English translation appears: “Andrey wrote to me”.

Correct yes, but partial: the input did not read “Andrey” but “Andrey Ivanovich”. Russians have a first name, a last name and also a “patronymic”  based on the father’s name: our Andrey’s father is or was called Ivan. Following the characters in, say, War and Peace, would be next to impossible without patronymics (it’s hard enough with them). On the other hand, English usually omits the patronymic, so if you are translating something simpler than a Tolstoy novel it is reasonable for an automated tool to yield “Andrey” as the translation of “Andrey Ivanovich”. In some cases, depending on the context, it gives “Andrei”, and in some others the anglicized “Andrew”.

Google Translate has yet another translation for Andrey Ivanovich”. Assume that you want to be specific; maybe you know two people called Andrey and use the patronymic to distinguish between them. You want to say, for example,

       I have in mind not Andrey Nikolaevich, but Andrey Petrovich.

You can enter this as “Ja imeju v vidu ne Andreja Nikolaevicha, no Andreja Petrovicha”, or copy-paste the Cyrillic: Я имею в виду не Андрея Николаевича, но Андрея Петровича. (Note for Russian speakers: the word expected after the comma is of course а, but Google Translate, knowing that а can mean “and” in other contexts, translates it here with the opposite of the intended meaning. This is why I use но, which sounds strange but understandable, and is correctly translated.). Try it now and see what comes out on the English side:

      I do not mean Kolmogorov, but Andrei Petrovich.

Google Translate, in other words, has another translation for “Andrey Nikolaevich”:

       Kolmogorov

The great mathematician A.N. Kolmogorov (1903-1987) indeed had this first name and this patronymic; to conclude that anyone with these names is also called Kolmogorov is not, however, a step that most of us are prepared to take, especially those of us with a (living) friend called Andrey Nikolaevich.

A favorite of Russians is “Dobroje Utro, Dmitri Anatolevich”, meaning “Good morning, Dimitri Anatolevich”, which Google translates into “Good morning, Mr. President”. I will let you figure this one out.

All the translations cited were, by the way, obtained on the date of this post; algorithms can change. For other examples, see this page, in Russian (thanks to Sergey Velder for bringing it to my attention).

What happened? Automatic translation has made great progress in recent years, largely as a result of the switch from structural, precise techniques based on linguistic theory to approximate methods based on statistics. These methods rely on an immense corpus of existing human translations, accessible on the Internet, and apply Bayesian techniques to match every text element to the most frequently encountered translation of a similar phrase in existing translations. This switch has caused a revolution in translation, making it possible to get approximate equivalents. Personally I find them most useful for a language I do not know at all: if I want to read a Web page in Korean, I can get its general idea, which I could not have done fifteen years ago without finding a native speaker. For a language that I know imperfectly, the help is less clear, because the translations are almost never entirely right; in fact they are almost always, beyond the level of simple phrases, grammatically incorrect.

With Bayesian techniques it is understandable why “Andrey Nikolaevich” sometimes comes out in English as “Kolmogorov”: he is probably the most famous of all Andrey Nikolaevichs in Google’s database of Russian-English translation pairs. If you do not know the database, the behavior is mysterious, as you cannot usually guess whether the translation in a particular context will be “Andrey, “Andrei”, “Andrew”, “Andrey N.” or “Kolmogorov”, the five variants that I have seen (try your own experiments!). Some cases are predictable once you know that the techniques are statistical: if you include the word “Teorema” (theorem) anywhere close, you are sure to get “Kolmogorov”. But usually there is no obvious clue.

Statistical techniques are great but such examples, beyond the fun, show their limits. I truly hope that in the future they can be combined with more exact techniques based on sound linguistics.

Postscript: are you bytypal?

Perhaps I should explain why I use Google Translate with Russian as the source language. I do not use it for translation, but I do need it to type texts in Russian. I could use a Cyrillic keyboard, but I don’t because I am a very fast touch typist on the English (QWERTY) keyboard. (Learning to type at a professional level early in life was one of the most useful skills I ever acquired — not as useful as grammar, set theory or axiomatic semantics, but far more useful than separation logic.) So it is convenient for me to type in Latin letters, say “Dostojeksky”, and rely on a tool that immediately transliterates into the Cyrillic equivalent, here Достоевский . Then I can copy-paste the result into, for example, an email to a Russian-speaking recipient.

I used to rely on a tool that does exactly this, Translit (www.translit.ru); I have of late found Google Translate generally more convenient because it does not just transliterate but relies on its database to correct some typos. I do not need the translation (except possibly to check that what I wrote makes sense), but I see it anyway; that is how I ran into the Bayesian fun described above.

As a matter of fact the transliteration tool is good but, as often with software from Google, only  “almost” right. Sometimes it simply refuses to transliterate what I wrote, because it insists on its own misguided idea of what I meant. The Auto Correct option of Microsoft Office has the good sense, when it wrongly corrects your input and  you retype it, to obey you the second time around; but Google Translate’s transliteration facility does not seem to have any such policy: it sticks to its own view, right or wrong. As a consequence it has occasionally taken me a good five minutes of fighting the tool to enter a single word. Such glitches might be removed over time, but at the moment they are sufficiently annoying that I am thinking of teaching myself to touch-type in Cyrillic.

Is this possible? Initially I learned to type on the French (AZERTY) keyboard and I had to unlearn it, since otherwise a Q would occasionally come out as an A, a Z as a W and a semicolon as an M. I know bilingual people, but none who have programmed themselves to touch-type on different keyboards. Anyone out there willing to comment on the experience of bytypalism?

VN:F [1.9.10_1130]
Rating: 9.7/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: +4 (from 6 votes)

Stendhal on abstraction

This week we step away from our usual sources of quotations — the Hoares and Dijkstras and Knuths — in favor an author who might seem like an unlikely inspiration for a technology blog: Stendhal. A scientist may like anyone else be fascinated by Balzac, Flaubert, Tolstoy or Dostoevsky, but they live in an entirely different realm; Stendhal is the mathematician’s novelist. Not particularly through the themes of his works (as could be the case with  Borges or Eco), but because of their clear structure and elegant style,  impeccable in its conciseness and razor-like in its precision. Undoubtedly his writing was shaped by his initial education; he prepared for the entrance exam of the then very young École Polytechnique, although at the last moment he yielded instead to the call of the clarion.

The scientific way of thinking was not just an influence on his writing; he understood the principles of scientific reasoning and knew how to explain them. Witness the following text, which explains just about as well as anything I know the importance of abstraction. In software engineering (see for example [1]), abstraction is the key talent, a talent of a paradoxical nature: the basic ideas take a few minutes to explain, and a lifetime to master. In this effort, going back to the childhood memories of Henri Beyle (Stendhal’s real name) is not a bad start.

Stendhal’s Life of Henri Brulard is an autobiography, with only the thinnest of disguises into a novel (compare the hero’s name with the author’s). In telling the story of his morose childhood in Grenoble, the narrator grumbles about the incompetence of his first mathematics teacher, a Mr. Dupuy, who taught mathematics “as a set of recipes to make vinegar” (comme une suite de recettes pour faire du vinaigre) and tells how his father found a slightly better one, Mr. Chabert. Here is the rest of the story, already cited in [2]. The translation is mine; you can read the original below, as well as a German version. Instead of stacks and circles  — or a university’s commencement day, see last week’s posting — the examples invoke eggs and cheese, but wouldn’t you agree that this paragraph is as good a definition of abstraction, directly applicable to software abstractions, and specifically to abstract data types and object abstractions (yes, it does discuss “objects”!), as any other?

So I went to see Mr. Chabert. Mr. Chabert was indeed less ignorant than Mr. Dupuy. Through him I discovered Euler and his problems on the number of eggs that a peasant woman brings to the market where a scoundrel steals a fifth of them, then she leaves behind the entire half of the remainder and so forth. This opened my mind, I glimpsed what it means to use the tool called algebra. I’ll be damned if anyone had ever explained it to me; endlessly Mr. Dupuy spun pompous sentences on the topic, but never did he say this one simple thing: it is a division of labor, and like every division of labor it creates wonders by allowing the mind to concentrate all its forces on just one side of objects, on just one of their qualities. What difference it would have made if Mr. Dupuy had told us: This cheese is soft or is it hard; it is white, it is blue; it is old, it is young; it is mine, it is yours; it is light or it is heavy. Of so many qualities, let us only consider the weight. Whatever that weight is, let us call it A. And now, no longer thinking of cheese, let us apply to A everything we know about quantities. Such a simple thing; and yet no one was explaining it to us in that far-away province [3]. Since that time, however, the influence of the École Polytechnique and Lagrange’s ideas may have trickled down to the provinces.

References

[1] Jeff Kramer: Is abstraction the key to computing?, in Communications of The ACM, vol. 50, 2007, pages 36-42.
[2] Bertrand Meyer and Claude Baudoin: Méthodes de Programmation, Eyrolles, 1978, third edition, 1982.
[3] No doubt readers from Grenoble, site of great universities and specifically one of the shrines of French computer science, will appreciate how Stendhal calls it  “that backwater” (cette province reculée).

Original French text

J’allai donc chez M. Chabert. M. Chabert était dans le fait moins ignare que M. Dupuy. Je trouvai chez lui Euler et ses problèmes sur le nombre d’œufs qu’une paysanne apportait au marché lorsqu’un méchant lui en vole un cinquième, puis elle laisse toute la moitié du reste, etc., etc. Cela m’ouvrit l’esprit, j’entrevis ce que c’était que se servir de l’instrument nommé algèbre. Du diable si personne me l’avait jamais dit ; sans cesse M. Dupuy faisait des phrases emphatiques sur ce sujet, mais jamais ce mot simple : c’est une division du travail qui produit des prodiges comme toutes les divisions du travail et permet à l’esprit de réunir toutes ses forces sur un seul côté des objets, sur une seule de leurs qualités. Quelle différence pour nous si M. Dupuy nous eût dit : Ce fromage est mou ou il est dur ; il est blanc, il est bleu ; il est vieux, il est jeune ; il est à moi, il est à toi ; il est léger ou il est lourd. De tant de qualités ne considérons absolument que le poids. Quel que soit ce poids, appelons-le A. Maintenant, sans plus penser absolument au fromage, appliquons à A tout ce que nous savons des quantités. Cette chose si simple, personne ne nous la disait dans cette province reculée ; depuis cette époque, l’École polytechnique et les idées de Lagrange auront reflué vers la province.

German translation (by Benjamin Morandi)

Deshalb ging ich zu Herrn Chabert. In der Tat war Herr Chabert weniger ignorant als Herr Dupuy. Bei ihm fand ich Euler und seine Probleme über die Zahl von Eiern, die eine Bäuerin zum Markt brachte, als ein Schurke ihr ein Fünftel stahl, sie dann die Hälfte des Restes hinterliest u.s.w. Es hat mir die Augen geöffnet. Ich sah was es bedeutet, das Algebra genannte Werkzeug zu benutzen. Unaufhörlich machte Herr Dupuy emphatische Sätze über dieses Thema, aber niemals dieses einfache Wort: Es ist eine Arbeitsteilung, die wie alle Arbeitsteilungen Wunder herstellt und dem Geist ermöglicht seine Kraft ganz auf eine einzige Seite von Objekten zu konzentrieren, auf eine Einzige ihrer Qualitäten. Welch Unterschied für uns, wenn uns Herr Dupuy gesagt hätte: Dieser Käse ist weich oder er ist hart; er ist weiss, er ist blau; er ist alt, er ist jung; er gehört dir, er gehört mir; er ist leicht oder er ist schwer. Bei so vielen Qualitäten betrachten wir unbedingt nur das Gewicht. Was dieses Gewicht auch sei, nennen wir es A. Jetzt, ohne unbedingt weiterhin an Käse denken zu wollen, wenden wir auf A alles an, was wir über Mengen wissen. Diese einfach Sache sagte uns niemand in dieser zurückgezogenen Provinz; von dieser Epoche an werden die École Polytechnique und die Ideen von Lagrange in die Provinz zurückgeflossen sein.

VN:F [1.9.10_1130]
Rating: 9.5/10 (6 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

From duplication to duplicity: a short history of the technology of knowledge transfer

1440:

Gutenberg discovers the secret of producing, out of one text, many books for the benefit of many people.

2007:

Von und Zu Guttenberg discovers the secret of producing, out of many texts, one book for the benefit of one person.

 

 

VN:F [1.9.10_1130]
Rating: 9.9/10 (16 votes cast)
VN:F [1.9.10_1130]
Rating: +10 (from 12 votes)

The great programming haiku competition

In a few weeks I will be teaching again my Introductory Programming course at ETH, based for the first time on the published “Touch of Class” textbook [1]. For fun (mine if no one else’s) every lecture will conclude with a haiku summarizing the topic.

I made up a few, given below, and am opening a competition for more. Every proposal should be submitted in the form of a comment to this post. Every winner’s haiku and name will appear in the course slides, and in the special Programming Haiku page which will be added to the book’s site. There are four rules:

  • The contribution has to be a proper haiku: “three unrhymed lines of five, seven, and five syllables”.
  • It must summarize the principal concept of a chapter or main section of the textbook or, better yet, of one of the course’s lectures; see [2] for the lecture plan.
  • It must give the book reference (chapter or section) or lecture number or both.
  • The prize committee’s members are secret and its judgments final.

Here, for a start, are my own examples.

Proof of the undecidability of the halting problem

Section 7.5 of the book; lecture 5.2.

If it stops, it loops,
Yet if it looped, it would stop.
Sad contradiction.

Recursion

Chapter 14, especially section 14.3; lecture 9.1.


Often, I call you.
But when the going gets tough,
I will call myself.

Topological sort

Chapter 15; lectures 11.1 and 11.2.


Partial to total?
With the right data structures,
O of m plus n.

Dynamic binding

Section 16.3; lecture 8.1.


O-O programmers:
How many to screw a bulb?
None whatsoever.

Deferred classes

Section 16.5; lecture 8.1.


Do not implement!
Though for a truly Zen spec
You need a contract.

References

[1] Touch of Class: An Introduction to Programming Well Using Objects and Contracts, Springer Verlag, 2009. See Amazon page (still wrongly says the book is not yet published).

[2] “Introduction to Programming” course at ETH Zurich, Fall 2009: course page. This does not have the slides yet, but you can see last year’s slides in last year’s page.

VN:F [1.9.10_1130]
Rating: 10.0/10 (2 votes cast)
VN:F [1.9.10_1130]
Rating: +1 (from 1 vote)

Long AND clear?

recycled-small (Originally a Risks forum posting, 1998.)

Although complaints about Microsoft Word’s eagerness to correct what it sees as mistakes are not new in the Risks forum, I think it is still useful to protest vehemently the way recent versions of Word promote the dumbing down of English writing by flagging (at least when you use their default options) any sentence that, according to some mysterious criterion, it deems too long, even if the sentence is made of several comma- or semicolon-separated clauses, and even though it is perfectly obvious to anyone, fan of Proust or not, that clarity is not a direct function of length, since it is just as easy to write obscurely with short sentences as with longish ones and, conversely, quite possible to produce an absolutely limpid sentence that is very, very long.

VN:F [1.9.10_1130]
Rating: 10.0/10 (2 votes cast)
VN:F [1.9.10_1130]
Rating: +3 (from 3 votes)

Computer technology: making mozzies out of betties

Are you a Beethoven or a Mozart? If you’ll pardon the familarity, are you more of a betty or more of a mozzy? I am a betty. I am not referring to my musical abilities but to my writing style; actually, not the style of my writings (I haven’t completed any choral fantasies yet) but the style of my writing process. Mozart is famous for impeccable manuscripts; he could be writing in a stagecoach bumping its way through the Black Forest, on the kitchen table in the miserable lodgings of his second, ill-fated Paris trip, or in the antechamber of Archbishop Colloredo — no matter: the score comes out immaculate, not reflecting any of the doubts, hesitations and remorse that torment mere mortals. 

 Mozart

Beethoven’s music, note-perfect in its final form, came out of a very different process. Manuscripts show notes overwritten, lines struck out in rage, pages torn apart. He wrote and rewrote and gave up and tried again and despaired and came back until he got it the way it had to be.

Beethoven

How I sympathize! I seldom get things right the first time, and when I had to use a pen and paper I  almost never could produce a clean result; there always was one last detail to change. As soon as I could, I got my hands on typewriters, which removed the effects of ugly handwriting, but did not solve the problem of second thoughts followed by third thoughts and many more. Only with computers did it become possible to work sensibly. Even with a primitive text editor, the ability to try out ideas then correct and correct and correct is a profound change of the creation process. Once you have become used to the electronic medium, using a pen and paper seems as awkard and insufferable as, for someone accustomed to driving a car, being forced to travel in an oxen cart.

This liberating effect, the ability to work on your creations as a sculptor kneading an infinitely malleable material, is one of the greatest contributions of computer technology. Here we are talking about text, but the effect is just as profound on other media, as any architect or graphic artist will testify.

The electronic medium does not just give us more convenience; it changes the nature of writing (or composing, or designing). With paper, for example, there is a great practical difference between introducing new material at the end of the existing text,  which is easy, and inserting it at some unforeseen position, which is cumbersome and sometimes impossible. With computerized tools, it doesn’t matter. The change of medium changes the writing process and ultimately the writing: with paper the author ends up censoring himself to avoid practically painful revisions; with software tools, you work in whatever order suits you.

Technical texts, with their numbered sections and subsections, are another illustration of the change: with a text processor you do not need to come up with the full plan first, in an effort to avoid tedious renumbering later. You will use such a top-down scheme if it fits your natural way of working, but you can use any other  one you like, and renumber the existing sections at the press of a key. And just think of the pain it must have been to produce an index in the old days: add a page (or, worse, a paragraph, since it moves the following ones in different ways) and you would  have to recheck every single entry.

Recent Web tools have taken this evolution one step further, by letting several people revise a text collaboratively and concurrently (and, thanks to the marvels of  longest-common-subsequence algorithms and the resulting diff tools, retreat to an earlier version if in our enthusiasm to change our design we messed it up) . Wikis and Google Docs are the most impressive examples of these new techniques for collective revision.

Whether used by a single writer or in a collaborative development, computer tools have changed the very process of creation by freeing us from the tyranny of physical media and driving to zero the logistic cost of  one or a million changes of mind. For the betties among us, not blessed with an inborn ability to start at A, smoothly continue step by step, and end at Z, this is a life-changer. We can start where we like, continue where we like, and cover up our mistakes when we discover them. It does not matter how messy the process is, how many virtual pages we tore away, how much scribbling it took to bring a paragraph to a state that we like: to the rest of the world, we can present a result as pristine as the manuscript of a Mozart concerto.

These advances are not appreciated enough; more importantly, we do not take take enough advantage of them. It is striking, for example, to see that blogs and other Web pages too often remain riddled with typos and easily repairable mistakes. This is undoubtedly because the power of computer technology tempts us to produce ever more documents and in the euphoria to neglect the old ones. But just as importantly that technology empowers  us to go back and improve. The old schoolmaster’s advice — revise and revise again [1] — can no longer  be dismissed as an invitation to fruitless perfectionism; it is right, it is fun to apply, and at long last it is feasible.

Reference

 

[1] “Vingt fois sur le métier remettez votre ouvrage” (Twenty times back to the loom shall you bring your design), Nicolas Boileau

VN:F [1.9.10_1130]
Rating: 10.0/10 (2 votes cast)
VN:F [1.9.10_1130]
Rating: +2 (from 2 votes)