The waves of publication

(This article first appeared in the Communications of the ACM blog.)

The very concept of publication has changed, half of its traditional meaning having disappeared in hardly more than a decade. Or to put it differently (if you will accept the metaphor, explained below), how it has lost its duality: no longer particle, just wave.

Process and product

Some words ending with ation (atio in Latin) describe a change of state: restoration, dilatation. Others describe the state itself, or one of its artifacts: domination, fascination. And yet others play both roles: decoration can denote either the process of embellishing (she works in interior decoration), or an element of the resulting embellishment (Christmas tree decoration).

Since at least Gutenberg, publication has belonged to that last category: both process and artifact. A publication is an artifact, such as an article or a book accessible to a community of readers. We are referring to that view when we say “she has a long publication list” or “Communications of the ACM is a prestigious publication”. But the word also denotes a process, built from the verb “publish” the same way “restoration” is built from “restore” and “insemination” from “inseminate”: the publication of her latest book took six months.

The thesis of this article is that the second view of publication will soon be gone, and its purpose is to discuss the consequences for scientists.

Let me restrict the scope: I am only discussing scientific publication, and more specifically the scientific article. The situation for books is less clear; for all the attraction of the Kindle and other tablets, the traditional paper book still has many advantages and it would be risky to talk about its demise. For the standard scholarly article, however, electronic media and the web are quickly destroying the traditional setup.

That was then . . .

Let us step back a bit to what publication, the process, was a couple of decades ago. When you wrote something, you could send it by post to your friends (Edsger Dijkstra famously turned this idea into his modus operandi, regularly xeroxing his “EWD” memos [1] to a few dozen people) , but if you wanted to make it known to the world you had to go through the intermediation of a PUBLISHER — the mere word was enough to overwhelm you with awe. That publisher, either a non-profit organization or a commercial house, was in charge not only of selecting papers for a conference or journal but of bringing the accepted ones to light. Once you got the paper accepted began a long and tedious process of preparing the text to the publisher’s specifications and correcting successive versions of “galley proofs.” That step could be painful for papers having to do with programming, since in the early days typesetters had no idea how to lay out code. A few months or a couple of years later, you received a package in the mail and proudly opened the journal or proceedings at the page where YOUR article appeared. You would also, usually for a fee, receive fifty or so separately printed (tirés à part) reprints of just your article, typeset the same way but more modestly bound. Ah, the discrete charm of 20-th century publication!

. . . and this is now

Cut to today. Publishers stopped long ago to do the typesetting for you. They impose the format, obligingly give you LaTex, Word or FrameMaker templates, and you take care of everything. We have moved to WYSIWYG publishing: the version you write is the version you submit through a site such as EasyChair or CyberChair and the version that, after correction, will be published. The middlemen have been cut out.

We moved to this system because technology made it possible, and also because of the irresistible lure, for publishers, of saving money (even if, in the long term, they may have removed some of the very reasons for their existence). The consequences of this change go, however, far beyond money.

Integrating change

To understand how fundamentally the stage has changed, let us go back for a moment to the old system. It has many advantages, but also limitations. Some are obvious, such as the amount of work required, involving several people, and the delay from paper completion to paper publication. But in my view the most significant drawback has to do with managing change. If after publication you find a mistake, you must convince the journal to include an erratum: a new mini-article, subject to the same process. That requirement is reasonable enough but the scheme does not support a significant mode of scientific writing: working repeatedly on a single article and progressively refining it. This is not the “LPU” (Least Publishable Unit) style of publishing, but a process of studying an important idea or research project and aiming towards the ideal paper about it by successive approximation. If six months after the original publication of an article you have learned more about the topic and how to present it, the publication strategy is not obvious: resubmit it and risk being accused of self-plagiarism; avoid repetition of basic elements, making the article harder to read independently; artificially increase differences. This conundrum is one of the legitimate sources of the LPU phenomenon: faced with the choice between freezing material and repeating it, people end up publishing it bit by bit.

Now back again to today. If you are a researcher, you want the world to know about your ideas as soon as they are in a clean form. Today you can do this easily: no need to photocopy page after page and lick postage stamps on envelopes the way Dijkstra did; just generate a PDF and put it on your Web page or (to help establish a record if a question of precedence later comes up) on ArXiv. Just to make sure no one misses the information, tweet about it and announce it on your Facebook and LinkedIn pages. Some authors do this once the paper has been accepted, but many start earlier, at the time of submission or even before. I should say here that not all disciplines allow such author behavior; in biology and medicine in particular publishers appear to limit authors’ rights to distribute their own texts. Computer scientists would not tolerate such restrictions, and publishers, whether nonprofit or commercial, largely leave us alone when we make our work available on the Web.

But we are talking about far more than copyright and permissions (in this article I am in any case staying out of these emotionally and politically charged issues, open access and the like, and concentrating on the effect of technology changes on the process of publishing and the publication culture). The very notion of publication has changed. The process part is gone; only the result remains, and that result can be an evolving product, not a frozen artifact.

Particle, or wave?

Another way to describe the difference is that a traditional publication, for example an article published in a journal, is like a particle: an identifiable material object. With the ease of modification, a publication becomes more like a wave, which allows an initial presentation to propagate to successively wider groups of readers:Waves of publication

Maybe you start with a blog entry, then you register the first version of the work as a technical report in your institution or on ArXiv, then you submit it to a workshop, then to a conference, then a version of record in a journal.

In the traditional world of publication each of these would have to be made sufficiently different to avoid the accusation of plagiarism. (There is some tolerance, for example a technical report is usually not considered prior publication, and it is common to submit an extended version of a conference paper to a journal — but the journal will require that you include enough new material, typically “at least 30%.”)

For people who like to polish their work repeatedly, that traditional model is increasingly hard to accept. If you find an error, or a better way to express something, or a complementary result, you just itch to make the change here and now. And you can. Not on a publisher’s site, but on your own, or on ArXiv. After all, one of the epochal contributions of computer technology, not heralded loudly enough, is, as I argued in another blog article [2], the ease with which we can change, extend and refine our creations, developing like a Beethoven and releasing like a Mozart.

The “publication as product” becomes an evolving product, available at every step as a snapshot of the current state. This does not mean that you can cover up your mistakes with impunity: archival sites use “diff” techniques to maintain a dated record of successive versions, so that in case of doubt, or of a dispute over precedence,  one can assess beyond doubt who released what statement when. But you can make sure that at any time the current version is the one you like best. Often, it is better than the official version on the conference or journal site, which remains frozen forever in the form it had on the day of its release.

What then remains of “publication as process”? Not much; in the end, a mere drag-and-drop from the work folder to the publication folder.

Well, there is an aspect I have not mentioned yet.

The sanction

Apart from its material side, now gone or soon to be gone, the traditional publication process has another role: what a recent article in this blog [3] called sanction. You want to publish your latest scientific article in Communications of the ACM not just because it will end up being printed and mailed, but because acceptance is a mark of recognition by experts. There is a whole gradation of prestige, well known to researchers in every particular field: conferences are better than workshops, some journals are as good as conferences or higher, some conferences are far more prestigious than others, and so on.

That sanction, that need for an independent stamp of approval, will remain (and, for academics, young academics in particular, is of ever growing importance). But now it can be completely separated from the publication process and largely separated (in computer science, where conferences are so important today) from the conference process.

Here then is what I think scientific publication will become. The researcher (the author) will largely be in control of his or her own text as it goes through the successive waves described above. A certified record will be available to verify that at time t the document d had the content c. Then at specific stages the author will submit the paper. Submit in the sense of appraisal and, if the appraisal is succesful, certification. The submission may be to a conference: you submit your paper for presentation at this year’s ICSE, POPL or SIGGRAPH. (At the recent Dagstuhl publication culture workshop, Nicolas Holzschuch mentioned that some graphics conferences accept for presentation work that has already been published; isn’t this scheme more reasonable than the currently dominant practice of conference-as-publication?) You may also submit your work, once it reaches full maturity, to a journal. Acceptance does not have to mean that any trees get cut, that any ink gets spread, or even that any bits get moved: it simply enables the journal’s site to point to the article, and your site to add this mark of recognition.

There may also be other forms of recognition, social-network or Trip Advisor style: the community gets to pitch in, comment and assess. Don’t laugh too soon. Sure, scientific publication has higher standards than Wikipedia, and will not let the wisdom of the crowds replace the judgment of experts. But sometimes you want to publish for communication, not sanction; especially if you have the privilege of no longer being trapped in the publish-or-perish race you may simply want to make your research known, and you have little patience for navigating the meanders of conventional publication, genuflecting to the publications of PC members, and following the idiosyncratic conventional structure of the chosen conference community. Then you just publish and let the world decide.

In most cases, of course, we do need the sanction, but there is no absolute reason it should be tied to the traditional structures of journal publication and conference participation. There will be resistance, if only because of the economic interests involved; some of what we know today will remain, albeit with a different focus: conferences, as a place where the best work of the moment is presented (independently of its publication); printed books, as noted;  and printed journals that bring real added value in the form of high-quality printing, layout and copy editing (and might still insist that you put on their site a copy of your paper rather than, or in addition to, a reference to your working version).

The trend, however, is irresistible. Publication is no longer a process, it is a product, increasingly under the control of the authors. As a product it is no longer a defined particle but a wave, progressively improving as it reaches successive classes of readership, undergoes successive steps of refinement and receives, informally from the community and officially from more or less prestigious sources, successive stamps of approval.

References

[1] Dijkstra archive at the University of Texas at Austin, here.

[2] Bertrand Meyer: Computer Technology: Making Mozzies out of Betties, article on this blog, 2 August 2009, available here.

[3] Bertrand Meyer: Conferences: Publication, Communication, Sanction, article on this blog, 6 February 2013, available here.

Conferences: Publication, Communication, Sanction

Recycled(This article was first published in the Communications of the ACM blog.)

A healthy discussion is taking place in the computer science community on our publication culture. It was spurred by Lance Fortnow’s 2009 article [1]; now Moshe Vardi has taken the lead to prepare a report on the topic, following a workshop in Dagstuhl in November [2]. The present article and one that follows (“The Waves of Publication”)  are intended as contributions to the debate.

One of the central issues is what to do with conferences. Fortnow had strong words for the computer science practice of using conferences as the selective publication venues, instead of relying on journals as traditional scientific disciplines do. The criticism is correct, but if we look at the problem from a practical perspective it is unlikely that top conferences will lose their role as certifiers of quality. This is not a scientific matter but one of power. People in charge of POPL or OOPSLA have decisive sway over the careers (one is tempted to say the lives) of academics, particularly young academics, and it is a rare situation in human affairs that people who have critical power voluntarily renounce it. Maybe the POPL committee will see the light: maybe starting in 2014 it will accept all reasonable papers somehow related to “principles of programming languages”, turn the event itself into a pleasant multi-track community affair where everyone in the field can network, and hand over the selection and stamp-of-approval job to a journal such as TOPLAS. Dream on; it is not going to happen.

We should not, however, remain stuck with the status quo and all its drawbacks. That situation is unsustainable. As a single illustration, consider the requirement, imposed by all conferences, that having a paper pass the refereeing process is not enough: you must also register. A couple of months before the conference, authors of accepted papers (at least, they thought their paper was accepted) receive a threatening email telling them that unless they register and pay their paper will not be published after all. Now assume an author, in a field where a conference is the top token of recognition, has his visa application rejected by the country of the conference — a not so uncommon situation — and does not register. (Maybe he does not mind paying the fee, but he does not want to lie by pretending he is going to attend whereas he knows he will not.) He has lost his opportunity for publication and perhaps severely harmed this career. What have such requirements to do with science?

To understand what can be done, we need to analyze the role of conferences. In an earlier article  [3] I described four “modes and uses” of publication: Publication, Exam, Business and Ritual. From the organizers’ viewpoint, ignoring the Business and Ritual aspects although they do play a significant role, a conference has three roles: Publication, Communication and Sanction. The publication part corresponds to the proceedings of the conference, which makes articles available to the community at large, not just the conference attendees. The communication part only addresses the attendees: it includes the presentation of papers as well as all other interactions made possible by being present at a conference. The sanction part (corresponding to the “exam” part of the more general classification) is the role of a renowned conference as a stamp of approval for the best work of the moment.

What we should do is separate these roles. A conference can play all three roles, but it can also select two of them, or even just one. A well-established, prestigious conference will want to retain its sanctioning role: accepted papers get the stamp of approval. It will also remain an event, where people meet. And it may distribute proceedings. But the three roles can also be untied:

  • Publication is the least critical, and can easily be removed from the other two, since everything will be available on the Web. In fact the very notion of proceedings is quickly becoming fuzzy: more and more conferences save money by not distributing printed proceedings to attendees, sometimes not printing any proceedings at all; and some even spare themselves the production of a proceedings-on-a-stick, putting the material on the Web instead. A conference may still decide to have its own proceedings, or it might outsource that part to a journal. Each conference will make these decisions based on its own culture, tradition, ambition and constraints. For authors, the decision does not particularly matter: what counts are the sanction, which is provided by the refereeing process, and the availability of their material to the world, which will be provided in any scenario (at least in computer science where we have, thankfully, the permission to put our papers on our own web sites, an acquired right that our colleagues from other disciplines do not all enjoy).
  • Separating sanction from communication is a natural step. Acceptance and participation are two different things.

Conference organizers should not be concerned about lost revenue: most authors will still want to participate in the conference, and will get the funding since institutions are used to pay for travel to present accepted papers; some new participants might come, attracted by more interaction-oriented conference styles; and organizers can replace the requirement to register by a choice between registering and paying a publication fee.

Separating the three roles does not mean that any established conference renounces its sanctioning status, acquired through the hard work of building the conference’s reputation, often over decades. But everyone gets more flexibility. Several combinations are possible, such as:

  • Sanction without communication or publication: papers are submitted for certification through peer-review, they are available on the Web anyway, and there is no need for a conference.
  • Publication without sanction or communication: an author puts a paper on his web page or on a self-publication site such as ArXiv.
  • Sanction and communication without publication: a traditional selective conference, which does not bother to produce proceedings.
  • Communication without sanction: a working conference whose sole aim is to advance the field through presentations and discussions, and accepts any reasonable submission. It may be by invitation (a kind of advance sanction). It may have proceedings (publication) or not.

Once we understand that the three roles are not inextricably tied, the stage is clear for removal for some impediments to a more effective publication culture. Some, not all. The more general problem is the rapidly changing nature of scientific publication, what may be called the concentric waves of publication. That will be the topic of the next article.

References

[1] Lance Fortnow: Time for Computer Science to Grow Up, in Communications of the ACM, Vol. 52, no. 8, pages 33-35, 2009, available here.

[2] Dagstuhl: Perspectives Workshop: Publication Culture in Computing Research, see here.

[3] Bertrand Meyer: The Modes and Uses of Scientific Publication, article on this blog, 22 November 2011, see here.

Your IP: does Google care?

 

A search for my name on Google Scholar [1] shows, at the top of the resulting list, my book Object-Oriented Software Construction [2], with over 7800 citations in the scientific literature. Very nice (thanks, and keep those citations coming!).

That top result is a link to a pirated version [3] of the full content — 1350 pages or so — at an organization in Indonesia, “Institut Teknologi Telkom”, whose logo bears the slogan “Center of Excellence in ICT”. The text has been made available, along with the entire contents of several other software engineering textbooks, in a directory helpfully called “ebooks”, apparently by a user with the initials “kms”. I think I know his full name but attempts at emailing him failed. I wrote a couple of times to the site’s webmaster, who does not respond.

Needless to say, the work is copyrighted and that online copy is not authorized. (I realize that to some people the very idea of protecting intellectual property is anathema, but I, not they, wrote the book, and for the time being it is not public property.)

At least Google could avoid directing people to a pirated text as the first answer to a query about my publications. I was able to to bring the issue to the attention of someone at Google; that result is already something of a miracle, as anyone who tries to interact with a human being regarding a Google-related problem can testify. The history of that interaction, which was initially about something else, might serve as the subject for another article. The person refused to do anything and pointed me to an online tool [4] for removing search results.

Navigating the tool proved to be an obstacle course, starting with the absence of Google Scholar among the Google products listed (I inquired and was told to use “Web Search”). Interestingly, to use this service, you have to be logged in as a Gmail user; I do have a gmail account, but I know several people, including a famous computer scientist, who refuse to open one out of fear for their privacy. Think of the plight of someone who has a complaint against Google results affecting his privacy, and to lodge that complaint must first register as a Google user! I did not have that problem but had to navigate the obstacle course. (It includes one of those “Captchas” that are so good at preventing automatic tools from deciphering the words that humans can’t read them either — I have pretty good eyesight and still I had to try five times. Fodder for yet another article.) But I succeeded, sent my request, and got an automatic acknowledgement. Then…

Then nothing. No answer. The search results remain the same. No one seems to care.

Here is a little thought experiment. Imagine you violated Google’s IP, for example by posting some Google proprietary code on your Web page. Now I have a hunch that they would respond faster. Much faster. This is all pure speculation of course, and I am not advising anyone to try the experiment for real. Pure speculation.

In the meantime, maybe I can at least use the opportunity for some self-promotion. The book is actually pretty good, I think. You can buy it at Amazon [5] for $97.40, a bit less for a used copy. But why pay? Google invites you to read it for free. Just follow any of the links they obligingly provide at [1].

References

[1] Result of a search for author:”b meyer” on Google Scholar: see here.

[2] Bertrand Meyer: Object-Oriented Software Construction, 2nd edition, Prentice Hall, 1997. See the book’s page at Eiffel Software here and the Wikipedia entry here. Note that either would be appropriate for Google Scholar to identify the book.

[3] Bootlegged version of [1] here.

[4] Google: “Removing content from Google”, page available here.

[5] Amazon book page for [1]: here.

New LASER proceedings

Springer has just published in the tutorial sub-series of Lecture Notes in Computer Science a new proceedings volume for the LASER summer school [1]. The five chapters are notes from the 2008, 2009 and 2010 schools (a previous volume [2] covered earlier schools). The themes range over search-based software engineering (Mark Harman and colleagues), replication of software engineering experiments (Natalia Juristo and Omar Gómez), integration of testing and formal analysis (Mauro Pezzè and colleagues), and, in two papers by our ETH group, Is branch coverage a good measure of testing effectiveness (with Yi Wei and Manuel Oriol — answer: not really!) and a formal reference for SCOOP (with Benjamin Morandi and Sebastian Nanz).

The idea of these LASER tutorial books — which are now a tradition, with the volume from the 2011 school currently in preparation — is to collect material from the presentations at the summer school, prepared by the lecturers themselves, sometimes in collaboration with some of the participants. Reading them is not quite as fun as attending the school, but it gives an idea.

The 2012 school is in full preparation, on the theme of “Advanced Languages for Software Engineering” and with once again an exceptional roster of speakers, or should I say an exceptional roster of exceptional speakers: Guido van Rossum (Python), Ivar Jacobson (from UML to Semat), Simon Peyton-Jones (Haskell), Roberto Ierusalimschy (Lua), Martin Odersky (Scala), Andrei Alexandrescu (C++ and D),Erik Meijer (C# and LINQ), plus me on the design and evolution of Eiffel.

The preparation of LASER 2012 is under way, with registration now open [3]; the school will take place from Sept. 2 to Sept. 8 and, like its predecessors, in the wonderful setting on the island of Elba, off the coast of Tuscany, with a very dense technical program but time for enjoying the beach, the amenities of a 4-star hotel and the many treasures of the island. On the other hand not everyone likes Italy, the sun, the Mediterranean etc.; that’s fine too, you can wait for the 2013 proceedings.

References

[1] Bertrand Meyer and Martin Nordio (eds): Empirical Software Engineering and Verification, International Summer Schools LASER 2008-2010, Elba Island, Italy, Revised Tutorial Lectures, Springer Verlag, Lecture Notes in Computer Science 7007, Springer-Verlag, 2012, see here.

[2] Peter Müller (ed.): Advanced Lectures on Software Engineering, LASER Summer School 2007-2008, Springer Verlag, Lecture Notes in Computer Science 7007, Springer-Verlag, 2012, see here.

[3] LASER summer school information and registration form, http://se.ethz.ch/laser.

TOOLS 2012, “The Triumph of Objects”, Prague in May: Call for Workshops

Workshop proposals are invited for TOOLS 2012, The Triumph of Objectstools.ethz.ch, to be held in Prague May 28 to June 1. TOOLS is a federated set of conferences:

  • TOOLS EUROPE 2012: 50th International Conference on Objects, Models, Components, Patterns.
  • ICMT 2012: 5th International Conference on Model Transformation.
  • Software Composition 2012: 10th International Conference.
  • TAP 2012: 6th International Conference on Tests And Proofs.
  • MSEPT 2012: International Conference on Multicore Software Engineering, Performance, and Tools.

Workshops, which are normally one- or two-day long, provide organizers and participants with an opportunity to exchange opinions, advance ideas, and discuss preliminary results on current topics. The focus can be on in-depth research topics related to the themes of the TOOLS conferences, on best practices, on applications and industrial issues, or on some combination of these.

SUBMISSION GUIDELINES

Submission proposal implies the organizers’ commitment to organize and lead the workshop personally if it is accepted. The proposal should include:

  •  Workshop title.
  • Names and short bio of organizers .
  • Proposed duration.
  •  Summary of the topics, goals and contents (guideline: 500 words).
  •  Brief description of the audience and community to which the workshop is targeted.
  • Plans for publication if any.
  • Tentative Call for Papers.

Acceptance criteria are:

  • Organizers’ track record and ability to lead a successful workshop.
  •  Potential to advance the state of the art.
  • Relevance of topics and contents to the topics of the TOOLS federated conferences.
  •  Timeliness and interest to a sufficiently large community.

Please send the proposals to me (Bertrand.Meyer AT inf.ethz.ch), with a Subject header including the words “TOOLS WORKSHOP“. Feel free to contact me if you have any question.

DATES

  •  Workshop proposal submission deadline: 17 February 2012.
  • Notification of acceptance or rejection: as promptly as possible and no later than February 24.
  • Workshops: 28 May to 1 June 2012.

 

Various interviews

Over the past few months I have given a few interviews to Russian news outlets on technology- and software-related issues. Here are the links I have.

In September, Mikhail Saprykin interviewed me [1] for Kommersant (the main Russian business daily) on a question that worries everyone in technology and academia: the brain drain.

In early November, at the SECR conference in Moscow where I gave a keynote [2], Natalia Dubova from Open Systems, the principal applied publication on software issues (which published translations of many of my articles over the years), interviewed me on the theme of software reliability and Eiffel [3].

On the same occasion, Internet University, which recently published the translation [4] of my introductory programming textbook Touch of Class [5], recorded a video conversation [6] between Prof. Vladimir Billig from Tver Technical University and me. Vladimir (pictured here a few weeks later)

Vladimir Billig

is the book’s translator; he had already translated the second edition of Object-Oriented Software Construction.

On December 19 I was interviewed with Dmitry Grishin, head of mail.ru — the biggest Russian internet companies — by Alexander Belanovskiy at the radio station “Echo Moskvy” in Moscow, for Echonet, the station’s technology program. The interview will air, I was told, on January 15.

On the occasion of a talk I gave on December 19 at the Technical University of Tver, a historic city at the junction of the Volga (appearing on the far-right in the picture) and the Tverska,

Tver, November 2011I was interviewed on two separate TV stations (one of them Russia 1); I didn’t get to see the broadcasts, but if anyone finds them on the Web I will be grateful for the links.

Tver house

Tver church

References

[1] Interview by Mikhail Saprykin in Kommersant, 20 September 2011, available here.

[2] Keynote at Software Engineering Conference Russia, available here.

[3] Interview by Natalia Dubova in Otkrytye Systemy (Open Systems), vol. 10, no. 21, December 2011, available here.

[4] Potchustvuj Klass: translation by Vladimir Billig of Touch of Class [5], book page available here.

[5] Bertrand Meyer: Touch of Class: Learning to Program Well, Using Objects and Contracts, Springer Verlag, 2009, book page available here.

[6] Video interview with Vladimir Billig, available here.