The biggest software-induced disaster ever
In spite of the brouhaha surrounding the Affordable Care Act, the US administration and its partisans seem convinced that “the Web site problems will be fixed”.
That is doubtful. All reports suggest that the problem is not to replace a checkbox by a menu, or buy a few more servers. The analysis, design and implementation are wrong, and the sites will not work properly any time soon.
Barring sabotage (for which we have seen no evidence), this can only be the result of incompetence. An insurance exchange? Come on. Any half-awake group of developers could program it over breakfast.
Who chose the contractors?
When the problems first surfaced a few weeks ago, anyone with experience and guts would have done the right thing: fire all the companies responsible for the mess and start from scratch with a dedicated, competent and well-managed team.
The latest promises published are that by the end of the month “four out of five” of the people trying to register will manage to do it. Nice. Imagine that when trying to make a purchase at Amazon you would succeed 80% of the time.
And that is only an optimistic goal.
The people building the site do not have infinite time. In fact, the process is crucially time-driven: if people do not get health coverage in time, they will be fined. But what if they cannot get coverage because the Web sites do not respond, or mess up?
Consider for a second another example of another strictly time-driven project: on January 1, 2002, twelve countries switched to a common currency, with the provision that their current legal tender would lose its status only a bare two months later. The IT infrastructure had to work on the appointed day. It did. How come Europe could implement the Euro in time and the US cannot get a basic health exchange to work?
Here is a possible scenario: the sites do not work (cannot handle the load, give inconsistent results). A massive wave of protests ensues, boosted by those who were against universal health coverage in the first place. Faced with popular revolt and with the evidence, the administration announces that the implementation of the universal mandate — the enforcement of the fines — is delayed by a year. In a year much can happen; opposition grows and the first exchanges are an economic disaster since the “young healthy adults” feel no pressure to enroll. The law fades into oblivion. Americans do not get universal health care for another generation. Show me it is not going to happen.
The software engineering lessons here are clear: hire competent companies; faced with a complicated system, implement the essential functions first, but stress-test them; deploy step by step, with the assurance that whatever is deployed works.
The exact reverse strategy was applied. As a result, we face the prospect of a software disaster that will dwarf Y2K and other famous mishaps; a disaster that software engineering textbooks will feature for decades to come.