Healthcare.gov screenshot

Obamacare: Where Did 500 Million Lines of Code Come From?

Healthcare.gov screenshot

Anyone who knows even a little about enterprise software has to be horrified by the events surrounding the rollout of the Affordable Care Act. Not so much the obvious flaws in the enrollment process, though they are bad enough, but but the inane statements of politicians, the glaring lack of knowledge in both the tech and general media, and even the statements of a lot of people who should know better.

An example: It has now become commonly accepted knowledge that that Healthcare.gov consists of 500 million lines of code. And where does this wisdom come from?Digging into its origin, it appears the first reference was in an Oct. 20 New York Times article whose last paragraph read:

According to one specialist, the Web site contains about 500 million lines of software code. By comparison, a large bank’s computer system is typically about one-fifth that size.

No indications of who that specialist was, no indication of his or her credentials, and no reason why anonymity was granted for uttering what is presented as a statement of fact. And the reader is given no explanation of why a bank computer system is a relevant standard of comparison. (The same article cited a “specialist”–not clear if it was the same one–as saying that 5 million lines of code might need to be rewritten, with an equal lack of provenance.)

Meaningless metrics. In fact, “lines of code” is meaningless as a measure of complexity or anything else. For example, this is a completely valid, ((Assuming that there’s a matching { somewhere.)) and common, line of code in the C language:

}

On the other hand, a sufficiently clever Perl programmer could probably compress a million lines of code into a single incomprehensible and totally undebuggable, but functional, line. To a considerable extent, the number of lines of code required for a task is a function of the programming language used and the style of the coder as much as anything else. But as a general rule, using more lines than fewer for a given task produces code that that easier to read and easier to debug, especially if a lot of those lines are comments explaining how the code works.

Of course, the argument about the number of lines was high-level debate compared to Texas Republican Joe Barton’s attack on Helathcare.gov’s alleged assault on privacy. At a hearing Oct. 25, Barton got extremely exercised about “code” that stated that users of the site had “no reasonable expectation of privacy.” Of course, if Barton (or his staff) actually knew how to read HTML, they would have realized that the offending line was in a block of code that had been commented out and thus was of no significance. (For the curious, this slightly blurry PDF of the code shows comment markers at lines 1406 and 1411.)

Was a waterfall to blame? One of the stranger analyses of Healthcare.gov’s woes blames the use of “waterfall” rather than “agile” development. Agile and waterfall are two different approaches to organizing major software projects, and their devotees can argue their relative virtues with the fervor of religious fanatics. But no one has ever demonstrated the superiority of one to the other in all cases and  no one, including developer Larry Fitzpatrick in the cited article, has shown that waterfall was responsible for the problems or even that it was the technique used.

Instead of a lot of random speculation, we could use some serious investigation of what really happened in the development of Healthcare.gov. This isn’t going to come from politicians, who are more interested in scoring points than fixing anything, but it might come from some hard journalistic work.

I’ll throw out my own question. An article by Sarah Kliff of The Washington Post (her expertise is healthcare, not tech, but she has been doing by far the best coverage of Healthcare.gov) explains the error-ridden process by which applicant information is transferred from the government to insurance companies. The key is a standard insurance industry form called an 834 that is transmitted using an antiquated technique called electronic document interchange. Why was the system based on EDI rather then the current approach of storing the data in XML files and transferring it via an API? Was it the old fashioned government? Or was EDI all the insurance companies could handle (the same process is widely used to communicate between employers and insurance companies)? Might not Healthcare.gov have presented an opportunity to modernize the process rather than enshrine early 1990s technology?

This sort of question, not random speculation, is what analysis of the Obamacare mess should be focusing on.

 

 

Published by

Steve Wildstrom

Steve Wildstrom is veteran technology reporter, writer, and analyst based in the Washington, D.C. area. He created and wrote BusinessWeek’s Technology & You column for 15 years. Since leaving BusinessWeek in the fall of 2009, he has written his own blog, Wildstrom on Tech and has contributed to corporate blogs, including those of Cisco and AMD and also consults for major technology companies.

One thought on “Obamacare: Where Did 500 Million Lines of Code Come From?”

Leave a Reply

Your email address will not be published. Required fields are marked *