August | 2014 | Arborosa

During the CAST 2014 conference in New York I participated in a workshop by Laurent Bossavit and Michael Bolton called “Thinking critically about numbers – Defense against the dark arts”. Inspired by this workshop I took a look at one of the Dutch sites addressing news about software testing www. testnieuws.nl. This is the second post to come out my curiosity.

On May 23, 2014 Testnieuws hosted an article “Code inspectie is 80 procent sneller dan testen” (translated Code inspection is 80 percent faster than testing). The article itself provides little more substantiation for the claim than a reference to research by IfSQ. Both the claim and usage of this as header seems to only serve to grab the readers attention. The article ends with an invitation to read more about it and this leads to what I think is the actual article “Status ICT-projecten vaak compleet onduidelijk” (translated “Status of ICT projects often completely unclear). This article describes that, especially government, projects need to have more objective information and they need to get it earlier. This way it is possible to determine the status of a project. Andres Ramirez, Managing Partner of the OSQR Group states that “better software leads to better projects” and “better source code leads to better software” and “the quality of source code can be objectively assessed by the guidelines from the Institute for Software Quality”. The last quote explains the IfSQ abbreviation used earlier.

A little further in the article the claim is used and even extended “Code inspection is 80 percent faster than testing, and finding and repairing code is much cheaper than testing”. Ramirez also adds “Research by IfSQ shows that regular code inspection during the production process ensures that software can be changed more easily. Inspected software is 90 percent cheaper to maintain.”. I am choosing to ignore these last claims for now and proceed to IfSQ the look into their research.

The IfSQ – Research Findings Relevant to the IfSQ Standards hosts about 50 or so reference to articles and research results divided into sections “Why should you inspect software?”, “When should you inspect software?” and “What should you look for?”. Noticeably the focus is strongly on code quality and, to my opinion, therefore not really on software quality as such. Also there seems to be need to position code inspection opposite to testing as suggested by titles like:

The second title points to a page with the title “Inspection is 80% faster than testing” which indicates I am on the right track. The page however only repeats “Code reading detected about 80% more faults per hour than testing.” and provides two, non IfSQ, sources for it without further argumentation. The sources are:

Comparing The Effectiveness of Software Testing StrategiesRecorded 1987 in IEEE Transactions on Software Engineering, SE-13, no. 12, December;
Pages 1278-96 by Victor R. Basili, and Richard W. Selby
Software Inspections: An Effective Verification Process
Recorded 1989 in IEEE Software, 6, no. 3, May;
Pages 31-36 by A. Frank Ackerman, Lynne S. Buchwald, and Frank H. Lewski

So, at least in this case, so called research findings by IfSQ do not point to research executed by IfSQ themselves, nor were they involved as both sources are quite old and IfSQ was established much later in 2005. Next step to identify which of the two sources holds the quote.

The first article was easily found. In summary the article describes a scientific study that applies an experimentation methodology to compare three (then) state-of-the-practice testing techniques: a) code reading by stepwise abstraction b) functional testing using equivalence partitioning and boundary value analysis, and c) structural testing using 100 percent statement coverage. It compares three aspects of software testing: fault detection effectiveness, fault detection costs and classes of faults detected. It focussed on unit testing code using a limited set of specific programs, known errors and a mix of academics and professional developers.

Although it found difference between the three test techniques with in some instances an identifiable hierarchy of code reading, functional testing and structural testing non of the results came anywhere near the claim of being 80% faster. So my conclusion is that this article cannot be a valid source for this claim.

I could only find the second at IEEE and as a result the article could only be read by buying it. Setting aside my initial dislike of paying for information, especially if it so old, I tried to buy it. Unfortunately the cash module did not like my dutch creditcard. As a result I stuck to a number (4) abstracts and a course summary of where the article was used.

The second article came closer to the IfSQ description of code inspection in describing how its done, what is needed for it and what it can measure. Still none of the abstracts said anything about being faster than testing. They did mention percentages around 80% for defects found by software inspections. This to me is a different claim. Sounds to me that a ‘leaky’ inference was made and worse another attempt to gain credibility by bringing testing in disrepute.

During the CAST 2014 conference in New York I participated in a workshop by Laurent Bossavit and Michael Bolton called “Thinking critically about numbers – Defense against the dark arts”. This workshop addressed the usage of numbers, measurements and the ability to influence their perception by choosing a certain representation, telling only parts of the information or leaving out context. Inspired by this workshop I took a look at one of the Dutch sites addressing news about software testing www. testnieuws.nl.

On Testnieuws I found an article about software errors invalidating school exams “Eindexamen ongeldig door softwarefout“. The story describes that the school inspection declared 1372 school exams invalid due to software errors. Especially software errors in the VMBO (a dutch school type) Math exam. Being a tester I was curious if could find out what had happened. The article written by Marco van der Spek referenced a newspaper article in “De Limburger“. Both articles were exactly the same, so my first conclusion was that in this case it was more of posting an article than writing an article. Something which is in line with the general practice of the site as it is more a collector of news than an actual writer of news stories. (To my knowledge the site only manned by a a few part-timers.)

Since the newspapers only indirect reference was mentioning the school inspection my search now focussed on looking for the source document there. On their site I found the original press release. The press release added much more detail with regard to all the exams, both written and digital, and differentiated the 1372 number as follows:

127 cases of technical problems (film not running, computer not working correctly)
112 cases of non technical problems (running out of time, fire alarm, usage of wrong materials)

That brought the number of potential Math exam problems down to 1133. The press release also mentioned what the problem with the Math exam was. It was not that the digital exam had produced obvious software errors or that functionality had not been available. The exam had offered the usage of an build-in calculator that handled negative numbers differently than many of the calculators VMBO students had used during the school year. The school inspection had ruled that this had given the students a disadvantage and offered them the possibility to redo the exam if they so wished. 1133 students have made use of this offer.

So what does this mean?

First that the number 1372 school exams is arbitrary and for the most part based on the number of students that used the offer to redo the Math exam. This could also have been half or double the number. So mentioning that specific number does not add value to the story.

Secondly I see an oracle problem with regard to the conclusion that the software was in error. Based on the information I cannot tell if either the built-in calculator had actually produced wrong answers when using negative numbers or if the calculators used by the students did during the school year. (Assuming there is another oracle in form of a scientifically established rule to use negative numbers we could which of the two produced incorrect answers.)

Finally I see a case of shallow agreement on what a software error is. For some a software error is something that occurs when the software runs into a situation where it can not handle or produce the data and responds by showing an error message. Others see a software error when the software, in this case the calculator, is not functioning according to specifications. The built-in calculator may or may not have been functioning according to its specifications, we cannot tell based on the information.

I do like the school inspections response to the situation. They did not call any of it a software error but only mentioned that 1372 digital exams were declared invalid. They did however see the potential disadvantage for the VMBO students, which I think is an excellent oracle, and offered a solution for them to redo the exam.

Arborosa

The wondrous world of software quality

Month: August 2014

Following the news – Code inspection is 80 percent faster than testing

Following the news