About Arborosa

Arborosa is a software tester, walking fanatic, father of two and husband living in Utrecht in the Netherlands. This blog is intended to display my thoughts and opinions on software testing, books, blogs, experiences and anything else that I find interesting enough to write and publish about.

Following the news – Code inspection is 80 percent faster than testing

During the CAST 2014 conference in New York I participated in a workshop by Laurent Bossavit and Michael Bolton called “Thinking critically about numbers – Defense against the dark arts”. Inspired by this workshop I took a look at one of the Dutch sites addressing news about software testing www. testnieuws.nl. This is the second post to come out my curiosity.

On May 23, 2014 Testnieuws hosted an article “Code inspectie is 80 procent sneller dan testen” (translated Code inspection is 80 percent faster than testing). The article itself provides little more substantiation for the claim than a reference to research by IfSQ. Both the claim and usage of this as header seems to only serve to grab the readers attention. The article ends with an invitation to read more about it and this leads to what I think is the actual article “Status ICT-projecten vaak compleet onduidelijk” (translated “Status of ICT projects often completely unclear). This article describes that, especially government, projects need to have more objective information and they need to get it earlier. This way it is possible to determine the status of a project. Andres Ramirez, Managing Partner of the OSQR Group states that “better software leads to better projects” and “better source code leads to better software” and “the quality of source code can be objectively assessed by the guidelines from the Institute for Software Quality”. The last quote explains the IfSQ abbreviation used earlier.

A little further in the article the claim is used and even extended “Code inspection is 80 percent faster than testing, and finding and repairing code is much cheaper than testing”. Ramirez also adds “Research by IfSQ shows that regular code inspection during the production process ensures that software can be changed more easily. Inspected software is 90 percent cheaper to maintain.”. I am choosing to ignore these last claims for now and proceed to IfSQ the look into their research.

The IfSQ – Research Findings Relevant to the IfSQ Standards hosts about 50 or so reference to articles and research results divided into sections “Why should you inspect software?”, “When should you inspect software?” and “What should you look for?”. Noticeably the focus is strongly on code quality and, to my opinion, therefore not really on software quality as such. Also there seems to be need to position code inspection opposite to testing as suggested by titles like:

The second title points to a page with the title “Inspection is 80% faster than testing” which indicates I am on the right track. The page however only repeats “Code reading detected about 80% more faults per hour than testing.” and provides two, non IfSQ, sources for it without further argumentation. The sources are:

So, at least in this case, so called research findings by IfSQ do not point to research executed by IfSQ themselves, nor were they involved as both sources are quite old and IfSQ was established much later in 2005. Next step to identify which of the two sources holds the quote.

The first article was easily found. In summary the article describes a scientific study that applies an experimentation methodology to compare three (then) state-of-the-practice testing techniques: a) code reading by stepwise abstraction b) functional testing using equivalence partitioning and boundary value analysis, and c) structural testing using 100 percent statement coverage. It compares three aspects of software testing: fault detection effectiveness, fault detection costs and classes of faults detected. It focussed on unit testing code using a limited set of specific programs, known errors and a mix of academics and professional developers.

Although it found difference between the three test techniques with in some instances an identifiable hierarchy of code reading, functional testing and structural testing non of the results came anywhere near the claim of being 80% faster. So my conclusion is that this article cannot be a valid source for this claim.

I could only find the second at IEEE and as a result the article could only be read by buying it. Setting aside my initial dislike of paying for information, especially if it so old, I tried to buy it. Unfortunately the cash module did not like my dutch creditcard. As a result I stuck to a number (4) abstracts and a course summary of where the article was used.

The second article came closer to the IfSQ description of code inspection in describing how its done, what is needed for it and what it can measure. Still none of the abstracts said anything about being faster than testing. They did mention percentages around 80% for defects found by software inspections. This to me is a different claim. Sounds to me that a ‘leaky’ inference was made and worse another attempt to gain credibility by bringing testing in disrepute.

 

 

Following the news

During the CAST 2014 conference in New York I participated in a workshop by Laurent Bossavit and Michael Bolton called “Thinking critically about numbers – Defense against the dark arts”. This workshop addressed the usage of numbers, measurements and the ability to influence their perception by choosing a certain representation, telling only parts of the information or leaving out context. Inspired by this workshop I took a look at one of the Dutch sites addressing news about software testing www. testnieuws.nl.

On Testnieuws I found an article about software errors invalidating school exams “Eindexamen ongeldig door softwarefout“. The story describes that the school inspection declared 1372 school exams invalid due to software errors. Especially software errors in the VMBO (a dutch school type) Math exam. Being a tester I was curious if could find out what had happened. The article written by Marco van der Spek referenced a newspaper article in “De Limburger“. Both articles were exactly the same, so my first conclusion was that in this case it was more of posting an article than writing an article. Something which is in line with the general practice of the site as it is more a collector of news than an actual writer of news stories. (To my knowledge the site only manned by a a few part-timers.)

Since the newspapers only indirect reference was mentioning the school inspection my search now focussed on looking for the source document there. On their site I found the original press release. The press release added much more detail with regard to all the exams, both written and digital, and differentiated the 1372 number as follows:

  • 127 cases of technical problems (film not running, computer not working correctly)
  • 112 cases of non technical problems (running out of time, fire alarm, usage of wrong materials)

That brought the number of potential Math exam problems down to 1133. The press release also mentioned what the problem with the Math exam was. It was not that the digital exam had produced obvious software errors or that functionality had not been available. The exam had offered the usage of an build-in calculator that handled negative numbers differently than many of the calculators VMBO students had used during the school year. The school inspection had ruled that this had given the students a disadvantage and offered them the possibility to redo the exam if they so wished. 1133 students have made use of this offer.

So what does this mean?

First that the number 1372 school exams is arbitrary and for the most part based on the number of students that used the offer to redo the Math exam. This could also have been half or double the number. So mentioning that specific number does not add value to the story.

Secondly I see an oracle problem with regard to the conclusion that the software was in error. Based on the information I cannot tell if either the built-in calculator had actually produced wrong answers when using negative numbers or if the calculators used by the students did during the school year. (Assuming there is another oracle in form of a scientifically established rule to use negative numbers we could which of the two produced incorrect answers.)

Finally I see a case of shallow agreement on what a software error is. For some a software error is something that occurs when the software runs into a situation where it can not handle or produce the data and responds by showing an error message. Others see a software error when the software, in this case the calculator, is not functioning according to specifications. The built-in calculator may or may not have been functioning according to its specifications, we cannot tell based on the information.

I do like the school inspections response to the situation. They did not call any of it a software error but only mentioned that 1372 digital exams were declared invalid. They did however see the potential disadvantage for the VMBO students, which I think is an excellent oracle, and offered a solution for them to redo the exam.

Seven questions – What questions do I have?

The previous two questions helped you to find why testing is necessary, what information you need to answer the first question (business value) and which test ideas help you deliver meaningful and relevant information. This post now extends this to areas that help you identify the circumstances in which you will have to do your work. It ends with a little advice that you should not take things for granted especially if you do not understand them.

DID-A-TEST

Originally called Jean-Paul’s test this mnemonic represents a set of surveying questions that helps you identify working conditions. Once you have the answers to these questions you should check if and if so how this influences your ability to test and the ability to give more or less rich information to your stakeholders. You can use these questions to identify  boundaries and constraints to your testing possibilities and address them or at least be and make others aware of them. These questions are by no means exhaustive, but in my opinion they form a good starting point in exploring your test context.

Are the Developers available?

Developers are physically close of far from you. They are more or less available in time or more or less organizationally accessible to testers. The ability or inability to work together with development can influence your risk assessments, your insight into risk areas, your knowledge about development solutions and what is or is not covered by development testing activities. Additionally when addressing developers it is good to know the preferences and willingness of each developer with regard to working with testers.

How soon do you have access to Information?

Of course you can use the FEW HICCUPPS mnemonic (James Bach, Michael Bolton) to improve and expand your test ideas, but gathering information about the intended product or solution is a main starting point and important reference to work with. So getting access to the sources of information or even better being involved in the information gathering should start as soon as possible.

Do you control the test Data?

My interpretation of test data here is wide in the sense that I do not only mean the ability to enter different types of inputs, in different variations and quantities. I also mean the ability to set up and load data sets creating test scenarios. And the ability to set or remove states in the software. Being able to control the data is beneficial in speeding up test execution, creating typical test situations and helps to quickly repeat the test case if necessary.

Having control of the test data is only one side of the story. The other side of the story is that you need to find the right ‘Trigger Data‘ to use. Trigger Data is any data item, set of data or data state specifically created and used to invoke, enable or execute your test case (scenario).

Are the Analysts available?

Like the developers the availability, both physical and in time, of the (business) analysts has an impact on the way you can interact with them. And like the developers analysts will have preferences and are more or less willing to work with testers. The impact of this might however be larger as analysts are often the first source of information about the products intended functionality and its means of satisfying the stakeholders needs and wants. They are often also a sort of gate(keepers) in communicating to business stakeholders. In that sense they can make a testers live more or less easy. Especially if testers are not expected to go outside of the projects boundaries.

Are the (other) Testers available?

In my experience working as the only tester on a project has an impact both on the way you work and to some extend to the quality of your work. Being able to pair, share thoughts or just have a chat with another tester can help you reconsider your work and develop new or different test ideas. The tester doesn’t necessarily have to be in your team to have this effect. Having other testers in your team brings both the benefit (and sometimes burden) of being able to divide work, get fast feedback on test ideas or test results and the possibility to focus or divert away from your strengths and weaknesses as a tester.

Do you have a quiet work Environment?

This question addresses two different aspects. The first aspect is the infrastructure. Do you know what it’s components are? Do you have a separated test environment? And if so are you its only user? Do you know how to get access to it? Are you allowed to change it yourself or do you need others to do it for your? Is your test environment similar to the real production environment?

Secondly it addresses the circumstances of your workplace. Do you work in isolation, in cubicles, or in a large office garden? Is your work uninterrupted or are you (in)voluntarily involved into other work processes and activities? Does that influence your performance and well-being? What the influences are obviously depends on you as person and the real circumstances. But it is wise to take note and consider possible consequences. There are many studies into this field. Here are few articles that might trigger your interest: “Designing the work environment for worker health and productivity” by Jacqueline C. Vischer;  “Interrupt Mood” by Brian Tarbox or “Where does all that time go” by Michael Bolton.

Are the Stakeholders (that matter) available?

Stakeholders come in many forms and shapes, but they have one thing in common. They are in someway involved in the creation and/or use of the software solution. That not only means they need to be informed about the product that also means that they have expectations and opinions about the product itself, what it is used for, and what the products needs to able to do to make it valuable to them. As a tester you should identify these expectations and opinions and tailor your information about the product so that it is meaningful to them.

In theory the effort you put into gathering, tailoring and presenting that information is based on how much the stakeholders matters to the product, the project and to some extend to you the tester. I say in theory because to do so in practice the stakeholders need to be available and accessible. If they are not or if it is difficult you should take the extra time and effort into account of your testing and test reporting.

Is there (mandatory) Tooling?

There are many types of tools available in the market to capture requirements, store test cases, log test execution or manage bugs. And likewise there are many tools available to use during testing. As a tester you need to find out which tools there are, which tools you are allowed to use, and which tools are mandatory to use. You will might not know all the tools you are faced with or are unable to use a tool that you already know and like. In that case you will have to get used to the ‘new’ tooling and learn to use it. Additionally many tools have inbuilt workflows and processes that take away time from actual testing. As a tester you should be aware of this and take this into account when testing.

Poutsma principle

Whenever I start on a new test assignment or pick up a new work item I need to search and find its purpose, its meaning and I need to understand how the chosen requirements offer a solution to the problem that is solved. Sometimes that is really easy.

Say you visit the 36th International Carrot Conference before going to CAST 2013.  You come home and decide to sell carrots for hungry rabbits online and you want to vary the amount of carrots or differentiate the type of carrot for different breeds of rabbits. You will need something like drop down list or input field to identify the different rabbit breeds.  And except for the sudden urge to sell carrots this is fairly easy to understand and test.

If however you are asked to test the software implementation of calculating results for a new Credit Risk Model used by an international bank you will have a lot more to understand. If so I remind myself of the Poutsma Principle:

If something is too complex to understand, it must be wrong.

I use this principle to remind myself to keep asking questions until I either understand it or except the argumentation of it as proof. In either case it helps me to break down requirements to a level that makes me confident enough to start testing and daring enough  so that I can also use my personal addition to the principle

And it is your job (as a tester) to proof it wrong.

If you want to know more about the Poutsma Principle you can follow this link.