Reference test

Regression Testing

A while ago a new developer approached me to discuss some changes he wanted to deploy to the test environment. He had re-factored parts of the application workflow management and redesigned a number of reports. We talked about it and he stressed that these where major changes and a lot of work to test. While we continued discussing I commented “that’s okay it will take some time but we have a regression test set just for this that we can use to work with and that will help us…

He looked at me with total bewilderment and uttered: “Didn’t you get anything of what I said. These are major changes. No regression test will cover that! They will all fail!!

I looked at him and realized…

He was right

His understanding is based on a common perception where regression tests mean something like running the same step by step tests as before to see that they have the same step by step results as before. And if that were the case he would obviously be right. There was no way I could execute the tests the same way as before simply because the workflow method and GUI functionality had changed dramatically. Nor could the results be exactly the same given the design changes he had made.

But I was right too

His reaction made me realize that over the years my understanding of a regression test set has changed dramatically. I don’t see test cases as something that provides a step by step description. I see test cases as a doable and executable follow-up of a test idea. A test case to me consists of the following elements:

  • A test idea; what do you want to learn, verify/validate or investigate?
  • A feature; which part of the test object do you investigate?
  • Test activities; a way to exercise your test idea and see how the software behaves
  • Trigger data; data that should trigger the behavior you want to see
  • An oracle; something that tells you how to value the behaviors information/data

With that understanding I looked at the changes and had concluded that the existing test cases were still useful. Even if the workflow management was re-factored I could still apply the test ideas we had used before. In essence the same features were affected and my oracles should still be valid. To me adjusting the test activities and trigger data, in this case to match the workflow changes, is something I tend to do anyway. With these adjustments I try to make re-running test cases more effective as variation adds new chances of finding new or different bugs.

With the reports the differentiation was slightly different. Test ideas, features, test activities and trigger data could stay more or less the same but my oracles, the current template reports, had changed. The reports now had a new layout but their content had changed only minimally.

Reference test

To avoid future confusion I came up with a new name for my tests which, as far as I am aware, has not been used in software testing before. I will call these tests Reference Tests.

A Reference Test then is a test where under similar circumstances similar input results in a similar outcome of the test if evaluated against an identical oracle

I am aware that ‘similar’ and ‘identical’ in this definition are relative concepts and I have chosen them on purpose. It expresses that each time that such a test is used the tester needs to be aware which information it tries to show or uncover, how it is doing that and to what purpose. This discourages mindless repetition or pass/fail blindness. It encourages thoughtful selection and execution of tests and deliberate evaluation of test results.






4 thoughts on “Reference test

  1. Hi Arborosa,
    this is an interessting topic. My first thoughts when I saw ‘reference test’ and your definition of it was: OK, that differs from ‘regression testing’. And it sounds good.
    But then I surfed on the web for the phrase ‘regression testing’ and found amongst others the following one definition about it on wikipedia (Regression testing): Regression testing is a type of software testing that seeks to uncover new software bugs, or regressions, in existing functional and non-functional areas of a system after changes such as enhancements, patches or configuration changes, have been made to them.
    I know that this is only one definition and no one can say that it is a correct one. But it contains also the phrase ‘testing… after changes such as enhancements, patches or configuration changes …have been made’.
    Now I’m a little bit confused.
    Is that not exactly that what the developer wants you to tell when he mentioned ‘These are major changes…’?

    Now I can’t understand what realy is the difference is between ‘regression testing’ and your ‘reference test’.
    On both ‘reference test’ and ‘regression testing’ you have as far as I understand at least similar inputs.
    About ‘similar outcome’ and an ‘identical oracle’ I’m currently not clear. When the developer mentioned that there are major changes (requirements and/or gui elements on the SUT), do you realy then have a similar outcome depending on the same similar oracle then before?
    But I find the idea not bad, trying to make differences between such ‘regression tests’ and others (=’reference tests’)!
    But at the end when your stakeholders do recognice the difference and it all works fine for your context, why not?
    I hope you don’t get me wrong. Maybee I got s. th. wrong. But I like to talk about ‘new stuff’ in the world of software tesing.
    kind regards, Ralf


    • Hi Ralf,

      I can see where the confusion comes from. In both the definition you use and in my Reference Test the starting point is the same. There is a change to the software. Both also have, more or less, the same purpose, to uncover bugs that the change might have brought about.

      So if the difference is not in its cause or in its aim then what is the difference between a regression test and a reference test? (I would imagine you ask.)

      The essential difference is in what is actually done to test. In my experience and in many explanations of how to do regression testing, that I know, the idea is to literally re-execute existing tests and then judge if there is a bug based on evaluating them by expecting the same outcome as before.
      In a reference test however there is no test to literally re-execute. There is a test idea that defines what you want to learn, verify/validate or investigate. There is the (un)changed feature you want to investigate and there is some basic information on how to exercise the test idea. This information is not a step-by-step description but information on what tooling to use, which access points (API, DBase, page objects, URL, etc.) there are and how to approach them.
      So rather then to rerun a test that was made before it to re-investigate the test idea behind the test.

      Sometimes a change is not only a change to the code, like it was in my case, but also a change to the functionality. In that case I would evaluate my reference test to verify that the test ideas are still valid and assess if I would need to change my oracle. But if the changes are not so dramatic a reference test is likely to be sufficient. They take more time then a regular regression test as one needs to re-think the method and the data but they have in my experience a higher value in discovering new information about the software.

      I hope this brings some more clarity.



  2. Hello Jean-Paul,

    your idea, which I really like, reminds me of something we used in a former project:
    We distinguished between abstract tests and concrete tests. The abstract tests were close to your reference tests: We decided what we were going to test on a functional basis, regardless of the actual steps necessary to perform the tests (in a nutshell: answering “what” and only a very rough “where”). Concrete tests (what, where, how and testdata) were created later on for some abstract ones, but not all. Background was that we had no clue how requirements were going to be implemented GUI-wise, but still had to deliver testcases beforehand. Which sometimes meant that – when doing regression tests- concrete tests could fail without abstract ones failing (GUI-change e.g.). This worked out pretty well for us.

    Best regards,


  3. Hi Jean-Paul,

    How are you? I came across your blog and I was wondering if you would be interested in guest blogging on

    Adding a blog post to TEST Huddle is easy as we have an upload resource option available on the site here:

    The sooner you upload your blog, the sooner we could add it to the blog schedule.

    I look forward to hearing from you,
    Kind regards,


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s