Bias and the Human Side of Software Testing

  July 18, 2013

Bias is everywhere. That is my unbiased opinion.

But most discussions about bias tend to lean toward the negative impact of someone or some process being biased. They should because bias is, by definition

a : bent, tendency

b : an inclination of temperament or outlook; especially : a personal and sometimes unreasoned judgment : prejudice

c : an instance of such prejudice

d (1) : deviation of the expected value of a statistical estimate from the quantity it estimates (2) : systematic error introduced into sampling or testing by selecting or encouraging one outcome or answer over others

Prejudice? Deviation? Error? Those are strong words.

We live in a world that desperately wants there to be a middle ground where everyone can meet, free of any bias, and find common ground. That’s a utopian view that has little to do with reality. And looking at the above definition, it’s not just a philosophical or political issue; it’s also a statistical one.

Bias exists whether we choose to acknowledge it or not. It is the choice to acknowledge its existence that alters how we view the world, experience our lives and, in the case of software, can impact the effectiveness of testing.

Types of Bias and Impact on Software Testing

This existence of bias is a hot topic in the world of software testing. Debate rages on about the types of bias there are, how they are introduced and, most importantly, how they can be prevented so results are as clean as possible.

Here’s an admittedly non-exhaustive overview of various types of testing bias. The reason I am quick to label it non-exhaustive is because this is a blog post and there are volumes upon volumes being produced on each type of bias as we speak. Also, since bias can be subjective, there are new takes on the subject being developed all the time - let us know if we should delve into each one more extensively in the future.

The following examples of a few types of bias come from Del Dewar’s “The Tao of Test” blog, which he attributed to various sources. The list includes but is not exclusive to:

Observational Bias: This can occur early in a product life cycle. When examining a text list of the requirements of the product, the existence of wireframes can bias or skew results because it could cause testers to miss elements of the text requirements that don’t get a fair shake in the wireframe. In other words, results are biased because a picture may not be worth a thousand words in a software tester’s world.

Reporting Bias: This basically says that the way data is reported can bias reaction to that data. Data can be presented in many ways, which is part of the curse and the beauty of it. However, when presenting an example like this from blogger Pete Houghton you see the potential trouble:

“For example what if the issue is: that a website has several serious issues when viewed in a particular web browser, but not in a more 'mainstream' browser. When this issue is presented to the decision maker - How could it be presented?

A) Users of Browser XYZ ... can't play/view the video

B) A browser used by < 1% of our users ... can't play/view the video”

His point is that while the second option seems to give more information it doesn’t address the real problem. It only addresses the result of the problem. If not explored further the problem itself could be missed completely.

Survivorship Bias - This is an interesting concept that Houghton explores as well. He uses the idea of how advertising messages are biased by using the example of a company saying, “99% of our customers are satisfied.” Sure that sounds nice and reads well, but the story behind the 99% could be that it is 99% of those that fought through the myriad issues with a product and survived the process to get what they can from the product or service. This is interesting in that we are wired to hear a 99% rating as a positive but when peeling back the layers it’s not at all.

In testing this kind of mindset could skew results easily and allow problems to persist even though, on the outside everything appears well.

Confirmation Bias - A paper from researchers from a Turkish university define confirmation bias as follows:

“During all levels of software testing, the goal should be to fail the code to discover software defects and hence to increase software quality. However, software developers and testers are more likely to choose positive tests rather than negative ones. This is due to the phenomenon called confirmation bias which is defined as the tendency to verify one’s own hypotheses rather than trying to refute them.”

Confirmation basis is far reaching and can include almost any definition of bias you would like. As one writer put it “this is the big daddy of all biases since it has so many variations”. Those variations as laid out in a presentation by Michael Bolton include assimilation bias, belief perseverance, availability bias, anchoring bias, recency bias, congruency bias and experimenter bias. And that’s only a partial list!

As pointed out earlier it is impossible to cover the full spectrum of potential bias that could impact software testing in a single blog post but there is one more to look at that is likely to fall on the side of the ledger that is less definable because it really addresses the human side of software testing. It’s almost more of a marketing term but it’s hard not to imagine that it can’t exist.

Emotional or Experimenter Bias- What if the tester has a particular bias toward the subject of what is being tested? What if there are personal issues that limit a tester’s ability to be neutral. Maybe there is a violent edge to a piece of software and the tester is against that kind of thing. What if the software being tested was around adult entertainment? What if it had to do with something as current and hot as the NSA scandal?

It’s this last form of bias that could be the most insidious to the success of a software testing project. It’s also the one that may not be identifiable, or even if it was, could it be measured?

What To Do About It

So what have we accomplished here? We have defined bias and identified / defined only a few of countless number of variations of bias that could impact the results of software tests. In some ways, it may feel as if there is no hope for completely removing bias from software testing.

Many talk about the removal of bias in testing. One area we didn’t even examine earlier could be an organizational bias about testing, which simply means you work in an environment where testing is undervalued or not respected. If that doesn’t skew a tester’s point of view nothing will and it lead to so many other forms of bias even those discussed earlier.

How is that possible you ask? In an interview at stickyminds.com, Keith Klain, head of Barclays Capital Global Test Centre and keeper of the Quality Remarks blog puts the onus on testers themselves, stating:

“If people are ignoring the information being produced by the testing team, in my opinion – that’s the test team’s fault. Testing produces some of the most vital information to make business decisions about risk, release dates, and coverage – how can that information be ignored! Speak the language of your project to understand what 'value' means to your business. When you align your testing strategy and reporting methods to those, I guarantee you will not be ignored. In our organization, the responsibility of ensuring testing gets the focus it deserves lies with the test team, and no one else."

To sum it up, Klain is saying that testers can simply be more vocal and stand their ground to gain respect that might limit bias even toward their own roles.

Regarding software test bias in general here’s another point of view. In their book “Lessons Learned In Software Testing” Cem Kaner, James Bach and Bret Pettichord offer:

“You can’t avoid these biases. They are, to a large extent, hard-wired into our brains. What you can do is manage the biases. For instance, just studying biases and practicing at becoming aware of them, you can become better equipped to compensate for them in your thinking. Diversity is also a protection against too much bias. If multiple testers brainstorm tests together, that can minimize the impact of any one tester’s bias.”

As with anything having any number of opinions and options as answers we are going to need to ultimately decide for ourselves.

The answer could very well be that there is not a way to completely remove bias from testing. That’s a hard pill to swallow particularly for personality types that are used to addressing complex issues, finding solutions, getting concrete answers and not feeling comfortable with loose ends.”

One way to approach this is truly from the other side of the computer screen. While end users don’t like buggy software, they are also more forgiving than those in the testing industry may think. In other words, they realize that perfect doesn’t exist, especially in the world of technology. Even those with cult-like followings like Apple products users see flaws in their favorite things but being a survivor (meaning they fought through their issues and become one of the 99% that are ‘happy’) they accept the imperfections.

That’s not to say that every effort shouldn’t be made to remove any form of bias from a software testing environment. It should. The expectation, however, is that there should be an acceptable amount of bias that makes its way through. The job of every tester is to mitigate its impact and ensure that the end product is performing at its optimal level with the least amount of hiccups possible. Will the process be perfect? Since human beings are involved I would say no and maybe that’s good enough.

What are your thoughts?

See also: