Managing Risk in Software Testing

How does your team come up a software testing strategy? For many teams, the answer is "we don't", at least, not really. Instead the team does what someone else defined a long time ago, perhaps something like run all the test cases, and then rerun the tests that fail once there is a fix. Other teams haven't thought about the question at all and release as soon as the feature testing is done. These strategies can work for a while, but eventually managers will either wonder why the release is taking so long, or get angry calls from customers about bugs, or start to ask questions about "coverage." Not all tests ideas are created equal. Some are more likely to find problems, and some are important because you don't want a problem appearing in a critical part of the software. Here are a few practical ideas to prioritize testing work based on risk, and how to talk about it.

How to identify risk

Most software testers are only vaguely aware of product risk. They just test everything and report bugs. For example, a tester might be working on a new text field, and enter something like a string with 256 characters or a string with special characters in it and when the submit button is clicked, boom. A buffer overflow error pops up in the browser to let them know that field wasn't designed to handle more than 255 characters because of the intersection of the variable type and the field side in the database. 

Both of those 'quick attacks' above are designed to expose a certain type of risk, or a way that the product could fail and be less valuable to a customer. They are also emergent — the testers weren't really aware of the risk until  it was in their face. Perhaps, during the testing, some core use cases emerge, and the testers get the idea of protecting those use cases.

Discovering risk ahead of time is a communication exercise.

At the release level, there are a couple of places to start. Step one is to take a look at everything that changed over the past couple of weeks and make a list. Once that list exists, the testers can talk with the programmers about what those changes impact. A change to a date field might alter how data is viewed in other parts of the software, how it is stored, reported on, or how a page displays. After the programmers describe the dependency tree of a change, start asking about general concerns and gut feelings. There are always other consequences to think about.

Things are a little more granular at the feature-testing level. Testers loosely think about risk for features in terms of 'test cases' or the series of steps it will take to expose some type of problem in the software. It might be more productive to leave the steps out of the equation and think in terms of test ideas, scenarios, or flows.

Test ideas are much more general.

Instead of asking what happens when someone enters 123456789 into an age field, a test idea might zero in on the risk that the software doesn't correctly handle things that aren't real ages — perhaps with a few examples. Test cases have a tendency to restrict people's behavior. When a person sees a set of explicit steps, some people will just follow without question. There is also the matter of this convincing management that testing is completely procedural, and once the test cases are there the people are completely interchangeable. Talking about risk and ideas instead of cases and steps opens testing up to personal judgment and variability, which is where the deep bugs are often hidden.

Ranking and taking action

Creating a list of risks is a pretty good start, but it can get much better. The next step is to make that list visible and sorted by priority.

To do that requires two things: A big wall that people walk by often, and a roll of painters tape.

First create a few rows on the wall with the tape. Each row will represent an area of the product, or a team. It can be any meaningful way to think about a part of the work. Once the frame is there write down the risks on sticky notes, one per note, and put them in columns. The team can arrange them initially based on what they think is most important. Importance can mean this feature is going to be used by a sensitive customer, or a delicate part of the product. The important part is to get the brain working and thinking about what should be tested and when.

This first draft of the risk calendar usually has some flaws.

The list only comes from one point of view, the testers, and because of that could use a little redecorating. This is where having that list in the hallway becomes important. Development managers will walk by and see what is happening, people from the product team will walk by on their way to meetings and get a view of the release work, and the CEO might even get a peek when she isn't busy rustling up the next round of funding or visiting a customer. Each of these types of people will have their own unique perspective on what is important.

The product person might recognize that three customers are waiting on a new accounting report to go live and bump that up to the top of the risk calendar. The project architect might see an ElasticSearch upgrade and move that up a few slots in the calendar because of how many parts of the software use that one library. Any problem there would be seen all over the place. The CEO might notice that testing for a new management tool is at the bottom, but move it up because there is an important product demo next week. Everyone adds their own opinion to the risk calendar based on what they know about the software, the schedule, and the customer. The only thing to watch out for here is too many cooks syndrome. One person might cancel another persons opinion out on accident without knowing the reasoning behind it.

This risk calendar doesn't arrive perfect before the first build. Instead, it emerges and changes with time. One feature might be nearly done when what is important changes all together over night. Priorities can change without public lists, but when the list is there people can see the change happen and have a conversation about it. Hopefully the questions like "Why are you working on this, I thought you were working on reporting" completely go away.

Software testing is all about discovering risk — what will go wrong when the customer uses the product, and how will that make them question the value. Thinking about testing up front and organizing some of the testing work around those ideas can completely change a teams view of what is important.

Start by creating a risk list, and then see where it takes you. 

Additional Resources