Test Automation Pitfalls: Don't Get Trapped!

  August 19, 2015

Automation has been around for a long time now, at least since the 1950s. At the insistence of management at every software company I've worked with, I've been trying to make it useful for the past decade.

 That game is a tricky one. One common strategy, for example, is to decide what the automation architecture will be, then go do it. If that first guess (made without any experiments) is wrong, and you could end up with an expensive project that doesn't do much, or isn't used, or can't keep its promises.

 I've learned some important lessons in that time; I'll share some of those here so you don't have to wait 10 years to learn them yourself.

 It Isn't Actually Testing

 The hardest conversations I've had around automation projects came from angry managers -- usually about why a bug wasn't found. Sometimes, the answer is that a check wasn't designed appropriately, but often the answer is that automation isn't really testing.

 Testing -- investigation, learning, and making decisions and judgement about a product -- is an important part of creating automation. I tend to find a lot of bugs during that part of the process. Once the tooling is complete -- coded, scripted or recorded -- the tool can run. When the tool runs, the software is not doing that investigation and learning. All the computer actually does is drive the software then ask a series of yes and no questions to determine the results. This idea of specific-action/inspect result is something Michael Bolton calls checking, not testing, and I've tried to use that distinction here. This is a great tool for change detection, and finding problems when new changes introduce problems in old code, but the only bugs that can be found are the ones you anticipate ahead of time.

 When I get a new bit of software to test, usually there is information coming at me from many different places -- product specifications and user stories, the product managers, the programmers that wrote the code, and similar functionality in the same product or even competing products. All of this comes together and helps me decide what is currently the most important thing to focus on, the types of problems the customer will care about and where I might find them, and where I should take my testing next.

 Taking in new information and making decisions about what to do and which observations are important is part of the magic of testing.

 Again, the automation alone can't do that. Automation strategies are based on defined procedures. When the test is running, new information isn't considered or even observed.

 Critical Mass

 Automation strategies often start with something like a smoke test, probably in the User Interface (UI), that can be run with every build or at a minimum every release. From there, maybe you get a progression to working with product managers to try to figure out a few more scenarios that absolutely must work to even think about shipping. After that things get bigger and bigger, sometimes under the guise of reducing the time it takes to do regression testing, or cutting that time out all together.

 At some point during this progression, every automation project gets to an important threshold. You get so many tests that running them takes a long period of time. The feed back loop starts to slow; eventually the results arrive the next business day instead of the next hour. Maybe you start running tests in parallel on a couple of servers to speed things up in that case. The leads to a different problem: Where you have so many tests that every minute is taken up by fixing tests that broke because of product changes.

 There are a few companies, some of which you know well and probably use their products daily, have tried an "automate all the tests" strategy. These strategies are pendulums. They swing from one extreme, an army of testers spending day after day running highly detailed tests, to the other extreme of teams of programmers developing and running suites of automated checks in all the different layers of the product. The side with an army of testers has problems related to the amount of time it takes to get a product delivered. The "automate everything" side will eventually hit a point where there are so many tests that it will take hiring one more programmer to test anything new.

 The Technical Tester

 Test automation is basically creating a second software product that runs against the product you are selling. The growth and development of both are important. At some point, sooner or later in the project, you'll find something that needs to be calculated - a date that should be "today's date plus two, except Monday if it overlaps a weekend, except Tuesday if Monday is a Holiday." Somehow has to write some code, the algorithm, for the check to figure out if the software passes or fails.

 Sooner or later, someone has to program, at least a little. Technical know-how remains important, even if you have the full support of the development group. Doing automation at anything more than a trivial level requires a special type of tester that understands at least how to read and modify a little code.

 If you are working in the UI, knowing about the document object model (DOM) along with JavaScript or whatever front-end technology is being used will be needed to create tests that can withstand changes like a button being moved around on the page. Under that, at the service layer, understanding REST architectural standards and a little about how web services work are important. Under the service layer, we are almost exclusively testing with code.

 Even though demand for testers with a good amount of technical ability, or developers with some working knowledge of testing (that are actually interested in testing) is growing, outside of tech hubs like the Silicon Valley, and maybe Seattle or Boston, these people can be hard to find.

 False Positives

 False positives, or tests that fail when there really is no product problem, are a disease for automation suites. This problem is most comment when the user interface is in flux and the tool focuses on it. For example: A check for login will fail when you add a new required field to the login form. Well, of course it "failed" -- you didn't fill in the filed! Each time this happens, a person has to read the failure logs, find which test failed, rerun the offending script (sometimes multiple times) and spend time analyzing the failure. Sometimes these failures are caused by timing issues, where the tool is trying to click buttons or enter text on a page that hasn't completely loaded.

 Other times the product has changed in ways a script can't cope with and the script has rightly failed. Even though this failure requires someone to take action, updating the test script, it takes up time and doesn't point to a problem customers would actually care about.

 One company I worked with had a large automation strategy that focused on the user interface. Every day before leaving for the evening, I'd type the magic words into a command line to kick off a set of checks that would run for a few hours and then email out an HTML file with the pass and fail count and a little snippet of the stack trace for the tests that failed.

 On a good day, I'd get to the office in the morning and see that somewhere around 80 per cent of the tests had passed. I'd spend about half of days like that rerunning the checks that failed trying to figure out what had happened. On other days that weren't so good, I'd get in and find out that something like 80 per cent of the tests had failed. The managers had of course already seen the email and replied demanding answers. There were two possible scenarios, something happened to the test environment that caused everything to go bad, or alternately something very wide sweeping changed in the product that affects a lot of tests.

 The bottom line, was that we had a automation strategy that was returning information so inconsistently that we couldn't trust the results whether they were good or bad.

 Getting Away With Fewer Testers

 The next step after signing on to automate as much as possible is usually trying to slowly transition the testing staff to other parts of the company like support, development, or product management. Maybe the testers that were originally there aren't explicitly fired or laid off, but over time there are none left. When this happens, all of the testing skill is cycled out of the company and what is left is two teams of programmers, one writing code for the product that gets sold and the other writing code to test the product.

 This directly helps to get more test automation but it also sells off all of the testing skill, or at least all of the people devoted to studying testing, in the organization. Instead of a team that can find ways the product might fail, there is a team focused on confirming that the product does work through specific examples.

 There is a saying: be careful what you wish for, you might get it.

 That's just a handful of the problems I have seen team struggle with when it comes to tooling - the few that were the worst that repeated the most often. Next time, we will talk about some real strategies you can use to avoid these traps and develop a successful and useful automation project.