Lame Excuses for Not Load Testing: A Cautionary Tale
We all know the phrases. They are trite and lifeless, but still we use them:
- “I had a bad feeling that day, and I should’ve listened to it”
- “I had been dreaming all week of it!”
- “It was on my list of things to do that day….”
- “I always play that number…”
Especially in the performance management sector, ignoring an opportunity to test, and dismissing it with an excuse can be quite dangerous. Here are some of the lamest excuses that have become common in the sector. Have you ever used any of these?
“The odds of that scenario happening on my network are one in a million.”
Thanksgiving night 2012. The e-retailers of the specialty mall brands have been testing their sites since the summer, waiting for this night. The situations have all been rehearsed, and problem escalation procedures are already in the run books, just in case. This is the Big Game, the starting gate of high season, where bonuses are made or lost for performance managers. Zero downtime allowed until January 1. For two of those managers in particular, tonight is going to go in remarkably different directions
For three hours, from 8PM to 11PM, one mall clothing retailer’s site enouncters a problem severe enough to take them offline. Their competitor’s site will, as a result, experience an hour or two of traffic, probably exceeding 150% of the highest they had anticipated for that night. Months later, some analysts in the retail sector will be crediting this night as the turning point in the battle between these two competitors, as the brand who lost their site that night has since lost significant market share to the other.
OK, so your odds of being dealt a royal flush in a given hand of poker is about 1 in 656,000. Your odds of dying in an airplane crash are widely held at about 1 in 355,000. And your odds of hitting the big prize in the average lotto is around 1 in 2 million, with the huge payouts being more remote. But when they start talking about the big jackpot on the radio, you still buy a ticket. Because no matter what the size of the jackpot, one thing is for sure: your odds go up remarkably when you go from playing zero tickets to playing one ticket.
Conversely, the “payoff” in the performance arena for hitting that one-in-a-million glitch in prime time is big winnings… for your competitor. And with a product such as LoadUIWeb Pro, setting up and running new tests - even the ones that come to you in a bad dream - are quick and fast. These tests can be easily run from the cloud as well, imitating the thousands of shoppers who are going to hit your site. With it this easy, why the lame excuses?
“We hired a load testing service last year, and they certified our network then. I am confident.”
Wow, that’s odd. Didn’t you just say that you just released a new application last month? Has your network operations team done any grooming, software upgrades or other infrastructure enhancements in that year?
Even if your application has remained exactly the same as when your service was tested, and as far as you know your NOC team has let the infrastructure remain static, stale routes and dead links happen. Traffic can increase across the segments of the infrastructure your application uses, simply because the discovery features in routing protocols found ‘your’ segments less congested. The slow memory leak on the server has finally reached a point where it is impacting production processing.
Just like on your car, periodic preventative maintenance is recommended to catch these small degradations in service before they become bigger, more expensive issues. With LoadUIWeb Pro, a periodic maintenance can be as easy as clicking one button in the product and letting a test run. The report compiles automatically, and you can then compare the results to previous maintenance tests. You might even be able to bird dog that failing firewall for the NOC operators, before even they see it.
How long do you go between checking the oil in your car?
“We’ve got high-availability and backups. Heck, our backups have backups!”
When I performed disaster recovery consulting, my favorite trick was to call a meeting with all the “important” people on the project. I would then describe a disaster that just that second took out the very conference room where we were meeting. I always "got" a bunch of the strategic people on the project, but I usually "got" a lot of the key tactical people too.
We live in an age where we have seen the unexpected, where mega-storms flood subway tunnels in Manhattan, and volcanoes in Iceland disrupt trans-Atlantic airline travel. We never know when the unavailability of one part of our infrastructure might cause stress on other portions of our system. However, if we have the right testing tools to use, we can assure ourselves that we have thought of virtually everything that can hit our application and knock it over, and have tested our attempts to harden the system against those hits. The right testing tool has to be robust enough to model the nightmares we build for it to measure, but the method how to build said nightmare testing scenarios in the tool has to be easy and quick to set up and run, modify, and re-run.
Developers of testing tools everywhere are still working on the “Turn-back-time” option for their products. Until that feature becomes GA, Performance Managers will have to rely on a proactive test strategy to keep their worst dreams from escaping off the pillow and into production systems. Lame excuses, like indigestion on Thanksgiving night, should be avoided at all cost.