By the time any software development project nears completion, it likely will have gone through numerous tests, particularly in an Agile environment where testing and development happen concurrently. But no matter how many tests you’ve run, once your application is nearly complete, there’s really only one way to know whether or not your software can handle the actual demands your army of end users will soon be placing on it. It’s called load testing.
As the best known and most commonly conducted type of performance testing, load testing involves applying ordinary stress to a software application or IT system to see if it can perform as intended under normal conditions. It is related to its bigger, more brutal cousin, stress testing, but load testing ensures that a given function, program, or system can simply handle what it’s designed to handle, whereas stress testing is about overloading things until they break, applying unrealistic or unlikely load scenarios. Both practices can play important roles in determining exactly how well a given piece of frontend software, such as a website, or a backend system, such as the Apache server hosting that site, can deal with the actual loads they’re likely to encounter through regular use. Stress testing deliberately induces failures so that you can analyze the risk involved at the breaking points, and then, perhaps, choose to tweak programs to make them break more gracefully. Stress testing is useful for preparing for the unexpected and determining exactly how far a given system can be pushed, exploring the outer limits of performance capacity. But when it comes to simply making sure that a software application or physical network can endure the user requests and actions it is likely to encounter in ordinary circumstances, load testing is the right method for the task.
Of course, it should be noted that if your application isn’t actually ready for the expected demands, then a test that was intended to be a load test when you launched it can suddenly become a stress test while it’s running. Once the load starts causing things to break, from that moment on you are, by definition, stressing the system. This is the main reason the terms are often confused, because the exact same test may turn out to be a load test under some situations and a stress test under others.
Understanding Load Testing
Load testing is about creating production simulations with an application or system that is as near as possible to being a finished product ready to deploy and subject to the masses. By utilizing specialized testing software, load testing allows dev teams to answer questions like “Is my system doing what I expect under these conditions?” and “Is its performance good enough?” As the Microsoft guide Performance Testing Guidance for Web Applications states:
A load test enables you to measure response times, throughput rates, and resource-utilization levels, and to identify your application’s breaking point, assuming that the breaking point occurs below the peak load condition.
Here, “below the peak load condition” simply suggests, again, a testing methodology that falls within the parameters of a load test as opposed to a stress test (which, by definition, is testing a system at and beyond peak load). Load testing can identify system lag, pageload issues, and anything else that might go awry when multiple users access an application or bombard a system with sudden traffic—things that can be easily overlooked in a development and testing environment where code is often checked with individual users in mind. Mix in a hundred or a thousand people trying to access the software or issue commands more or less simultaneously, however, and problems that might not have been detected in solo use cases can suddenly come to light in all their buggy glory.
As just one example, let’s say you’re developing a new online voting platform, and you’d like for it to be able to handle potentially up to 10,000 user submissions per minute during peak load times. While developing the software, you may have performed unit tests as the code was being written, plus periodic regression tests to make sure you didn’t break existing functionality with each new modification as development progressed, but at what point did you begin testing for multiple users? At what point did you begin testing the program to accept hundreds or even thousands of overlapping field entries, form submissions, and other commands?
Technically speaking, load testing of an application can’t be performed until a project is nearly at the end of its production cycle, in which actual user engagement and system performance can be accurately simulated and put to the test. It’s analogous with a car: you can repair and test the engine, but if the engine isn’t installed yet, you can’t test the car’s performance on the road. And yet, if at any earlier point in a software development project you were able to test a specific component in a focused way for load—such as testing for backend performance issues, simultaneous user input, endurance of input over extended periods of time, or anything else that could put stress on your system and cause lag, memory leaks, or broken functionality—then you were, of course, already engaging in a limited form of load testing and already on the lookout for the effects of multi-user engagement with your system. When testing the input of only a few users on an incomplete system, performance-testing expert Scott Barber, one of the coauthors of the aforementioned Microsoft resource, prefers to refer to this as “multi-user functional testing.” Proper load testing, again, requires a nearly complete system, and generally demands the use of testing software that can simulate users by the thousands.
But there’s an exception to every rule. While multiple users are clearly an issue where Internet applications are concerned—from smartphone GPS apps to online multiplayer video games—load testing can also be performed on systems without multiple users, because multiple users are not the only possible source of load. Sometimes load is the result of large files, intense calculations, or even poor network connectivity. Think about opening a PDF in Acrobat, for instance, or a PSD in Photoshop. Load comes into play wherever a system encounters stress. Do the files open quickly enough? If the files are too large, will the application crash? Exactly what criteria do you use, anyway, to judge whether or not your application is opening a file “quickly enough”? Is it acceptable if it opens just as it should but takes five minutes to do so? Who sets the criteria for a system’s ideal load capacity and on what basis? Where does a load tester draw the line between a user’s subjective preference and a system’s objective functionality?
To be a good load tester, something more than software engineering and testing expertise is often called for. And that’s being well versed in the psychology of user experience.
The Future of Load Testing: Getting Inside the Heads of End Users
The ultimate purpose of load testing—and performance testing in general—is always to mitigate risk, be it risk to your software’s successful functionality, risk to your end-users’ sanity, or risk to your company’s bottom line. Naturally, all three of these are intimately intertwined, so it’s important to know how they relate to each other and where you, as a developer or tester, can intervene for the greater good. Let us dare to suggest that if you focus on mitigating the middle criterion, user sanity, the other two factors will usually fall into place, and that many load-testing issues actually boil down, in the end, more to users’ perception than to specific ideal pageload times and other technical stats.
Indeed, while specialized software is typically necessary to run repeated load tests, due to the existence of complex human beings, interpreting the data is not as straightforward as it may seem. For example, if someone arrives on a website that clearly hosts nothing but text, they’re going to expect it to load instantly and have very low tolerance for it taking more than a second or two, but if they’re expecting an embedded video to load, they’re going to be far more forgiving when it takes a bit of time. And that’s because we’ve all been trained, at least since the advent of broadband, to expect these things. As one delves into the psychology of user experience these factors get more subtle, with people actually preferring uniformly consistent sluggishness within a site, for example, to sites that load faster overall but have internally inconsistent speeds. So without a working knowledge of this psychological dimension, truly understanding your users’ desires and expectations, no amount of load-testing data alone is automatically going to help you modify your software in the ways that will give you the biggest perceptual bang for your buck.
In other words, if you don’t understand human psychology, knowing how a user acts and reacts, you’re unlikely to generate a very realistic load test and, worse, you’re probably going to misinterpret the results. That’s why, when performing load testing, it’s important to be able to simulate the actual end-user experience as closely as possible, replicating what someone will likely encounter as a site or application approaches peak load, analyzing the test results, and then doing one’s best to reduce any of the unpleasantness surrounding that experience by modifying the system accordingly. As the pace of production cycles increases every day, software companies can save time and money not by fixing everything that can conceivably be improved to streamline a high-load situation, but instead by simply focusing on the specific impediments to a smooth and efficient user experience.
Written by: Tom Huston
Check out the LoadComplete Load Testing Tool