How to Fix a Flaky Selenium Suite

  February 15, 2018

Selenium has changed the game for many software teams, but it doesn’t come without its many challenges. One of the most common headaches for testers is recurring flaky tests, or tests that fail randomly when there hasn’t been any change to the code.

It’s no surprise that this can get annoying since flaky tests affect progress. After a while, it can significantly slow down teams’ productivity when they are no longer able to differentiate between a test that has a bug and one that they’re wasting time on because of an invalid fail.

Eventually, too flaky tests may mean testers might ignore the results, which of course isn’t a good option either. In order to put an end to the flakiness, we have a few tips to keep your builds stable.

  1. Find the source - When your tests start going flaky, you first have to investigate why they’re going flaky.In order to identify the problem, it’s important to be familiar with common reasons for flaky tests so you can identify them. Oftentimes, this can be something really simple such as poor use of bad locators. Craig Schwarzwald stresses the importance of locators that are unique, descriptive, and unlikely to change. If they don’t meet that criteria, it’s a bad locator, which means your tests are likely to fail.  However, there are many other factors that can lead to flakiness. Richard Bradshaw says that lazy manual testing practices, poor knowledge of tools incomplete data, and bad assertions have all been culprits of flakiness, which goes to show that it simply comes back to how well your tests are written.
  2. Isolate and fix - Once you understand the root problem of the issue, it’s much easier to isolate the test and debug. You want to do this as soon as possible because the longer a flaky test sits in your suite, the more likely you are to attract more flaky tests. As Angie Jones points out, all it takes it one flaky test to ruin the rest of a build. She advises separating tests into different paths — one for stable tests that only fail when something is wrong, and one for unstable tests with flaky fails. This way you can divide attention between the red builds that need to be fixed or rewritten and the green builds that developers should focus on.
  3. Document flakiness - Good testing includes three things -- documentation, documentation, and more documentation. Just because you’ve debugged a flaky test doesn’t mean you should be done with it. Documenting will make sure that the team is aware of the problems that caused the flakiness and has the resources to go back and fix it should the problem arise again. Additionally, if you have many flaky tests or a flaky test that is hard to determine the cause of failure, documentation helps recognize patterns that will prove critical for debugging. Nebojša Stričević says that you should document every flaky test in your ticketing system and add information as you discover it so your team can collectively figure out how to fix the suite. It'll also do you well to keep organized reports before flakiness is found so that you rely on them when you run into an issue rather than trying to gather all your information once a bug is found.
  4. Use page object pattern - We’ve waxed poetic about page objects before, but they are especially useful when it comes to maintaining your Selenium suites. That’s because the page object model will help you make robust testing frameworks that are resistant to small tweaks in the UI. The clean separation between test code, page-specific code, and layout plus the single repository for the services or operations offered by the page allows any modifications required due to UI changes to all be made in one place. Nikolay Advolodkin goes into great detail about how page object patterns can often be the key to stabilizing test automation. If you truly want to eliminate flakiness, page objects will get you on the right track.
  5. Stabilize your environment - One common reason that your tests are unstable is that your environment is unstable. If you have a browser crash, server bug, or bad network latency, it’s going to affect your tests. This may be unsurprising, but it’s a commonly overlooked aspect of keeping your Selenium tests stable. You want to make sure you look at carefully the testing environment and separate automation from the rest of QA for maintainability purposes. Emanuil Slavov advocates that stabilizing his test environment helped go from a pass rate of 50 percent to 100 percent through this method of separation.
  6. Wait for it - Incorrectly using waits and sleeps will be a sure-fire way to have your tests switch from green to red. Not using them will make your life even more difficult. This is because page loads are unpredictable -- some aspects will load quicker than others. If you’re trying to automate an action on a feature that hasn’t been loaded yet, you’re going to have inconsistent results. In order to avoid flakiness, differentiate between implicit waits, explicit waits, and sleeps, and add them to your script appropriately. Alan Richardson covers this aspect of synchronization as one of the top causes for failing automation -- if your tests pass without waits, it’s probably because you got lucky. He says that state-based synchronization is the answer to making it work.
  7. Frameworks are your friend - Frameworks aren’t just helpful in writing tests, they’re also essential for maintaining a stable suite. Joe Colantonio says that pretty much any time you’re working with automation, you should have a framework in place for reuse. By following the guidelines that frameworks provide, your tests will be more repeatable and maintainable for your entire team.

Don’t Be Flaky

Flaky tests can result from a variety of reasons, but at the end of the day, it’s up to you as the tester to find the root and dissect the issue. Fortunately, everyone deals with flakiness from time to time, so there are many resources dedicated to helping you identify what went wrong.

Thinking about how to prevent flakiness before you start seeing red will be the difference between automation that actually helps you meet your goals and flakiness that just slows you down.