Getting Started with Mobile Test Automation

Getting started with web-based test automation is relatively straightforward: Install a tool, create some test cases, press the “run” button, and wait. Once the first test runs, extending the tests into Continuous Integration and reporting is as easy as putting together legos.

Assembling a mobile test automation solution can feel more like assembling a puzzle. There are too many pieces to track, the final picture is unclear, and most of the pieces seem to not fit well with each other. Even where to start can be a challenge.

Today we’ll explain the puzzle pieces of Mobile Test automation, how to fit them together, and provide a plan for how to get started. We’ll call our puzzle pieces components of a solution. There are many different ways to implement a component - buy, build, or customize an open-source package. Because these are puzzle pieces, not legos, the important thing is to find the right fit between components.

The Puzzle Pieces In Mobile Test Automation

In regular test automation, manual steps are optional. It is possible to manually update the website, install a test database, point the automation to the those systems, watch the rests run, and interpret results.

Doing the same work by hand for a mobile application is hard to even read about. First, the application needs to be compiled, likely on a PC and Mac; then the human builders need to plug the devices, one by one, into a USB port and install the application. The same setup needs to happen on any web server serving any APIs to the software, plus the database. Each tablet, phone, and operating system type combination needs to be plugged in and have the tests run; this either means a lab environment with a dozen computers, or waiting and switching.

Effective mobile test automation needs to be a pipeline. Here are the pieces of that pipeline:

Build System. The software doesn’t just need to be compiled; it needs to be compiled automatically, on a schedule, with the compiled code stored and versioned. This makes fixing defects as easy as finding the version where the defect was introduced, and reviewing the exact code changes involved.

Unit tests. Unit tests not only detect defects early, they pinpoint exactly where the problem is. Conducting unit testing reduces the amount of defects that can escape and be caught by end-to-end testing, which has a slower feedback loop and more expensive maintenance cost. Typically the build system, unit tests, and other components are tied together by Orchestration, described later.

Provisioning. To provision is to “supply something for use.” In the case of mobile test automation, that means obtaining the phones and tablets to run the software on, installing the software, and registering the device back to some central system to be able to run tests against it.

APIs and API tests. Most modern mobile applications also use APIs that sit on servers and need provisioning. That typically means creating a web server, installing the latest API software on it, putting test data into a test database, and running API tests with a tool to find errors quickly. APIs and the mobile software itself are two key puzzle pieces -- if either has the incorrect version, then the two will generate errors. Releasing the two as a set is one common approach, but unpredictability of the release process of the App Stores, and of customers in obtaining upgrades, can make this unrealistic. Instead, version the APIs, and have software itself call a specific version of the API to ensure backwards compatibility.

Executing Tests. This is the one of two pieces that most people think of - the actual tool to run the tests. Your mobile testing tool needs to provide output in a format that another tool can understand - what failed and when it failed. Some tools provide screenshots or a video of the test run as an aid in debugging. Others can compare images or portions of the window. The best mobile testing tools will have a low switching cost, find the kind of errors that slip through development for this application, and have the kind of debugging aid needed in this application.

The Tests Themselves. Someone has to actually create the mobile tests and check them into version control. Unless these are versioned along with the source code, the system can have version conflicts where the tests do not match the code. Even if they are versioned along with the source code, a change in the source without a reflected change in the tests will result in an “error” that needs to be “debugged” and “fixed.” How the tests are expressed is also important. The ability to organize tests in a group or “suites” using tagging or a file structure can be very powerful. Mobile tests can also be expressed as code, in compilable near-english, in a grid, or in a visualization - a visualization often allows easy step-through debugging and enables loop constructs. Here’s a simple example of a search test using a grid. The test first searches for Jabberwocky exact, and does not find a page (but does find the other 12 seeded test pages), then changes to match just root words, and finds a match for just “Jabberwock.”

Action Variables Expected
Log In $userID
$password
Hello $user
Search-Exact-On TRUE
Search Jabberwocky 12 results
Verify_Text Hast Tho Slain the Jabberwock? FALSE
Verify_Link Jabberwock FALSE
Search-Exact-On FALSE
Search Jabberwocky 14 results
Verify_Text JHast Tho Slain the Jabberwock? TRUE
Verify_Link Jabberwock TRUE

The commands “log in”, “Search-exact-on” and “search” are tied directly to this application. They are called “domain commands” and stored in a domain layer.

The Domain Layer. The near-english, grid, and visualization layers all require something underneath, the “plumbing” to define the domain commands. This allows search to be defined in one place. If the search button changes in some way to “break” the tests, then the change to the search command only needs to be made one time. It also makes tests more concise, even understandable by business people. That makes the tests an executable specification - a living example of what the software should do. Even if the tests are expressed in raw code, it usually makes sense to create a code library with the commands that look and act very much like a domain layer.

A final component of the domain layer is the locators used to find elements. The location of a button, the name of a text box - these things change frequently. Moving a common button (such as tag) could cause many domain layer tests to fail. Some mobile testing tools have an “object repository” to store user interface elements - this means that when an object changes, it only needs to be changed in one place. It also means the software can use meaningful names like “Login Button” instead of code like:

//android.widget.ScrollView[0]/android.widget.RelativeLayout[0]/android.widget.TextView[0]/

An object repository can have multiple entries, making it possible to reuse the exact same mobile test on android, iOS, BlackBerry, Mobile Windows, or another operating system - even if the locators are radically different.

Reporting. Some test tools provide multiple views of results; others provide a big block of text. Searching through a thousand “okay” messages for the five errors is a bit like searching for a needle in a haystack. When mobile test tools provide a large amount of text output, it often makes sense to create a reporting layer that can provide multiple views, including pass/fail results, dashboard resuts, drill-down results, and so on. Some orchestration systems provide this.

Manage Test Execution Time. Testing a large number of systems will take a long time. Going for exhaustive coverage will make it take even longer. Some companies triage tests, running only a small number of the most important tests during the continuous loop, running more overnight. Others triage which kind of devices to run, only running the larger ones overnight. Others rent a grid and run multiple tests at the same time. If these tests write back to a database, there may be timing issues - unless each webserver also has its own database.

Notifications. When things break, there are people who need to know. If the system runs one build for every change, then a failure can be traced back to when the error was introduced. Supervisors and testers are also often interested in the results of a test run. The most common way to do this is by email, providing a report that is appropriate to the level of the stakeholder. Sometimes executives want to be notified; what they need is Red/Green/Yellow and trending.

Orchestration. Sometimes called “Continuous Integration”, the orchestration layer binds all these pieces together. The most common way to do this is with a set of commands that run on a schedule or after an event, like a code check-in. The CI system checks out the latest code, performs a build, calls the provisioning scripts, sets up the web servers, tests the APIs, then calls the tool to run the tests, takes the output of the tests as input, publishes reports and sends out notifications. Most puzzle pieces can be called from the command-line, so the simplest way to create an orchestration layer is with a batch script - a dozen system calls running on a schedule. Most modern tools do much more, providing web-based history of builds, reporting, trending, some analytics, and event monitoring. Jenkins, an open-source continuous integration tool, can monitor version control and even run on an individual commit, or “push” of new code.

Most of these puzzle pieces exist for traditional software. They are less nuanced. For example, it is much easier to release new code and new APIs at the same time with web-based software than with native mobile due to the play store.

Here, the new piece that mobile test automation adds is the mobile device itself.

Challenges with Mobile Testing

Some bugs can only be found by physically holding a device - for example, if the application drains battery power. Others are simple software bugs. Most mobile testing projects use a combination of physical devices and alternatives. The best tools can switch between these options at run time - running the same tests on simulators, emulators, on the desktop or in the cloud.

Simulators. A computer program that can run on the same device as the build system. A simulator looks, to the user, like the device, and interacts like the device, but is just a surface layer. Simulators may behave in different ways than the actual device. A classic example of a simulator is “skinning” Safari on a Macintosh to “look like” an iPhone. Heavier simulators are full-blown windows or Mac applications that look like the device - but are just a windows or macintosh program running to look like the device. For example, in the iPhone simulator at right, the user has held down an application and got the apps to “jiggle” to see what will be deleted - but the “X” letter does not appear in the corner.

Emulators. Where simulators try to look like a device, emulators are the device, or at least as close as possible, running in software. Emulators create a virtual machine, all the way down to the hardware, running inside the computer. Emulators need to boot up. They connect to the internet through the wireless connection. To do that, the software needs to create an abstraction layer, recreating the actual chip-set in software. As a result emulators are slower and more awkward than simulators.

Cloud-based devices.Some vendors, like CrossBrowserTesting, offer simulators or emulators in the cloud. It can even be possible to get physical devices hooked to a video camera, for rent, either for testing or automation. These require another abstraction layer and more software, but they make running on-demand grid automation much easier.

Physical devices in the office. It should be possible to plug in a physical device and run the test automation through it. Testers can watch as the software runs and observe problems, root causes, and usability issues. There are also issues that are hard or near impossible to observe in an emulator, such as heat, battery drain, sudden drop in bandwidth, memory or GPS issues - so it makes sense to keep some physical devices around the office.

New Wrinkles with Mobile Test Automation

Earlier we mentioned object locators. It is easy enough to find locators on a web-based system - you open a web-browser, right click, click on “Find Element”, or look for the locator through developer tools. There is no right-click equivalent on Mobile systems. Testers need to have some tool to find the identifiers and store it in the object repository.

It isn’t always required to have all off these pieces. When starting out, it is possible to create tests in code, to run them and have an impressive demo. Eventually the tests become unwieldy, brittle, and hard to maintain. The trick is to extend the architecture as the tests are built out.

Now it’s time to talk about getting started with mobile testing.

Getting Started with Mobile Testing

The Extreme Programming literature proposed the idea of a “spike”, a proof of concept that goes all the way through the architecture to demonstrate the end result could work. The first test needs to make sure the delivery pipeline can work, dealing with installing the software, setup, test data, “warts and all”. The device for that first test might be a tethered physical device or an emulator running on the same computer; the point is to demonstrate it is possible to switch out the devices and still have the test run. Likewise, the test might run against an existing test web server (for the APIs) and database (for the test data) - just make sure it is possible to create test servers in real time and pass them back to the test to run against. At least have a plan to extend, have a command-line variable for a base website location, and so on.

Which brings us to the first real hurdle to overcome: Actually getting a single mobile test running. That means picking a tool. Research the bug database, if there is one, to find the kind of defects that slip through development, and make sure the tool can find those sorts of bugs. Major categories of bugs include broken APIs, broken APIs/User interface matchup, crashes, user interface layout issues, and state/memory issues. Platforms like ReadyAPI offer multiple API testing tools that can be used to validate the performance of a mobile app's API, as well as functional and security testing. Automated testing tools like TestComplete, will enable you to run, manage, and analyze UI tests across a variety of devices and environments so you can ensure your user interface looks and behaves as expected. With TestComplete's integration to API tools like SoapUI, you can get visibility into your mobile API calls and shorten your feedback loops.

When selecting a mobile testing tool, also consider future automation issues. If the tool runs in Windows and the build pipeline is Mac or Linux, that could be a problem that requires at least a virtual machine. Mobile testing tools like CrossBrowserTesting give users access to a cloud-based device lab that hosts the mobile platforms and tablets for you. With 1500+ combinations of operating systems, screen resolutions and configurations, browers and device types, you can squash any UI layout issues or bug that might occur by running one test in parallel across multiple environments. CrossBrowserTesting also has native visual testing capabilities, meaning you can take screenshots of your test runs and compare how they look across different devices, enabling you to ensure a seamless user experience and perfect mobile layout across every device you want.

Even with a low switching cost, the cost of the wrong tool can be significant. So try a few. Look at how they would fit, as puzzle pieces, into the delivery pipeline. Write the same three or four tests with multiple tools to see how the domain layer and object repository work - along with how easy the tests are to run, debug, and use in version control.

Once you’ve selected a tool, add more tests - but be careful. The first handful of tests should be build verification - the top dozen things customers need to do, combined with the top dozen things that tend to break frequently.

The temptation after that is to build more tests, to make a “real” test suite that can be used for regression testing. Don’t fall into that trap. Instead, look at automating provisioning. Once that’s done, add provisioning and testing to the build process. Then start monitoring the length of the test runs.

At that point, we are past “getting started with test automation.” Better still, we’ve filled in enough pieces of the puzzle to be able to step back and get a picture of the board. Other pieces will be easier to add, with less effort.

Final Author’s note: The use of the Lewis Carroll Poem “Jabberwocky” as test data for search was instituted at Socialtext by Chris McMahon under the supervision of Ken Pier in 2007. Credit for who actually created the test was lost to time; Ken had passed and Chris was forgotten. The tests are still running, use a domain layer, and the original author (Chris) was discovered by checking version control