Mozilla Corporation prides itself on keeping the internet open and accessible to all. With the rise of smartphone and mobile internet, it was a natural move for the Mozilla team to bring its Firefox Browser to Android and iOS users. Today, over a billion web browser users around the world enjoy the Firefox experience on desktop, mobile phone, or tablet.
Bringing the same level of Firefox Desktop experience to Firefox Mobile has always been a goal for Mozilla’s development teams. The automation team led by Joel Maher, Lead Automation & Tools at Mozilla, wanted to go through the full set of tests on a variety of physical phones with different combinations of hardware and software.
“We worked with Sakari [the CTO at BitBar], who helped us set up a Docker image container connected directly to a phone in their private cloud. And it ended up being great.”
Unable to Scale Real Device Testing with a Homegrown Device Lab
The company started the Firefox Android project in 2012, and the automation team built many automation scripts to run against their in-house devices.
“It didn’t go well,” Joel admits. “The phones were distributed at different desks in the office, or to people working remotely. Running tests became very dependent on that person being around to fix the phones in case of any issues. We started moving to emulators.”
Due to the lack of a robust real device testing solution, the team opted to test on emulators, since it was impractical to run tests on physical devices. In 2017, the year before they adopted BitBar Private Cloud, the team was hosting half of a rack of devices at Mozilla’s Mountain View office.
“This time, we set up 30 Android phones connected to a couple of servers in a closet. We were able to run the tests, but devices would go offline on any given day,” explains Joel. “We didn’t have any paid staff to look at the devices. When some phones went offline, it always required somebody hands-on to get into the closet and spend a half hour getting those phones online. It wasn’t a robust setup, and we were frustrated with that. So we started to consider other options.”
Putting BitBar to Work Out Unique Testing Requirements
Joel’s team looked at a variety of options, including Google Firebase and Amazon Device Farm. However, none of these could meet Mozilla’s unique testing needs.
“Our web browser tests aren’t built using APIs. Instead, all our commands are built into a browser, and we need to load a webpage from a server. That doesn’t work with the typical way to run APIs on a phone,” explains Joel. “What we’re testing is whether we can load this web page and if it can render properly. Because of that, everything depends upon a certain framework or a list of different automation frameworks.”
In addition to necessary support for custom frameworks, Joel’s team needed rooted phones to make tests run more smoothly. Given BitBar’s previous success for custom OS flashing (Firefox OS) on Android devices for Mozilla, Joel’s team reached out to the BitBar team.
“We didn’t have Dockerized containers or high-quality hubs connecting them. We got by with what we had, as we didn’t have time to experiment and make things better,” says Joel. “Then we worked with [then BitBar CTO] Sakari Rautiainen, who helped us set up a Docker image container connected directly to a phone in their Private Cloud. It ended up being great.”
The POC with BitBar set up Joel’s team for success in a much faster way, and turned out to be a more reliable and repeatable process.
Realizing the Full Potential of BitBar’s Technologies for CI/CD and Full Automation
Every week, Joel’s team has over a thousand builds in the integration and release branches. They also have a “try server” where developers test their stuff before it gets checked in. With around 2,500 builds in any given week, there’s a lot of testing. To improve test efficiency and support its continuous integration and release processes, Mozilla created Taskcluster, an easy-to-configure and robust scheduling software that helps to distribute test runs.
“For a full set of tests against one Android build,” Joel explains, “we would launch a hundred emulator jobs and 40 tasks on BitBar’s real devices. If we were to run all of them sequentially, that would take 75 hours to complete – 60 hours for the emulators and 15 hours for real devices – without repeating them. Mozilla created Taskcluster to distribute multiple tests across available Workers [a Worker could be an emulator or a phone at BitBar] at the same time. Now we’ve taken that 75 hours down to 5 hours to get through a test cycle.”
For real device testing, a proxy will call a BitBar API and insert the Taskcluster test jobs into the proper queue. With Taskcluster managing test orchestration, and BitBar’s support for unlimited device concurrency, Joel’s team has brought 15 hours of real device testing down to an average of 2 hours.
To help Mozilla enhance its full automation and continuous testing practice, BitBar is hosting a Private Cloud with a multitude of devices, dedicated to Mozilla, in its Santa Clara data center.
“As far as BitBar’s real device environment, we’re over 90% utilization during the peak hours,” says Joel. “During the core workday, from 6 a.m. to 8 p.m. Pacific Time, we’re guaranteed to have test jobs in the queue. We’ve always been running full automation. BitBar’s solution has enabled us to scale up our automated testing. We’re running a much higher throughput, and these tests are depended on by a much larger volume of people.”
Monitoring and Scaling to Achieve a Robust Environment
“It was hard to scale our testing efforts before switching to BitBar,” Joel adds. “We needed better organization, and we didn’t have good tools. If one of our servers needed updating, for example, all eight devices connected to it would go down. Now we run test automation on a hundred phones hosted by BitBar, and we have alerts on the phone status online. It’s a much more robust environment where it’s more self-healing, and we’ve got more eyes on it.”
By using the BitBar API, the automation team at Mozilla is able to monitor the system and infrastructure to get proper alerts on device status.
The frustration with the homegrown testing solution led the team to think about outsourcing device hosting and management while allowing them to run test automation on real devices.
“Since we didn’t have any paid staff to look at the devices, when some devices went offline, it required somebody to get into the closet and spend a half hour getting these phones online. It wasn’t a robust setup, and we started to consider other options.”