Automated Android testing with Calabash, Jenkins, Pipeline, and Amazon Device Farm: Part 1

Around two years ago we began playing with the Calabash library to see if we could automate some portion of mobile release QA process. Previously we’d sent an APK file and spreadsheet of steps to India for every release, then engaged in a weeks-long back and forth as we hashed out bugs with the application and test plan. This process had become costly and burdensome as the size of our development team grew (and our feature set along with it). Moving to some UI level automated testing suite offered some obvious advantages

We would save the money we’d been paying contractors to run through the tests
We would find out immediately when something was broken, rather than all at once during release.
- This allows the person who caused the bug to handle it rather than whoever is assigned to triage release bugs
We could test across a large set of devices and Android OS versions
We would be managing test code instead of managing a huge test spreadsheet

Unfortunately the road to this gold standard was a long one, beset on all sides by the tyrannies of poor documentation and rapidly evolving technology. While writing and running Calabash tests was straightforward, this wasn’t even half the battle. In order for tests to be useful we needed them to run and pass consistently, deliver meaningful and accessible reports, and plug into our notifications workflows like Slack and Github. Oh, and write the tests.

This November we ran our first release with a significant portion of the test plan (more than 40 man hours) automated. We have about forty “scenarios” running nightly builds on Amazon Web Services across three to five devices. Failure notifications end up in our Slack channel and reports emailed to us and stored on our Jenkins server. This is how we got there.

The Test Suite

Our test repository is a standard Calabash test suite. Most important are the files in the features/ folder. These contain Calabash files written in the business logic language Gherkin. For example:


@Fixtures
Feature: Fixtures

  @AWS
  Scenario: Ensure that we error cleanly when missing a fixture
    Then I install the ccz app at "fixtures.ccz"
    Then I login with username "fixtures_fails" and password "123"
    Then I press start
    Then I select module "Fixtures"
    Then I select form "Fixtures Form"
    Then I wait for form to load
    Then Next
    # Should error
    Then I should see "Error Occurred"
    Then I should see "Make sure the"
    Then I should see "lookup table is available"

The biggest unit of a Calabash test is a “Feature” which can contain multiple scenarios. The Android sandbox is shared within a feature so if you add some persistable data in the first scenario it will still be available in the second. In this case we install a CommCare application, login with a dummy user, and then run through some menus and a form. Eventually we expect to see a clean failure. We add Tags (prefixed with ‘@’) to our tests as this is the primary unit of control for organizing test runs. We’ll be using those later.

We’ve also included in the repository some test resources that we need such as CommCare applications, and a spreadsheet, and even another Android application that we use for testing our mobile APIs.

The killer feature of Calabash is that this (weird sounding) plain English becomes a runnable test. “Will, this is amazing, how is this possible?” Well, Calabash gives us a number of so-called “canned steps” for free. These canned steps are enough to manipulate an Android device in just about any way possible. You can touch a TextView based on its String content or resource ID, use the soft keyboard, assert some text exists, etc. We use the “Then I should see [x]” step above. Calabash then compiles these down to Ruby that runs on the Calabash server alongside your app on the device.

For more control and cleaner code we specify our own steps as well. For example, each of our tests must begin with the installation of some CommCare app. So we defined a step to install an app from a local resource:

# peform offline install using ccz file pushed from repository
Then (/^I install the ccz app at "([^\"]*)"$/) do |path|
  press_menu_button()
  tap_when_element_exists("* {text CONTAINS[c] 'Offline install'}")
  push("features/resource_files/ccz_apps/%s" % path, "/sdcard/%s" % path)
  step("I enter \"storage/emulated/0/%s\" into input field number 1" % path)
  hide_soft_keyboard()

  # get around bug where the install button is disabled after entering text
  perform_action('set_activity_orientation', 'landscape')
  perform_action('set_activity_orientation', 'portrait')
  sleep 1

  tap_when_element_exists("* {text CONTAINS[c] 'Install App'}")
  wait_for_element_exists("* id:'edit_password'", timeout: 6000)
end

This ought to give a better idea of what Calabash actually runs on the application. Here we navigate to the install screen, push our test app from test resources into the application’s storage space, start the install, and then wait for the login screen to appear. At this level we’re writing Ruby rather than Gherkin so we have much less readable but more precise code. Ideally our developers would be able to define enough robust steps that a technical non-developer could write their own tests. And indeed this has happened with a member of our QA team outputting a large portion of our test plan.

The commented portion in the middle hints at a persistent weakness of Calabash that we’ve discovered; namely, that we frequently run into UI quirks such as views becoming “invisible” due to screen rotation or keyboard focus. Most frustratingly these issues often are not consistent so a test will pass a few times before failing when a device happens to run more slowly than before. We’ve always been able to work around these issues and usually can hide the fix in a step so this isn’t show stopping. However, these bugs do slow development time.

In Part 2 I’ll go into how these tests are packaged and run on Jenkins.

Pride and the Fall

Automated Android testing with Calabash, Jenkins, Pipeline, and Amazon Device Farm: Part 1

The Test Suite

Leave a comment Cancel reply