End-to-End mobile testing on physical devices with Device Farm

AHMED ELSENOUSI
My Local Farmer Engineering
8 min readDec 27, 2022

TL;DR;
This blog post demonstrates opinionated End-to-End Mobile Testing on devices. It uses physical devices provided by Device Farm and integrates with CICD pipeline. You can clone the companion code , use and adapt it to build your own CICD Pipeline with End-to-End mobile device testing

One day we were reviewing our analytics data and we realised the visits for a category of products were extremely below average. Looking at the data we noticed something they all had in common: They were all using old Android device models

Disclaimer
I Love My Local Farmer is a fictional company inspired by customer interactions with AWS Solutions Architects. Any stories told in this blog are not related to a specific customer. Similarities with any real companies, people, or situations are purely coincidental. Stories in this blog represent the views of the authors and are not endorsed by AWS.

Things didn’t make sense, and we couldn’t investigate any further unless we grab one of those old phones and test it for ourselves, but no one in our team owned any old phones. Until we found another colleague who happens to own one and volunteered to test. So apparently, this category failed to render and was missing in the list of choices. This explains why so many users on these phones never ordered from this section as they didn’t know it exists. The problem was due to a race condition, caused by some unoptimised javascript logic, and this race condition didn’t happen unless if the processing was too slow and hence we didn’t have this problem with all the other new devices.

We’ve investigated the Android device fragmentation and for a comprehensive test coverage, we need to test on 20 different devices. This is when we discovered an AWS service called Device Farm, which gave us access to thousands of different types of phone devices. When playing around with the service we were able to simply pick any phone type and manually control it remotely to do. And the interesting part was these are not emulators, but real physical devices.

Previously we have done some end to end testing, but the focus was usually on cross browser compatibility. We never considered testing on different hardwares to check performance related issues. It would be good to detect such issues before deployments. So testing on different devices using device farm should be automated and part of our CICD.

Device Farm

Device Farm is an AWS service that offers 2500+ different mobile devices with different OS installed for testing both mobile & web apps on.

To start, we have to create a project and select a pool of mobile devices we want our tests to run on.

Then package/zip our tests, we used Appium tests with Java, which is one of the frameworks it supports. Then we just upload the packaged tests. This last part sounded strange to us at first, why do we need to upload our test files? Shouldn’t we just run our tests against a selenium endpoint and that’s it? Apparently not with mobile testing on Device Farm. Device Farm has all of its devices physically located in the US. So if you are testing from far places around the world, the connection latencies could cause test scripts to be flaky and unreliable, like connection timeouts.

This is what we usually call the client server model. But Device Farm uses a server side testing model. The idea being you upload your test scripts, and device farm will run it for you on a server that is physically close to the device. It then sends you back the result, avoiding any latency issues related to distance.

After running our tests, we get access to test results for each device along with artefacts like videos, screenshots, device logs. These are very useful when tests fail, you can use the logs to see what errors were thrown and then view the videos or screenshots to see how they happened.

Plus we even can see stats. like cpu & memory usage.

Ok, all this sounds good, but obviously we can’t do this manually every time we deploy, so we need a way to automate all of this, in order to make it part of our CICD pipeline. And this is where a CodePipeline Plugin comes in to play.

CodePipeline Plugin

In CodePipeline we can configure multiple stages for different purposes. in our scenario we require 3 stages

  1. Source — to fetch the tests from the source code
  2. Build — to build, package & zip the tests
  3. Testing — to trigger the tests

For each stage, code pipeline has a list of preconfigured action types that you can use. I will go through each one next. But first, below is our architecture and we will be building all of our components using AWS CDK.

CDK

In CDK, each AWS resource is represented by a building block called a ‘Construct’. Here we need four of them:

1. Code Commit:

This creates our code repository

const repository = new aws_codecommit.Repository(this, 'Repo', {
repositoryName: 'ilovemylocalfarmer',
code: aws_codecommit.Code.fromDirectory('app', 'main'),
});

2. Code Build

This will create our CodeBuild project and configure it to run the maven command on the source code to package our tests. In the artefacts section we define which files should be stored for later use, which in our case would the zip file generated by our mvn command

const build = new aws_codebuild.Project(this, 'Build', {
source: aws_codebuild.Source.codeCommit({ repository }),
buildSpec: aws_codebuild.BuildSpec.fromObject({
version: '0.2',
artifacts: {
files: 'target/zip-with-dependencies.zip',
},
phases: {
build: {
commands: ['mvn clean package'],
},
},
}),
});

3. Device Farm

Unfortunately CDK doesn’t provide a Construct for device farm, but luckily they didn’t leave us hanging. CDK gives us an option to create a custom resource to address such situations. So if there is way to create our component using API calls, then we can use them in our custom resource. So for example below, I am using 2 APIs available from DeviceFarm (createProject & deleteProject) and they are placed in the appropriate lifecycles that the CustomResource provides me (onCreate & onDelete). So when we deploy our CDK code, the createProject API gets called and when we destroy our Infrastructure the deleteProject will be used to delete our Device Farm project and to have a clean deletion of our infrastructure.

const devicefarm = new custom_resources.AwsCustomResource(this, 'DeviceFarm', {
onCreate: {
service: 'DeviceFarm',
action: 'createProject',
parameters: {
name: 'mobile-test',
},
physicalResourceId: custom_resources.PhysicalResourceId.fromResponse('project.arn'),
},
onDelete: {
service: 'DeviceFarm',
action: 'deleteProject',
parameters: {
arn: new custom_resources.PhysicalResourceIdReference(),
},
},
policy: custom_resources.AwsCustomResourcePolicy.fromSdkCalls({
resources: custom_resources.AwsCustomResourcePolicy.ANY_RESOURCE,
}),
});

We also need to create a pool of the type of devices we want to run our tests on. So below we create a device pool and explicitly set their OS to be Android and older than version 9.

const devicePool = new custom_resources.AwsCustomResource(this, 'DevicePool', {
onCreate: {
service: 'DeviceFarm',
action: 'createDevicePool',
parameters: {
name: 'Old Androids',
maxDevices: 2,
projectArn: deviceFarmArn,
rules: [
{ attribute: 'PLATFORM', operator: 'EQUALS', value: '"ANDROID"' },
{ attribute: 'OS_VERSION', operator: 'LESS_THAN', value: '"9.0.0"' },
],
},
physicalResourceId: custom_resources.PhysicalResourceId.fromResponse('devicePool.arn'),
},

4. Code Pipeline

So now we create the pipeline that will use all of the above components for each stage we mentioned earlier. In the Construct we just define each action type available for each stage and point them to the related AWS services. I’ll go through each config below.

Stage 1 — Code Source

In the source stage, we use the CodeCommit action type provider and point it to our repo. We set the git branch to be ‘main’ and configure the output artefact (an artefact is just an S3 bucket that pipeline uses to store files). The files in output artefacts will be used by the next Stage.

{
name: 'Source',
actions: [{
name: 'RepositoryName',
actionTypeId: {
category: 'Source',
owner: 'AWS',
provider: 'CodeCommit',
version: '1',
},
runOrder: 1,
configuration: {
BranchName: 'main',
PollForSourceChanges: 'false',
RepositoryName: 'ilovemylocalfarmer',
},
outputArtifacts: [{
name: 'srcOutput',
}]
}]
},

Stage 2 — Build

In the build stage, we just use the CodeBuild Action provider to trigger the CodeBuild we created earlier in order to package and zip our tests and outputting it to another artefact, making it available for the next stage to pick up.

{
name: 'Build',
actions: [{
name: 'Build Test Package',
actionTypeId: {
category: 'Build',
owner: 'AWS',
provider: 'CodeBuild',
version: '1',
},
configuration: {
ProjectName: 'CodeBuildProjectName',
},
runOrder: 1,
inputArtifacts: [{
name: 'srcOutput',
}],
outputArtifacts: [{
name: 'buildOutput',
}]
}]
},

Stage 3 — Test

Finally the test stage uses the Device Farm action provider (the plugin we have been talking about) and this does all of the steps I have shown earlier. It will take the file in the previous output artefact, upload it to Device Farm and run the tests on the list of devices I have chosen in pool that I have created earlier.

{
name: 'Test',
actions: [
{
name: 'Device Farm',
region: 'us-west-2',
actionTypeId: {
category: 'Test',
owner: 'AWS',
provider: 'DeviceFarm',
version: '1',
},
configuration: {
ProjectId: '9152545a-7b...',
DevicePoolArn: 'arn:aws:devicefarm:us-west-2::devicepool:...',
AppType: 'Web',
Test: 'target/zip-with-dependencies.zip',
TestType: 'APPIUM_WEB_JAVA_TESTNG',
},
runOrder: 1,
inputArtifacts: [{
name: 'buildOutput',
}]
}]
}

The Pipeline

After we deploy the CDK app. we will have a complete pipeline that pulls the tests code from repo, builds and tests your code on real physical devices.

Conclusion

Performance related issues can be a real pain and tricky to catch. This is why solutions like device farm offered us piece of mind, cause otherwise it would have been very difficult to think of alternatives and their implications. If we were to buy a bunch of devices to keep around for testing. which ones ? how to automate them ? how do we maintain them ?. But in this example I demonstrate how we can automate e2e testing on real hardware devices. Ideally next steps would be to have this for multiple environments (dev, staging & prod) and to add a deployment stage to the pipeline which will be blocked if tests in previous environment fail. Here is the complete CDK code for this example. Enjoy!

--

--