A Saga of Improvement in Android App Performance — Part 2
Introduction
In Part 1, we learned about the 4 steps improvement cycle of Monitor, Profile, Fix & Validate, and Deploy. Also, we learned the monitoring process of production application performance in Tokopedia.
In this part of the blog series, we will continue with the Monitoring step and discuss the following process:
- Measuring the performance of the pre-production(internal) builds.
- PR checker to validate the changes before merging.
Pre-Production Test Pipeline
We have in-house Jenkins test pipeline to measure the performance of an application using the development & release candidate branches.
We firstly run the UI Automation test on Firebase Test Lab, then download logs and parse performance data. We push the relevant data to a Mysql Database which is used to display pre-production performance graphs on the dashboard.
We run a performance job on our main development branch and release candidate branch every night with multiple devices having different configurations.
You can refer to the architecture of the pre-production performance test pipeline in the below image to have a better understanding.
Let’s understand the important steps in detail:
Jenkins Job — Tokopedia application
We run this job every night on development and release candidate branch to build a production variant of application with enabling firebase debug logs and disabling minification (It is disabled to prevent removal of the debug logs)
To support this, we have added configuration using the custom build parameter and injecting build variables in the manifest file of an application.
After successful completion, this job triggers downstream Jenkins’s job to run the performance test.
Jenkins Job — Build Test App
We use a separate black box test app to perform the UI Automation test on Tokopedia application to measure the performance of the desired user flow. We build the test APK whenever we make any changes in the test app.
You can refer to the sample UI TestApp skeleton using the below link.
https://github.com/vishalgupta1987/android-ui-test-skeleton
Jenkins Job — Pre Production Performance Test
The responsibility of this job is to run the performance test and save results. Let’s see some more detail about internal steps
- Download the last successful Tokopedia application and test application using wget command.
wget --auth-no-challenge http://$USER:$PASS@<Jenkins-Server-IP>/job/<JOB_NAME>/lastSuccessfulBuild/artifact/<APK_FILE_NAME>
- Run a test on firebase test lab using gcloud run command
#you can change the model name and os version as per your needMODEL_NAME="walleye"
OS_VERSION="26"#you need to modify the test-targets as per your needgcloud firebase test android run - test-targets="class com.sampleapp.blackbox.test.SampleEspressoTest" - app=universal.apk - test=app-debug-androidTest.apk - results-history-name='Daily Android Performance Test' - device-ids=$MODEL_NAME - os-version-ids=$OS_VERSION - directories-to-pull=/sdcard/Android/data/<package_name>/files 2>&1 | tee -a gcloud_response.txt
- Download Device logs and test results from the firebase test lab default cloud storage bucket using below gsutil command.
path=$(grep "Raw results will be stored in your GCS bucket at" gcloud_response.txt | sed 's/Raw results will be stored in your GCS bucket at \[https:\/\/console.developers.google.com\/storage\/browser\///' | sed 's/\/\]//')path="gs://$path/$DEVICE_NAME/"gsutil -m cp -r "$path" .
- Parse the firebase performance logs from the device log file and insert it into the database. For example, please refer to the sample shell script to parse the TraceMetric logs and store them in the MySQL database.
grep "Logging TraceMetric" $DEVICE_NAME/logcat | rev | cut -c3- | rev | while read -r line ; do
line=${line#*Logging TraceMetric - }
TRACE_NAME=${line% *}
TRACE_TIME=${line#* }
echo $line
echo $TRACE_NAME
echo $TRACE_TIME
mysql --host=$DB_HOST --port=$DB_PORT --user=$DB_USER --password=$DB_PASS android-perf-stats -e "INSERT INTO \`$DB_TABLE_TRACE\` (\`id\`, \`trace_name\`, \`duration\`, \`test_matrix\`, \`branch_name\`, \`date\`, \`tag\`, \`os_version\`, \`device_name\`) VALUES (NULL, '$TRACE_NAME', '$TRACE_TIME', '$test_matrices_id', '$BRANCH', '$NOW', '$TAG', '$OS_VERSION', '$DEVICE_NAME')"
done
- Notify on slack with test status and performance using slack webhooks API.
Pre-Production Performance Dashboard
We have built an internal dashboard to view the performance data using DataStudio. We use Mysql as the data source and plot the statistics of important pages with their health based on the target.
We use color-coding to represent the metric health based on the target we define.
You can refer to the metrics health color-coding mapping & sample dashboard below:
PR Checker
We built the PR checker Jenkins job to measure the performance impact in a particular branch post the changes made.
We configure it to trigger using Github comment “@sonodabot check performance”, which runs the performance test as we discussed above and publish the result back to PR after completion as Github comment.
This job saves the data with a different tag in the database.
We prepare the data by comparing it with the base result from the daily build and also find the change percentage in each metric. We pull all the data from Mysql database using a shell script to prepare the report. Based on the threshold values, this checker blocks the PR if performance degrades.
You can refer to the PR Checker sample report example below:
Summary
By now we have covered the overall framework and test pipeline being used for monitoring & validating the pre-production performance of an Android application in Tokopedia.
In Part 3 of this series, I will cover how we identify & profile the performance issues in the application
Stay Tuned 🙌 Happy Reading 🙏
References
https://firebase.google.com/docs/test-lab/
https://www.jenkins.io/download/
https://cloud.google.com/sdk/gcloud/reference/firebase/test/android/run