Launched last Saturday (Mar 21, 2020), TraceTogether is a mobile app developed by GovTech for community-driven contact tracing to mitigate the spread of COVID-19.
Mar 24, 2020 Update: GovTech rolled out an update, from 1.0.24 to 1.0.30. New findings relevant to “#2–Data Records & SQLite Storage” and “#3–21 Day Data Retention Policy?” sections are appended below each section.
The release of TraceTogether piqued my interest as a security professional and technologist. GovTech published 9 geeky myth-busting facts you need to know about TraceTogether, hoping to debunk concerns over TraceTogether’s security and privacy which came fast and furious to nobody’s surprise, given how rampant fake news is and skepticism around a state built app.
As much as I appreciated GovTech’s attempt to explain the application and assure the public of its security and privacy controls, the article was unable to satisfy the technologist in me: I had to dig deeper. How better than to revisit my old friends: apktool, dex2jar, jadx and jd-gui.
Instead of detailing a full step-by-step tear-down, I’ve chosen to selectively highlight facts I find interesting. I picked TraceTogether’s Android APK as the target as it is more accessible for d̵i̵s̵a̵s̵s̵e̵m̵b̵l̵y̵ inspection that iOS applications. Details within sections are graduated, so simply forward to the next section should it be too much to follow.
#1 — TraceTogether is Serverless
What stood out to me immediately was the absence of backend host strings (e.g. api.myawesomeapp.com) or API calls, which is (almost) always the first thing to look out for when inspecting applications. Developers of applications with significant resources or risk typically harden their applications by packaging critical functionality such as encryption and data transport in low level libraries and calling them via JNI, effectively protecting critical functionality from inspection with basic tooling and script kiddies.
However, that’s not the case with TraceTogether: all online functionality was pointing to Firebase SDK, from User Authentication via OTP (PhoneAuthProvider) to records uploading with Firebase Storage.
I am impressed by the modernity of the application’s serverless architecture, and since backend development have been reduced to purely administrative functionalities, the development team was able to go-to-market the ‘frontend’ or mobile applications in a much shorter time, all while enjoying Google grade scalability and reliability.
#2 — Data Records & SQLite Storage
It is probably no surprise that TraceTogether uses SQLite for record storage. The schema creation statements provide an interesting insight to the data that TraceTogether intends to collect.
Of particular interest is
record_table, whose schema and field names hints storage of records of other TraceTogether users who were discovered in proximity: the data of interest for contact tracing should it be summoned by the authorities.
id, integer, primary key, autoincrement, not null
timestamp, integer, not null
v, integer, not null
msg, text, not null
org, text, not null
modelP, text, not null
modelC, text, not null
rssi, integer, not null
From GovTech’s article (Myth #3):
When the app is running on your phone, it will create a temporary ID, generated by encrypting the User ID with a private key that is held by MOH. The temporary ID is then exchanged with nearby phones, and renewed regularly, making it impossible for anyone to identify or link the temporary IDs to you. The temporary ID can only be decrypted by MOH, with MOH’s privately-held key. Your phone will store the temporary IDs from nearby phones, together with information about the nearby phone’s model, Bluetooth signal strength, and time. All this information is stored locally on your phone, and not sent to MOH, unless you are contact traced.
From GovTech’s description of data collected by TraceTogether, 1, 2, 8 and 9 are self explanatory, with 8 and 9 being the Bluetooth signal strength. 3 thru 7 are the interesting columns as they store store cryptographic hashes and described by GovTech as temporary ID and nearby phone’s model. From code,
C that appends
model are likely references to Central and Peripheral device concepts in Bluetooth GATT (described in #4 — Jargons & Protocols).
I don’t happen to have an Android device on hand to perform dynamic analysis, which would have allowed inspection of actual records in the SQLite store: this writeup is based on pure static analysis! Static analysis on disassembled code can be laborious as high-level control flow and references are not necessarily preserved, especially in increasingly modern languages (i.e. Kotlin): I find myself having to hunt down method prototypes based on parameter types and intuition when tracing flows for several features of interest.
Mar 23, 2020 Update: 1.0.30
Two new schema alteration statements were added that sets default values for the
org fields in the
record_table schema. These changes hints the purposes of those fields: my educated guess is that
v signifies the payload version of the
msg field, and
org denotes the organisation, in this case
#3–21 Day Data Retention Policy?
As far as I can tell, the TraceTogether Android App 1.0.24 published Mar 20, 2020 does not purge records after 21 days: there are no strings that could constitute queries to selectively purge records from the SQLite database, neither is it likely that an equivalent feature was implemented with some obscure method that when unnoticed when reviewing the application, though it’s not entirely impossible as my review was a mix-mash of code types.
It may be possible that the developers plan for the feature to be incorporated via an update: there are hints of an update event intent.
In all honesty, it’s no biggie if records are stored beyond 21 days.
Mar 23, 2020 Update: 1.0.30
This remains unchanged: I did not find any change in code that suggest that records are purge after 21 days.
#4 — Jargons & Protocols: StreetPass, GATT, BLE
TraceTogether’s detection of other participating TraceTogether phones work by clever usage of Bluetooth Generic Attribute Profile (GATT) and Bluetooth Low Energy (BLE). This is the meat of TraceTogether. The algorithm and protocol behind this magic have been packaged and marketed as BlueTrace, and from the writeup on their official site the authors are intending to make it available for third parties, nice.
Unfortunately, I wasn’t able to do a deep dive into the GATT and BLE components of BlueTrace as it’s a significant amount of code and bulk of it didn’t disassemble cleanly and I don’t have time to examine them with the level of detail required. However, I’ll like to highlight and applaud the development team’s novel use of the protocol, as BLE peripherals can only connect to one central device at a time: my hunch is that the algorithm effectively scans for all BLE peripherals in the proximity and cycles each one for connection.
Interestingly, BlueTrace references ‘StreetPass’ internally for Data Parcels. StreetPass happens to be a Nintendo 3DS functionality for passive communication over Wi-Fi, functionally similar to BlueTrace.
#5 — Proudly Made in Singapore 🇸🇬
As a developer myself, I take every opportunity to amuse myself and my colleagues (I hope they are) by littering my code with a healthy sprinkle of colloquial flavour. This is observed with the following two code specimens:
I could have gone down the rabbit hole indefinitely but I had to stop and wrap up the exploration at some point. Although I didn’t manage to answer all the questions I set out with, I’ve validated the application sufficiently to trust it with confidence and I urge everyone to participate and nip this virus in the bud.