Building a new mobile Magnet.me application, and how a Content-Security-Policy issue on Android nearly killed it last-minute

To make it as easy as possible to find a job you love, Magnet.me has mobile apps live for both iOS and Android. Around 18 months ago we decided to re-develop our mobile application. The existing app had a clunky integration with our OAuth authentication layer (it did not support two factor authentication for example), and regularly got out-of-sync feature-wise with our web application.

Picking technologies

Given our previous endeavours on mobile, we learned the hard way how difficult it was to keep 3 platforms, (web, iOS, and Android) in sync. Magnet.me has a relatively small engineering team, but tripling the number of frontends to maintain was definitely too much. It would also make it harder to run experiments. We therefore make it a requirement that we could:

  • Share code between platforms
  • Control the deployment pipeline

We quickly realized that for this to work, we’d need to drop fully native as a strategy, and reuse our existing web flows. As we’d be using web technology on mobile regardless, we decided to drop all native components, and go all-in on web! Given our frontend stack, we then settled on react-native for the remaining bits and pieces.

All in all, the agreed upon architecture looked like this:

  1. A start screen, built in React native,
  2. An authentication layer, which allowed signing in and registering (more on that later), and would support 2FA,
  3. A wrapper around our web application (more on that later too),
  4. Linking the magnet.me domain with our app,
  5. Push notifications

Integrating with OAuth

Magnet.me runs its own OAuth2.0 server on https://oauth.magnet.me. We use this OAuth throughout our stack, and mobile would not be an exception. Yet mobile development has some challenges, as the OAuth server needs to yield data back to the client, but cannot assess whether the client is the intended one.

On iOS Universal linking can be used, which verifies the domain connection both ways, but on Android we were stuck with deep linking on a scheme. It’s no real surprise we’d pick the scheme magnetme: but that would be trivial to figure out for an attacker.

The problem here would be that if another application on the same device on Android would also register the scheme, it may be the one receiving the deep link. And that deep link would, if the authentication was succesfull, contain the access token of the authenticated user. Most undesirable!

We opted to remedy this by implementing PKCE, an extension to OAuth. It works relatively simple:

  1. Generate a random string with enough entropy,
  2. Send the user to OAuth as normal, but add a sha256 hashed value of said string to the request,
  3. When the requests completes successfully, the deep link is called with an authorization code which we’d need to swap for an access token,
  4. However to obtain the access token, the client needs to send the original random string along, so the server can verify the client initiated the request as a malicious application could not possibly know this string!

One of our backend engineers took up the challenge of implementing PKCE and within days we had a working version running in production. Making the required changes in the React native application was easily done as well. Testing on iOS showed it worked, and we were fully confident we’d cracked this case.

At least we thought…

Why not properly test in Android as well you mind ask? Mind you, we were in the middle of a pandemic when all of this happened. Our office had closed and due to the working from home situation, all of our (mobile) test devices got scattered throughout the country at the homes of different designers and developers. For mobile we had to use our own devices, which in this case happened to the iOS twice, and no Android. Lacking test devices we tested Android in Android Studio.

Running the wrapper

Once the token was obtained using the method described above, we would store the token safely, and load the application. Being largely a wrapper-like application, most people, associate this with bad performance characteristics. We therefore went to great lengths to ensure that our application would feel as native as possible:

  1. All scrolling actions had to mimic native variants, so no “scrolling through jelly”,
  2. No browser actions should trigger, such as text selection,
  3. Backdrops had to be visible near the screens edges (Hi there lovely rounded iPhone),
  4. Native sharing should be used,
  5. Logging out in the web application should clear push notification registration,
  6. No delays when switching views.

Additionally we really wanted to control the deployment pipeline, so it was desirable that the app, when foregrounded, would update itself if possible.

As we were able to stand on the shoulders of react-native-webview we were able to control most of the above by injecting specific Javascript code into the Webview. Using this approach we fixed the scrolling, browser actions, and native sharing.

To ensure the UI would respond instantly to user input, we would request all UI javascript 2 seconds after the initial view had loaded. That would ensure that upon navigation the UI would update instantly, even if data was still being fetched in the background. This instantaneous update made the application feel much more native-like

Given React native and the Webview could not easily communicate, we used postMessage for this. This handled the backdrop handling, or the push notification unsubscribing on logout.

By this point we were ready to start internal testing to gather some feedback. We set up the flows in Google Play and the App Store, and the first internal builds were released to our colleagues in Design, Development, Marketing, Commercial, and Customer Success.

Testing has commenced

The first testing round started, and we were feeling pretty proud of ourselves. Not only had we ticked all requirements, we had done so in just 2 weeks! The Marketing and Commercial teams also loved the first builds, and very few bugs were reported.

Then we got the following message from a fellow developer on Slack:

When I try to log in, it opens the secure view to OAuth correctly. I then proceed to login, enter my 2FA code, click Confirm, and then nothing happens at all.

That seemed like a show stopper! We asked to retry different times, reinstall the application, and whatnot. For this colleague the issue persisted. Other colleagues with Android confirmed the issue.

We dove into the issue, and realized that almost all of our marketing and commercial colleagues were using iOS, and we actually had not yet received any responses from anyone with Android. This message was the first!

Our cross platform application is now non-functional on Android

We were startled: how could a platform-agnostic ™️ platform as React native have unique issues on a single platform (we’ve since learned React native is not as agnostic as we had originally hoped)? We re-tested the flow in Android Studio: no issues. We re-re-tested the flow in Android Studio. No issues. We asked for logs of the device with issues: no mention of anything. This was not looking good.

Android Studio debugging apparently was not going to cut it, we needed our physical devices for debugging. We therefore mandated a company-wide Test Device Return To Office. Everyone with test devices was to either drop them off (if you lived close), or send them in by post. Given our office was still closed, I volunteered to work alone from there to collect all packages and devices. The deafening silence of an office, usually full of energy, felt like an omen.

Once all Android test devices were redistributed to the mobile folks, the hunt begun. Test device 1 got the latest version from Android Studio, and we followed the exact reproduction steps, while watching carefully what output the debugger would generate.

It. completed. the. login. flawlessly.

Damnit.

Test device #2 and #3 got loaded in parallel. My nervous fingers were above each touch screen, ensuring the reproduction steps were performed correctly. Being 100% confident I entered everything correctly, I was horrorstruck when both devices closed their secure views and showed the intended job seeker home screens. No bug!

Desperate debugging

What could be another difference between the devices we were testing with, and those our colleagues used daily? We loaded the distributed builds on the devices, and all three encountered the bug. Fearing some Android build magic was to blame, we were now extremely sad. Yet the full device logs using production builds did not yield anything. It was as if the request never hit the OS to start with, as if the problem wasn’t with the mobile application…

If the problem wouldn’t lie with the application, it surely must be something before the application would receive the deeplink. That narrowed it down to either Android itself doing something weird, or the Chrome custom tab that Android uses for the authentication flow.

As we had some difficulty attaching to the Chrome custom tab in our production Android build, we instead decided to ship temporary exhaustive client logging. We’d trace the network logs more exhaustively, added all possible exception handlers in Javascript we could think of, and added reporting to all tooling we had. We re-ran the reproduction steps.

A moment of Eureka

The Javascript reporting gave no more information. The network logs all looked fine. Yet our security tooling had triggered but it looked like a false positive.

The indication was that the browser had blocked a form action towards magnetme://app/oauth2 . That’s exactly the endpoint we are using for the deep link, but a quick inspection showed that exactly was part of our Content Security Policy we used on OAuth.

form-action magnetme://app/oauth ‘self’ https://oauth.magnet.me https://magnet.me

According to the docs, that seems the exact way of constructing the redirect URI on Android (which we also used on iOS, were everything was functional). So it looked some glitch.

Nonetheless our curiosity was triggered, and the used CSP, and after some careful considerations, we shipped a feature which would allow us to temporarily switch our CSP to reporting only. We triggered the feature in the backend, ran our reproduction steps for the seemingly thousand time. After filling the required 2FA code, hitting Confirm, the authentication session closed and the web application loaded without issues.

Eure-fucking-ka!

The problem with the Content Security Policy

Switching the production CSP to reporting only would fix the problem across all devices. Yet for security reasons we liked the idea of having a CSP. So why would it block an exact match? And only on Chrome custom tabs, and not within the ASWebAuthenticationSession iOS uses?

With some simple tinkering we experimented with loosening the CSP to only include magnetme: scheme, not adding a host nor path. It worked on iOS. And it worked on Android. Re-adding the host caused breakage again on Android.

In the end, with PKCE present as another layer of defence, we decided to relax the CSP to just include magnetme:, leaving our the host and path we’d ideally use.

TLDR

  1. When using OAuth2 for mobile authentication,
  2. Where deep linking is leveraged to pass a code back to the mobile application,
  3. Where a Content Security Policy is present,
  4. Ensure that only the custom scheme is present in the form-action value of the CSP

The custom scheme should include the colon : but no slashes, nor hostname, nor path. An example of a valid form-action would be form-action magnetme:, whereas a faulty one (on Android) would be form-action magnetme://host/path .

Epilogue

After this bug got squashed, we resumed internal testing (and removed all non-essential added logging). All lights being green, we decided to push both Android as well as iOS to their respective app stores. Both versions were well received, and currently have a combined user base of over 15%+ of our Monthly Active Users.

With the applications live for the last couple of months, we have achieved the goal of shipping to all platforms at the same time, and control the deployment pipeline. All with minimal engineering efforts.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store