Onfido Hack Day (pt. 3) — Immunis

Published in

Onfido Product and Tech

5 min readDec 5, 2017

This is the third in a series of blog posts about the projects developed during the Onfido Hackathon, in November. If you missed the previous parts, definitely check them out here and here.

Immunis

In this post we are going to talk about Immunis, an object security service, meant to transparently scan all objects (files, documents, images, …) uploaded to our services, analyse them for malicious content and if detected, block the files from being used in our pipeline. Similar to how the immune system would act upon external threats.

Goals

The traditional way to integrate an object security scanner (e.g. anti-virus) is to perform hard, in-line, scans and either drop, delete or quarantine an object if it is being detected as malicious. That is where we are right now and this approach comes with advantages like increased security as objects detected as malicious are stopped immediately, but it also introduces additional latency and potential failure points which could heavily impact speed and stability, and therefore availability. We wanted to find a way which has close to zero impact on speed but still provides a good level of security.

Some of our objectives for the project included:

Ensure that all files we ingest have been checked for potential threats like malware or embedded exploit code before they are accessed by any user or potentially exploitable code in the pipeline.
It should have no practical impact on file upload and processing speed.
Do not introduce a new failure point — our pipeline needs to be capable to work even if the object security service fails to operate properly e.g. due to a bug or high load.
Minimise the changes required to integrate the object security service to our existing services.

Ideas

We had to find a way to access the file objects which are being uploaded with minimal change to our existing services. Some ideas we played around with:

Tap into the imago service (which handles our object uploads centrally) and do the object analysis triggered from there.
Add a reverse proxy which extracts all objects from the uploads and passes them to an asynchronous analyser.
Passive network taps which extract all objects from the network stream.

As we were limited in time and also wanted to play around with new toys (golang in this case) we made the decision to prototype the second idea with a http/s reverse-proxy written in go interfacing with clamav as an object security analyser.

Initial integration diagram of the object security service — Immunis:

Implementation

The actual implementation was pretty straightforward and consists of two parts.

Part 1 is a docker container with the following installed: clamav and any dependencies, a small, 80-line, reverse proxy written in GO and go-bindings for communicating from the go service to the local clamav unix socket. It was important for us to be able to avoid writing the extracted object into a file but to be able to stream the bytes straight into the analyzer. This has multiple advantages. We don’t store potentially sensitive information on any volume, not even temporarily, and we save time by skipping the volume and passing on the bytestream directly.

Part 2 is a lambda function in AWS API Gateway which has a twofold use. First for marking files as malicious/infected and second for asking whether a specific file is identified as malicious and therefore should no longer be used in the pipeline. Each file is also identified with a checksum and all these ‘marks’ are stored in a DynamoDB table.

The flow of this setup looks like this:

A new file upload (POST) request hits our load balancer and is forwarded to our transparent proxy living in a docker container.
The GO proxy receives the request and creates two threads. One thread is forwarding the request to the next destination, our Onfido API services, and the other calls the clamav bindings and scans the stream of incoming bytes. The scan takes less than 500ms on the average file sizes we see. Once the scan ends, the result is parsed and is pushed to the DynamoDB table via a simple GET request on our Immunis API.
Meanwhile, each component of the pipeline can do a brief GET request on our Immunis API and validate that a file is clear for processing. This validation is flexible and can happen at any point, even as soon as the file hits the Onfido API. It is also easy to switch behaviour at any point and make this proxy synchronous so that any file scan has to be completed before the file is processed further on.

This implementation comes with some risks. While we perform the asynchronous object security scans the object is already being processed by the first set of microservices in our platform. Those services can check the status of the object from the Immunis API. Depending on the robustness and resiliency of a specific service, make the decision to process it, before a Immunis result is available. Or alternatively, stall the processing of the specific object until the security analysis result is available. On the flipside, it allows us to run the security analysis in parallel while our Onfido services retrieve the file, perform some sanity checks, store it to an object storage and perform other standard tasks.

Further improvements for this system could include rewriting the GO proxy as an nginx module for even more flexibility in deployment and for decoupling the extraction of the object from the analysis itself. Also we could extend the checks done on each file to be more thorough than just an antivirus scan. For instance, we could also scan them with foremost to verify that no hidden files are appended and hand the objects over to binaryalert for yara rule scanning or tools which verify the structure (or EXIF data?) of jpeg, png images.

The hackathon was a great excuse to test some of the improvement ideas we had in our backlog and we can’t wait to get this to production.

Onfido Hack Day (pt. 3) — Immunis

Immunis

Written by Pawel S