Building resilient software at Goji

David Genn
Technology @ Goji
Published in
4 min readJun 16, 2017

At Goji we’re building a market leading technology platform to enable Financial Intermediaries to give their clients access to investments in the peer to peer lending and alternative credit sectors.

The markets that we invest in are digitally native, and so unlike more traditional investment products and asset classes, this means our platform has many systems connecting to it and we are in turn connected to many other systems. We need to be able to guarantee the security of investor’s data as well as ensuring rock solid financial processing.

The importance of cyber security

We’re following a number of industry best practices to prevent against failure and data loss and also to recover well in the event the worse does happen.

Almost every day there appears to be a news article detailing some failure in the internet or the breach of customer data. For this reason, cyber and data security are high on the regulators’ agenda.

As much as we may do everything we can to guard against incidents happening, it’s much more a question of when a failure will happen rather than if it will occur. This means we need to be building software with the assumption that failure will occur at some point, and we should be defending against this at every turn and regularly testing the measures we have in place.

We deploy our software to Amazon’s cloud environment (AWS), one of the largest hosting providers in the world. This not only saves us the cost of procuring actual physical hardware, but also allows us to build a very fault resilient architecture.

One of our key priorities is protecting customer data. To help us achieve this we use an AWS managed MySQL database to store our investor data. This database replicates in real-time between two distinct physical locations in Ireland to provide us with one layer of redundancy.

The data is also replicated to a sister database in Germany that we can switch to in the event of a failure that meant that both databases in Ireland become unavailable. Data is backed up nightly to AWS S3 storage and archived to AWS Glacier on a monthly basis. AWS S3 is an extremely reliable mechanism for storing data. AWS Glacier provides durable long-term storage. Any personally identifiable information within the database is encrypted using an AES-256 algorithm. This is an extremely secure symmetric encryption algorithm which means that in the event the database is compromised, the data will remain protected. Passwords are also encrypted using the one-way hash Blowfish algorithm.

Software architecture

In the same way that the database is in two distinct locations in Ireland with a backup in Germany, our application servers that power the Goji platform follow the same pattern. This means that if the severs fail in one of the locations in Ireland, traffic is automatically routed to the alternate location in Ireland with zero downtime. In the event that both locations in Ireland fail (extremely rare, but has happened in one of the US AWS regions) then we can failover to Germany. This is a manual process which we have completed within 30 minutes in the drills we run as part of Goji’s security tests.

This architecture protects us against failures due to things like power loss, flood, or malicious access to our data, but a much more likely fault is a failure in one of our internal systems. Our platform is broken up into a number of small services such that if one part fails, the others can continue to work. For example, if the email service fails, we don’t want to stop people logging into the platform. Once the email service has been restored, any unsent emails can now be processed but the rest of the platform remained unaffected. We also have multiple instances of each service running so that if one fails, the other can take its place. This is known as a micro-service oriented architecture that encourages having many small, single-purpose services that are composed together to form the whole system. This allows us to rapidly build and deploy new features as we only need to change and deploy the micro-services that are impacted.

Cyber attacks

A very common kind of attack you see in the news is a Distributed Denial of Service attack (DDoS). This is where a web application is swamped with a large number of requests which prevent legitimate requests from being processed. To protect against this we use a Content Delivery Network (CDN) and Web Application Firewall. A CDN allows us to spread incoming traffic over a large number of servers around the globe and a Firewall can block requests coming from certain countries or IP addresses if they are shown to be malicious. AWS Shield allows us to proactively monitor incoming traffic for malicious requests and block them before they reach our infrastructure.

Testing

Every change to our software is run through a test suite containing thousands of automated tests to ensure that the new functionality performs as expected and existing functionality is not negatively impacted.

We regularly perform load tests and Disaster Recovery tests to ensure we are well drilled in handling outages and we are subject to an external security penetration test to give us a third party perspective on our vulnerabilities.

In a recent load test, our API served 99% of requests in under 400ms with a sustained throughput of 40 requests per second.

Culture

We keep security and performance at the forefront of our minds when building new software components. We place great emphasis on culture when hiring developers. Being humble and inquisitive are essential characteristics if we’re to spot potential flaws in our platform and be able to listen to feedback from others. Continual learning is essential if we’re to stay up to date with developments in this field. We run Friday ‘lunch and learn’ sessions where team members can share videos, podcasts and talks that can help us think about new ideas and ways of working.

Technology security is never something you can ever completely protect against and we remain vigilant and proactive to build the most secure and stable platform we can.

--

--