Putting Jetpacks On Our Membership Platforms; How the FT made message processing near real time in salesforce.com
by Gareth Park
In 2015 the FT replaced its monolithic subscription and entitlements system — replacing it with a platform of microservices and APIs (Find out more here). This provided the FT with a modern, scalable platform for managing our users and subscriptions on FT.com.
However this left us the challenge of providing comprehensive systems which fully support our Customer Services and Back Office operations teams. We’d previously integrated Salesforce but updates were slow, with one feed of read-only data processed every eight hours. But yet still Salesforce felt like the logical solution — as a CRM it’s what it was born to do.
From this point we began planning a mechanism which could allow Customer data to flow freely between our new Platform and Salesforce, enabling Sales and Customer Service agents to action key business processes from within Salesforce itself. We wanted the power of the Salesforce Platform to be harnessed by our business users, so that they could use Salesforce to undertake actions on customer’s behalf, such as upgrading subscriptions. The microservices based platform we’ve built uses Apache Kakfa as it’s key mechanism for asynchronous inter-service communication.
To solve this challenge we opted for an event driven synchronisation mechanism that kept a replica of subscriber information in Salesforce. It’s the kind of approach that gives most Engineers nightmares, but something we strongly believed was the best option. We tapped into our messaging broking infrastructure using microservices to publish and consume messages with a bespoke Java App (the Salesforce Events Bridge) developed to orchestrate messages in Salesforce.com to the Message Queue. The Salesforce side of this is described in more detail here.
When we implemented our plan it was hailed as a success due to a noticeable increase in speed — updates to and from Salesforce now took minutes (just five of them!) rather than hours (up to 8)!
Salesforce became a powerful one stop tool for customer service agents to manage FT.com users. Our ability to quickly process messages was a triumph compared with previous ETL syncs, however our users started requesting the holy grail of integration: real time updates.
Wait… Real Time Integration with Salesforce? Really?
In order to try and discover the best way to make this happen we did some experimenting, before hitting a bottleneck. Salesforce messages were being pushed by the events bridges and stored in a Custom Object. A scheduled job would then pick up the messages and process them in a single batch. Simple but limited by rates at which scheduled jobs run, and the impact of large batches being transactional, offering opportunity for a dud message to break a processing run.
One alternative could be an after insert trigger within Salesforce, which would process messages immediately. We purposely decided against this approach because:
- We want to guarantee the delivery of the message. Had the platform hit a governor limit or runtime exception during processing the entire transaction would be rolled back and the record would not have been persisted. This would make monitoring processing, and diagnosing issues within Salesforce extremely challenging. Consequently developers would have to interrogate Kafka to find the lost messages.
- We want to control the process when there is a high influx of messages. Certain message types pushed from the platform can come in high volume at sudden intervals, we want to control the size of the batches we are processing.
The asynchronous approach gifted us resilience but no extra speed. We examined this problem, put our thinking caps on, peered hard at Salesforce and found:
- Apex Jobs are just not designed for fast processing — we pushed them to run every minute by having them reschedule themselves but this was slow-moving.
- @Future methods would allow us to operate under a different set of governor limits. However, the execution time cannot be guaranteed.
So we began to look outside Salesforce…
There are a number of Node module libraries for integrating with Salesforce. In this case we used NForce — a node.js salesforce REST API wrapper. The Node app orchestrates messages via REST calls to SF — a connected app was setup in SF to configure the security around this integration. There are no Governor limits around the number of call-ins to the organisation but there are limits to user connections per hour, with sessions managed using an Access Token. NForce methods allowed us to do this.
On the Salesforce side; messages are pulled from the custom object and the processing delegated to Batch classes using the Apex Flex Queue, introduced in Spring ’15. This measure allows us to submit up to 100 batches for execution, where previously this was 5. Jobs are put into a holding state in the Apex flex queue and when this system’s resources become available they are moved to the batch job queue.
Therefore by leveraging the inbuilt queue mechanisms of the platform the solution will scale naturally when SF receives high volumes of messages. The end result now sees messages pushed to Salesforce from Kafka, the processing of the messages occurs in seconds, and relevant records are created in *near* real-time to the end user (even during times of high traffic). Customer Service Agents and FT.com users hugely benefit by having real time updates flow between Salesforce.com and the Platform. In short, this work has led to a faster and even better level of service on offer from FT.com.