Designing for High-Volume Writes in Salesforce

Published in

Salesforce Architects

7 min readJun 15, 2021

In the digital era, systems and devices are generating data at high velocity and on a massive scale. Before deriving business insights from this data in your CRM platform, you must first have a reliable way of getting rapidly generated data into the platform in accordance governor limits, including limits on CPU time, database reads, and database writes.

An earlier post covered designing for high-volume reads in Salesforce. This post focuses on high-volume writes and the buffering mechanisms that you’ll want to consider when your system requires a large volume of API calls to write data to Salesforce. It includes design patterns, a reference architecture, and design considerations for building systems that are highly scalable, resilient, and designed to seamlessly handle high-volume writes — tens of millions of concurrent API write.

Business use cases

In part, the motivation for this post stems from our recent experience designing a system to handle COVID-19 vaccine appointments. As you would expect, the system needed to handle a huge surge in traffic when the vaccine became available, and thus, it needed to perform high-volume writes. Of course, applications that call for high-volume write are common to many industries and business functions, including claim management (where a natural disaster can cause a spike in filed claims), case management (any service outage from electric power to streaming video can lead to surges in customer service cases), and sales (where promotional campaigns can bring in a flood of new customers) to name a few. Beyond these examples, there are many more, including examples that rely on data from clickstreams, IoT devices, or any source that is regularly and rapidly generating streams of data.

Design patterns

When designing for high-volume writes in Salesforce, the key principle is to ensure that constraints (as defined by governor limits) on database writes, database reads, and CPU time are not violated. This can be achieved by following a well-known design pattern: decoupling. In this context, decoupling refers to separating components — for example, Salesforce and a service that is producing data to be written into Salesforce — so that they do not directly interact with one another. Stream and queues are two popular mechanisms for decoupling components.

Streams
Data streaming systems, such as Apache Kafka or Amazon Kinesis, are used for collecting, storing, and processing real-time continuous data. Data sources send the data for ingestion and processing by data sinks, which consume the data. A key capability of data streams in this design pattern is the ability to control the flow of data — through throttling or buffering — so that it can be written to Salesforce while remaining within governor limits.

Queues
Message queues — such as AWS Simple Queue Service (SQS), Anypoint MQ, or Azure Queue Storage — make it easy to decouple and scale microservices, distributed systems, and serverless applications. Streams and queues enable you to send, store, and receive messages between software components at any volume, without losing messages or requiring the receiving services to be always available. Building applications from individual components that each perform a discrete function improves scalability and reliability and is the best practice design for modern applications.

You can use either streams or queues to facilitate high-volume writes to Salesforce; your specific use case will help determine which option is the better fit. Here are some key differences to consider (note that these are not hard-and-fast distinctions, the specific stream or queue technology you choose may not fall neatly into line with these generalizations):

Data availability: With streams, data is available in near real-time; whereas queues will typically have more latency.
Scale : Both patterns have built in Auto-Scale mechanism
Message ordering: With streams, the data is ordered; with queues, it is not always ordered.
Replay: Streams support message replay; queues often do not.
Message size: Streams are generally used for larger messages (1 MB) vs Queue are in range (256KB) .
Message delivery: With streams, multiple consumers can receive the same message; with queues, once a consumer has processed a message from the queue, that message is removed and no other consumer can read that message

Reference architecture

The reference architecture architecture patterns shown here incorporate Salesforce (for building the business applications) and AWS services (for processing the data streams). Similar architectures can be built with Salesforce and Heroku/Mulesoft, Salesforce and Microsoft Azure, GCP or other technology combinations.

In this architecture, Salesforce provides the platform for building CRM business applications. For example, Salesforce can be used to create cases, assign cases to service representatives, store summarized data, view it using reports and dashboards, handle platform events, send emails/notifications, and perform various functions on the data using business rules.

The two flows shown in the above architecture use streams to handle data generated by clickstream feeds and IoT devices. In this example, Amazon Kinesis Data Streams and Amazon Kinesis Data Firehose perform the stream processing.

In the architecture above, a queue provides the buffering function, acting as a bridge between a high-speed data producer and a low-speed data consumer. In this example, this functionality is provided by AWS SQS. The queue is running on the AWS platform. The lambda function will send data to the Salesforce REST Endpoint.

Other components in this architecture include:

Identity Provider: The identity provider is used to verify the user’s identity and issue a token to use platform resources. In AWS, this function is performed by Cognito. Cloud-based identity providers such as Okta, Ping Identity, and Azure AD perform a similar function.
API Gateway: The API gateway is the access point to one or more APIs that are deployed in the platform. The access point checks if the request is valid and grants access to the resources. Some API gateways also perform the throttling function (for example, MuleSoft API Manager / Gateway). The MuleSoft API Manager provides a throttling capability that can restrict the number API calls that can be made through the API Gateway. In AWS, this function is performed by API Proxy.
Storage: Streaming data is ephemeral, and it has to be eventually stored in a permanent data store if it is to be kept. This can be done by AWS S3 or Redshift, or in a database such as Heroku Postgres.
Microservices: Microservices perform key functions such as invoking APIs, writing to a data store, or performing computations. In AWS, this is performed by Lambda functions. Other cloud providers have similar implementations for serverless functions. MuleSoft Anypoint Studio provides various components for batch processing, logging, transformations, and so on.
Machine Learning Models: Machine learning (ML) models can be built using a platform such as AWS Sagemaker, trained on sample data, and deployed at an endpoint. Other tools used to build ML models are Tensorflow and PyTorch.

Additional design considerations

Other than database and CPU considerations, writing data at scale involves a range of other design considerations:

High Availability: Take steps to ensure the system will operate continuously with zero or minimal downtime.
Multi-Region Support: Use multiple regions to support fault-tolerant applications with easy failover to a backup region.
DDOS Protection: Plan for handling distributed denial-of-service (DDOS) attacks on your online services so they are not overwhelmed by maliciously generated traffic
Security: Implement industry-standard security practices including API key-based authentication,
OAuth access tokens, policy-based access control, rate limiting, and two-way TLS authentication
Monitoring: Continuously monitor systems to detect performance degradation, errors, and critical issues.
Scalability: Create an elastic architecture that can automatically scale as needed

Conclusion

The tremendous scalability that Salesforce provides requires the enforcement of governor limits to maintain performance across the shared system. Use the patterns and concepts covered in this post to design systems capable of high-volume writes within those limits. And, if you also need to handle large volumes of API calls to read data from Salesforce, be sure to follow the guidance covered in Designing for High-Volume Reads in Salesforce.

About the Authors

Varad Vardarajan is Director of Product Management — Microservices Platform. He has more than two decades of industry experience in healthcare, financial services, and manufacturing. He is passionate about working with customers to help them define and execute their business transformation.

Tushar Jadhav is an architect with the Salesforce Customer Success Group. He has almost a decade of experience working with customers on CRM transformation using digital technologies..

Suchin Rengan is a VP, Strategic Account Architect with the Salesforce Customer Success Group. He leads a team of architects that works with Salesforce’s largest and most complex customers to meet their scale needs.