Scaling to Meet Demand

Talkdesk
Scaling to Meet Demand
5 min readAug 23, 2018

--

Undoubtedly, scalability is an important issue and may be one of the main technical challenges for computer engineers. Especially for those who deal directly with reliability issues.

For companies like Talkdesk, the reliability of its systems and infrastructures to support the software solutions needed to provide their innovative services and solutions. In order to guarantee high quality standards, the entire platform must be highly reliable in order to avoid excessive processing time by servers, databases, algorithms, or software without any crash or unexpected error. Scalability is a decisive step on the road to success through stability.

Scalability is the capacity of a system, network or process to handle a growing amount of work, or its potential to be enlarged to accommodate that growth. It allows a system, network or process to adapt its operability even when its use conditions change. These use conditions are influenced by multiple factors including

  • Administrative: Ability for an increasing number of users to share the system
  • Functional: Enhance by adding new functionality at minimal effort
  • Global: Maintain performance across all physical access points
  • Load: Ability to expand and contract its resource pool to accommodate for demand
  • Generation: Ability to scale up by using new generation of components

These variables allow us to partially understand the need for scale in the informatic world and challenge engineers to build fully operable systems in this multivariable environment.

Database Scaling — Too much load

Frequently, your website or server receives thousands or even millions of requests each day. When you’re facing this kind of traffic load, scaling your database could be a possible solution. In order to alleviate the workload, you could use a replication strategy where two secondary databases are connected to the primary database. The data from the primary database is replicated on the secondary databases to divide the workload. Considering that the most common requests are to read data and not to write, the main database holds the writes while secondary database would hold the read requests.

The database replication solution does not necessarily increase the writing speed but can offer an improvement in writing time through an overall decrease in workload.

Another important consideration and potential drawback is the replication lag which is sometimes generated when the secondary database cannot keep up with the updates occurring on the primary database. In these cases the unapplied changes accumulate in the secondary database’s relay logs and the version of the database on the secondary database becomes increasingly different from that of the primary database.

This is a good general architecture concept to follow if your database is struggling with too much load.

Database Scaling — Too much data

If your problem is related with the amount of data and not enough free space to store it all, you might consider a Shard database architecture which distributes or partitions large tables of data to multiple databases based on ranges of values in a key field. This way, the database can be scaled horizontally across a cluster of separate database servers.

Within horizontal scaling data storage, scalability is defined by the maximum storage cluster size which guarantees full data consistency, meaning there is only one valid version of stored data and is independent from the number of redundant physical data copies. Clusters which provide “lazy” redundancy by updating copies in an asynchronous fashion are called “eventually consistent.”

One potential drawback to Shard is its eventual and inherent complexity as the data is divided and database grows. Queries get more complex thus increasing the difficulty to perform joints due to the data fragmentation.

Scaling Hardware and Software — Two sides of the same coin

Replication and Shard architecture are scaling solutions to solve some problems at an infrastructure level of databases. But databases are not the only element in the equation to improve the overall performance. Another component is physical hardware improvements accomplished through vertical and/or horizontal upgrades.

Scaling vertically is adding more hardware power resources (CPU, RAM) to an existing machine or infrastructure. Horizontal scaling is achieved by adding more machines into your pool of resources.

In addition to hardware upgrades, the software itself can be scaled up or optimized to increase efficiency and practicality. When an increase in performance is needed, the addition of a new node to a system may be an easier and more cost-effective method than performance tuning to improve existing node capacities.

Here are some common operations to improving the performance:

  • Store results of common operations to avoid repeating work
  • Reuse data when possible
  • Avoid complex operations during the request-response cycle
  • Don’t make requests from the client for things you already have
  • Move complex and inefficient algorithms off the main thread

Of course, simply increasing memory is a tried and true remedy for many problems. Memory is inexpensive to add to a server and often much faster for accessing data when compared to disk or network. It is particularly noticeable on PCs boot time. If your PC boots from an SSD (memory) unit, it’s almost certain that it will start faster than from an HDD (hard disk) unit. The same is true for data — it’s much faster opening a file already present on your PC storage than to try to access it from the network.

Glitches

Sometimes during the scaling process you may encounter a problem, roadblock or “glitch.” How does your system behave when it cannot scale further? Consider implementing backpressure and/or load shedding when you encounter a glitch.

Backpressure

It’s just a way of telling data to wait because the system is slow. Have you ever tried to go to a coffee shop, saw a line out the door and decided to go elsewhere? That’s example of backpressure.

Load shedding

Acknowledging an inability to respond to all requests. This is the nuclear option. Set aggressive timeouts in your web server and let requests hang and return blank pages. Some data is likely to be lost.

The issue of scalability in computer science is very important for enterprises like Talkdesk. The company’s focus is to ensure reliability of its systems and software solutions and this depends heavily on the scalability concept and its advantages as a key resource to meet the demand of clients. It is particularly important to guarantee the smallest rate of error when creating the software that can be shaping the future of what we will have tomorrow on our desk.

About the author: Sean Martin is a senior software engineer at Talkdesk, with 11 years of experience in Silicon Valley companies

--

--