If you work in a software development organization, you probably deal with various software development applications. These include core applications like source control, bug tracking, and continuous integration. In a small organization (less than 500 engineers), scalability is not typically a concern. Most applications in this area perform quite well when the number of users or the amount of concurrent activity remains small. This article is for the other end of the organizational spectrum, the ones who experience the challenges of using SDLC applications which were not designed to meet the needs of a large organization.
The term “commodity hardware” is nearly synonymous with horizontal scalability. When the hardware is inexpensive and seldom fault tolerant, you need to design your software to provide fault tolerance. It needs to be capable of distributing the workload across multiple systems and expanding to include more servers as usage demands. The idea of “commodity software” is similar: if the application is not designed to scale horizontally with increased load, you must distribute the projects and users across multiple instances.
Organizations commonly make the mistake of trying to deploy single monolithic application instances and scale them vertically to support the entire company. Unfortunately most applications are not designed for large scale deployments, so performance begins to degrade as the amount of concurrent activity increases. This is usually a result of application design limitations, and no amount of optimization or tuning is going to change that.
Of course these problems rarely present themselves during initial deployment because the application data is relatively small and the activity on the system is low. If performance degradation is gradual, it may be possible to project the resource usage forward to anticipate future requirements, but sometimes the degradation is severe or exponential. The best way to mitigate the risk of performance problems in the future is to decompose the application into smaller instances. But how do you know when an application needs to be decomposed for horizontal scaling? The answer is easy: always. Assume that an application can’t scale and you won’t be disappointed.
Here are some common factors that lead to scalability problems:
- Storing application data on the filesystem
- Loading large amounts of data in memory to render in the user interface
- Executing poorly tuned, highly relational database queries
In order to decompose the application, you must determine if it is possible to decompose its usage. This requires an understanding of how users, projects, and teams are structured within your organization. The diagram below shows the comparison between team-based organization in a single application instance compared to an alternative deployment with breaks the application into three instances.
There are several things to consider when breaking up an application:
- Logical grouping of projects and teams
- Reporting and communication across instances
- Licensing and operational costs
Projects and Teams
The first challenge to decomposing any application is deciding how to distribute the projects, teams, and users. There is no magic formula. You must understand how users interact with the application and then model your environment accordingly. For software development organizations, it makes sense to decompose your application instances around software and organizational structures. Try to allocate teams along technology, organization, or product boundaries.
In order to ensure consistency across instances, you may need to develop internal tools which can be used to deploy new application instances and create new teams, projects, and users within each instance. By using abstractions to model the administrative activities across application instances, you can establish an infrastructure which can manage an arbitrary number of application instances or application types.
Examples of this internal tooling include:
- Containerization (Kubernetes)
- Configuration Management (Ansible, Puppet, Chef)
- Administrative tools for invoking application REST APIs
With these tools in place, it should be possible to deploy a new application instance in minutes. Once the instance is running, the internally developed administration tools can be used to configure the application. As usage increases, it may be necessary to create a self-service user interface to allow users to create new teams and projects on their own.
Cross-application Reporting and Communication
When multiple instances of an application exist, it becomes more difficult to manage integration or communication across instances. It also becomes more difficult to provide transparency across multiple instances. In my previous article, “Transparent Software Development” I describe the use of Kafka as a message bus to expose information and events about each application in the SDLC environment. This same principal applies when deploying multiple instances of an application. The use of event messages makes it easier to collect data and metrics, trigger events, or integrate with other applications.
Licensing and Operational Cost
The software development process is typically composed of some combination of commercial and open source software. Each commercial product will have unique licencing terms that determine licensing and support costs for the purchase and ongoing use of the software. Licensing terms may be based on the number of users, number of instances of the software, or the operational environment where it runs (i.e. number of CPUs, amount of RAM, etc). When scaling an application, it is critical to consider the licensing and operational costs of the software to ensure that the most cost effective architectural and operational decisions can be made.
Although open source software may not have a licensing cost associated with it, there could be a higher operational cost due to a lack of paid support or quality documentation. Depending on the software project, your operational staff may need to invest more time becoming experts in the software or investigating problems. The same is also true for custom tools and software developed internally.
In addition to the application licensing, there may be licensing costs associated with the operating systems or hardware as well. Management and administration of the hosting environment caries a cost as well, so ensure that you choose an environment that you have the knowledge and ability to efficiently manage at scale.
Deploying multiple application instances can be an effective way to ensure that performance does not degrade as application usage increases, even for applications that were never designed to operate at an enterprise scale. In order to be successful, it is important to make thoughtful decisions about the infrastructure and internal tooling that is used to manage the application instances. By keeping your applications small and lean, you will be better able to predict application performance and ensure that the application can continue to scale well into the future.