Merging RPA and SRE to Automate Processes, Improve Reliability and Increase Productivity

Timothy Tan
DBS Tech Blog
Published in
10 min readMay 31, 2023

Challenges remain, but creating a Centre of Excellence can reduce toil and workloads.

By Jose Netto, Timothy Tan, Wong Zhen Wei, Joe Chin, Benjamin Hong

One of the realities of software engineering is that toil is inadvertently introduced into all workflows. Take for example, the more code we write, the more technical debt we’ll introduce. The more compliance and release processes we implement, the more forms employees would need to fill. And the more applications we develop, the more effort technologists need to put in to support them. I could go on, but you get the gist.

Toil can be eased, but it is sometimes difficult to remove. While teams need to learn how to live with the presence of toil, DBS is always looking at it through a “tech-tinted” lens, asking ourselves, “How can we make this better with technology?”

This isn’t to say that DBS is free from toil. What we’ve done is to take a few major steps to remove and extract as much toil as we can using Robotic Process Automation (RPA). One of the places where we’ve found strong synergy is by embedding RPA­ into our core practice of Site Reliability Engineering (SRE).

What is RPA, and how does SRE come into the picture?

The core idea of RPA is to enable organisations to automate repetitive, manual, and time-consuming tasks by using software bots to mimic human actions, such as mouse clicks and keyboard inputs. These actions can be chained to perform meaningful business tasks like logins or creating a trade, allowing us to build full-fledged business process workflows.

In this regard, RPA is suitable for operational tasks that are done repetitively. One such example is seen in Investment & Trading Technology (ITT), which builds software, applications, services, and tools to support Treasury & Markets (T&M). Certain teams within ITT perform well-defined and specific tasks at a fixed interval. These tasks include daily work which require manual intervention, such as consolidation of daily reports, and new user onboarding, etc.

At DBS, SRE combines software engineering with systems administration to ensure that our systems are reliable, fault-tolerant, and secure. It focusses on reducing the risk of human error. In this instance RPA becomes a valuable tool in our tech arsenal to drive more efficient and reliable systems that deliver value our users and the bank.

The Role of RPA in SRE

There are many types of work where RPA contributes to SRE. Examples include:

1. Automating repetitive tasks

RPA reduces the amount of manual and repetitive tasks, allowing SRE teams to focus on more value-adding and strategic initiatives. Examples of bots that were created for our teams include those that handle daily application health checks, and even assist in responding to customer inquiries.

2. Streamlining incident management and resolution processes

More likely than not, SRE teams have set well-defined processes that manage different scenarios — such as incident escalation or issue closure — in place. By allowing these workflows to be handled by an RPA bot, human errors are minimised.

3. Improving response time and reducing downtime

As incidents need to be handled in a timely manner following pre-determined SLAs (Service Level Agreements), an RPA bot can reduce the overall turnaround time as it can provide an immediate first response after an incident or event is triggered, removing delays that could be naturally attributed to an on-call engineer who might already be preoccupied with an urgent task.

Why ITT Uses RPA

As ITT is continuously churning out solutions for T&M, we noticed an ever-increasing demand as T&M sought to deliver more value-adding solutions for its customers.

Taking a third-party’s perspective, we can either distribute the management of new applications to existing employees or alleviate their workload by hiring more SRE staff.

However, both approaches are not feasible in the long run, as we would either cause existing employees to be burnt out, or find ourselves building an SRE team to no end. This led us to experiment with RPA to leverage its benefits.

Figure 1: Structure of ITT’s RPA Centre of Excellence

Setting up ITT’s RPA Centre of Excellence

The high-level objectives of the team were to evangelise and promote RPA usage within the ITT support teams. To effectively do this, we set up an RPA Centre of Excellence, then agreed on the standards and direction of the use of RPA.

We structured the RPA Centre of Excellence by identifying and empowering colleagues to either be a core team member, or a subject matter expert (SME). Core team members are tasked with setting up proper processes, guidelines, and infrastructure within ITT, so that teams can easily onboard and create their scripts.

The latter will identify suitable use-cases, and work closely with the core team member to automate these scenarios. These SMEs are from SRE teams, as they are heavily involved in the day-to-day work and know best which scenarios require automation. This puts them in a unique position to evangelise and promote the usage of RPA within their own teams.

Because of this structure, our RPA Centre of Excellence cuts across the entire ITT department, enabling us to foster water-tight communication, as well as to drive central initiatives such as training, and best practices.

An ITT Success Story: Application Health Check

One of the teams the Centre of Excellence worked with identified an application health check as a suitable use-case for RPA automation.

Prior to the RPA implementation, the health check verification took at least eight steps to complete:

Figure 2: The multi-step application health check for ITT’s software

We drew out the entire workflow to visualise the work that was done, enabling teams to first get clarity on what was currently being performed manually. Reviewing the workflow made us realise that even if certain manual steps took just a few minutes to complete, they needed to be performed very frequently, sometimes even every hour.

While designing the script to enable automation, the team had to weave in a few considerations:

· The RPA script had to be created in a manner that allowed for easy extensibility in the future to accommodate other use-cases

· The steps, functions, and variables names had to be automated in a modular way, to facilitate reusability across other teams

This scenario had many upstream and downstream dependencies. This meant that the team had to be cognisant that some steps in the workflow could not be completed purely with the RPA tool. A mix of RPA scripts along with .bat script files written by other teams had to be carefully incorporated into the final solution to work seamlessly.

Over the course of 3 months, we successfully completed and deployed the application health check script. And because the script required no human intervention, the team could increase the frequency of health checks from being done once every hour, to once every 15 minutes.

Advantages of implementing RPA in SRE

Figure 3: Advantages of implementing RPA in SRE

As outlined a few sections above, there are a few noteworthy benefits in implementing RPA.

  1. Reduced manual toil

SRE teams can focus working on higher-level tasks and developing new tools and services. RPA enables a flywheel of progress, where an increasing number of tasks get automated, resulting in time being freed up to work on even more automation projects.

2. Creating a joyful workplace — better employee engagement

A continuous improvement culture is created, as pain points are continuously being addressed, and SRE teams are empowered to create their own bots, driving value a new and different way.

3. Improved customer satisfaction

As tickets and queries are handled faster and more reliably, we see an increase in end-user satisfaction.

4. Increased cost savings

Reduced time in incident and ticket management results in cost savings for the organisation.

Challenges of implementing RPA in SRE

As much as benefits have been derived from RPA, we have also met with headwinds.

1. Integration with existing systems
At DBS, we pride ourselves in having a rich ecosystem of applications and tools that are developed in-house, such as IAM and secrets management. This meant that before we automated our first scenario, we had to first spend time integrating RPA with these systems. This ensured that SRE teams only needed to focus on automating the applications in question, and not worry about these upstream and downstream systems.

2. Automating modern GUI applications
As RPA is primarily targeted towards non-technical folks, there is a core reliance on the inbuilt capabilities of the RPA tool to identify and manipulate screen objects. As software tech stacks get increasingly modern and complex (e.g. widgets-based application, Shadow DOM etc), there will be situations where some coding still needs to be done in order for the bot to function properly.

3. Bot Maintenance in an Agile culture

Figure 4: Source: https://xkcd.com/1319/

As application teams deploy at an increasing cadence, the RPA bot needs to be updated regularly to catch up with changes. This inadvertently creates unexpected toil on the SRE team to constantly ensure that their bots are working, and fix them if they aren’t. The reality is, there is an unspoken or hidden cost to maintaining the bots, no matter how good the use case is.

4. Governance and Compliance
As much as we’d like to automate everything, we are also aware that certain controls and processes are purposefully kept manual for gatekeeping purposes, ensuring governance and compliance standards are adhered to. For instance, it may not be wise to create an RPA bot to automatically deploy code into production. As such, SRE teams cannot purely afford to look at workflows from a purely technical perspective (‘Can this be automated?’) and must be able to put on different hats during feasibility assessments (‘Should we automate this?’).

Best Practices for implementing RPA in SRE

· Start small, clock fast wins
The best form of empowerment came in the form of us automating our first RPA bot as outlined above. Even though it was a relatively simple use-case, it encouraged us to relook the low-hanging fruit that could be automated, think deeper, and dream bigger to see how else we could apply the RPA bot to.

· Create a culture of collaboration and learning
RPA was a topic that most had vague ideas about, but lacked detailed understanding on. We got a small team of six developers together with the primary objective of sharing knowledge widely and rapidly. This enabled everyone to get up to speed, and internalise that they were in this together.

· Engage your stakeholders
I’ve written a lot about the SRE teams and developers who that stand to benefit from RPA implementation, but equally important are the key stakeholders — the senior management team and the SRE Team Leads. Many conversations were held to help them understand the benefits of this initiative. With their support, we easily sidestepped potential issues.

The Journey Continues in ITT

In the last couple of years, there were instances where other teams in ITT reached out to the RPA Centre of Excellence first, as they heard from others about the benefits of RPA.

The work is far from over — there are still bugs to fix, internal banking software and tools to integrate with, and problems to resolve. There is still so much left to do, but we’ve come a long way. Teams are taking initiative to create bots to ease toil, which is the goal of RPA in SRE.

Conclusion

We’ve described the roles that RPA can play in the discipline of SRE, and how the synergy can drive improvements in many different areas, fundamentally changing the way teams and organisations think about work. RPA has the potential to realise these easily, making it an important technology for organisations to consider adopting.

While some effort has been made also to describe the challenges that we faced when implementing RPA, with far-sightedness and a strategic approach, organisations are most likely to enjoy the benefits.

As technology gets increasingly complex, and as it permeates deeper into all organisations, the clarion call to identify ways to improve the way we support and serve our customers gets louder.

In addition to the articles already referenced throughout the article, here are some other further readings that benefitted us in the journey toward understanding how RPA could be applied in DBS:

· Eliminating Toil
· How to Reduce Toil with SRE and Automation, Maria Homann
· Using RPA for DevOps Process Automation
· The Evolution of Automation at Google

About the Authors

Jose Netto is a Senior Vice President in Investment & Trading Technology at DBS Singapore. He heads the development teams whose core focus is on developing SRE and Test Automation Solutions. He is also designated Senior Principal Engineer focussing on enterprise architecture for financial applications.

Timothy Tan is an Assistant Vice President in Investment & Trading Technology at DBS Singapore. He leads development teams whose core focus is on developing internal solutions and tools to improve developer productivity and quality of life.

Wong Zhen Wei is an application developer at DBS Singapore. He is passionate about creating innovative applications to improve productivity and efficiency.

Benjamin Hong is an application developer who focusses a lot on customer evangelism and empowerment, to make sure that the RPA truly benefits teams, and helps to make work joyful amongst our ITT colleagues.

Joe Chin is an application developer that is always keen to explore new implementations of Robotic Process Automation. He finds joy in assisting end-users in developing their own unique solutions.

--

--

Timothy Tan
DBS Tech Blog

Team Lead | Application Development (Full Stack) | DBS Singapore