Peak Readiness, a license to play.

Darren Flynn
adidoescode
Published in
8 min readDec 10, 2021

With the holiday season, comes the excitement of celebration, food and — shopping! The e-commerce business eagerly awaits this time of the year and with globalization, the peaks tend to spread throughout the year across regions.

In sport, performance in key moments separates the champions from the contenders. In the digital world, the performance of your e-commerce and app business during peak periods separates the great from the good. To be champions, everybody involved with the team — from the backroom staff to the players need to be great together, So, let’s begin with one of the most important things: get yourself a champion team!

From an engineering point of view, peak periods are exciting times, systems are stretched to cater to volumes that are far beyond anything experienced during ‘normal’ business. For some, this is the pinnacle of panic, for others it’s a license to play. Systems and solutions have limitations and peak necessitates overcoming these obstacles. Leaps of creativity are required, as people collaborate across the internal and external teams and partners to co-create innovative solutions for, resilience, scalability, security, and performance.

Peak performance is an outcome, an outcome that starts with ensuring your team understand the ‘why’ and have the freedom to design the ‘how’ and build the ‘what’. For e-Commerce today the ‘what’ is your foundation, ideally multi-cluster, cloud native applications consisting of idempotent micro services which are geographically dispersed across multiple availability zones in the cloud. This architectural approach minimizes the risk of a geographic/localized outage impacting your business and the idempotent design mitigates against a service reacting incorrectly to receiving the same response more than once (a common occurrence with geographically dispersed services communicating across the internet).

Fast is never fast enough, to ensure optimal customer experience, content needs to be as close to the consumer as possible. Content delivery (presentation to consumers) should be optimized and cached via CDN and indexing services that are scaled and situated in line with your customer populations and business needs.

A journey starts with a single step and to ensure that the steps are going in the right direction its important to know the goal and to check-in often and re-orientate when necessary. Planning and preparation for peak, means aligning with the business and technology teams on the list of essential requirements and priorities. Of course, all is dependent on capacity and feasibility and orientated around two critical priorities, Growth and Stability. If you are not seeing improvements in these areas on your journey, then course correct.

eCom-The field of play

Growth

Peak readiness requires collaboration and trust, systems and technology must perform at scale, business teams need to estimate the required volumes accurately and operational teams need to provide the products and processes to achieve this. Clear ownership, value driven outcomes and aligned effort is essential.

1: Estimate the scale.

Volume estimation and planning, by period, market, campaign and ideally day and hour so that teams are in the position to react, adapt, and prioritize effort requires great understanding of market conditions and strong data analytics capability. Delivering this information in a timely manner fosters the trust required to focus on designing for peaks.

2: Prioritise the business needs.

Customer facing teams in the markets should identify not just the pricing and product mix which will drive the volumes, they also must define and prioritise the capabilities and campaigns that will inspire browsers to buy and attract visitors to their eCom and App ecosystems. Given that there are always more things to do than time the challenge is to align also on what will not be done.

3: Scale up your operational teams

To achieve peak sales necessitates significant increases in visitors and customers and while great solution design improves the orders to service calls ratio, Consumer Service and Business Operations teams must be scaled up with trained staff to manage increased customer and order volumes. To further enhance services, RPA could be implemented to automate recurring tasks and AI Chatbots could be trained to augment service and provide seamless human-like interaction for customers with answers to commonly occurring queries.

4: Strengthen your SCM and logistics.

Equally critical, when the customers have been inspired to purchase and the solutions have been scaled to create and process the orders, the stock must be available for delivery to the customers in a timely manner. The supply chain from factory floor through DC to end consumer has to function and deliver the correct product to the consumer consistently.

Stability

A culture of openness and trust across internal, external team members, and partners is of huge benefit when looking to improve critical areas such as security, resilience, scalability, and performance. The people who know the limitations and areas of concern are the teams themselves and their trust is essential to identifying what should be improved. Mindset is everything, teams should be comfortable calling out problems (especially when they don’t personally have the solution). Structures should encourage a culture of value/customer first, excellence always. If not then problems might be hidden or ignored as edge cases. Its absolutely critical that teams collaborate to identify where effort is required early, it may be the gray areas, the intersections.

5: Identify the constraints and risks.

Teams should critically analyze their areas and dependencies and work together to identify potential improvements. If you don’t have a solid architectural foundation, then work on this first. If you do, then preparing for peak is playing “what if”, injecting problem scenarios then mitigating, initially reactively but through learning, adapting to proactive auto-correction. A blameless culture, coupled with a strong focus on agile delivery of value encourages teams to stretch the boundaries of possibility. Both are essential to this type of chaos engineering and the improved resilience and consistent customer value proves the benefit.

6: Validate that the environment scales.

Teams and partners need to carry out load testing across all internal and external components. Even if auto scaling is already in place, applications should be pre-scaled to minimize risk of resource constraints and guarantee resource availability for volumes above the maximum expected. Application auto-scaling facilitates fast reaction to changes resource requirements. However, customer volume increases due to global peaks are often related to global events, consequently every business may be placing the same demands on resources at the same time. So, auto-scale is good, but pre-scale across clusters and geographies might be better.

7: Have solutions for when things go wrong.

A chain is only as strong as its weakest link so once you’re identified your weak links its time to mitigate. eCom is not just about how great everything works when things are perfect its also how consistently well things work when there are problems. Some problems will stop the process, others can be mitigated by deferring work or degrading service, its vital to know which is which. Product Landing Pages (PLPs) and Product Detail Pages (PDPs) are critical components of customer experience so measure load times and optimise where necessary. If not already in place, then perhaps prioritise graceful degradation options to provide an acceptable level of service, should customers experience delays, this could mean optimizing long-running, customer-facing browser tasks or switching to static rather than dynamic content if thresholds are breached. It may also be worth validating and enhancing (if necessary) circuit breakers to ensure degradation of service is managed in a consistent and controlled manner. Additionally, identify options for both security and performance improvements, some changes could bring optimizations above 50%.

Payment is the point of pain for the customer, if a person commits to purchase then you certainly don’t want issues during or after authorization to stop or reverse the sale. Having multiple PSPs, Risk Partners and Acquirers with seamless routing of payment across your external partners may not be possible for all but irrespective of your size a strategy to mitigate issues should be predefined and ready to implement if necessary. At the very least, perhaps a process to disable payment methods when thresholds are breached and a mechanism to clearly communicate the service impact to your customers.

8: Align the technical needs with your business partners

Once priority requests are analyzed and reviewed by the product and platform teams and by any partners it is common sense to create backlogs to ensure effort is focused where it matters — on delivery of value. However, it is also critical to ensure that your business and external partners appreciate the necessity as their understanding of the priorities may be different. Share information to create a common understanding and shared purpose, ‘we’ are stronger than ‘us’ and ‘them’. By channeling focus in this manner, you will be able to sub-divide work, mitigate obstacles and focus on continuous delivery of value.

9: Focus on Release Excellence

The world doesn’t stop, and neither can progress. Every change is a risk so is best to work together to mitigate. Give teams the responsibility, they are the experts, let them define what to test and when to release (within guidelines). Objectively measure release impact and guide teams to continuously improve both product stability and release experience. Create a shared global unified release calendar, make everything that will impact your ecosystem visible to all (including internal and external releases and major events). This allows teams to focus on proactive planning and mitigate risk during releases. If releases are successful, celebrate, if not conduct blameless postmortems to learn and mitigate reoccurrence of any issues.

10: Make everything visible

It’s difficult to manage what you can’t measure and impossible to react if you don’t know what’s happening. None of the above is possible without observability, so designing and implementing a comprehensive observability and alerting capability across the entire Digital Ecosystem is essential. This is journey, once you’ve provided end-to-end visibility of both the business process and underlying technology components then work to continuously improve. Two critical dimensions are granularity and timeliness, by focusing on different levels you should be able to automatically alert the appropriate teams when both gradual and sudden degradation occurs. It makes sense to enhance this capability in line with your service quality expectations, tuning alerts to reduce noise, linking process guides with alerts to increase resolution speed, or implementing AI-driven alerting are all areas to consider.

In Ireland, there is an old saying, “Maith an t’osnaigh is leath na h’obair” (A good start is half the work) and hopefully the thoughts above give you just that, a good start. Peak performance is an outcome and with a strong team, collaborating to achieve a shared purpose, then there is no limit to the peaks that can be scaled. Of course, it’s all common sense, surround yourself with great people, empower them and trust then, give them a license to play and get out of the way. So if you didn’t do it this year, don’t worry, do the post mortem, learn and make sure to give it the best start next year!

Disclaimer: The views, thoughts, and opinions expressed in the text belong solely to the author, and do not represent the opinion, strategy or goals of the author’s employer, organization, committee or any other group or individual.

--

--