How to provoke a stabilization roadmap? — PART 1

Daniel Duarte
Agile Insider
Published in
9 min readFeb 18, 2023

Relating better with business concepts and technical contexts to enhance the relationship with the team and identify solutions.

Attention! This article is for us producters to bring changes that will positively affect us on the business roadmap, AT NO TIME THE IDEA is that you develop technical skills.

There comes a certain moment in the product development that we need to stop and evaluate what we have already implemented, do we already have a stable software? Are we producing crises in production? Are we using the latest industry standards? Do we leave technical debts that could endanger services? Are we vulnerable to cyberattacks? The questions are endless and can be summarized in a single theme: stabilization roadmap.

This type of roadmap is mostly technical and is far beyond our knowledge, which generally concerns business and markets.

I have already mentioned in several other articles, how important the relationship of trust between Products and IT is, and yes, it helps a lot when we have this good relationship between the team members so that the people who develop proactively give us inputs of what is wrong, or could be better in the code. However, this is not always possible, and this is where we at products at least have to know that certain tools exist — emphasis on “exist” — at no time should we operate these tools to build backlogs ourselves.

This concept is foreign to most producters, but it’s called “Spike”.

Origin of Spike and why you should be a fan of him

Spike is a type of exploit Story Enabler, it is originally from XP (Extreme Programming) but became famous in SAFe (Scaled Agile Framework). It is used whenever a developer needs to gain more knowledge about a tool, problem, or market solution, trying to avoid risks during its implementation, or provide more security in the proposed solution. It is perfect for cases where you are dealing with a legacy platform, teams that are new, new members, or if we need to understand what problems exist in already implemented solutions that do not necessarily have to do with business definitions.

As an act of exploration takes time, it must be foreseen in the interaction and adopted as an extreme measure, not as a routine, after all, every delivery produced by a squad has risks, if for every delivery there is a Spike attached, we go back to the era of waterfall and goes against the grain of the concept of “fail fast and fix fast”.

There are many authors who defend that it occurs while the sprint is happening “without disturbing the deliverables” (SOARES, 2021), however, according to Leffingwell (2011), SAFe itself is contrary to this view, since being a type of Story it should at least be envisioned as exploration activity within the interaction. As its ultimate goal is to generate Stories, if it ends even before the interaction itself, they can be worked on, otherwise these stories will be prioritized later.

In my experience applying spikes, when well done and with the team’s consent that a developer will do the spike and at the end will present to the team what he found, if he has no interruptions, in about 2 days you can get good results . However, it is important to know that this is not a rule. Depending on what you are exploring, it can take much longer or much less.

Important: some authors, such as Sohrab Salimi, who argue that Spike is a synonym of POC (Proof of concept), however, they are completely different things. As said, the Spike has the function of trying to gain knowledge to avoid risks, while the POC is to prove that a solution is functional, or a concept is in fact valid in a smaller part before leaving for the definitive solution in production.

Improvement actions on the development track

Imagine a bread factory, the CEO of that factory certainly doesn’t know which machines he has on the production line, but if he doesn’t have the best machines or using them in the best way, do you agree that production is probably compromised or will be inefficient? It doesn’t differ with the development track of a software.

Integrated frequent issue meetings

Although there are several tools that can help to identify the biggest problems that a solution may have, sometimes the simplest thing is to gather all the developers in a room and ask if we have technical debts, vulnerabilities, or items in the code that can be improved, or refactored.

Automated tests

This is a modern practice where scripts are created so that tests are run automatically after software development and ensures agility in deliveries, but it depends on a long and time-consuming step of script writing and creation.

“There is no way to automate delivery to users if there is a manual and time-consuming step in the delivery process” REHKOPF

In my experience, at least the main journeys/functionalities that your solution works on should consider being automated tests and for that it is interesting to find out with the team if they are and prioritize them on the mats.

Gain: agility in the implementation of new solutions and improvements in those flows considered vital.

Log File

These are files that record system occurrences and/or events. Depending on how they are created or designed, it manages not only to track a user when he accesses your application, but what was the call and response of each system involved in that operation.

This is one of the points that I recommend more care and attention from product professionals, although we do not control or manipulate the logs ourselves, it is good to provoke the development team to, at certain times, isolate for a period all the error logs that were presented. This can be rich material for us to work on a new backlog.

I’ve worked in places where this topic was taken so seriously that a squad was formed just to analyze all the error logs that appeared in production, this work was important, because as it’s not a common subject for us in business, we had no idea that 10% of all accesses per day to our applications had error logs. We had about 400,000 accesses per day, that is, 40,000 navigations had error logs that no one had any idea what the result was for the customer. We didn’t know if he was logged out, if we lost sales, if it was a false positive or anything.

A tip is to set aside some time with the development team to analyze the highest incidences of logs that point to errors, promote spikes to analyze their cause, effects and how to adjust them, and then prioritize adjustments in normal sprints.

NewRelic is a great example of a tool that can help with this monitoring of environments, logs and events.

Gain: simplification of the problem analysis process during crises, or identification of new errors that may be occurring in production.

Version control

Especially if your company is starting to grow and we have several developers tinkering with the code, it is important to study solutions that record who and when tinkered with the code that is in production.

Github is one of the great examples of platforms that can help in this regard.

Gain: security and control of who and when the code was modified.

Bug analysis

There are software’s that analyze the developed code looking for weaknesses, lags and signs of possible bugs when the code is linked. Some of them even provide scores for the quality of that developed code.

For these, an objective that is in the hands of businesses and that we can provoke is to increase the number of note codes for the main journeys that customers navigate.

FindBugs, PMD and Sonar Qube are examples of software that perform this service.

Gain: knowing preventively the weaknesses, lags and points of code improvement.

Vulnerability analysis

The risk of cyberattacks has become a frequent problem in web and application development, some software can analyze the code from development to production and point out to the team preventively what can be better, or practices that were left out during design.

Veracode is one of the most famous vulnerability analysis software.

Gain: Mitigate the risk of cyberattacks.

Front-end data capture (Keylogger)

This is a form of data capture that manages to capture all keyboard actions, it is very common to be used by spyware, but also by companies such as e-commerces, companies that work with the search bar, etc. and need to understand what user is typing on screen.

It differs from simple tagging in analytics software, as its implementation requires specific code and specific destinations to determine where to capture the data entered by the user and where the information will be stored.

When I worked at a company where my squad worked on a search engine, keyloggers helped us understand the terms entered by users and how to improve search results for them.

Gain: fully knowing data inputted by the user on screen to improve experience, search results or taxonomy.

Web page performance tool

There are tools that evaluate the performance of websites and web applications, informing by market indexes the note that each page has in different aspects.

I highlight Google Lighthouse, available in every Google Chrome browser and for free, which assigns performance scores to a page when loaded under 5 aspects: how is the page’s SEO, loading performance, accessibility, responsiveness in PWA and best practices recommended by Google for development.

I’ve had teams where there were constant complaints about the performance of websites and web applications that no one could ever isolate. Promoting a Spike with this tool, we found that our site averaged a score of 40 (maximum 100). As Google itself already gives beyond the grade, but also improvements that can be made, we made sprints of corrections of the problems pointed out by Google and the improvement for the stakeholder was easily perceived.

Gain: measure the performance and quality of the pages proactively, generating a backlog of corrections and improvements.

Conclusion

If you are now questioning what to do to improve the stability of your platform, there are ways we can follow to start creating backlog items, but it will depend on your adherence and acceptance of Spike’s culture and free communication with the team of development.

In addition to being open to understanding that the code may have problems and that we need to solve them, in the case of known problems, when we don’t know where to start, it is worth asking if the quality standards of the treadmill are favorable to eliminating the problems.

Questions we can still ask ourselves:

1. Do we have test automation processes in place?

2. Do we have enough logs to work with if there is a crisis?

3. Or are they already indicating that we have problems in the code?

4. Do we have full control of the versions we are changing in the code?

5. Do we have software that helps us predict bugs and vulnerabilities?

6. Do I know how my websites and web applications are performing?

Many may question whether this will add value to the business, but every efficient conveyor belt optimizes issues that will actually affect our stakeholder’s life.

References

LEFFINGWELL, Dean. Agile Software Requirements: Lean Requirements Practices for Teams, Programs, and the Enterprise. Addison Wesley, 2011. Available at < https://www.scaledagileframework.com/spikes/#:~:text=Spikes%20are%20a%20type%20of,investigation%2C%20exploration%2C%20and%20prototyping. > Accessed on 12/31/2022 at 00:17;

SOARES, Daniela, 1 — Is it wrong to have spikes within the sprint that the team is working on, since they are tasks that are not scored due to lack of details?. Alura [forum]. November 2021. Available at <https://cursos.alura.com.br/forum/topico-1-e-errado-ter-spikes-dentro-da-sprint-que-o-time-esta-trabalhando-visto -which-are-tasks-not-scored-due-to-lack-of-details-163227 > Accessed 12/31/2022 at 00:24;

SALIMI, Sohrab. Spike. Agile Academy [Portal]. Available at <https://www.agile-academy.com/en/agile-dictionary/spike/ > Accessed 2022/12/31 at 00:38;

REHKOPF, Max. Automated software testing. Atlassian [Portal]. Available at <https://www.atlassian.com/us/continuous-delivery/software-testing/automated-testing> Accessed on 12/31/2022 at 00:50;

LOCAWEB. Discover 7 tools and websites that assess code quality. LocaWeb [Blog]. November 2021. Available at <https://blog.locaweb.com.br/temas/codigo-aberto/conheca-3-ferramentas-e-sites-que-avaliam-a-qualidade-do-codigo/> Accessed at 12/31/2022 at 10:50 am;

NASCIMENTO, Anderson. What is keylogger?. CanalTech [Portal]. July 2014. Available at <https://canaltech.com.br/seguranca/O-que-e-keylogger/> Accessed on 12/31/2022 at 10:31 am;

Unknown author. Google Lighthouse: User Guide to Audit and Optimize Websites. SAN internetl [Blog]. January 2023. Available at <https://blog.saninternet.com/google-lighthouse> Accessed 1/4/2023 at 11:15 PM;

MIGUEL, Robertson. How to monitor logs at different levels with the best tools. Vindi [Blog]. Available at <https://blog.vindi.com.br/logs-monitoramento/> Accessed on 01/04/2023 at 23:26;

--

--

Daniel Duarte
Agile Insider

10 years in digital products and leadership. Specialist in innovation, payment methods, edtech, insurtechs, and market as a services.