The new IT Project Triangle

Tristan Henry
The Startup
Published in
14 min readMay 6, 2020

During my career, I have worked on tens of digital projects in several consulting companies. All these projects start either from scratch or add new functionalities to an existing solution but the main equation is the same: “a list of X features must be delivered for $Y within Z months”.

The list of features is stated by the client, then the sale/expert team provides the cost based on the effort and the project/resource managers share the timeline. Unfortunately, the constraints of the equation (scope, budget and timeline) are based on early hypothesis which often turn out to be either slightly different or totally incorrect. Reviewing the initial equation requires the team to revise the list of features, the budget or the timeline. This rework is a cornelian dilemma where there is no good answer for any of the parties.

In the following sections, I’ll review why only focusing on the initial constraints creates tensions to deliver new functionalities that may not satisfy the business goal and deteriorate the solution currently in place. Then I’ll propose a new set of indicators to better deal with schedule issues, monitor the added value and assure the project’s stability.

The initial vision

Building a project requires to find the right balance between 3 constraints:

  • Time
  • Cost
  • Quality

A Proof a Concept sacrifices quality to get a quick result at a minimum cost. If security and stability are core for a project, the implementation will require an important time and cost.

However an experienced consulting company has to deliver a standard quality. A quick and dirty approach is not conceivable. Thereby the following 3 constraints are available:

  • Time
  • Cost
  • Scope

Projects are too optimistic

Once the right constraints are established for a project, the main belief is to assume that the scope will be delivered in time for the budgeted cost. However, this assumption is based on the facts that:

  1. The scope is complete and perfectly understood
  2. The effort is correctly estimated
  3. The timeline is properly set up

The more experienced all the parties are, the more the scope is fully described and the more the cost estimation is accurate. However, humankind tends to be too optimistic and the duration is very often underestimated. A 2017 PMI’s surveys indicates that 49% of IT projects are late. In the same survey we notice that 31% of the projects did not reach the intended business goal. With regards to the cost, the survey indicates that 43% of IT projects exceed their initial budget.

Dealing with issues

Given that a significant amount of projects establishes unrealistic constraints, alternative options to deal with potential issues must be prepared, e.g. manage the fact that a feature development is taking longer than expected. With three constraints it’s possible to either act on the scope, the timeline or the cost (if the resources are available). This kind of discussion can take ages and degrade the project dynamic. Fortunately, there are several options to deal with those issues more efficiently.

Reducing the complexity

In order to deal with issues more efficiently, one constraint can be fixed. It helps the decision making process by offering fewer possible actions. Here are some examples:

  • A fixed timeline approach helps to focus on the work to do before the critical deadline. If unscheduled work appears it will affect the scope or the cost (if additional resources are available)
  • A fixed price or fixed team project helps to focus on what can be done within the budget. This approach stresses the remaining constraints, meaning that issues will result in delivering late, with a reduced scope or with a lower quality
  • A fixed scope defined by the initial statement approach focuses the effort on how to deliver the list of features within time and budget. If an issue occurs the project will end up delivering late or/and over budget

These approaches help manage issues by limiting available measures to deliver the project given the constraints. Some projects decide to fix more than one constraint such as scope and cost. This extreme approach is really challenging since it pushes all the pressure to that one constraint. In all cases reducing the complexity does not help to reduce the tensions on the projects. At the end the solution will require to sacrifice part of the scope, increase the budget, delivering beyond the schedule or a mix of it.

Become Agile

Nowadays on IT projects, agile methodologies are the mainstream approach (PMI’s surveys). The main idea is to deliver more often to ensure that the scope and its added values are aligned with the business goal. Generally with an agile team, the time and the effort/team are fixed for each iteration, but the scope will vary depending on the estimates’ correctness. An agile ceremony called retrospective occurs at the end of each sprint to review what went well, what issues were encountered and how the sprint can be improved for the next iterations.

Agile approaches are reducing the complexity by fixing the time and cost constraint for each sprint. It means that if issues occur it will automatically reduce the scope. If retrospectives are well executed it ensures that all the scope will be delivered when the sprint ends. However agile approaches might not always take into consideration the system’s durability and the fact that each new sprint makes the product better than the previous one.

Limitations

Reducing the complexity by fixing constraints or becoming agile is an aid to manage issues but it does help to deal with:

  • Risk management
  • Scope reduction
  • Added value measure
  • Performance degradation

Indeed, by stressing only one or two constraints the teams end up always having the same conversation: “Why did we remove that feature?”, “Why do we need to push the timeline?” and “Why do we have this additional cost?”. These questions are genuine requests but they are based on the first initial estimation of effort and schedule which is only a first understanding of the project, not a golden rule to follow in order to deliver added value.

Finding actionable indicators

In order to shift the conversation and focus on progressive indicators, it’s possible to use a new project triangle. It contains 3 main aspects:

  • Assessing the business impact
  • Maximizing the timeline accuracy
  • Assuring the technical stability

Assessing the business impact

Unit added value

The most important element of all projects is to deliver value. There are many ways to push added value to the customer and some are difficult to measure such as brand reputation. Here we are focusing on the measurable costs composed by:

  • The initial cost
  • The operational cost
  • The additional revenue

A project can often face the following situation: a new feature was released after the deadline and over the forecasted cost. This situation is frustrating for all the parties but the lifecycle of a feature does not stop after the initial release. We need to evaluate the big picture by understanding if the time and cost spent is worth it in the long run. In other words the sentence should move from: “We delivered a feature with a one month delay and it costed 150% of the initial cost” to “We delivered a feature with a one month delay, the feature is used by 10% of the users and the additional revenue generated will cover the development cost within 6 months”.

Not all features can directly translate into direct revenue but it’s possible to isolate their costs.

Software Development Tool & Continuous Deployment

To evaluate the development cost, the project should be able to track accurately the time spent on each feature. The real time spent can differ from the initial estimate but thanks to most of the Software Development Tools it’s possible to track accurately and independently the time and the cost of each feature.

Understanding the cost of each feature is important to assess its business value. There are two mistakes to avoid. The first one is merging the time spent on all the features during the iteration development phase. Indeed, in some cases the overall time spent on a development iteration is really close to the first estimates. In reality, it’s often due to a hazardous equilibrium between the features that were underestimated and the ones that were overestimated. The second issue comes with the testing and deployment phase.These activities often group several features which makes the end to end effort for each feature more difficult to understand.

It’s mandatory to be as granular as possible to get the real development cost. Features must be as independent as it can be from their development to their release. Continuous Delivery/Deployments approaches help to be granular by tracking and treating each feature independently.

FinOps

When the initial cost of development is known it’s important to understand the impact of running this new feature. Some functionality could be inexpensive to develop but will require an important infrastructure or operational cost.

FinOps is a contraction of the words Finance and Operations. The main goal of this approach is to better manage the running cost (generally in the cloud). Legacy software architecture are often monolithic, meaning there is one big system handling all the operations. In that case it’s difficult to evaluate the cost of triggering one particular feature. However, thanks to new architectures based on microservices, the cost of running internal services can be more granular. Outside of the custom services, an external service can be billed based on the number of API calls. Cloud based applications using microservices can also take this approach following two distinct cases:

  • Standard microservices where it’s possible to estimate the cost of each service
  • Serverless microservices where it’s possible to know the cost of each call

An example will be the following: on an e-commerce website, a new payment method is implemented. The external payment method used costs $1 per transaction, and a dedicated serverless microservice function is created to handle it.
At the end of the month the cost of running this new feature is the external cost of the provider $1 time X transactions + the cloud cost of executing X times the microservice.

A/B testing

Once the cost of the initial development and the running cost are known, it’s possible to understand if this new feature is used, attracts more people or gets more revenue. In order to do that A/B testing should be ran by proposing both versions of the solution to the customers: one version without the new feature and one version with it. Then after a period of several days to several weeks, the team has gathered enough information to understand the value of the new feature.

A/B testing is a simple way to gather feedback on each feature. Analytics tools highlight whether the feature is used and what the potential additional value/revenue is. Behavior tracking is an additional way to understand the change brought by a particular feature on the customer journey in detail.

Websites like Booking.com are running 1,000 experiments at the same time (HBR article). This approach allows to test each new idea (not only the HIPPO’s — highest paid person’s opinion) and to get tangible data indicating how each version performs.

Maximizing the timeline accuracy

5 Whys

At the time of the initial hypothesis the effort and cost to develop the desired scope have been estimated. Based on the available resources it’s possible to set a timeline. This first timeline is generally valid for a short period of time because plenty of internal or external factors quickly force the project to adapt, which often translates into delays.

When it comes to indicating reasons for delays, answers are often understable but the questioning is not detailed enough to understand to root causes and take appropriate actions.
For example:

The release has been delayed due to a critical bug in a feature.
- “Why has the release been delayed by 2 days?”
- “The release has been delayed because unexpected issues were raised during the testing phase”.

Issues can happen once but if the same issue occurs again in a project it means that no proper actions were taken to handle it. In order to set up these actions it is mandatory to understand the root cause of the issue in detail.

The 5 Whys method was developed by Sakichi Toyoda, the Japanese inventor and industrialist who created Toyota and started the reflexion around lean methodology. The 5 Whys approach helps to provide a better understanding of the issue and get actionable solutions by asking “why?” 5 times. If we take the previous example, the 5 Whys could have highlighted the following:

The release has been delayed due to a critical bug in a feature.
- “Why has the release been delayed by 2 days?”
- “The release has been delayed because unexpected issues were raised during the testing phase”.
- “Why was this bug raised so late during the testing phase?”
- “The bug was not raised during the initial development”
- “Why was this bug allowed to move to the testing phase?”
- “The developer is new on the project and did not set up all the unit tests”.
- “Why was the developer not aware of the policy to cover all new features code by unit test?”
- “The developer has been introduced to the team without a proper onboarding”.

Thanks to this approach we know that with a proper onboarding the bug could have been revealed by a unit test and fixed before moving to the testing phase.

Plan A

The mitigation plan is the list of actions that prevents a risk from happening, that’s plan A. As we’ve seen before, a project has 3 main components: time, cost & quality/scope. To prevent a component from deviating it is mandatory to list potential risks and their related actions. The team must then schedule the execution of these actions in advance.
Example:

  • Risk: New developers not properly onboarded could affect the quality/timeline of a project
  • Preventive action: ensure that all new developers are properly onboarded and that they understand the project practices and policies

If an issue is raised and reviewed by the 5 Whys it must be either part of the mitigation plan or part of the contingency plan described in the next section.

Plan B

Not all risks can be handled by a mitigation plan and preventive actions. If it’s the case, it is important to ensure that the team knows what to do when it happens. The contingency plan describes actions to cure the risk after it happens, that’s plan B. The actions must be prepared but not executed.
Example:

  • Risk: Having server shutdown causing service non-availability
  • Preparation: Set up and test rolling out a backup server
  • Monitoring: Error code on the solution
  • Curative action: Rollout the backup server and redirect the incoming requests

Assuring the technical stability

Monitoring transactions and performances

A project is made of several layers used by a multitude of use cases. Adding new features is an essential part of the business because this allows to provide more value to the customer and potentially get more revenue. However, adding more complexity can deteriorate the overall performance and user experience in the long run. Walmart found out that every 100 ms improvement in transaction speed led to a 1% improvement in revenue (New Relic article). On the contrary, reducing the transaction speed by 1 second could directly impact the customer journey and the website revenue.

In order to handle the performance impact, each new feature must be considered in the overall customer journey. Imagine a new feature that would add 100 ms of loading time. This feature is only used by 10% of the users but is part of the main customer journey. This means that 90% of customers will have a degradation of their journey with no added value.

Thanks to a shorter iteration cycle and continuous delivery/deployment, new features can be delivered with a higher frequency (up to several times a day). Even if each individual feature has an insignificant impact on the performance, the solution can suffer from speed degradation in the long run.

In order to efficiently monitor the performance and understand the impact on the customer journey, it’s important to be able to:

  • Reach an availability close to a 100%
    A solution could be fast and accurate, but it is useless if no one can reach it
  • Ensure the correctness of transactions
    Be sure that calls to all services are working, especially when there are dependencies with external providers
  • Evaluate the perceived load time on main customer journeys
    Be sure that customers can navigate as soon as possible on the main parts of the solution

When a significant decrease of the performance is noticed, an action must be initiated to handle the situation. It could be moving the feature to an alternative user journey or re-engineering the solution to make it more efficient.

Measuring and maintaining quality

There are several ways of evaluating the quality of a project, e.g. ensuring proper training, documentation, inspections, etc. Here, let’s lay the emphasis on the testing and particularly on the following steps:

  • Unit testing
    Testing the specific code related to the feature
  • Integration testing
    Testing the dependency (component, integration, API) linked to the specific feature code
  • Acceptance tests
    Testing the feature against the requirements with no prior knowledge of its inner logic (black box testing)

Automated Unit tests should be part of each feature development because it ensures that the specific code is working as required. All the tests can be executed on demand and before moving to the next phase ensuring that the new piece of code does not create any regression on previous functionalities. It’s also possible to analyse the unit test code coverage to make sure that all the logic of the solution is properly tested.

Once the Unit test are covered, automated component, automated integration and automated API testing can be performed. These tests make sure that everything is working well together. Integration testing can be performed in different environment, e.g. sandbox, pre-production.

Finally, the acceptance testing can be processed via automated GUI (Graphical User Interface) testing. These tests are replicating user behavior when browsing the solution. It’s even possible to set a robot to execute GUI testing on production in order to add a new monitoring of metrics and alerts.

Thanks to the automation of the unit, to integration and to acceptance testing it is possible to assess the quality of the solution over time and to prevent regression tests appearing late in the manual testing process.

Being scalable ready

Monitoring the solution and its quality is mandatory to ensure the best service to the customers at all time. However, it’s also important to be prepared for a time when the solutions might be attracting more customers. In order to do that the solution must be scalable ready.

Horizontal scaling allows to scale the capacity of services by adding new workers. This approach is available on most of the non-monolithic architectures such as microservices. The analogy will be the following one: to absorb the additional customers attracted by a Black Friday sale of a block and mortar shop, additional cashiers are deployed. This approach is the preferred one because it allows to dynamically adapt the number of workers needed and it can deal with possible worker unavailability. Horizontal scaling works well with managed container clusters solutions such as Kubernetes. It’s also included under the hood of serveless solution.

If an application is based on a monolithic architecture, there is a high change that horizontal scaling cannot be enabled. If this is the case, it is possible to adopt a more restrictive approach called vertical scaling. If we go back to our brick and mortar Black Friday event, in order to deal with more customers the store needs to boost one single cashier to ensure that he can handle more clients per minute, e.g. recruit the fastest cashier of the industry. This approach cannot be done dynamically and it creates a single point of failure.

Most of the architectures are now hybrid where the front-end is able to scale horizontally but the back-end must stay on the same instance and could only scale vertically. Caching technologies can also help to absorb most of the browsing traffic.

Once the scaling strategy is defined, it’s possible to test the load of several time the usual traffic and, if needed, to adapt the solution and ensure that everything works in these conditions.

Final thoughts

Time, cost and scope are essential to start a project but these indicators do not help to drive added value. Focusing on measuring the business impact, maximizing the timeline accuracy using proactive plans and assuring the technical stability of the solution will keep the whole team in line to make the project work in the long run.

--

--

Tristan Henry
The Startup

Leveraging cloud capabilities with a customer-centric approach #CRM #Cloud #AI #DataScience #GIS etc.