How to assess, communicate and manage uncertainty and risk with Agility?

<< Overcoming the one weakness of OOP, comparing solutions | Home | The 3 inspiring principles of microservices >>

How to assess, communicate and manage uncertainty and risk with Agility?

The thought that disaster is impossible often leads to an unthinkable disaster.
Gerald M. Weinberg

Doubt is not a pleasant condition, but certainty is absurd.
Voltaire

Every time there’s a good business opportunity to develop a new product or evolve an existing one, executives want to know required investment amount, expected ROI and probability, and risk involved.

Therefore product and development teams are asked to figure out if the new product or product evolution is technically feasible, how long it will take to implement, how much it will cost, when will be finished, and how much risk and uncertainty is involved.
Then Project/Release/Iteration manager will present a release plan where time, effort and costs estimation are expressed with ranges. The degree of uncertainty and the level of risk determines amplitude of the estimations range.

As result, the degree of uncertainty and the level of risk will impact the investment decision, budgeting, governance of products portfolio, release plan, and external customer where one is involved.

How to assess and communicate risk and uncertainty

Visualising and communicating risk and uncertainty with agility it means to do that in a lightweight form that’s quick and simple to understand, easy to constantly and frequently update, and straightforward to iteratively enact.
This is very different from heavyweight approaches for documenting risks with long documents that quickly become outdated, often remain unread and very seldom if ever lead to timely action.

The lightweight assessment is
constantly and frequently updated
as new learnings and new information emerge

The assessment of risk and uncertainty is typically done after initial high-level identification of features and user stories and after initial high-level effort estimation. The assessment is then updated periodically as new learnings and new information emerge.
A high-level way to assess and visualise the overall degree of uncertainty and the level of risk consists in categorising the implementation of the new product or the evolution of the product as an Exploration type of work or as a Production type of work.

Investing more time detailing upfront requirements,
specifications and design wont reduce risk or uncertainty

- Production type of work is characterised by known problems and known solutions. It occurs when implementation work is similar to previous implementations. As a result it has very low degree of uncertainty and very low level of risk. Estimation ranges can be narrow because of low estimation errors expected.

- Exploration type of work is characterised by high level of uncertainties and unknowns in the problem and/or in the solution, and in the product/market fit. In addition to this, there could be plenty of things outside the area of control or influence that can change anytime impacting the success of the implementation and of the product. For this type of work, investing more time detailing upfront requirements, specifications and design wont reduce risk or uncertainty.

There’s a continuum between the very low risk and uncertainty production type of work and the high risk and uncertainty exploration type of work.

Different opinions and different points of view
are invaluable sources of information,
therefore are encouraged and explored

Product and development teams’ members that will do the actual work, should collectively vote where they think the current implementation work stands, placing each one a dot between the two extremes, discussing whenever they see differences among their votes and revoting after the discussion. In case of persistent differences in opinion among team members, it’s advisable to raise the level of risk & uncertainty. This evaluation can alternatively be done by team's Product Owner together with the Tech-Lead and the Project/Release/Iteration Manager.

Actual effort or scope can be up to 4 times the initial estimate

For an implementation work estimated to be on the extreme left-end, a pure exploration type of work, the estimation error can be up to 300% of the initial estimate. In other words the actual effort or scope can be 4 times the initial estimate. For a pure production type of work the estimation error can be 5-10%. While for an implementation work that stands in the middle, the estimation error can be up to the 50%. For examples see the Cone of uncertainty, based on research in the software industry and validated by NASA's Software Engineering Lab.

How to detail and explain risk and uncertainty

When executives and managers want to know reasons behind estimated overall risk and uncertainty level, when product and development teams want to improve the assessment accuracy, when they want to verify their assessment or present it in more detail, it’s useful to drill down into these three main risk and uncertainty components:

1) people and market
2) domain and requirements
3) technology and architecture

For each component the level of risk and uncertainty can be rated Low (green), Medium (amber) or High (red) and presented, for example, like this:

The following lists break down each of the three main risk and uncertainty components into multiple elements. While discussing the elements, new elements specific to teams’ and product context and circumstances could emerge and should be added to the list.

The elements in the lists below are placeholders
to provoke discussions and provide basic guidance

Note that the elements in the lists are placeholders to provoke discussions and provide basic guidance. A team experienced in dealing with risk, uncertainty and complexity would probably start with empty lists and populate them throughout discussions.

1) People and market

2) Domain and requirements

3) Technology and Architecture

Each element of the lists should be discussed one at time by the product and development teams to rate the level of risk and uncertainty.

The rating of each element should be given for example considering historical data, or absence of historical data, listening to professional judgment of team members, and considering the weight of each element in the current context and circumstances. Whenever there are differences among team members' estimate, there should be a discussion to learn from each other point of view, and after that a re-estimation. In case of persistent differences in opinion among team members, it’s advisable to raise the level of risk & uncertainty.

This approach should help to rate the overall level of risk and uncertainty for each of the three components and explain reasons behind each rating.

Every new information, findings and learnings
are used to verify, validate, update and enhance
results and decisions from previous stages

From those estimations and from the conversations that lead to the estimations, product and development teams should be able to re-vote where the current implementation work stands in the continuum between the production type of work and the exploration type of work, considering possible interactions among the components, and producing a more accurate assessment.

Tip: Keep it simple! Update the assessment frequently using new info and new learnings available over time.

How to manage high risk and high uncertainty

When work required to deliver a new product, or a product evolution, is expected to be mostly of Exploration type, estimation of costs, effort and time are expressed with extremely wide ranges and with estimation errors up to 300% of the initial estimate (for examples, see the Cone of uncertainty mentioned before). Therefore a linear upfront investment based on initial estimate for Exploration type of work is hazardous.

Small experiments and prototypes
are designed, built, deployed and tried out
with real users, early adopters and customers

in hours, days, or few weeks

In these cases it’s more convenient an initial exploration phase with the goal to identify and reduce main risks and uncertainties by exploring the problem space, testing assumptions, validating the solution, verifying product/market fit, clarifying the scope, and learning new info.

During the exploration phase an experimental approach is adopted: shortest possible experiments are designed and executed to gather data with minimum effort, and smallest possible prototypes are built, deployed and tried out involving as much as possible real users, early adopters and customers. Each is done in the space of hours, days, or few weeks. One of these experiments is the minimum viable product or MVP.

Investment decision and estimates
are finalized only after the end of the
exploration phase

An exploration phase ends only when risk and uncertainty are reduced enough so that a good, informed investment decision become possible. For an overall effort of one year, the exploration phase could last, for example, 2 months.

After exploration phase, sometimes estimates still have a wide range and estimation errors are up to 30-50%. I.e. the development time could be estimated in the range of 6 to 12 months.

Low priority requirements can be used as a safety net
to deal with residual risk and uncertainty

When this happens it’s particularly convenient to prioritise the requirements in the backlog using the MoSCoW prioritisation method in order to use the requirements classified Should as a safety net.

Here the best case scenario forecast, that assumes the highest velocity of the team, will include in the scope requirements classified as Must together with requirements classified as Should.
The worst case scenario forecast, that assumes the lowest velocity of the team, will include in the scope only requirements classified as Must.
As a result, in this example, the initial estimation range of 6 to 12 months, that is 6 months wide, is turned into a range of 6 to 8 months, only 2 months wide.

After a short period of time
from the beginning of the implementation work,
the investment decision and estimates
are verified against real progress

After about two months of work implementing the solution, it’s worth to observe the trend of team’s velocity:

When the velocity is stable or converging and is inside the ballpark, a new more accurate estimation can be done and plans can be updated accordingly.
When the velocity is stable outside the ballpark, this is a sign that investment decision should be re-evaluated.
When the velocity is diverging outside the ballpark, this is a sign that there could be still risks and uncertainties that need to be explored extending the exploration phase.

Conclusions

This lightweight, gradual, iterative approach to assess, communicate and manage uncertainty and risk is a simple and effective way to monitor and react timely to a variety of circumstances that impact investment decisions, budgeting, governance of products portfolio, and release planing. It also encourages and supports conversations on risk and uncertainty between executives, managers, product and development teams, and customers.

It is based on current lean/agile literature and on personal experience.

The mechanic of this approach is simple. In addition to it, few organisation cultural traits and leadership mindset characteristics are useful to make it work: transparency, trust, teamwork, and tolerance for experimentation.

If you are interested into the topics discussed in this post, those four suggested readings are for you:

Book: Lean Enterprise: How High Performance Organizations Innovate at Scale. Jez Humble, Joanne Molesky, and Barry O’Reilly. 2015
Article: 4 Significant Ways to Improve Your Ability to Innovate. Joanne Molesky. 2015
Book: Agile Project Management: Creating Innovative Products (2nd Edition). Jim Highsmith. 2009
Article: A Leader’s Framework for Decision Making. David J. Snowden, Mary E. Boone. 2007
Article: How to prioritize risks on your business model, Ash Maurya

Thanks to Maurizio Pedriale and Carlo Bottiglieri for their help in the review of the draft of this post.