Failure is closely aligned is Risk. Failure is what happens when we aren’t able to prevent the risk from manifesting. As we develop better ways to manage risks we should also consider the means for dealing with and learning from failures.
Resilience is about preventing failures.
- analysis and prediction,
- contingency planning, and
- how does Resilience embrace uncertainty and accept that not everything can be predicted or monitored?
- how can failure be accepted as necessary for learning and be planned to be prevented at the same time?
Response is about responding to and continuing to operate whilst a failure happens.
- broad knowledge of situation,
- deep knowledge of the impact over time,
- careful consideration so as to not make the failure worse,
- priorisation, and
- fast detection and early steps towards recovery.
- should we protect other work over prioritising dealing with the failure?
- should an assessment of impact be the deciding factor over likelihood of the failure, and
- should dealing with failure be a whole team responsibility?
Recovery is about fixing the failure after it has happened.
- a plan,
- resources to put the plansinto action, and
- testing to be embedded in process.
- how do we maintain a close feedback loop with the Response phase to get the solution right, and
- who should fix the failure, those involved in creating it or someone else?
Review is about learning from the failure.
- documentation of actions, and
- commitment and processes to make improvements to Resilience, Response and Recovery.
- how does it spread the learning,
- and is it’s aim to prevent future failures or encourage them?
How we approach Risk and Failure is essential for fostering psychological safety, to enable rapid experimentation, to make people awesome. We tend to assume that failure is a result of fault. A fault occurs which leads to a failure, and if there is a fault then there must be someone at fault, someone to blame, someone who did something wrong or different. If dealing with failures is underpinned by this assumption then failure can never be looked upon as a positive. The best we can see is that at least the failure has been uncovered and fixed. And perhaps that is enough to drive some kind of iterative process of continuous improvement.