You’ve probably heard of Operational Resilience. No? Well, it’s your organisations ability to withstand incidents and issues and still operate. That’s my words – so in the words of NIST, it’s:
“The ability of systems to resist, absorb, and recover from or adapt to an adverse occurrence during operation that may cause harm, destruction, or loss of ability to perform mission-related functions.”
If you have heard of it, it’s possible you have come across one of a number of rules and regulations around it. For example, here are two I come across:
- DORA (Digital Operational Resilience Act) – adopted by the European Parliament in November 2022, and expected to come into force in 2025.
- PS21/3 – Building operational resilience – the policy paper was published by the FCA in 2021 and came into force on 31 March 2022.
So, what is Operational Resilience, and is it really different from traditional business continuity planning? We'll use these rules and regulations to try to explain.
What is operational resilience?
Well, DORA says it comprises five ‘pillars’. These are as follows:
- ICT risk management - Financial entities shall have in place internal governance and control frameworks that ensure an effective and prudent management of all ICT risks.
- ICT-Related Incidents - Financial entities shall establish and implement an ICT-related incident management process to detect, manage and notify ICT-related incidents and shall put in place early warning indicators as alerts.
- Testing - financial entities shall establish, maintain and review, with due consideration to their size, business and risk profiles, a sound and comprehensive digital operational resilience testing programme as an integral part of the ICT risk management framework
- Managing ICT third party risk - Financial entities shall manage ICT third-party risk as an integral component of ICT risk within their ICT risk management framework
- Information sharing - Financial entities may exchange amongst themselves cyber threat information and intelligence.
As you can expect, there’s lots to do underneath each ‘pillar', but you get the idea that this is a risk management and incident response focus, with information shared between entities to minimise the risk a particular issue will spread.
The FCA takes a slightly different approach. It drops the ‘Digital’ to begin with. The focus of the FCA approach is:
- Understand your important business services. This is what you do for customers and clients.
- Set tolerance thresholds for your important business services. This is how bad your service would need to get before causing ‘intolerable harm’ to consumers or risk financial stability.
- Of note, ‘Intolerable Harm’ would be one of the top 5 likely names of my death metal band, should I choose to start one.
There’s lots more in the FCA paper, but that is the heart of it. Understand what you do, understand how bad it would need to get before it caused 'intolerable harm' and manage to that.
While the approaches are different, the reality of Operational Resilience covers both areas, in my view. You have to understand the business and what you do for customers and clients. Setting tolerances for customers and clients really sharpens the mind when planning resilience. Most processes are digitally heavy, so managing the resilience risk inevitably means risk management and a strong focus on third parties. And things go wrong, so effective planning and testing will always be important.
So, what’s different from BCP?
There’s probably a lot of words that are different, and a lot that will be familiar if you know business continuity planning. However, two things for me stand out as offering a different perspective.
- Supply chain focus. There’s a recognition that this is bigger than a single organisation, and the supply chain, particularly the technology supply chain, is important. This will likely draw more non-financial services firms into regulation or oversight, It definitely puts pressures on outsourcers to have proper measures in place. We all kind of knew this, but both DORA and the FCA make this explicit.
- It’s not just about you. By this, I mean, traditional business continuity planning often focused on the organisation and its processes (or did so more often than not). The focus on Operational Resilience is moving this focus from internal operations to what an organisation is doing to protect its customers and clients from intolerable harm. This is part of a growing trend to focus risk management on an entity’s obligations to someone else (not just their internal operation or regulators). GDPR was the same to me. Not many have really understood this yet (but financial services firms, particularly those in the UK, understand better than most).
Where do you start?
Essentially, both these requirements are about maintaining the continuity of your organisation. However, I have a growing sense that you’re maintaining for the market / your customers, not for yourselves. That’s new, and that’s different.
So, here's seven things you can do to start your operational resilience process:
- Understand your important business services. Essentially, list what you do for your customers or clients.
- List your important service metrics. How does your customer know you're offering the service. Is it a timeliness metric like a transaction time or response rate?
- Understand how bad these service metrics would need to be to cause harm to customers. How bad would your performance need to be?
- List the technology and services that support your important business services. You can add people, facilities, data and other gubbins to this too. If you've covered BCP, this will already be in your business impact assessment.
- Manage IT and third party risk. The DORA text gives you some guidance here. There are many specialists who can also point you in the right direction. A tip: concentrate on the risk the systems and suppliers supporting your important business processes cannot deliver services to the service level you have defined. I'll write more on risk assessments in the future.
- Define an incident response plan. Make sure you know how you'd detect a problem (i.e. your service of your important business processes is not to the standard you need) and how you'd mobilise and deliver a response to reduce the issue severity, impact and duration.
- Test, test, test. Practice your response, through incident exercises. Challenge yourselves in different situations. Test to train. Test to fail, and improve each time. It doesn't need to take long every time, but still test.
If you want to know more, or want to discuss further, please get in touch or sign up to our updates below.