Lesson 1 — Risk Assessment for AI Initiatives

Module 3, Unit 3 | Lesson 1 of 3

By the end of this lesson, you will be able to:

Identify and articulate AI-specific risks in testable, evidence-led terms (K13, K24, S3)

Build a risk register that scores likelihood and impact on consistent scales and prioritises by exposure (K7, S3, S4)

Choose between Avoid, Mitigate, Accept, and Transfer responses, and defend the choice (K24, S5, B1)

Recognise interdependencies and recalibrate residual risk as the system moves into operation (K13, B3)

Why risk work is different in AI

A familiar pattern in AI projects: the system performs exactly as designed from a technical standpoint, and still creates unacceptable exposure. The model is accurate, the integration works, the dashboards turn green — and yet the deployment ends up withdrawn because the data was used without a clear legal basis, or the outputs disadvantaged a group nobody had thought to test for, or the decision pathway shifted accountability somewhere it was never meant to sit.

That is what makes AI risk work different from traditional project risk. The list of things that can go wrong extends past delivery delays and cost overruns into operational, legal, ethical, governance, and reputational territory — frequently all at once. Treating risk as a separate workstream that runs alongside delivery is no longer adequate. It has to be embedded in the design.

This lesson covers the discipline that makes risk visible, prioritisable, and governable: how to identify risks honestly, how to score them consistently, how to choose a response strategy you can defend, and how to keep the register alive once the system is in operation.

Identifying risks in testable terms

Risk identification is the structured process of working out what could prevent the initiative from succeeding — and what harm could arise if it does. The objective is to surface risks that are genuinely material, not hypothetical anxieties.

The most common mistake at this stage is vagueness. "Data quality issues" is not a risk; it is a topic. "Inconsistent upstream data capture in the customer onboarding system could reduce model accuracy below the operational threshold of 92%, leading to misclassification of high-risk applications" is a risk. The difference matters because vague risks cannot be scored, owned, or controlled — they only get mentioned in steering meetings.

A useful structure is a complete causal chain: if [trigger], then [risk event], resulting in [impact]. Three sentences in that shape are worth more than a page of general concerns.

Good identification draws on three sources of evidence: internal experience (incident reports, audit findings, retrospectives from previous deployments), structured analysis (the SWOT you will see in Unit 2 Lesson 2 on scope (or have already done) surfaces internal weaknesses and external threats), and external learning (regulator reports, industry case studies, technical publications on common AI failure modes — bias from unrepresentative training data, model drift, automation bias, unclear accountability, insufficient documentation).

The risk register

A risk register is a structured record of exposure designed so that decisions can be traced, defended, and updated as the initiative evolves. It does three things at once: makes risk visible, makes responsibility explicit, and makes trade-offs governable.

A credible register holds qualitative description and quantitative scoring side by side. Qualitative entries capture the nature, mechanism, and context of each risk. Quantitative entries allow prioritisation through likelihood and impact scoring on consistent scales — plus a confidence rating, because some likelihood estimates rest on solid evidence and others on guesswork that should be flagged as such.

At minimum, each register entry should capture: the risk statement (in the if/then/resulting form above), the type, the trigger, the consequence, current controls already in place, the proposed response, the assigned owner, the monitoring signal, the monitoring cadence, and the residual risk after controls have been applied. The point is not to maximise columns; it is to make sure every risk on the register is actionable rather than decorative.

Doing this with AI
An LLM is a quick way to generate a first-pass list of AI-specific risks for your initiative — feed it your problem statement and scope and ask for risks across operational, technical, ethical, regulatory, and reputational categories. The honest follow-up matters more than the first draft: "Now identify three risks I might be motivated not to see — risks where surfacing them would slow down or weaken the case for this project." Models are usefully indifferent to your delivery pressure.

Scoring likelihood and impact

Once risks are identified, prioritisation requires consistent scoring. Two scales are used. Likelihood answers how probable is this event during the lifecycle of the initiative? Impact answers if it happens, how damaging would it be? Multiplied together, they produce an exposure score that lets you rank the register and decide where to spend mitigation effort first.

The standard five-point scales below are the ones most UK organisations use. The exact wording matters less than the discipline of using one definition consistently across every risk on the register — otherwise the exposure scores stop being comparable.

Likelihood scale

Score	Level	Description
1	Very Low	Unlikely to occur; would require unusual circumstances.
2	Low	Possible but unlikely; may occur only in exceptional cases.
3	Medium	Reasonable possibility; could occur during the lifecycle.
4	High	Probable; likely to occur if not actively addressed.
5	Very High	Almost certain; expected to occur, possibly more than once.

Impact scale

Score	Level	Operational and governance impact
1	Negligible	Minor variation absorbed within normal operations; no regulatory or reputational consequence.
2	Minor	Limited disruption affecting a small group; isolated fairness or compliance concern; minor rework.
3	Moderate	Noticeable operational disruption; fairness or compliance exposure affecting a defined segment; some reputational impact.
4	Major	Significant disruption; material fairness or bias issue; regulatory breach; loss of auditability; substantial reputational damage.
5	Catastrophic	Widespread disruption; severe fairness or safety harm; major regulatory action; loss of stakeholder trust; organisational liability.

Exposure = Likelihood × Impact. A risk scored 4 (high likelihood) and 3 (moderate impact) has an exposure of 12. The number is not a prediction. It is a comparative tool that lets the team and the governance owners see, on a single page, which risks are most pressing and where the tightest controls need to sit.

💬 Reflection

If you scored every risk on your current register honestly, would the highest-exposure ones be the ones that are actually getting most of the team's attention? In most projects the answer is no — attention follows whoever is loudest, not whichever risk has the highest exposure score. The register is the corrective.

Choosing a response: Avoid, Mitigate, Accept, Transfer

Scoring tells you which risks matter most. The response strategy is what you actually do about them. There are four options, and the choice is the moment risk work stops being descriptive and becomes governance.

Avoid. Change the design or scope so the risk condition cannot arise. In AI work, avoidance often appears as a design decision: not automating high-impact decisions where errors would have severe consequences, excluding a dataset whose legal basis is unclear, restricting the system to decision-support rather than full automation, or delaying deployment until governance infrastructure matures. Avoidance is the right choice when the potential harm is severe and you cannot confidently reduce exposure through controls.

Mitigate. Introduce safeguards that reduce either the likelihood or the impact. In AI deployments this typically means a layered combination: rigorous validation before deployment, continuous monitoring of model performance in operation, drift detection, human-in-the-loop oversight, audit logging, and a fallback procedure (revert to manual handling if model confidence drops below an agreed threshold). Mitigation only works if controls are layered — a single safeguard rarely holds.

Accept. Tolerate the risk after evaluating likelihood, impact, and the cost of further controls. Acceptance is not the absence of action. It requires a documented rationale (why the residual exposure is acceptable), a named owner, and a monitoring cadence so the decision can be revisited if circumstances change. The register entry stands as evidence that the choice was made deliberately rather than by drift.

Transfer. Shift part of the financial or operational consequence to another party — typically through insurance, supplier agreements, service-level commitments, or contractual liability clauses. Transfer can reduce financial exposure, but it does not remove governance responsibility. The organisation deploying the AI system remains accountable for how it behaves, even when a vendor built it. Treating transfer as a way to make accountability someone else's problem is one of the most common errors in AI risk work.

The honest version of this choice always asks two questions. Is this strategy proportionate to the harm and to our ability to control or monitor it? And can I defend why I rejected the next-best alternative? If both answers hold up, the choice will hold up.

Residual risk and recalibration

Risk management does not stop when a response is selected. Once controls are operational, the register has to be reassessed against evidence: have the controls actually changed the conditions that created the risk, or are they ceremonial? The remaining exposure after controls is the residual risk, and it is unavoidable in almost every initiative. The goal is not to eliminate every exposure but to ensure remaining risk is visible, proportionate, and actively monitored.

Recalibration matters more in AI than in most other contexts because system behaviour can change after deployment. Models drift. Data distributions shift. Users adapt their behaviour around the system in ways nobody designed for. Controls that look robust in testing can prove weaker once the system is operating at scale. Treating the register as a living governance instrument — revisited on a defined cadence — is what turns it from a planning artefact into something that protects the organisation in operation.

Interdependencies

Risks rarely occur in isolation. An interdependency exists when one risk increases the likelihood, impact, or visibility of another. A single underlying weakness can generate several downstream risks that look unrelated when assessed individually.

A common example: poor upstream data quality looks like a technical issue affecting model accuracy. But inaccurate training data also produces unreliable predictions, which increases monitoring noise, which raises fairness concerns, which reduces user trust, which causes staff to override outputs, which generates operational inefficiency. What started as data governance becomes operational and reputational risk simultaneously.

Practically, this means annotating the register to show which risks share root causes, which can trigger or amplify others, and which mitigations would reduce several risks at once. Ignoring interdependencies produces misleading confidence — every risk looks manageable in isolation, but their combined effect can be much larger than the sum of the parts.

Owners and monitoring cadence

A risk without an owner is a risk that nobody will act on. Each entry on the register needs a named individual with both the visibility of the risk and the authority to do something about it. In AI initiatives, ownership is often distributed across operational, technical, and governance domains — which is fine, as long as the boundaries are explicit. Risks fall through gaps when responsibility is described in plural: "the team will monitor."

Monitoring cadence should be proportionate to exposure. High-exposure risks may warrant weekly review of model performance metrics, drift indicators, override rates, and exception logs. Lower-exposure risks can sit on a monthly or quarterly cycle. The point is consistency: a defined rhythm turns monitoring from a reactive activity into a routine governance habit.

Project Activity — Complete section 4.3: risk assessment

Open the Module 3 Project workbook and complete section 4.3 Risk assessment. Build the risk register for the same initiative you scoped in Unit 2.

Identify at least five risks across technical, data, operational, legal, ethical, financial, adoption, and reputational categories.
Write each risk in testable terms: cause, event, and impact. Avoid vague labels such as "data risk" or "user resistance".
Score likelihood and impact, then choose a response strategy: avoid, mitigate, accept, or transfer.
Name one owner for each risk and define the monitoring cadence.
After responses, identify the top three residual risks. These are the risks your sponsor should still be watching.

Project Checklist

Section 4.3 includes at least five project-specific risks.
Each risk is written as a scenario with cause, event, and impact.
Likelihood and impact scores are justified by evidence, comparators, or clear assumptions.
Every risk has one owner with authority to act.
Each response strategy is realistic and proportionate.
I have identified interdependencies between risks where one risk could trigger another.
The top three residual risks are named after responses, not before.
Monitoring cadence is tied to risk exposure and AI-specific signals such as drift, overrides, incidents, and fairness measures where relevant.

Quick Check

⏭️ Up next — Lesson 2: With risks identified and responses planned, Lesson 2 turns to the financial expression of the project. You will examine what a credible whole-life budget contains, how to structure it across initial, delivery, and operational costs, and why an under-budget project is rarely a good thing.

Next: L3.2 - Budget Analysis and ROI