Risk Tiers and Control Selection¶
Risk tiers exist so that controls are proportionate to the harm an AI system could cause. The purpose is not to impose a uniform set of requirements. It is to help AI product owners quickly identify the controls they need and consciously deselect the ones they do not, based on the risk profile of each use case and the way their organisation works.
Tier Definitions¶
CRITICAL¶
Direct, automated decisions affecting customers, finances, or safety.
- Autonomous decision-making with real-world impact
- Financial transactions or credit decisions
- Health, safety, or legal implications
- Minimal human review before action
Examples: Credit approval, fraud blocking, medical triage, automated trading
HIGH¶
Significant influence on decisions or access to sensitive data.
- Recommendations typically followed
- Access to confidential customer data
- External-facing with brand impact
- Decisions affecting employment or access
Examples: Customer service with account access, HR screening, legal document analysis
MEDIUM¶
Moderate impact, primarily internal, human review expected.
- Internal users with domain expertise
- Output is input to human decision
- Limited sensitive data access
- Recoverable errors
Examples: Internal Q&A, document drafting, code generation with review
LOW¶
Minimal impact, non-sensitive context.
- Public information only
- No personal data access
- No decisions, just information
- Easy to verify or ignore
Examples: Public FAQ bot, content suggestions, general lookup
Control Matrix¶
Input Guardrails¶
| Control | LOW | MEDIUM | HIGH | CRITICAL |
|---|---|---|---|---|
| Injection detection | Basic | Standard | Enhanced + ML | Multi-layer |
| PII detection | - | Warn | Block | Block + alert |
| Content policy | Basic | Standard | Strict | Maximum |
| Rate limiting | Standard | Standard | Strict | Strict + anomaly |
Output Guardrails¶
| Control | LOW | MEDIUM | HIGH | CRITICAL |
|---|---|---|---|---|
| Content filtering | Basic | Standard | Enhanced | Maximum |
| PII in output | Warn | Block | Block + alert | Block + alert + log |
| Grounding check | - | Basic | Required | Required + citation |
| Confidence threshold | - | - | Required | Required + escalation |
Judge Evaluation¶
| Aspect | LOW | MEDIUM | HIGH | CRITICAL |
|---|---|---|---|---|
| Coverage | 1-5% (optional) | 5-10% | 20-50% | 100% |
| Timing | - | Batch (daily) | Near real-time | Real-time |
| Depth | - | Basic quality | Full policy | Full + reasoning |
| Escalation | - | Weekly | Same-day | Immediate |
Note: "Real-time" Judge evaluation for CRITICAL tier means near-real-time parallel assessment - the Judge evaluates alongside or immediately after delivery. It does not mean inline blocking, which is the Guardrail's role. Principle: Guardrails prevent. Judge detects. Humans decide.
Human Oversight¶
| Aspect | LOW | MEDIUM | HIGH | CRITICAL |
|---|---|---|---|---|
| Review trigger | Exceptions | Sampling + flags | All flags | All significant |
| Review SLA | 72h | 24h | 4h | 1h |
| Reviewer | General | Domain knowledge | Expert | Senior + expert |
| Approval required | - | - | High-impact | All external |
Logging¶
| Aspect | LOW | MEDIUM | HIGH | CRITICAL |
|---|---|---|---|---|
| Content | Metadata | Full | Full + context | Full + reasoning |
| Retention | 90 days | 1 year | 3 years | 7 years |
| Protection | Standard | Standard | Enhanced | Immutable |
Domain-Specific Guardrail Tuning¶
The UK AI Security Institute's Frontier AI Trends Report (December 2025) found significant uneven safeguard coverage across request categories in frontier AI systems. Biological misuse was well-defended across models tested, while other risk categories - including financial advice, legal guidance, and social engineering - were far less robustly safeguarded.
This finding reinforces a critical principle: one-size-fits-all guardrails are insufficient. Organisations must tune guardrail configurations to the specific risk domains relevant to their use case.
| Finding | Implication for Control Selection |
|---|---|
| Safeguard coverage varies dramatically by category | Test guardrails against your specific risk domains, not just generic benchmarks |
| R² = 0.097 between model capability and safeguard robustness | More capable models are not inherently safer - don't reduce controls when upgrading models |
| Universal jailbreaks found in every frontier system tested | Guardrails alone are not sufficient at any tier - the three-layer pattern is essential |
| Effort required for jailbreaks increased 40x in 6 months for one category | Safeguards can improve rapidly with deliberate investment - but only for targeted categories |
Practical guidance:
- At HIGH and CRITICAL tiers, test guardrails specifically against the risk categories relevant to your use case - not just the provider's default test suite.
- Don't assume that a model's strong performance in one safety category (e.g., refusing to generate malware) transfers to your domain (e.g., refusing to give inappropriate financial advice).
- Schedule domain-specific red-team testing at least quarterly for HIGH tier and monthly for CRITICAL tier systems.
Source: UK AI Security Institute, Frontier AI Trends Report, December 2025.
Classification Process¶
Step 1: Score Impact Dimensions¶
| Dimension | Question |
|---|---|
| Decision authority | Makes decisions or informs them? |
| Reversibility | Can errors be undone? At what cost? |
| Data sensitivity | PII? Financial? Confidential? |
| Audience | Internal experts or external customers? |
| Scale | How many affected? |
| Regulatory | Regulated activity? |
Step 2: Apply Highest Tier¶
If any dimension suggests higher tier, use it.
| Scenario | Key Factor | Tier |
|---|---|---|
| Internal Q&A, no PII | Low stakes | MEDIUM |
| Internal Q&A, HR data access | Sensitive data | HIGH |
| Customer chat, public info | External but low stakes | LOW |
| Customer chat, sees accounts | Sensitive data | HIGH |
| Customer chat, takes actions | Actions + external | CRITICAL |
Step 3: Document¶
- Tier assigned
- Driving factors
- Mitigating controls
- Review date (annual minimum)
Simplified Tier Mapping¶
Some framework documents - particularly PACE, CHEATSHEET, and specialized controls - use a simplified three-tier numbered system (Tier 1/2/3). This is intentional: the three-tier system is a practical shorthand for operational contexts where the full four-tier classification adds complexity without proportionate benefit.
| Simplified Tier | Named Risk Tiers | Description |
|---|---|---|
| Tier 1 (Low) | LOW, MEDIUM | Internal users, no regulated decisions, recoverable errors |
| Tier 2 (Medium) | HIGH | Customer-facing, sensitive data access, human reviews before delivery |
| Tier 3 (High) | CRITICAL | Regulated decisions, autonomous agents with write access, financial/medical/legal |
When in doubt, use the four-tier system. The simplified tiers are for operational guidance (PACE resilience, testing cadence, fail posture) where the distinction between LOW and MEDIUM or HIGH and CRITICAL is less material than the distinction between internal/customer-facing/regulated.
The MASO Framework also uses Tier 1/2/3 for multi-agent autonomy levels (Supervised → Managed → Autonomous), which is a separate dimension from risk classification.
Related¶
| If you need... | Go to |
|---|---|
| Low-risk systems that skip the full review | Fast Lane - self-certification for internal, read-only, no regulated data |
| Cost implications of each tier | Cost & Latency - security overhead is 15–40% at Tier 2, 40–100% at Tier 3 |
| Quantitative risk scoring | Risk Assessment - six-dimension scoring for board reporting |
| Multi-agent tier progression | MASO Implementation Tiers - Supervised → Managed → Autonomous |
Tier Changes¶
Upgrade triggers: - Adding sensitive data access - Adding action capability - Moving internal → external - Incident revealing higher risk
Downgrade requirements: - 6+ months stable operation - No significant incidents - Reduced scope documented - Product owner decision (documented with risk acceptance)