Changelog¶
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[Unreleased]¶
Planned¶
- Cost model with production data
- Platform-specific implementation guides (detailed)
- Case studies from production deployments
- Judge accuracy benchmarks from real deployments
- Epistemic risk detection algorithm specifications
[0.9.1] - 2026-03-09¶
Added¶
- MASO 2.0 Anticipated Changes (
maso/maso-2.0-anticipated-changes.md) - Forward-looking analysis of six AI capability trajectories that will stress or break the current framework, with architectural responses and a three-phase roadmap. Covers: Judge ceiling (primary models exceeding evaluation capability), human oversight scaling (transaction review becoming untenable), session boundary dissolution (persistent/ambient agents), multi-agent emergent behaviors (fleet-level unanticipated states), AI-vs-AI adversarial dynamics (machine-speed offense vs. human-speed defense), and regulatory divergence (conflicting jurisdictional requirements). Defines MASO 2.0 priority roadmap across three phases: Extend (0–6 months), Architect (6–18 months), and Paradigm Shift (18–36 months).
Changed¶
- MASO README (
maso/README.md) - Added MASO 2.0 Anticipated Changes section with evolution vector summary table and roadmap overview. Positioned between Stress Testing and What's Next sections.
Rationale¶
The v0.9.0 threat intelligence review identified three structural gaps in the current framework. This analysis extends that work forward: given observable AI capability trajectories, where will the framework's current architecture become insufficient, and what architectural changes must be planned now? Six vectors were identified; each is assessed for timeline, confidence, impact on current controls, and specific MASO 2.0 response. The deeper finding: the framework's "constrain regardless" principle becomes more important, not less, as evaluation and oversight become harder. Infrastructure-level action-space constraints are the control layer that scales with model capability.
[0.9.0] - 2026-03-09¶
Added¶
-
Output Evaluator (
extensions/technical/output-evaluator.md) - New solution architecture document addressing three structural blind spots identified through 2025–2026 threat intelligence analysis. Provides a three-layer evaluation model operating at real-time (per-action), session (per-task), and campaign (cross-session) timescales. Includes action classification, pre-action Judge prompt structure, intent coherence scoring, session state tracking, cross-session correlation, cost/latency impact analysis, and tiered implementation guide (Tier 1–3). -
Session-Level Intent Analysis (
core/agentic.md§6) - Major addition. New core control addressing task decomposition attacks, where adversaries break malicious goals into individually benign sub-tasks that pass all per-interaction controls. Adds session-level action sequence analysis, intent coherence tracking, cumulative scope monitoring, and session-level circuit breakers with aggregate thresholds. Driven by: Anthropic September 2025 espionage disclosure (80–90% autonomous campaign via task decomposition), CrowdStrike documentation of adversary frameworks designed for sub-task decomposition. -
Synchronous Pre-Action Evaluation (
core/agentic.md§7) - Major addition. New core control extending the Judge with a synchronous pre-action evaluation mode for agentic systems. Risk-based action routing: low-risk actions proceed with async evaluation; elevated-risk actions require synchronous Judge approval before execution. Includes action classification taxonomy, architecture diagram, and cost/latency tradeoff analysis. Driven by: CrowdStrike fastest eCrime breakout time (27 seconds) vs. async Judge evaluation cycle (500ms–5s + queue), creating a structural prevention gap for agentic systems. -
Tool and Integration Supply Chain (
core/agentic.md§8) - Major addition. Elevated supply chain and integration layer controls from MASO extensions to core agentic controls. Includes minimum controls (provenance verification, tool output sanitisation, network isolation, permission scoping, behavior monitoring, dependency scanning) and MCP-specific controls (server allowlisting, schema validation, capability restriction, update verification). Driven by: Cisco 2025 finding that attackers target integration components over models; 43% MCP server vulnerability rate; CVE-2025-6514 (CVSS 10.0 RCE via MCP); fake npm MCP package and GitHub issue injection attacks. -
Threat Intelligence Review (
maso/threat-intelligence/threat-intelligence-review.md) - Comprehensive analysis of 2025–2026 threat landscape covering CrowdStrike, Microsoft, Anthropic, Cisco, Kiteworks, and academic research. 18 priority recommendations across framework domains. Identifies three structural gaps: task decomposition defeating per-interaction controls, speed asymmetry making async Judge a liability for agentic contexts, and integration layer as primary attack surface.
Changed¶
- Agentic Controls (
core/agentic.md) - Extended Judge for Agents criteria with session coherence and trace integrity. Updated Key Takeaways from 8 to 11 items, adding aggregate intent detection, pre-action evaluation, and integration layer security.
Rationale¶
Threat intelligence analysis (CrowdStrike 2025 Global Threat Report, Microsoft Digital Defense Report 2025, Anthropic September 2025 disclosure, Cisco AI Defense Report 2025, Kiteworks 2025 Data Exposure Report, AISI Frontier AI Trends December 2025) revealed three architectural gaps in the framework's control model:
-
Task decomposition defeats per-interaction controls. The three-layer pattern evaluates at the interaction level. Adversaries, including AI-orchestrated campaigns, decompose malicious goals into sub-tasks that individually pass every layer. This is an architectural gap, not a tuning problem.
-
Speed asymmetry makes async Judge evaluation a liability for agentic systems. The 27-second breakout window is shorter than async Judge evaluation cycles. For agents with tool access, post-action detection is forensic, not preventive.
-
Integration layer is the primary attack surface but was positioned as an extension. MCP vulnerability rates (43%), critical CVEs (CVSS 10.0), and supply chain attacks through tool packages demonstrate that tool supply chain security is a prerequisite for agentic deployments, not an optional add-on.
These are the first major additions to core controls since v0.5.0. The framework's core architecture (Guardrails → Judge → Human) remains sound for per-interaction threats. These additions extend it to handle campaign-level, speed-critical, and supply-chain threats that the 2025–2026 landscape has made operationally relevant.
[0.8.4] - 2026-03-01¶
Added¶
- Why Containment Beats Evaluation - New position paper page articulating the constrain-regardless architecture for sandbagging-resistant AI governance. Covers closed-loop vs open-loop control, action-space vs reasoning-space constraints, compound defence by design, enterprise-owned Judge positioning, and honest acknowledgement of limits. Aligns the framework with the published position paper.
Changed¶
- Architecture Overview - Added closed-loop control system framing, constrain-regardless language, and enterprise-owned Judge positioning.
- Foundations README - Added action-space constraint language, enterprise-owned Judge emphasis, and compound defence by design framing.
- Evaluation Integrity Risks - Reframed from "mitigate sandbagging" to "sandbagging is operationally irrelevant as a breakout vector" under the constrain-regardless architecture.
- Containment Through Declared Intent - Added constrain-regardless framing, action-space vs reasoning-space distinction, parental analogy, and closed-loop control system language.
[0.8.3] - 2026-03-01¶
Added¶
- Containment Through Declared Intent - New insights page articulating the framework's unified defence thesis. Declared intent is the organising principle that gives every layer (guardrails, Judge, monitoring, human oversight, PACE) its reference point. Covers how intent flows through the defence stack, downstream intent awareness between agents, the confinement model, and honest assessment of where the theory breaks down.
[0.8.2] - 2026-03-01¶
Added¶
- CoSAI Principles Alignment - Updated References with full alignment mapping to CoSAI's Principles for Secure-by-Design Agentic Systems (July 2025). Demonstrates how the framework's three-layer pattern, PACE resilience, and MASO controls operationalise CoSAI's three principles.
- Producer/Implementer/Principal Accountability - Added supply chain accountability model to the AI Governance Operating Model. Complements the three-lines governance model with a supply chain lens for distributing responsibility across technology producers, service implementers, and human principals. Credit to CoSAI.
- SLSA-Style Supply Chain Provenance - Added to ET-04 (MCP as Attack Surface). Recommends adapting SLSA for verifiable provenance of agent and model artifacts: signed build provenance, content hashes for tool definitions, and verifiable chain from source to deployment.
[0.8.1] - 2026-03-01¶
Added¶
- E-Commerce 10K Stress Test - New MASO stress test scenario: 6 agent types, 10,000 concurrent customers, 60,000 agent instances on EKS. Complements the 100-agent breadth test with a depth-and-volume test covering type-level observability, risk-tiered Judge sampling, exception-driven human oversight, Kubernetes-native identity, PCI scope containment, two-level PACE, graduated shutdown, and volume-based compound attacks.
Changed¶
- 100-Agent Stress Test - Added cross-reference to the new e-commerce stress test in the relationship table.
[0.8.0] - 2026-03-01¶
Added¶
- Evaluation Integrity Risks - New insights page covering sandbagging, evaluation evasion, and the AISI finding that black-box monitors lose accuracy on harder tasks. Includes mitigations: canary interactions, evaluation signature elimination, multi-model cross-validation, behavioral consistency monitoring.
- AISI 5-Level MCP Autonomy Classification - Added to The MCP Problem page. Maps MCP server autonomy levels (read-only through unbounded autonomous) to MASO implementation tiers, with specific guidance for financial services.
- Self-Replication Capabilities (ET-09) - New emerging threat in MASO threat intelligence. Documents RepliBench progression from 5% to 60% success rates (2023–2025) and implications for kill switch architecture.
- Capability Acceleration (ET-10) - New emerging threat documenting the ~8-month doubling time for autonomous task complexity, with implications for control framework evolution cadence.
- Persuasion Scaling Risk - Added to Emerging Trends. Documents AISI finding that persuasive capability increases with model scale while accuracy decreases, and post-training amplifies persuasion more than scaling.
- Emotional Dependence Data - Added to Emerging Trends. Documents AISI finding on 33% AI emotional support usage and 30x negative sentiment spikes during outages. PACE implications for service continuity.
- AISI Reference - Added UK AI Security Institute Frontier AI Trends Report (December 2025) to References & Sources page with full finding summary and cross-references to all updated pages. Added 6 new key statistics.
Changed¶
- Risk Tiers - Added Domain-Specific Guardrail Tuning section with AISI evidence for uneven safeguard coverage across risk categories (R² = 0.097 capability-safety correlation, 40x jailbreak effort increase for targeted categories).
- When the Judge Can Be Fooled - Expanded sandbagging section with AISI empirical data on black-box monitor accuracy degradation. Added cross-reference to new Evaluation Integrity Risks page.
- Open-Weight Models - Added AISI confirmation that safeguards can be "quickly and cheaply removed" and 4–8 month open-to-closed source capability gap data. Strengthens case for runtime monitoring as primary control.
- Emerging Threats - Updated threat landscape summary table to include ET-09 and ET-10.
Rationale¶
The UK AI Security Institute's Frontier AI Trends Report (December 2025) is the strongest empirical backing for runtime behavioral security from a government body. Its findings directly validate the framework's core thesis (runtime controls matter more than deployment-time security alone) and provide concrete data points that strengthen multiple framework sections. Credit given throughout via source attribution.
[0.7.0] - 2026-02-15¶
Added¶
- MATURITY.md - Honest assessment of framework validation status
- Four-level validation model (production, incident, standards, pattern consistency)
- Explicit documentation of known gaps
- Call for pilot partners and peer review
- VALIDATED-AGAINST.md - Control-by-control incident validation
- Controls mapped to known real-world incidents with confidence ratings
- Evidence strength ratings (Strong: 3+ incidents, Moderate: 1–2, Threat-modelled: 0)
- Validation coverage map by MASO domain
- Top 5 most-validated controls identified
- EVOLUTION.md - Narrative history of framework development
- Decision rationale for every major version
- What drove each change (incidents, feedback, architectural shifts)
- Timeline from v0.1.0 (Dec 2025) through current
Changed¶
- Updated site navigation to include Credibility section (Maturity, Validated Against, Evolution)
- Changelog now links to narrative Evolution page for context
Rationale¶
The framework is comprehensive but has no production deployments. Rather than ignoring this gap, these additions address it directly: honest status assessment, retroactive alignment against real incidents, and a living record of how the framework evolves in response to real-world events. Credibility comes from transparency, not claims.
[0.6.0] - 2026-02-08¶
Changed¶
- Renamed: AI Security Blueprint → Enterprise AI Security Framework
- Better reflects the content scope (governance, compliance, org structure)
- "Blueprint" implied buildable artifacts; "Framework" is accurate
- Later renamed to AI Runtime Security (February 2026)
Added¶
- IMPLEMENTATION_GUIDE.md - New practical guide with working code
- Input guardrails (regex + Bedrock + NeMo examples)
- Output guardrails (PII, forbidden phrases, structured validation)
- Model-as-Judge (prompts, sampling strategies, async processing)
- Human-in-the-loop queue (Redis implementation, FastAPI endpoints)
- Telemetry and logging (structured logs, Prometheus metrics)
- Complete request flow example
- Test suite templates (unit tests, red team inputs)
- ~1,500 lines of copy-paste-ready Python
Rationale¶
Reality check revealed the framework was thought leadership, not a buildable blueprint. Now there are two clear paths: - Implementors: Start with IMPLEMENTATION_GUIDE.md (code) - Architects/Governance: Use the full Framework (strategy)
[0.5.0] - 2026-02-07¶
Changed¶
- Major restructure: Core + Extensions model
- New
/core/folder with 5 essential documents: - README.md - Overview and quick start
- risk-tiers.md - Classification and control selection
- controls.md - Guardrails, Judge, HITL combined
- agentic.md - Agent-specific controls
- checklist.md - Implementation tracking
- New
/extensions/folder for reference material: - regulatory/ - ISO 42001, EU AI Act, banking
- technical/ - Bypass prevention, infrastructure, metrics
- templates/ - Playbooks, assessments
- examples/ - Worked examples
- Root README now serves as navigation hub
- Previous detailed documents preserved in extensions
Rationale¶
Framework had grown to 48 files. Core + Extensions model provides clear "start here" path (5 docs) while preserving depth for those who need it.
[0.4.1] - 2026-02-06¶
Added¶
- Bypass Prevention document - comprehensive guide to preventing and detecting control circumvention across 5 bypass categories (guardrails, intent, agentic, architectural, process)
- Technical Controls document - network, firewall, WAF, AI gateway, DLP, proxy, endpoint, cloud, and IAM controls for infrastructure-level enforcement
- 14 new SVG diagrams:
- bypass-taxonomy.svg - 5 bypass categories visual
- defence-in-depth.svg - 8-layer control stack
- technical-controls-architecture.svg - infrastructure overview
- ai-gateway-architecture.svg - gateway internals
- network-zones.svg - network segmentation
- agent-sandbox.svg - infrastructure constraints
- action-validator-flow.svg - action validation pipeline
- tool-output-sanitiser.svg - tool output handling
- canary-testing.svg - control verification programme
- dlp-inspection-points.svg - 4 DLP layers
- casb-ai-classification.svg - sanctioned/tolerated/blocked apps
- bypass-learning-loop.svg - continuous improvement cycle
- infra-vs-instruction.svg - enforcement comparison
- multi-layer-input-validation.svg - input processing pipeline
Changed¶
- Updated bypass-prevention.md and technical-controls.md to reference SVG diagrams instead of ASCII art
- Clarified lifecycle scope in README - framework is operationally focused (deployment → operation → incident response), not full AI/ML lifecycle
[0.4.0] - 2026-02-05¶
Added¶
- AI Incident Response Playbook - 10 playbooks for AI-specific incidents
- Vendor Assessment Questionnaire - comprehensive due diligence template
- Operational Metrics document - KPIs, dashboards, alerting thresholds
- Data Retention Guidance - requirements by tier and jurisdiction
- Templates README - index of all templates
- Standard repo files: CODE_OF_CONDUCT.md, GOVERNANCE.md, LICENSE (MIT), SECURITY.md
Changed¶
- Updated README with Templates section and new document links
- Moved "Threats" section to "Threats and Risks" with expanded content
[0.3.0] - 2026-02-05¶
Added¶
- Novel AI Risks document - 12 risks unique to AI systems
- Support Systems Risk document - operational risks that matter most
- Banking Cyber Risks document - top 10 banking risks through AI lens
- Feeder systems analysis with diagram
- 10 new controls: AI.3.4, AI.5.4, AI.6.4, AI.6.5, AI.7.4, AI.8.5, AI.9.5, AI.10.6, AI.13.4
- Support systems risk heat map SVG
- Banking AI feeder systems diagram SVG
- Model card template
- Reference materials (glossary, bibliography)
- Future work roadmap
Changed¶
- Strengthened AI.4.2 (Testing) with statistical testing for non-determinism
- Strengthened AI.6.2 (Model Validation) with bias testing and continuous validation
- Strengthened AI.6.3 (Model Monitoring) with degradation detection
- Strengthened AI.7.1 (Input Guardrails) with semantic analysis and RAG filtering
- Strengthened AI.7.2 (Output Guardrails) with grounding checks
- Strengthened AI.7.3 (Guardrail Maintenance) with semantic adversarial testing
- Strengthened AI.8.1 (Judge Evaluation) with hallucination and override detection
- Strengthened AI.8.2 (Sampling Strategy) with baseline integration
- Strengthened AI.9.1 (HITL) with automation bias mitigation
- Strengthened AI.11.1 (Logging) with full context capture
- Strengthened AI.13.1 (Vendor Assessment) with training data practices
- Strengthened AI.14.1 (Training) with cognitive bias training
- Strengthened AG.2.3 (Scope Enforcement) with outcome boundaries
- Updated README with new documentation links
Fixed¶
- XML entity escaping in SVG files (ampersand encoding)
[0.2.0] - 2026-01-15¶
Added¶
- Agentic Controls (AG.1-AG.4) for autonomous AI systems
- AG.2.5 Tool Protocol Security for MCP, function calling
- ISO 42001 alignment document
- EU AI Act crosswalk
- Platform integration guide (Bedrock, Databricks, Foundry)
- Control selection guide
- Tube map visualisation
- Multiple architecture diagrams
Changed¶
- Expanded risk tier definitions
- Enhanced HITL model documentation
- Improved Judge model selection guidance
[0.1.0] - 2025-12-01¶
Added¶
- Initial framework release
- Three-layer control model (Guardrails, Judge, HITL)
- AI control families AI.1-AI.16
- Risk tiering framework (CRITICAL/HIGH/MEDIUM/LOW)
- Model-as-Judge pattern and operating model
- HITL operating model
- ISO 27001 alignment
- OWASP LLM Top 10 threat mapping
- Implementation guide
- Maturity model
- Example implementations (customer service, document assistant, credit decision)
Categories¶
Addedfor new featuresChangedfor changes in existing functionalityDeprecatedfor soon-to-be removed featuresRemovedfor now removed featuresFixedfor any bug fixesSecurityfor vulnerability fixes