Skip to content

Microsoft Agent Governance Toolkit Patterns

Purpose: Reference patterns for implementing AI agent runtime security using Microsoft's open-source Agent Governance Toolkit (AGT), released April 2026 under the MIT licence. Status: Reference patterns. AGT is new (v0.1.x at time of writing). Evaluate maturity for your use case.

Overview

The Agent Governance Toolkit is a seven-package system covering policy enforcement, identity, execution sandboxing, reliability engineering, compliance automation, plugin lifecycle, and RL training governance. It is the first open-source toolkit to address all 10 OWASP Agentic Top 10 risks with deterministic, sub-millisecond policy enforcement.

Available in Python, TypeScript, Rust, Go, and .NET. Framework integrations exist for LangChain, CrewAI, Google ADK, and Microsoft Agent Framework.

Architecture Mapping

Framework Zone AGT Component Function
Zone 1 - Ingress Agent OS Stateless policy engine intercepts every agent action before execution (p99 < 0.1ms)
Zone 2 - Runtime Agent Runtime Execution rings, saga orchestration, kill switch
Zone 3 - Evaluation Agent OS (policy rules) Deterministic policy evaluation against declared manifests
Zone 5 - Control Plane Agent Compliance, Agent Marketplace Compliance grading, plugin lifecycle, trust-tiered gating
Zone 6 - Logging Agent SRE SLOs, error budgets, structured telemetry
Cross-cutting Agent Lightning RL training governance with policy-enforced runners

Agent OS: Policy Enforcement

Agent OS is the core component. It functions as a stateless policy engine that intercepts every agent action before execution.

How it maps to AIRS controls

AIRS Control Agent OS Feature
TOOL-01 (Manifests) Agents declare permitted tools in a manifest. Agent OS rejects any tool call not in the manifest
TOOL-02 (Gateway) All tool invocations transit Agent OS. Agents cannot self-authorise
TOOL-03 (Parameters) Parameter constraints defined in policy, enforced at interception
TOOL-04 (Classification) Actions classified by reversibility. Irreversible actions require human approval
TOOL-05 (Rate Limits) Per-agent, per-tool rate limiting with configurable windows
NET-02 (Bypass Prevention) Architecture enforces that all actions pass through the policy engine

Configuration pattern

# agent-policy.yaml
agent:
  id: customer-service-agent
  manifest:
    tools:
      - name: search_knowledge_base
        max_calls_per_minute: 30
        parameters:
          query:
            type: string
            max_length: 500
      - name: create_ticket
        classification: reversible
        max_calls_per_minute: 10
      - name: issue_refund
        classification: irreversible
        requires_human_approval: true
        max_amount: 100.00
    denied_tools:
      - execute_code
      - send_email

Agent Runtime: Execution Rings

Agent Runtime introduces execution rings modelled on CPU privilege levels. Each ring restricts what the agent can do.

Ring Privilege Level Use Case
Ring 0 Full system access Platform operators only. Not for agents
Ring 1 Scoped tool access with approval gates Production agents handling sensitive operations
Ring 2 Read-only tools, no side effects Information retrieval agents, research assistants
Ring 3 Sandboxed execution, no external access Code generation, data analysis, testing

AIRS mapping

AIRS Control Execution Rings Feature
SAND-01 (Isolation) Ring assignment determines isolation level
IAM-02 (Least Privilege) Agents assigned to the minimum ring for their task
IAM-04 (Tool Constraints) Ring determines which tool categories are available

Saga orchestration

Multi-step agent operations use the saga pattern: each step has a compensating action. If any step fails or is rejected, previous steps are automatically rolled back.

This maps to IR-04 (Rollback) and addresses ASI08 (Cascading Failures) in the OWASP Agentic Top 10.

Kill switch

Emergency agent termination halts all in-flight operations and triggers compensating actions. This implements the circuit breaker pattern described in the AIRS PACE model.

Agent SRE: Reliability Engineering

Agent SRE applies site reliability engineering practices to agent systems.

SRE Concept Agent Application AIRS Mapping
SLOs Target success rates, latency bounds, error rates per agent LOG-05 (Drift Detection)
Error budgets Agents with exhausted error budgets are automatically throttled or paused Circuit Breaker
Circuit breakers Open on SLO violation, close after recovery period PACE resilience model
Chaos engineering Inject tool failures, latency, and policy violations to test agent resilience Red team methodology
Progressive delivery Canary agent deployments with automatic rollback on SLO regression Deployment controls

Agent Compliance: Automated Governance

Agent Compliance automates compliance verification against regulatory frameworks.

Framework AGT Coverage
EU AI Act High-risk obligations mapping, evidence collection
HIPAA PHI handling verification for healthcare agents
SOC 2 Control evidence for Type II audits
OWASP Agentic Top 10 All 10 risk categories assessed and graded

Compliance grading

Each agent receives a compliance grade (A through F) based on:

  • Policy coverage (are all required controls in place?)
  • Enforcement verification (are controls actually enforced, not just declared?)
  • Evidence completeness (can you prove compliance to an auditor?)

This maps to the AIRS framework's maturity levels and supports the regulatory mapping requirements in the compliance section.

Agent Marketplace: Plugin Lifecycle

Agent Marketplace governs how agents discover and use plugins (skills, tools, integrations).

Feature What It Does AIRS Mapping
Ed25519 signing Cryptographic verification of plugin authorship SUP-01 (Provenance)
Manifest verification Plugin declares permissions, data access, and network requirements TOOL-01, SUP-05
Trust-tiered gating Plugins categorised by trust level; higher-tier plugins require more verification SUP-02 (Risk Assessment)
Lifecycle management Version pinning, deprecation tracking, vulnerability alerts SUP-07, SUP-08

This directly addresses the agent supply chain risks demonstrated by the OpenClaw incident, where 1 in 5 packages in an agent skill registry were malicious.

Agent Lightning: RL Training Governance

Agent Lightning governs reinforcement learning training workflows.

Feature What It Does AIRS Relevance
Policy-enforced runners RL training runs within policy boundaries; reward functions cannot bypass safety constraints Training-time safety extension
Reward shaping Governance-aware reward signals that penalise policy violations Addresses reward hacking risk
Zero violation target Training halts on any policy violation Circuit breaker for training

This is relevant to the emergent misalignment risk documented in When Learning Goes Wrong, where RL training spontaneously produced misaligned behaviour.

Framework Integration Patterns

AGT integrates via each framework's native extension points:

Framework Integration Method
LangChain Callback handlers on chain execution
CrewAI Task decorators wrapping agent actions
Google ADK Plugin system hooks
Microsoft Agent Framework Middleware pipeline injection

LangChain example

from agent_governance_toolkit import AgentOS
from langchain.callbacks import BaseCallbackHandler

class AGTCallback(BaseCallbackHandler):
    def __init__(self, policy_path: str):
        self.policy = AgentOS.load_policy(policy_path)

    def on_tool_start(self, tool, input_str, **kwargs):
        decision = self.policy.evaluate(tool.name, input_str)
        if decision.denied:
            raise PolicyViolation(decision.reason)

OWASP Agentic Top 10 Coverage

OWASP Risk AGT Component Coverage
ASI01 Agent Goal Hijack Agent OS (policy enforcement) Tool-level containment limits hijack impact
ASI02 Tool Misuse Agent OS (manifests, parameters) Declared tool boundaries enforced deterministically
ASI03 Identity Abuse Agent Runtime (execution rings) Ring-based privilege prevents escalation
ASI04 Supply Chain Agent Marketplace (signing, gating) Cryptographic provenance for all plugins
ASI05 Code Execution Agent Runtime (Ring 3 sandbox) Sandboxed execution with no external access
ASI06 Memory Poisoning Agent Runtime (saga rollback) State rollback on detected corruption
ASI07 Inter-Agent Comms Agent OS (per-hop policy) Policy evaluation on every inter-agent message
ASI08 Cascading Failures Agent SRE (circuit breakers) SLO-based circuit breaking with error budgets
ASI09 Trust Exploitation Agent OS (approval gates) Irreversible actions require human confirmation
ASI10 Rogue Agents Agent SRE + Agent OS Behavioural SLOs detect deviation; kill switch terminates

Limitations

  • New project (v0.1.x). Production maturity is unproven. Evaluate carefully before deploying to HIGH or CRITICAL systems
  • Policy, not detection. Agent OS enforces declared policies. It does not perform semantic evaluation (no Judge equivalent). Pair with the AIRS Judge layer for full coverage
  • No multimodal support. Current policies are text and tool-action based. Image, voice, and video inputs are not covered
  • Framework lock-in risk. Deep integration with a specific agent framework may make migration difficult. Use the abstraction layer where possible