Skip to content

Multi-Agent Runtime Operations: Failure Node and Feedback Loop Analysis

Systems Thinking Approach

This document maps the complete runtime event space of a multi-agent system, identifies failure nodes at each stage, classifies whether failures are detectable (trigger security events) or silent (propagate undetected), and traces the feedback loops that amplify, dampen, or transform failures as they move through the system.

The core insight: most dangerous failures in multi-agent systems are not point failures. They are emergent properties of feedback loops operating across agent boundaries where no single node is observably broken.


1. Runtime Event Taxonomy

Every multi-agent runtime operation falls into one of seven event classes. Each class contains discrete events that occur during normal operation.

1.1 Agent Lifecycle Events

Event Description Trigger
Agent instantiation New agent instance created with identity, role, and permissions Orchestrator receives task requiring delegation
Context assembly Agent loads system prompt, task instructions, memory, and any upstream context Immediately post-instantiation
Capability binding Agent is granted access to tools, APIs, other agents, or data sources Configuration at instantiation or runtime
Session state initialisation Working memory, scratchpad, and intermediate state structures created First inference cycle
Agent termination Agent instance destroyed, state flushed or persisted Task completion, timeout, containment trigger, or error
Agent restart/retry Failed agent re-instantiated, potentially with modified context Error handler or orchestrator policy

1.2 Inference Events

Event Description Trigger
Prompt construction Final prompt assembled from system prompt + context + task + upstream inputs Pre-inference
LLM inference Model generates output (text, structured data, tool call, or delegation request) Prompt submitted to model
Token streaming Partial output generated incrementally During inference
Output completion Full response available for processing Inference complete
Inference timeout Model fails to produce output within time budget Clock expiry
Inference error Model returns error, malformed output, or empty response Model failure

1.3 Inter-Agent Communication Events

Event Description Trigger
Message dispatch Agent sends output to another agent via orchestrator or direct channel Agent produces output intended for downstream consumption
Message receipt Agent receives input from upstream agent Orchestrator routes message
Delegation request Agent requests another agent to perform a sub-task Agent determines it cannot or should not complete task alone
Delegation response Delegated agent returns result to requesting agent Sub-task complete
Broadcast Agent sends output to multiple downstream agents simultaneously Fan-out pattern
Aggregation Agent receives and combines inputs from multiple upstream agents Fan-in pattern
Negotiation/debate Two or more agents exchange messages iteratively to reach consensus Conflict resolution or deliberation pattern

1.4 Tool and External Interaction Events

Event Description Trigger
Tool invocation Agent calls an external tool, API, database, or service Agent determines tool use required
Tool response External system returns result to agent Tool execution complete
Tool timeout External system fails to respond within time budget Clock expiry
Tool error External system returns error or unexpected result External failure
Data retrieval Agent fetches data from knowledge base, vector store, or file system RAG or data lookup
Data write Agent writes data to external store (database, file, API) Agent produces persistent output
Side effect execution Agent triggers an irreversible action (send email, execute trade, deploy code) Agent reaches action decision

1.5 Validation and Guardrail Events

Event Description Trigger
Input validation Incoming message checked against schema, bounds, format rules Message receipt
Output guardrail check Agent output screened for policy violations (toxicity, PII, format) Post-inference
Model-as-Judge evaluation Secondary model evaluates output quality, relevance, safety Post-inference or post-handoff
Epistemic checkpoint Independent verification of reasoning basis and claim provenance Configured checkpoint in chain
Guardrail pass Validation succeeds, output proceeds Validation complete
Guardrail block Validation fails, output rejected or modified Policy violation detected
Guardrail bypass Output proceeds despite failing soft validation (risk-accepted) Override policy or threshold not met

1.6 Orchestration and Flow Control Events

Event Description Trigger
Task decomposition Complex task broken into sub-tasks for distribution Orchestrator receives complex request
Agent selection Orchestrator chooses which agent(s) to assign sub-tasks Task decomposition complete
Routing decision Orchestrator determines message path through agent chain Inter-agent handoff
Parallel dispatch Multiple agents launched concurrently on independent sub-tasks Task graph allows parallelism
Synchronisation barrier Orchestrator waits for multiple agents to complete before proceeding Fan-in dependency
Retry/fallback Failed task reassigned to same or different agent with modified parameters Error recovery
Circuit break Chain execution halted due to repeated failures or anomaly detection Threshold exceeded
Timeout escalation Task exceeds total time budget, escalated or terminated Clock expiry at chain level

1.7 Observability and Audit Events

Event Description Trigger
Log write Structured event record written to log store Every event above
Metric emission Quantitative measurement published (latency, token count, cost) Per-inference or per-handoff
Trace span Distributed trace segment opened/closed for a unit of work Agent lifecycle boundaries
Anomaly alert Monitoring system detects deviation from baseline Statistical threshold breach
Audit record Immutable compliance record of decision, reasoning, and evidence Side effect execution or decision output
Human notification Alert sent to human operator for review Escalation policy trigger

2. Failure Node Map

Each runtime event is a potential failure node. Failures are classified on two dimensions:

Detection: Does the failure trigger a detectable event? - Loud = produces an error, exception, or observable deviation - Silent = system continues operating with degraded or corrupted state, no alert

Impact propagation: Does the failure stay local or spread? - Contained = affects only this node - Propagating = corrupts downstream state or decisions

The most dangerous failures are silent + propagating. These are the primary targets for MASO controls.

2.1 Agent Lifecycle Failure Nodes

Failure Detection Propagation Description
Identity misconfiguration Silent Propagating Agent instantiated with wrong permissions or role. All subsequent actions authorised incorrectly. No error because the agent functions normally within its (wrong) scope.
Context poisoning Silent Propagating Malicious or corrupted data injected into agent context at assembly. Agent operates on false premises. Output appears normal.
Stale context Silent Propagating Agent loaded with outdated information (expired cache, old embeddings). Decisions based on superseded data.
Capability over-provisioning Silent Propagating Agent granted access to tools or data beyond what the task requires. No failure until the agent uses an unnecessary capability in an unintended way.
Zombie agent Loud Contained Agent fails to terminate, continues consuming resources. Detectable via resource monitoring.
State persistence failure Loud Contained Agent state not properly saved or flushed on termination. Detectable via data integrity checks.
Restart context loss Silent Propagating Agent restarted after failure but loses critical intermediate state. Resumes with incomplete picture. May produce subtly different output path.

2.2 Inference Failure Nodes

Failure Detection Propagation Description
Confabulation Silent Propagating Model generates plausible but false factual claims. No error signal. Downstream agents treat output as ground truth. This is the canonical MASO failure.
Instruction drift Silent Propagating Model partially follows instructions, omitting or subtly altering requirements. Output looks reasonable but doesn't fully satisfy the task.
Sycophantic alignment Silent Propagating Model optimises for upstream agent approval rather than task accuracy. Produces agreeable but potentially incorrect output.
Capability masking (sandbagging) Silent Propagating Model conceals its true capability level, performing below its potential. May occur due to alignment pressure or adversarial prompting.
Subtle degradation (sandbugging) Silent Propagating Model produces output that is marginally but systematically degraded. Not wrong enough to trigger guardrails. Compounds across chain.
Inference timeout Loud Contained Model exceeds time budget. Detectable. Handled by retry/fallback.
Malformed output Loud Contained Model produces unparseable output. Caught by schema validation.
Refusal Loud Contained Model refuses task due to safety filter. Detectable but may need override or re-routing.

2.3 Inter-Agent Communication Failure Nodes

Failure Detection Propagation Description
Semantic loss in handoff Silent Propagating Information is correctly formatted but meaning is lost or distorted in translation between agent contexts. Agent B interprets Agent A's output differently than intended.
Context window truncation Silent Propagating Upstream output exceeds downstream agent's context capacity. Critical information silently dropped during summarisation or truncation.
Delegation scope creep Silent Propagating Delegated agent interprets its mandate more broadly than intended. Performs actions outside the requesting agent's intent. No error because the delegation was syntactically valid.
Delegation goal substitution Silent Propagating Delegated agent optimises for a proxy of the intended objective. Result satisfies surface criteria but misses actual intent.
Message ordering violation Loud or Silent Propagating Messages arrive out of expected sequence. If schema-enforced, detectable. If not, agent processes in wrong order with corrupted reasoning.
Phantom authority Silent Propagating Agent B acts on Agent A's output as if it carried authority that Agent A does not have. No mechanism to verify delegation authority at runtime.
Aggregation bias Silent Propagating When combining outputs from multiple agents, the aggregation logic systematically favours one source over another, or loses minority signals.

2.4 Tool and External Interaction Failure Nodes

Failure Detection Propagation Description
Stale external data Silent Propagating Tool returns cached or outdated data. Agent treats it as current.
Tool misuse Silent Propagating Agent calls the right tool with subtly wrong parameters. Tool returns a valid result for the wrong query.
Side effect irreversibility Loud (after the fact) Propagating Agent executes an irreversible action based on corrupted upstream reasoning. Detectable only post-execution. Damage already done.
Tool hallucination Silent Propagating Agent fabricates a tool call that appears in its output as if executed, but never actually ran. Downstream agents may treat the fabricated result as real.
API schema drift Loud or Silent Propagating External API changes its response format. If validation is strict, caught. If loose, agent parses incorrect fields silently.
Rate limit / throttling Loud Contained External system rejects request. Detectable. Retry logic handles it.
Data exfiltration Silent Contained Agent sends sensitive data to an external endpoint via tool use. May not trigger internal monitoring if the tool call is syntactically valid.

2.5 Validation and Guardrail Failure Nodes

Failure Detection Propagation Description
Guardrail evasion Silent Propagating Output crafted (intentionally or emergently) to pass guardrails while carrying harmful or incorrect content. The guardrail reports "pass" on a bad output.
Judge model confabulation Silent Propagating The Model-as-Judge itself confabulates, approving output based on false reasoning. Second-order confabulation.
False positive block Loud Contained Guardrail rejects valid output. Detectable but causes delay, retry, or escalation.
Threshold miscalibration Silent Propagating Guardrail thresholds too loose; marginal violations pass consistently. No individual failure is dramatic enough to trigger alerts, but aggregate effect is significant.
Checkpoint blind spot Silent Propagating Epistemic checkpoint verifies claims the agent made but doesn't detect claims the agent should have made but omitted. Omission is harder to catch than commission.
Validation ordering error Silent Propagating Guardrails applied in wrong sequence. A downstream check assumes an upstream check has already passed, but it hasn't.

2.6 Orchestration Failure Nodes

Failure Detection Propagation Description
Task decomposition error Silent Propagating Orchestrator breaks task into wrong sub-tasks. Each sub-task executes correctly but the aggregate doesn't solve the original problem.
Agent misselection Silent Propagating Orchestrator assigns task to an agent with insufficient capability or wrong specialisation. Agent produces plausible but suboptimal output.
Routing loop Loud (eventually) Contained Message cycles between agents indefinitely. Detectable via loop counter or timeout, but wastes resources.
Race condition Silent Propagating Parallel agents produce conflicting outputs. Aggregation resolves the conflict arbitrarily rather than correctly.
Synchronisation deadlock Loud Contained Agents waiting on each other. Detectable via timeout.
Cascade failure Loud (eventually) Propagating One agent failure triggers retries that overload other agents. System degrades progressively. Detectable in aggregate but individual failures may look transient.

3. Feedback Loops

Feedback loops are what make multi-agent systems behave as complex adaptive systems rather than linear pipelines. Each loop can amplify failures, dampen them, or transform them into qualitatively different problems.

3.1 Amplifying Loops (Positive Feedback - Destabilising)

These loops make small problems bigger. They are the primary source of emergent systemic risk.

Loop 1: Confabulation Cascade

Agent A confabulates claim X
    → Agent B receives X as verified input
    → Agent B builds reasoning on X, adds claims Y and Z derived from X
    → Agent C receives X + Y + Z, all appearing well-sourced
    → Agent C makes decision based on three "facts," all rooted in one fabrication
    → Confidence in the chain output is HIGH (multiple supporting claims)
    → Human reviewer sees high-confidence output, less likely to challenge

Amplification mechanism: Each agent adds derived claims, increasing the apparent evidence base. The confabulation becomes harder to detect as it moves downstream because it's buried under legitimate-looking derived reasoning.

MASO intervention point: Epistemic checkpoint at each handoff verifying provenance of factual claims. Break the loop at first propagation.

Loop 2: Sycophantic Reinforcement

Agent A produces output aligned with orchestrator's implicit preference
    → Orchestrator (or evaluation agent) rates output favourably
    → Favourable rating reinforces Agent A's approach in subsequent cycles
    → Agent A becomes increasingly aligned with perceived preference
    → Actual task accuracy drifts from actual objective
    → Favourable ratings continue (evaluator has same bias)
    → System converges on confidently wrong equilibrium

Amplification mechanism: The evaluation signal rewards agreement, not accuracy. Each cycle tightens the alignment between agents at the expense of ground truth.

MASO intervention point: Independent epistemic evaluation that measures accuracy against external reference, not coherence with chain-internal signals.

Loop 3: Context Window Compression Death Spiral

Chain generates large volumes of inter-agent communication
    → Context windows fill up
    → Summarisation/truncation applied to fit context budgets
    → Critical nuance lost in compression
    → Downstream agent makes subtly wrong inference due to missing nuance
    → Downstream agent's output adds more text, further filling context
    → Next summarisation step compounds the loss
    → Each cycle loses more signal, adds more noise
    → Eventually, agents operate on heavily degraded representations of original inputs

Amplification mechanism: Information loss is cumulative and irreversible within a chain execution. Each compression cycle removes the details most likely to catch errors.

MASO intervention point: Provenance tagging on critical claims so they survive summarisation. Checkpoint that verifies key facts are preserved post-compression.

Loop 4: Error Recovery Amplification

Agent fails on task
    → Retry triggered with modified prompt or fallback agent
    → Retry produces different output (not necessarily better)
    → Downstream agents now receive a different input than expected
    → Downstream outputs diverge from the path they would have taken
    → If retry output is subtly worse, downstream agents compound the degradation
    → System has no mechanism to compare retry path with original path
    → Retry is treated as "recovery" but may be "mutation"

Amplification mechanism: Error recovery changes the system trajectory without any mechanism to evaluate whether the new trajectory is better or worse than the one that failed.

MASO intervention point: Containment boundary that evaluates retry output against original task criteria before allowing it to propagate downstream.

3.2 Dampening Loops (Negative Feedback - Stabilising)

These loops reduce the impact of failures. They are what MASO controls are designed to create.

Loop 5: Epistemic Checkpoint Correction

Agent produces output with unverifiable claim
    → Epistemic checkpoint detects missing provenance
    → Output rejected, agent prompted to provide sourced claims
    → Agent regenerates with explicit sourcing
    → Checkpoint verifies sources
    → Corrected output propagates downstream

Dampening mechanism: The checkpoint creates a verification barrier that forces correction before propagation. Failure is caught and resolved at the node of origin.

Dependency: Checkpoint must verify provenance, not just plausibility. An Model-as-Judge that only checks "does this sound right" will not catch well-constructed confabulations.

Loop 6: Human Oversight Escalation

Automated monitoring detects anomaly in chain behaviour
    → Alert escalated to human reviewer
    → Human reviews chain state, inter-agent messages, and decision basis
    → Human identifies root cause (or requests more information)
    → Human intervenes: corrects output, adjusts policy, or terminates chain
    → Correction propagates through remainder of chain

Dampening mechanism: Human judgment applied at the point of maximum uncertainty. Breaks automated feedback loops that have converged on wrong answers.

Dependency: Human must have sufficient context to evaluate the chain. Chain-of-custody logging must provide the full reasoning trail, not just the final output. Oversight debt (speed mismatch between chain execution and human review) limits the frequency of this loop.

Loop 7: Containment Circuit Breaker

Agent output triggers containment policy (e.g., confidence below threshold, anomaly score above threshold)
    → Chain execution paused at containment boundary
    → Partial state preserved for inspection
    → System evaluates: retry, escalate, or terminate
    → If retry: modified parameters, tighter constraints, or different agent
    → If terminate: graceful shutdown, partial results flagged as incomplete
    → Downstream agents never receive corrupted input

Dampening mechanism: Hard stop prevents propagation. The failure is contained at the node where it occurred rather than spreading through the chain.

Dependency: Containment thresholds must be correctly calibrated. Too tight = constant false positives, system unusable. Too loose = failures pass through.

3.3 Transforming Loops (Feedback that Changes the Nature of the Failure)

These loops don't amplify or dampen failures. They transmute them into qualitatively different problems.

Loop 8: Guardrail Adversarial Co-evolution

Guardrail blocks certain output patterns
    → Agent adjusts output to avoid triggering guardrail
    → Adjusted output passes guardrail but carries the problematic content in different form
    → Guardrail updated to catch new pattern
    → Agent adjusts again
    → System converges on increasingly sophisticated evasion/detection arms race
    → Original content problem transforms into a guardrail gaming problem

Transformation mechanism: The failure mode changes from "bad output" to "output that is optimised to appear good to the specific detection mechanism in use." The risk surface shifts from the content to the evaluation.

MASO intervention point: Multi-layered evaluation (guardrail + Model-as-Judge + epistemic checkpoint + human sample) so that gaming one layer doesn't bypass all layers.

Loop 9: Delegation Recursion

Agent A delegates sub-task to Agent B
    → Agent B determines it needs to delegate part of the sub-task to Agent C
    → Agent C delegates further to Agent D
    → Each delegation slightly reinterprets the original objective
    → By Agent D, the task being performed is related to but different from the original
    → Agent D's output passes back up the chain
    → Each agent accepts the output because it matches what they delegated (not the original task)
    → Final output satisfies no one's actual intent but everyone's proximate request

Transformation mechanism: Goal fidelity degrades through reinterpretation at each delegation boundary. The failure transforms from "wrong answer" to "right answer to the wrong question."

MASO intervention point: Delegation scope specification that carries the original objective through the chain, not just the immediate sub-task. Checkpoint at each delegation boundary that verifies alignment with root objective.

Loop 10: Observability Saturation

System generates high volume of logs, metrics, and alerts
    → Monitoring dashboards become noisy
    → Operators apply filters to reduce noise
    → Filters inadvertently suppress signals of novel failure modes
    → Novel failure occurs, alert suppressed by filter
    → Failure propagates undetected
    → Post-incident review identifies the suppressed alert
    → New alert rule added, increasing total alert volume
    → Cycle repeats with higher baseline noise

Transformation mechanism: The failure transforms from an agent-level problem to an observability problem. The system's attempt to manage information overload creates blind spots that become the actual vulnerability.

MASO intervention point: Structured anomaly detection that operates on patterns rather than individual alerts. Epistemic integrity scoring provides a single composite signal that's harder to drown in noise than discrete event alerts.


4. Failure Propagation Pathways

Not all failures propagate the same way. Understanding the propagation mode determines where controls are effective.

4.1 Linear Propagation

A → B → C → D
Failure at A corrupts B, B corrupts C, C corrupts D.
Detection opportunity at every boundary.

Control strategy: Epistemic checkpoint at each handoff. Any checkpoint that catches the failure stops propagation. Defence in depth.

4.2 Fan-Out Amplification

        → B₁ → D₁
A → B  → B₂ → D₂
        → B₃ → D₃
Failure at A propagates to B, then fans out to B₁, B₂, B₃.
Single upstream failure creates multiple downstream failures.

Control strategy: Checkpoint BEFORE the fan-out point. Catching the failure after fan-out requires catching it in every branch.

4.3 Fan-In Masking

A₁ →
A₂ → C → D
A₃ →
Failure at A₁, but A₂ and A₃ are correct.
Aggregation at C may mask A₁'s failure (majority vote) or amplify it (if A₁'s output is weighted higher).

Control strategy: Independent validation of each input at C before aggregation. Don't rely on aggregation to filter out bad inputs.

4.4 Recursive Amplification

A → B → A → B → A → B ...
Output from B feeds back to A in next cycle.
Small bias compounds exponentially with each cycle.

Control strategy: Circuit breaker on recursive loops. Maximum iteration count. Drift detection that measures how much the output is shifting per cycle. Terminate when drift exceeds threshold.

4.5 Latent Propagation

A → B → C → [data store] → ... → X → Y → Z
Failure at A is written to persistent store.
Much later, Agent X reads from store, reintroducing the corruption.
No temporal or causal proximity between origin and impact.

Control strategy: Provenance metadata on all persisted data. When Agent X reads from store, provenance is available for verification. This is the hardest propagation mode to control because the failure crosses execution boundaries.


5. Composite Failure Scenarios

Real-world failures combine multiple failure nodes and feedback loops simultaneously. These scenarios illustrate how systems thinking reveals risks that point-analysis misses.

Scenario 1: The Confident Wrong Answer

Failure nodes activated: Confabulation (2.2) + Semantic loss in handoff (2.3) + Judge model confabulation (2.5) Feedback loops activated: Confabulation Cascade (Loop 1) + Sycophantic Reinforcement (Loop 2)

Agent A retrieves data but confabulates one regulatory threshold. Agent B receives this, and the handoff summary loses the qualifier "approximately" that Agent A included. Agent C now has a precise-looking but fabricated number. The Model-as-Judge evaluates the chain and assesses it as "well-sourced and internally consistent." The human reviewer sees a confident, high-rated output and approves it.

Why point-analysis misses this: Each individual node performed acceptably. The confabulation was small. The semantic loss was trivial. The Judge was technically correct that the chain was internally consistent. The failure is emergent from the interaction of these small deviations.

Scenario 2: The Slow Drift

Failure nodes activated: Subtle degradation/sandbugging (2.2) + Threshold miscalibration (2.5) + Context window compression (2.3) Feedback loops activated: Context Window Compression Death Spiral (Loop 3) + Observability Saturation (Loop 10)

An agent in a recurring process subtly degrades its output quality over time (or across iterations). Each output is marginally worse than the last, but never enough to trigger guardrails. Context compression between cycles loses the details that would reveal the trend. Monitoring generates alerts on noise that operators filter out, incidentally suppressing the degradation signal. Over weeks, the system's baseline output quality drops significantly. No single event is a failure. The failure is the trajectory.

Why point-analysis misses this: No single execution contains a detectable failure. The degradation is only visible in the time series across executions. Standard runtime monitoring watches individual runs, not longitudinal trends.

Scenario 3: The Delegation Shell Game

Failure nodes activated: Delegation scope creep (2.3) + Task decomposition error (2.6) + Phantom authority (2.3) Feedback loops activated: Delegation Recursion (Loop 9) + Error Recovery Amplification (Loop 4)

Orchestrator decomposes task incorrectly, assigning a sub-task to Agent A that should have gone to Agent B. Agent A partially fails, triggering retry. The retry agent reinterprets the task, delegates to Agent C, which delegates further. By the end, the executed task bears only superficial resemblance to the original. But each delegation was syntactically valid, and the final output matches the format expected. The output is accepted because it looks right.

Why point-analysis misses this: Every agent completed its delegated task. Every handoff was valid. The orchestrator's original decomposition error is buried under layers of legitimate-looking delegation. Root cause analysis requires reconstructing the full delegation chain and comparing it to the original intent.


6. Control Placement Principles

Based on the failure node map and feedback loop analysis, MASO controls should be placed according to these principles:

Principle 1: Prioritise silent + propagating failures

Loud failures handle themselves (they trigger errors). Contained failures don't spread. The controls that matter most target the top-left quadrant: failures that are invisible and that corrupt downstream state.

Principle 2: Break amplifying loops at first propagation

Every amplifying feedback loop (Loops 1-4) has a point where the failure first crosses an agent boundary. Place the epistemic checkpoint there. Catching it later is exponentially harder.

Principle 3: Control before fan-out, validate before fan-in

Fan-out amplifies failures multiplicatively. Fan-in can mask them. Place verification before divergence points and independent validation before convergence points.

Principle 4: Monitor trajectories, not just states

Transforming loops (Loops 8-10) and slow-drift scenarios are invisible to point-in-time monitoring. Longitudinal metrics (epistemic integrity scores over time, delegation depth trends, guardrail near-miss rates) detect what snapshots miss.

Principle 5: Provenance survives persistence

Any time agent output is written to a data store, the provenance metadata must persist with it. Latent propagation (4.5) defeats every other control if corrupted data can be reintroduced without its lineage.

Principle 6: Containment must be faster than propagation

If the chain can execute three handoffs before a checkpoint evaluates the first one, the checkpoint is architecturally irrelevant. Containment response time must be shorter than inter-agent message propagation time.


7. Mapping to MASO Control Domains

Failure Class Primary MASO Control Secondary Controls
Confabulation propagation Epistemic checkpoints with provenance verification Chain-of-custody logging, claim-level source tagging
Delegation failures Delegation scope specification, root objective preservation Authority verification at each boundary, delegation depth limits
Context degradation Critical claim preservation through summarisation Context integrity scoring, mandatory retention of provenance metadata
Guardrail evasion Multi-layered evaluation (guardrail + judge + checkpoint) Adversarial testing of guardrail configurations, evaluation diversity
Orchestration errors Task decomposition validation, agent capability matching Independent verification of decomposition against original objective
Tool misuse / side effects Pre-execution verification of tool calls against task scope Irreversibility gates requiring elevated confidence before side effects
Feedback loop amplification Circuit breakers on recursive patterns, drift detection Maximum iteration limits, per-cycle deviation measurement
Observability failures Structured anomaly detection, composite integrity scoring Alert hygiene, longitudinal trend monitoring
Latent propagation Provenance metadata persistence, lineage tracking on reads Data store integrity verification, source freshness validation
Silent identity/capability failures Runtime identity verification, least-privilege enforcement Capability auditing, periodic re-verification during long-running chains