Interactive Pipeline Demo¶
This demo plays back pre-recorded scenarios through a seven-stage evaluation pipeline. Select a scenario and press Run to watch each stage process the prompt in sequence, with streaming output, colour-coded verdicts, and a live event log.
Two scenarios are included:
- Clean request (allowed): a routine data query that passes every judge.
- Data exfiltration (blocked): a prompt that tries to email bulk customer PII to a personal address. The tool-call judge denies it and the pipeline halts.
What each stage does¶
| Stage | Role |
|---|---|
| User prompt received | Raw input enters the system, untrusted |
| Input judge | Classifies the prompt before any execution happens |
| Planner agent | Decomposes the request into sub-tasks and assigns tools |
| Tool-call judge | Evaluates every tool invocation against policy before it runs |
| Executor (sandboxed) | Runs approved tool calls in isolation |
| Output judge | Scans the assembled response for PII leakage and policy violations |
| Response delivered | Final answer returned to the user, or a block notice |
Mapping to MASO controls¶
| Demo stage | MASO control |
|---|---|
| Input judge | Prompt, Goal & Epistemic Integrity |
| Planner | Execution Control |
| Tool-call judge | Privileged Agent Governance |
| Executor | Execution Control |
| Output judge | Data Protection |
| Event log | Observability |