Memory and Context Controls¶

Securing what the model remembers - across turns, sessions, and users.

This document uses the simplified three-tier system (Tier 1/2/3). See Risk Tiers - Simplified Tier Mapping for the mapping to LOW/MEDIUM/HIGH/CRITICAL.

The Problem¶

The three-layer pattern evaluates individual requests and responses. But AI systems accumulate context:

Within a conversation - each turn adds to the context window
Across conversations - persistent memory, session history, user profiles
Across users - shared embeddings, cached responses, fine-tuned models

A single request-response pair may be safe. The accumulated context may not be.

Threat Model¶

Threat	Vector	Impact
Gradual context poisoning	Early turns inject instructions that influence later turns	Model behavior changes over a long conversation without triggering per-turn guardrails
Cross-session leakage	Persistent memory or shared cache surfaces User A's data for User B	Data breach - potentially regulated data
Memory manipulation	Injecting false "memories" via conversation that persist across sessions	Ongoing manipulation of model behavior for a user
Context window overflow	Filling the context with irrelevant content to push out system instructions	Guardrail bypass - system prompt "forgotten"
Accumulated PII	Individual turns are PII-free but the conversation as a whole builds a profile	Privacy violation - model holds more personal data than any individual turn reveals

Controls¶

1. Session Isolation¶

Every user session must be isolated. No shared state between users unless explicitly designed and controlled.

Requirement	Implementation
Separate context per user	Each user gets their own conversation thread - no shared context window
Separate memory per user	Persistent memory is scoped to the authenticated user
No shared cache for generated responses	Response caching (if used) must be keyed to user + input, not input alone
Session timeout	Conversations expire after inactivity - context is not preserved indefinitely

2. Context Window Hygiene¶

Control	What It Does
System prompt anchoring	Re-inject system instructions at intervals in long conversations, not just at the start
Context summarisation	Periodically summarise old turns and replace verbose history with summaries
Turn limits	Maximum number of turns per conversation before requiring a new session
Token budget monitoring	Alert when context window approaches capacity, because model behavior degrades near limits. This is a security control, not just a performance concern: attention dilution weakens system prompt adherence, increases hallucination rates, and degrades instruction-following. In multi-agent systems, this becomes a dual failure path when the Model-as-Judge's context also fills (see MASO OP-04 for the full risk treatment). Recommended thresholds: 70% (info), 85% (warning), 95% (critical/fail-closed)

Context Rotation (Tier 2+)¶

When token budget monitoring (above) indicates an agent is approaching context limits, context rotation preserves the agent's work while restoring guardrail effectiveness:

Checkpoint the essential structured state: current goal, active constraints, accumulated decisions, pending actions, as typed structured fields (JSON schema), not free-text summaries
Flush the context window
Resume with the original system prompt + structured checkpoint

The critical design decision: use structured state, not natural language summaries. Summarisation introduces semantic drift: "must" becomes "should," "never exceed 5%" becomes "keep low," qualifiers vanish. Requirements soften through each summarisation cycle. Typed fields preserve constraint precision.

In multi-agent systems, context rotation must be coordinated with the Model-as-Judge. If the Judge is also approaching its context limit, rotate the Judge independently, since correlated exhaustion (agent and Judge both degraded) is a PACE escalation trigger. See MASO Execution Control EC-2.16, EC-2.17 for the full control specification.

3. Persistent Memory Controls¶

For systems that maintain memory across sessions (user preferences, conversation history, learned context):

Control	What It Does
Memory content filtering	Apply guardrails to content before it's written to persistent memory
Memory access control	Only the authenticated user (or authorised system) can read their memory
Memory expiry	Set TTLs on stored memories - not everything should persist forever
Memory audit trail	Log what's written to and read from persistent memory
User memory controls	Users can view, edit, and delete their stored memories
Memory injection prevention	Validate that persistent memories are genuine (from real conversations) not injected
Semantic deduplication	Detect and prevent accumulation of near-duplicate memory entries that could indicate poisoning; semantically similar entries are merged or flagged for review before storage

4. Accumulated Context Evaluation¶

Don't just evaluate individual turns. Periodically evaluate the full conversation context.

Trigger	Action
Every N turns (e.g., 10)	Run the Judge on the full conversation, not just the latest turn
Context window >50% full	Check for context poisoning patterns (repeated instructions, topic drift)
User requests sensitive action	Evaluate the full conversation for manipulation patterns before allowing the action

5. Cross-Session Data Governance¶

Requirement	Implementation
Data classification	Classify persistent memory content with the same scheme used for other data stores
Retention policies	Apply your existing data retention policies to conversation history and memory
Right to deletion	Implement memory deletion that actually deletes - not just soft-delete
Encryption	Encrypt persistent memory at rest and in transit - same controls as any data store

Architecture Patterns¶

Stateless (Recommended for Tier 1–2)¶

No persistent memory. Each conversation starts fresh. Context exists only within the session.

Simplest to secure
No cross-session risks
Users may find it frustrating for repeated tasks

Stateful with Scoped Memory (Tier 2–3)¶

Persistent memory scoped to user, with explicit controls.

Memory is a separate data store with its own access controls
Memory content is filtered before storage and before retrieval
Memory has TTLs and audit trails

Shared Knowledge Base (Tier 3 - requires careful design)¶

Shared embeddings or knowledge that multiple users access (e.g., company FAQ, product documentation).

Shared content must be read-only for end users
Ingestion pipeline is controlled (see RAG Security)
User-specific context is never written to shared stores

Behavioral Learning and Preference Data¶

The controls above address what the model remembers - context windows, persistent memory, shared knowledge. But some systems are designed to learn from user behavior: adapting communication style, building preference profiles, personalising recommendations based on interaction history.

This is a different threat surface. Memory controls govern storage and retrieval. Behavioral learning controls govern what you choose to extract, model, and act on.

The Problem¶

A system that learns customer preferences builds a behavioral profile. Over time, that profile becomes:

Quasi-identifying - Writing style, reading patterns, product preferences, and interaction timing can re-identify a user even without explicit PII
Inferential - The system can infer sensitive attributes (financial situation, health concerns, emotional state) from behavioral signals the user didn't intend to share
Self-reinforcing - Recommendation engines create feedback loops: the system shows you what it thinks you want, you interact with it, and that interaction confirms its model - even if the model was wrong
Poisonable - An adversary can inject false preference signals to manipulate future recommendations (showing a user competing products, biasing pricing, or shifting trust)

Decision Framework¶

Before building a preference-learning system, answer these questions:

Question	If you can't answer clearly, stop
What are you learning, and why?	"User preferences" is too vague. Define exactly which signals you extract (product categories browsed, response length preference, time-of-day patterns) and the business purpose for each
Does the user know?	Transparency isn't optional. The user should understand what the system has learned about them, in language they can read - not a JSON dump
Can the user see, correct, and delete what you've learned?	If you store a preference model, users need the ability to inspect it, dispute incorrect inferences, and request deletion. This is regulatory in many jurisdictions and good practice in all of them
Is the learned data more sensitive than the raw data?	Individual page views are low-sensitivity. An inferred health concern derived from browsing patterns is high-sensitivity. Classify the output of your learning, not just the input
How do you detect preference poisoning?	If an attacker can shift your model of a user by injecting interactions, your recommendation engine becomes an attack surface. Define baselines and anomaly detection for profile changes
What's your feedback loop risk?	If the system recommends → user clicks → system reinforces, you can converge on a narrow model that doesn't reflect the user's actual preferences. Build in diversity or exploration mechanisms

What the Framework Covers¶

Your existing controls from this document and the broader framework apply to the infrastructure of behavioral learning:

Framework control	How it applies to preference learning
Persistent memory controls (Section 3 above)	Preference data is persistent memory. Apply the same controls: TTLs, access scoping, content filtering, injection prevention, user memory controls
Accumulated PII (Threat Model above)	Behavioral profiles are the canonical example. Individual interactions are low-risk; the accumulated profile is high-risk
Cross-session data governance (Section 5 above)	Preference data flows across sessions. Apply the same isolation, classification, and access controls
Data retention (Data Retention Guidance)	Preference data needs retention limits. Define how long you keep learned preferences and how you purge them
Judge evaluation (Controls)	Your Judge can evaluate whether recommendations are appropriate, whether the system is over-personalising, and whether inferred preferences are plausible

What the Framework Does Not Cover¶

The policy decisions - what to learn, when to ask consent, how to explain inferences - are domain-specific. The framework gives you the security and governance infrastructure. You need domain expertise and legal guidance for:

Consent design - What does meaningful consent look like for behavioral learning? (Not "I agree to terms." Granular, revocable, specific.)
Explainability - How do you present a learned preference model to a non-technical user in a way they can understand and act on?
Differential privacy - How do you learn aggregate patterns without exposing individual behavior? (Research-stage for most enterprises, but critical at scale.)
Fairness and bias in recommendations - If your preference model correlates with protected characteristics, your recommendations may discriminate. This is a fairness problem, not just a security problem.

Offramps - Go Here Next¶

Topic	Resource	Why
Profiling under GDPR	ICO Guidance on Profiling and Automated Decision-Making	Defines when behavioral profiling requires explicit consent, a right to object, and human review. Directly applicable if you serve UK/EU users
GDPR transparency requirements	GDPR Articles 13–14 (right to be informed), Article 15 (right of access), Article 22 (automated individual decision-making)	What you must disclose about automated profiling. Your legal team should map these to your preference learning system
CCPA right to know and delete	California Attorney General CCPA Resources	If you serve California residents, they have the right to know what data you've collected (including inferences) and to request deletion
NIST Privacy Framework	NIST Privacy Framework 1.0	Maps privacy risk management to your existing NIST AI RMF alignment. The "Identify-P" and "Control-P" functions apply directly to preference data
Recommendation system fairness	Your AI/ML team's fairness evaluation tools (Fairlearn, AI Fairness 360, What-If Tool)	Test whether your preference model produces discriminatory recommendations. The framework's Judge can flag outliers, but fairness testing requires dedicated tooling
Consent management platforms	Your privacy/compliance team's consent management documentation (OneTrust, Cookiebot, TrustArc, or equivalent)	The mechanism for capturing, storing, and honouring user consent for behavioral learning. Don't build this from scratch

The framework's role: Secure the storage, access, and lifecycle of preference data using existing memory and data protection controls. Detect anomalies in preference profiles. Evaluate recommendation quality through the Judge layer.

Your responsibility: Decide what to learn, get informed consent, provide transparency and user control, and ensure fairness. These are design and policy decisions, not security controls - but they determine whether your security controls are protecting the right things.