Memory

Memory Privacy & Data Protection

How to protect personal and organizational memory data in LLM agent systems — covering both academic research and engineering practice.

Research Papers

MEXTRA: Memory Extraction Attacks on LLM Agents

Paper: 2502.13172

Black-box prompt injection that leaks stored private memories from agent systems. The attack constructs adversarial queries that cause the agent to return raw memory content, bypassing access controls. Key findings:

Attack success rate varies by retrieval mechanism — vector similarity is more vulnerable than keyword-based retrieval
Multi-turn attacks are harder to detect than single-turn
Implication for us: Every retrieval query must be tenant-scoped at the SQL level, not just the application level. Our WHERE tenant_type = $1 AND tenant_id = $2 pattern is the correct defense — it prevents cross-tenant leakage regardless of prompt injection

MaRS & FiFA: Privacy-Aware Forgetting Policies

Paper: 2512.12856

Hybrid forgetting policies that balance memory retention with privacy. The MaRS (Memory-Augmented Retrieval System) uses time-decay combined with importance weighting, achieving a composite score of 0.911 while preserving privacy constraints.

Time decay: Older memories get reduced confidence unless frequently accessed
Importance weighting: High-access, high-confidence memories resist decay
FiFA (Forgetting-in-Favor-of-Accuracy): Selective forgetting that removes noise while keeping signal
Implementation mapping: Our consolidation layer directly implements this pattern — confidence -= 0.15 for stale memories, floor at 0.1, with promotion for high-access memories

DMAS Privacy: Distributed Multi-Agent Memory

Paper: 2601.07978

Privacy in distributed multi-agent systems. Compares Mem0 and Graphiti under network constraints and privacy requirements.

Federated memory: Each agent maintains local memory, sharing only aggregated insights
Differential privacy for shared learning: Adding calibrated noise when propagating learning across tenants
Relevance: Our Layer 6 (layered learning) propagates only anonymized aggregates (arm selection counts), never memory content — this is the correct approach per DMAS findings

A-MemGuard: Proactive Memory Poisoning Defense

Paper: 2510.02373

Defense against memory poisoning attacks via consensus validation and dual-memory architecture.

Consensus validation: New memories must be consistent with existing knowledge before integration
Dual-memory: Separate working memory (volatile) from long-term memory (validated)
Anomaly detection: Flag memories that significantly deviate from established patterns
Potential implementation: Add a validation step to memory creation that checks semantic consistency with existing memories in the same project

Engineering Practices

1. Tenant Isolation

Every SQL query is scoped by (tenant_type, tenant_id). This is enforced at the data access layer, not the application layer — there is no code path that can query memories without tenant scoping.

-- Every query follows this pattern
SELECT * FROM memories
WHERE tenant_type = $1 AND tenant_id = $2
  AND ...

Cross-tenant data leakage is structurally impossible at the database level.

2. Encryption at Rest

Neon Postgres: Storage encrypted with AES-256 by the hosting provider
Cloudflare R2: Objects encrypted at rest by default
Application-level encryption: Not yet implemented. Potential improvement: encrypt sensitive memory content with tenant-specific keys before storage

3. Encryption in Transit

All connections use TLS:

Hyperdrive to Neon Postgres (TLS 1.3)
Workers to R2 (internal Cloudflare network, encrypted)
Client to API (HTTPS required)

4. Access Control

Multi-layered authentication:

Web users: Clerk JWT session tokens, verified against JWKS endpoint
API clients: Scoped API tokens (Bearer auth)
MCP clients: OAuth 2.1 PKCE flow with dynamic client registration
Cron jobs: Internal system identity, no external credentials

5. Memory Lifecycle & Right to Restriction

The state field on memories supports GDPR-compatible lifecycle management:

State	Behavior
`active`	Normal retrieval, visible in all queries
`quarantined`	Excluded from retrieval, preserved for audit
`superseded`	Replaced by a newer memory, kept for history

Quarantine supports the GDPR "right to restriction of processing" — data remains but is excluded from active use.

6. Audit Trail

The memory_events table logs every significant action:

Memory creation, update, deletion
Derivation (regex and LLM)
Consolidation (merge, decay, promote)
Graph extraction
Arena evaluation

Each event records: actor ID, timestamp, event type, and event data. The table intentionally has no foreign key to memories — audit history outlives the data it describes.

7. PII Detection (Planned)

Not yet implemented. Potential approach:

Scan memory content on write for PII patterns (email, phone, SSN, credit card)
Auto-tag detected PII with tags: ["pii:email", "pii:phone"]
Optional: auto-quarantine memories with sensitive PII
Use regex patterns for structured PII, LLM classification for unstructured

8. Right to Deletion

Current implementation:

Soft delete: State changed to superseded, memory excluded from retrieval
Audit tombstone: Memory events preserved even after memory deletion

Potential improvement:

Hard delete cascade: Remove memory + embeddings + entity links, leave only audit tombstone
Tenant wipe: Delete all data for a tenant (GDPR "right to erasure")

9. Embedding Privacy

Embeddings are one-way projections but can leak information:

Nearest-neighbor attacks: Querying the embedding space to reconstruct original text
Membership inference: Determining if specific content exists in the memory store

Mitigations:

Per-tenant embedding isolation: Embeddings are scoped by (tenant_type, tenant_id) — vector search cannot cross tenant boundaries
Differential privacy noise (planned): Add calibrated Gaussian noise to embeddings before storage
Model choice: text-embedding-3-small produces lower-dimensional representations that are harder to invert

10. Cross-Tenant Learning Safety

Layer 6 (layered learning) propagates knowledge across the hierarchy: global, tenant, project.

Safety guarantees:

Only anonymized aggregates are propagated (arm selection counts, average confidence scores)
No memory content, titles, or metadata crosses tenant boundaries
Learning propagation uses platform_metrics table with scope isolation
A tenant's learning only affects their own new projects' defaults

Threat Model

Threat	Current Mitigation	Gap
Cross-tenant data leakage	SQL-level tenant scoping	None — structurally prevented
Prompt injection memory extraction	Tenant-scoped retrieval	Consider rate limiting on retrieval
Memory poisoning	None	Consensus validation (A-MemGuard)
Embedding inversion	Per-tenant isolation	Differential privacy noise
PII in memory content	None	PII detection on write
Insider threat (API key compromise)	Scoped API tokens	Key rotation, usage monitoring
Stale sensitive data	Quarantine state	Hard delete cascade

References

MEXTRA: arxiv.org/abs/2502.13172
MaRS/FiFA: arxiv.org/abs/2512.12856
DMAS Privacy: arxiv.org/abs/2601.07978
A-MemGuard: arxiv.org/abs/2510.02373
OWASP Top 10 for LLM Applications: owasp.org/www-project-top-10-for-large-language-model-applications