Memory Privacy & Data Protection
How to protect personal and organizational memory data in LLM agent systems — covering both academic research and engineering practice.
Research Papers
MEXTRA: Memory Extraction Attacks on LLM Agents
Paper: 2502.13172
Black-box prompt injection that leaks stored private memories from agent systems. The attack constructs adversarial queries that cause the agent to return raw memory content, bypassing access controls. Key findings:
- Attack success rate varies by retrieval mechanism — vector similarity is more vulnerable than keyword-based retrieval
- Multi-turn attacks are harder to detect than single-turn
- Implication for us: Every retrieval query must be tenant-scoped at the SQL level, not just the application level. Our
WHERE tenant_type = $1 AND tenant_id = $2pattern is the correct defense — it prevents cross-tenant leakage regardless of prompt injection
MaRS & FiFA: Privacy-Aware Forgetting Policies
Paper: 2512.12856
Hybrid forgetting policies that balance memory retention with privacy. The MaRS (Memory-Augmented Retrieval System) uses time-decay combined with importance weighting, achieving a composite score of 0.911 while preserving privacy constraints.
- Time decay: Older memories get reduced confidence unless frequently accessed
- Importance weighting: High-access, high-confidence memories resist decay
- FiFA (Forgetting-in-Favor-of-Accuracy): Selective forgetting that removes noise while keeping signal
- Implementation mapping: Our consolidation layer directly implements this pattern —
confidence -= 0.15for stale memories, floor at 0.1, with promotion for high-access memories
DMAS Privacy: Distributed Multi-Agent Memory
Paper: 2601.07978
Privacy in distributed multi-agent systems. Compares Mem0 and Graphiti under network constraints and privacy requirements.
- Federated memory: Each agent maintains local memory, sharing only aggregated insights
- Differential privacy for shared learning: Adding calibrated noise when propagating learning across tenants
- Relevance: Our Layer 6 (layered learning) propagates only anonymized aggregates (arm selection counts), never memory content — this is the correct approach per DMAS findings
A-MemGuard: Proactive Memory Poisoning Defense
Paper: 2510.02373
Defense against memory poisoning attacks via consensus validation and dual-memory architecture.
- Consensus validation: New memories must be consistent with existing knowledge before integration
- Dual-memory: Separate working memory (volatile) from long-term memory (validated)
- Anomaly detection: Flag memories that significantly deviate from established patterns
- Potential implementation: Add a validation step to memory creation that checks semantic consistency with existing memories in the same project
Engineering Practices
1. Tenant Isolation
Every SQL query is scoped by (tenant_type, tenant_id). This is enforced at the data access layer, not the application layer — there is no code path that can query memories without tenant scoping.
-- Every query follows this pattern SELECT * FROM memories WHERE tenant_type = $1 AND tenant_id = $2 AND ...
Cross-tenant data leakage is structurally impossible at the database level.
2. Encryption at Rest
- Neon Postgres: Storage encrypted with AES-256 by the hosting provider
- Cloudflare R2: Objects encrypted at rest by default
- Application-level encryption: Not yet implemented. Potential improvement: encrypt sensitive memory content with tenant-specific keys before storage
3. Encryption in Transit
All connections use TLS:
- Hyperdrive to Neon Postgres (TLS 1.3)
- Workers to R2 (internal Cloudflare network, encrypted)
- Client to API (HTTPS required)
4. Access Control
Multi-layered authentication:
- Web users: Clerk JWT session tokens, verified against JWKS endpoint
- API clients: Scoped API tokens (Bearer auth)
- MCP clients: OAuth 2.1 PKCE flow with dynamic client registration
- Cron jobs: Internal system identity, no external credentials
5. Memory Lifecycle & Right to Restriction
The state field on memories supports GDPR-compatible lifecycle management:
| State | Behavior |
|---|---|
active | Normal retrieval, visible in all queries |
quarantined | Excluded from retrieval, preserved for audit |
superseded | Replaced by a newer memory, kept for history |
Quarantine supports the GDPR "right to restriction of processing" — data remains but is excluded from active use.
6. Audit Trail
The memory_events table logs every significant action:
- Memory creation, update, deletion
- Derivation (regex and LLM)
- Consolidation (merge, decay, promote)
- Graph extraction
- Arena evaluation
Each event records: actor ID, timestamp, event type, and event data. The table intentionally has no foreign key to memories — audit history outlives the data it describes.
7. PII Detection (Planned)
Not yet implemented. Potential approach:
- Scan memory content on write for PII patterns (email, phone, SSN, credit card)
- Auto-tag detected PII with
tags: ["pii:email", "pii:phone"] - Optional: auto-quarantine memories with sensitive PII
- Use regex patterns for structured PII, LLM classification for unstructured
8. Right to Deletion
Current implementation:
- Soft delete: State changed to
superseded, memory excluded from retrieval - Audit tombstone: Memory events preserved even after memory deletion
Potential improvement:
- Hard delete cascade: Remove memory + embeddings + entity links, leave only audit tombstone
- Tenant wipe: Delete all data for a tenant (GDPR "right to erasure")
9. Embedding Privacy
Embeddings are one-way projections but can leak information:
- Nearest-neighbor attacks: Querying the embedding space to reconstruct original text
- Membership inference: Determining if specific content exists in the memory store
Mitigations:
- Per-tenant embedding isolation: Embeddings are scoped by
(tenant_type, tenant_id)— vector search cannot cross tenant boundaries - Differential privacy noise (planned): Add calibrated Gaussian noise to embeddings before storage
- Model choice: text-embedding-3-small produces lower-dimensional representations that are harder to invert
10. Cross-Tenant Learning Safety
Layer 6 (layered learning) propagates knowledge across the hierarchy: global, tenant, project.
Safety guarantees:
- Only anonymized aggregates are propagated (arm selection counts, average confidence scores)
- No memory content, titles, or metadata crosses tenant boundaries
- Learning propagation uses
platform_metricstable withscopeisolation - A tenant's learning only affects their own new projects' defaults
Threat Model
| Threat | Current Mitigation | Gap |
|---|---|---|
| Cross-tenant data leakage | SQL-level tenant scoping | None — structurally prevented |
| Prompt injection memory extraction | Tenant-scoped retrieval | Consider rate limiting on retrieval |
| Memory poisoning | None | Consensus validation (A-MemGuard) |
| Embedding inversion | Per-tenant isolation | Differential privacy noise |
| PII in memory content | None | PII detection on write |
| Insider threat (API key compromise) | Scoped API tokens | Key rotation, usage monitoring |
| Stale sensitive data | Quarantine state | Hard delete cascade |
References
- MEXTRA: arxiv.org/abs/2502.13172
- MaRS/FiFA: arxiv.org/abs/2512.12856
- DMAS Privacy: arxiv.org/abs/2601.07978
- A-MemGuard: arxiv.org/abs/2510.02373
- OWASP Top 10 for LLM Applications: owasp.org/www-project-top-10-for-large-language-model-applications