Memory

Zep / Graphiti: Temporal Knowledge Graphs for Agent Memory

Zep's Graphiti is a temporal knowledge graph system designed for agent memory. Its standout feature: no LLM calls at retrieval time, achieving P95 latency of 300ms for memory lookups.

Core Architecture

Graphiti builds a knowledge graph where:

Nodes are entities (people, concepts, projects, tools).
Edges are relationships with temporal metadata (when the relationship was established, modified, or invalidated).
Episodes are conversation segments that sourced the knowledge.

Ingestion (Write Path - uses LLM)

Extract entities and relationships from conversation using an LLM.
Resolve entities against existing graph (deduplication + merging).
Add temporal metadata (valid_from, valid_to, source_episode).
Update graph edges and invalidate contradicted facts.

Retrieval (Read Path - no LLM)

Parse the query into entity mentions using lightweight NLP (no LLM).
Traverse the graph from matched entities.
Filter by temporal validity (only return currently-valid facts).
Rank by relevance and recency.
Return structured results with provenance.

Key Design Decisions

LLM at write, not read: Expensive extraction happens once during ingestion; retrieval is pure graph traversal.
Temporal validity: Facts have time ranges, enabling "what was true at time T?" queries.
Episode-level provenance: Every fact traces back to the conversation that created it.
Incremental updates: New information updates the graph; it doesn't rebuild it.

Performance

P95 retrieval latency: ~300ms (no LLM in the loop)
Graph queries: Sub-second even on large graphs
Write latency: Higher (1-3s) due to LLM extraction, but acceptable for background processing

Relevance to Memory Platform

Graphiti's approach informs several platform design decisions:

Separate write and read complexity: Our API can do expensive processing on create/update but keep retrieval fast.
Temporal awareness: Our created_at/updated_at + confidence scores serve a similar purpose; a graph layer could enhance this.
Provenance: Our session_id and source_type fields provide episode-level tracking.
No-LLM retrieval: Our FTS/SQL retrieval path is already LLM-free; the agent decides what to search for.

References

Zep Graphiti documentation and architecture
Temporal knowledge graphs for conversational AI
The "LLM at write, graph at read" pattern is increasingly common in production systems