Architecture
cortex-engine is a cognitive layer that sits between your AI agent and its persistent memory. It handles storage, embeddings, retrieval, and tool routing — all exposed via the Model Context Protocol.
System Overview
Your Agent (Claude, GPT, Gemini, etc.)
|
| MCP tools: observe, query, believe, wander, dream...
v
+------------------------------------------+
| cortex-engine |
| |
| cognitive/ ← FSRS, dream, graph walk |
| engines/ ← consolidation, retrieval |
| stores/ ← SQLite | Firestore |
| providers/ ← embeddings + LLM |
| mcp/ ← tool definitions |
+------------------------------------------+
| |
Storage Embeddings
(SQLite / (built-in /
Firestore) Ollama /
OpenAI / Vertex)
Module Layout
| Module | Role |
|--------|------|
| core | Foundational types, config, and shared utilities |
| engines | Cognitive processing: memory consolidation, FSRS, graph traversal |
| stores | Persistence layer — SQLite (local) and Firestore (cloud) |
| mcp | MCP server and tool definitions |
| cognitive | Higher-order cognitive operations: dream, wander, validate |
| triggers | Scheduled and event-driven triggers |
| bridges | Adapters for external services and APIs |
| providers | Embedding and LLM provider implementations |
| bin | Entry points: serve.js (HTTP + MCP), cli.js (admin CLI) |
Storage Layer
cortex-engine supports two storage backends, switchable via config with no code changes.
SQLite (default)
Local file storage. Zero config. The database lives at .fozikio/cortex.db by default.
Good for: local development, single-agent setups, offline use.
npx fozikio config --store sqlite
Firestore
Cloud-native, real-time sync. Requires a GCP project with Firestore enabled and a service account key.
Good for: production agents, multi-device use, teams sharing agent memory.
npx fozikio config --store firestore
See Deployment for full Firestore setup.
Embedding Providers
Embeddings power semantic search — they turn text into vectors that can be compared by meaning, not just keywords.
| Provider | Quality | Setup | Cost |
|----------|---------|-------|------|
| built-in | Good | Zero config | Free |
| ollama | Better | Ollama installed locally | Free |
| openai | Excellent | API key | Pay per use |
| vertex | Excellent | GCP project | Pay per use |
Built-in uses all-MiniLM-L6-v2 bundled with the package — no external dependencies. It's the default and works well for most use cases.
Ollama is the recommended upgrade path. Pull nomic-embed-text or mxbai-embed-large for noticeably better semantic search.
FSRS Scheduling
Memories don't decay uniformly. cortex-engine uses FSRS (Free Spaced Repetition Scheduler) — the same algorithm used by Anki — to manage salience scores.
Key behaviors:
- New observations start with moderate salience
- Retrieved memories get a salience boost (like practicing a flashcard)
- Unretrieved memories decay over time — but the decay rate depends on consolidation state
- Consolidated memories (post-dream) decay slower than raw observations
This means your agent's memory works like human memory: frequently used knowledge stays accessible, dormant knowledge fades but isn't gone.
Dream Consolidation
dream() is the most important long-running operation. It runs a two-phase consolidation pipeline modeled on biological sleep:
NREM Phase — Compression
- Cluster — group semantically similar observations using locally-adaptive thresholds that respect embedding space curvature (information geometry)
- Refine — within each cluster, identify the most salient representative observations
- Create — generate consolidated memory nodes that abstract across the cluster
REM Phase — Integration
- Connect — link new consolidated memories to existing ones via graph edges
- Score — update salience scores based on connection density and goal alignment
- Abstract — create higher-order abstractions where meaningful patterns emerge
The result: individual observations become interconnected knowledge. An agent that runs dream() regularly develops genuine expertise over time.
# Run consolidation manually
dream()
# Or via CLI
npx fozikio maintain fix
Graph Structure and Retrieval
Memories are stored as a graph, not a flat list. Every observation is a node; link() and the dream process create edges between related nodes.
Spreading Activation
When you call query(), cortex-engine doesn't just find the most similar vectors. It runs spreading activation from the matched nodes — traversing the graph to surface connected memories that wouldn't rank highly on vector similarity alone.
This is how wander() works: it starts from a seed node and follows edges through the memory graph, surfacing dormant connections and unexpected neighbors.
Thousand Brains Voting
For complex queries, cortex-engine uses multi-anchor Thousand Brains voting: multiple anchor points in the graph each cast weighted votes for candidate memories. The final ranking reflects consensus across anchors, not just proximity to a single query vector.
GNN Neighborhood Aggregation
Deep retrieval uses graph neural network-style neighborhood aggregation — a memory's effective representation includes weighted contributions from its neighbors, making retrieval context-aware.
Belief Tracking
believe() stores positions, not just facts. Beliefs differ from observations in two key ways:
- Contradiction detection — when new evidence contradicts a belief, cortex-engine flags it rather than silently overwriting
- Temporal tracking — beliefs have a history. You can see when a position was held and what changed it
validate() checks a claim against all existing memories and beliefs, returning supporting and contradicting evidence with confidence scores.
predict() generates forward predictions from current beliefs — useful for testing whether an agent's worldview is consistent.
Goal-Directed Cognition
goal_set() creates a desired future state in the memory graph. Goals have a special property: they generate forward prediction error — a signal that biases dream consolidation and wander() toward memories and connections relevant to the goal.
Practically: an agent with an active goal will surface more goal-relevant memories during consolidation, and wander() will tend toward goal-adjacent territory.
Graph Health
cortex-engine tracks two graph-level health metrics:
-
Fiedler value (algebraic connectivity) — measures how well-integrated the knowledge graph is. Low Fiedler value means the graph has isolated clusters; knowledge isn't connecting.
healthCLI command reports this. -
PE saturation — prediction error saturation. If an agent's beliefs stop generating meaningful prediction errors, it's a sign the identity model has become too rigid. The
vitals_get()tool surfaces this signal.
Namespaces
Every agent operates in a namespace — a scoped partition of the memory store. This is how multi-agent isolation works:
- SQLite: separate tables per namespace within the same database file
- Firestore: separate document collections per namespace
Agents share the same embedding engine and LLM provider configuration but have completely independent memory graphs.
# Serve a specific agent namespace
npx fozikio serve --agent researcher
Safety Rules (Reflex)
cortex-engine ships with Reflex rules — portable YAML guardrails that install into your workspace:
| Rule | When It Fires | What It Does |
|------|--------------|-------------|
| cognitive-grounding | prompt submit | Nudges the agent to call query() before evaluation, design, or creation |
| observe-first | file write/edit | Warns if writing to memory directories without querying first |
| note-about-doing | prompt submit | Suggests capturing new threads of thought with thread_create() |
Rules live in reflex-rules/ as standard YAML. They work with Claude Code, Cursor, Codex, or any Reflex-compatible runtime.