Architecture

cortex-engine is a cognitive layer that sits between your AI agent and its persistent memory. It handles storage, embeddings, retrieval, and tool routing — all exposed via the Model Context Protocol.

System Overview

Your Agent (Claude, GPT, Gemini, etc.)
    |
    | MCP tools: observe, query, believe, wander, dream...
    v
+------------------------------------------+
|           cortex-engine                  |
|                                          |
|  cognitive/   ← FSRS, dream, graph walk  |
|  engines/     ← consolidation, retrieval |
|  stores/      ← SQLite | Firestore       |
|  providers/   ← embeddings + LLM        |
|  mcp/         ← tool definitions         |
+------------------------------------------+
         |               |
    Storage          Embeddings
   (SQLite /         (built-in /
   Firestore)         Ollama /
                    OpenAI / Vertex)

Module Layout

| Module | Role | |--------|------| | core | Foundational types, config, and shared utilities | | engines | Cognitive processing: memory consolidation, FSRS, graph traversal | | stores | Persistence layer — SQLite (local) and Firestore (cloud) | | mcp | MCP server and tool definitions | | cognitive | Higher-order cognitive operations: dream, wander, validate | | triggers | Scheduled and event-driven triggers | | bridges | Adapters for external services and APIs | | providers | Embedding and LLM provider implementations | | bin | Entry points: serve.js (HTTP + MCP), cli.js (admin CLI) |

Storage Layer

cortex-engine supports two storage backends, switchable via config with no code changes.

SQLite (default)

Local file storage. Zero config. The database lives at .fozikio/cortex.db by default.

Good for: local development, single-agent setups, offline use.

npx fozikio config --store sqlite

Firestore

Cloud-native, real-time sync. Requires a GCP project with Firestore enabled and a service account key.

Good for: production agents, multi-device use, teams sharing agent memory.

npx fozikio config --store firestore

See Deployment for full Firestore setup.

Embedding Providers

Embeddings power semantic search — they turn text into vectors that can be compared by meaning, not just keywords.

| Provider | Quality | Setup | Cost | |----------|---------|-------|------| | built-in | Good | Zero config | Free | | ollama | Better | Ollama installed locally | Free | | openai | Excellent | API key | Pay per use | | vertex | Excellent | GCP project | Pay per use |

Built-in uses all-MiniLM-L6-v2 bundled with the package — no external dependencies. It's the default and works well for most use cases.

Ollama is the recommended upgrade path. Pull nomic-embed-text or mxbai-embed-large for noticeably better semantic search.

FSRS Scheduling

Memories don't decay uniformly. cortex-engine uses FSRS (Free Spaced Repetition Scheduler) — the same algorithm used by Anki — to manage salience scores.

Key behaviors:

  • New observations start with moderate salience
  • Retrieved memories get a salience boost (like practicing a flashcard)
  • Unretrieved memories decay over time — but the decay rate depends on consolidation state
  • Consolidated memories (post-dream) decay slower than raw observations

This means your agent's memory works like human memory: frequently used knowledge stays accessible, dormant knowledge fades but isn't gone.

Dream Consolidation

dream() is the most important long-running operation. It runs a two-phase consolidation pipeline modeled on biological sleep:

NREM Phase — Compression

  1. Cluster — group semantically similar observations using locally-adaptive thresholds that respect embedding space curvature (information geometry)
  2. Refine — within each cluster, identify the most salient representative observations
  3. Create — generate consolidated memory nodes that abstract across the cluster

REM Phase — Integration

  1. Connect — link new consolidated memories to existing ones via graph edges
  2. Score — update salience scores based on connection density and goal alignment
  3. Abstract — create higher-order abstractions where meaningful patterns emerge

The result: individual observations become interconnected knowledge. An agent that runs dream() regularly develops genuine expertise over time.

# Run consolidation manually
dream()

# Or via CLI
npx fozikio maintain fix

Graph Structure and Retrieval

Memories are stored as a graph, not a flat list. Every observation is a node; link() and the dream process create edges between related nodes.

Spreading Activation

When you call query(), cortex-engine doesn't just find the most similar vectors. It runs spreading activation from the matched nodes — traversing the graph to surface connected memories that wouldn't rank highly on vector similarity alone.

This is how wander() works: it starts from a seed node and follows edges through the memory graph, surfacing dormant connections and unexpected neighbors.

Thousand Brains Voting

For complex queries, cortex-engine uses multi-anchor Thousand Brains voting: multiple anchor points in the graph each cast weighted votes for candidate memories. The final ranking reflects consensus across anchors, not just proximity to a single query vector.

GNN Neighborhood Aggregation

Deep retrieval uses graph neural network-style neighborhood aggregation — a memory's effective representation includes weighted contributions from its neighbors, making retrieval context-aware.

Belief Tracking

believe() stores positions, not just facts. Beliefs differ from observations in two key ways:

  1. Contradiction detection — when new evidence contradicts a belief, cortex-engine flags it rather than silently overwriting
  2. Temporal tracking — beliefs have a history. You can see when a position was held and what changed it

validate() checks a claim against all existing memories and beliefs, returning supporting and contradicting evidence with confidence scores.

predict() generates forward predictions from current beliefs — useful for testing whether an agent's worldview is consistent.

Goal-Directed Cognition

goal_set() creates a desired future state in the memory graph. Goals have a special property: they generate forward prediction error — a signal that biases dream consolidation and wander() toward memories and connections relevant to the goal.

Practically: an agent with an active goal will surface more goal-relevant memories during consolidation, and wander() will tend toward goal-adjacent territory.

Graph Health

cortex-engine tracks two graph-level health metrics:

  • Fiedler value (algebraic connectivity) — measures how well-integrated the knowledge graph is. Low Fiedler value means the graph has isolated clusters; knowledge isn't connecting. health CLI command reports this.

  • PE saturation — prediction error saturation. If an agent's beliefs stop generating meaningful prediction errors, it's a sign the identity model has become too rigid. The vitals_get() tool surfaces this signal.

Namespaces

Every agent operates in a namespace — a scoped partition of the memory store. This is how multi-agent isolation works:

  • SQLite: separate tables per namespace within the same database file
  • Firestore: separate document collections per namespace

Agents share the same embedding engine and LLM provider configuration but have completely independent memory graphs.

# Serve a specific agent namespace
npx fozikio serve --agent researcher

Safety Rules (Reflex)

cortex-engine ships with Reflex rules — portable YAML guardrails that install into your workspace:

| Rule | When It Fires | What It Does | |------|--------------|-------------| | cognitive-grounding | prompt submit | Nudges the agent to call query() before evaluation, design, or creation | | observe-first | file write/edit | Warns if writing to memory directories without querying first | | note-about-doing | prompt submit | Suggests capturing new threads of thought with thread_create() |

Rules live in reflex-rules/ as standard YAML. They work with Claude Code, Cursor, Codex, or any Reflex-compatible runtime.