Grain Studios: Cortex

Self-Healing Memory Platform for AI Agents

Role: Architect & Developer
Platform: Python, PostgreSQL 16, MCP, Apple Silicon
Industry: AI Infrastructure
Focus: Agent Memory

Project Overview

Cortex is a self-healing memory platform for AI agents. Memories go in raw and self-organize — auto-embedded, auto-linked through cosine similarity, auto-clustered by project. Dual-path search merges Postgres full-text and ChromaDB semantic results, ranked with recency bonus. A-MAC admission control quality-gates every write. Adaptive retrieval (Agentic RAG via LangGraph) reformulates queries when relevance is low. Runs natively on Apple M4 Silicon with embedded ChromaDB (PersistentClient) — no Docker, no zombie containers. Three-tier self-healing: process (LaunchAgent auto-restart), application (graceful degradation, Redis write buffer), infrastructure (warm standby, automated backups). Currently running in production with 1,600+ memories and 14,200+ auto-generated edges across 30 projects. Sub-3ms FTS query latency, 8ms unified search (FTS + semantic + graph) on forge-native deployment across a three-node Tailscale network.

Key Features

→Three-Tier Self-Healing — Process: LaunchAgent with KeepAlive, self-heal daemon monitors every 60s. Application: embedded ChromaDB eliminates zombie containers by architecture, Redis write buffer replays failed writes on recovery. Infrastructure: warm standby on sentinel, automated daily backups, Time Machine hourly snapshots.
→Dual-Path Search — Every query hits both Postgres full-text search AND ChromaDB semantic vectors. Results merged and ranked with recency bonus. Exact keywords and conceptual similarity both work.
→A-MAC Admission Control — Quality gate on every write. Scores incoming memories on novelty, specificity, and project relevance. LLM-based scoring via local qwen3.5:9b. Eliminated 288+ duplicate memories/day from health checks.
→Auto-Linking Relationship Graph — Cosine similarity on every write discovers and connects related memories with typed edges (similar_to, superseded_by, contradicts, supports, depends_on). 14,200+ auto-generated edges.
→Adaptive Retrieval (Agentic RAG) — LangGraph-powered query reformulation. Grades retrieved memories, retries with rephrased queries if relevance is low. Agents don't need perfect keywords.
→Embedded ChromaDB — PersistentClient mode, no separate server process. If Cortex is up, vectors are up. The zombie container problem is eliminated by architecture, not monitoring.
→Code Intelligence — tree-sitter AST parsing for Python: function signatures, class hierarchies, import graphs, call sites indexed into searchable memory.
→Memory Tiers — Facts, decisions, learnings, preferences, procedural (workflows with success tracking), episodic (session-scoped, auto-expires), and code intelligence.
→MCP + OAuth 2.1 — Model Context Protocol for native agent integration. Full RBAC, API key management, audit logging. Live dashboard at cortex.grainlabs.io.

Technical Approach

Built local-first with zero cloud dependencies. Every architectural decision prioritizes reliability: ChromaDB runs embedded inside the Cortex process (PersistentClient) so there's no separate server to zombie. A-MAC quality-gates writes at the door. The auto-linker builds the knowledge graph on every write without manual curation. Adaptive retrieval reformulates queries automatically when initial results are weak. Three-tier self-healing ensures the system recovers from process crashes, application degradation, and infrastructure failures without human intervention. Redis write buffer prevents data loss during restarts. The MCP interface makes Cortex feel like native memory to any agent — no special prompting needed.

Outcome

Cortex runs 24/7 natively on Apple M4 Silicon across 30 projects, serving as the shared memory layer for all AI agent interactions. Migrated from Docker on Intel to native on M4 — 2x faster queries, zero zombie containers. 1,600+ memories and 14,200+ auto-generated relationship edges. Three-node Tailscale network (forge primary, sentinel warm standby, scout mobile). Kill the process — it's back in 5 seconds. Every write is quality-gated, every query hits dual search paths, every relationship is discovered automatically. The system has fundamentally changed how AI sessions work: context carries forward, decisions persist, and agents build on prior knowledge instead of starting from zero.

Back to Portfolio Grain Studios

Interested in working together?

Get in Touch