Grain Studios: Coquina
Self-Healing Memory Platform for AI Agents (formerly Cortex)
- Role
- Architect & Developer
- Platform
- Python, PostgreSQL 16, MCP, Apple Silicon
- Industry
- AI Infrastructure
- Focus
- Agent Memory
Project Overview
Coquina (formerly Cortex) is a self-healing memory platform for AI agents. The name comes from coquina, the shell-rock the Spanish used to build Castillo de San Marcos in St. Augustine in 1672 — the only fort of its kind in the US, still standing because coquina absorbs cannonball impacts instead of shattering. Many small bonded shells distribute the shock. Same architecture here: many small memories, bonded by auto-linked relationships, resilient because of the bonds. Memories go in raw and self-organize — auto-embedded, auto-linked through cosine similarity, auto-clustered by project. Dual-path search merges Postgres full-text and ChromaDB semantic results, ranked with recency bonus. A-MAC admission control quality-gates every write. Adaptive retrieval (Agentic RAG via LangGraph) reformulates queries when relevance is low. Runs natively on Apple M4 Silicon with embedded ChromaDB (PersistentClient) — no Docker, no zombie containers. Three-tier self-healing: process (LaunchAgent auto-restart), application (graceful degradation, Redis write buffer), infrastructure (automated backups, Time Machine snapshots). Currently running in production with 1,712+ memories and 21,853+ auto-generated edges across 29 projects. Sub-6ms FTS query latency, 2ms unified search (FTS + semantic + graph) on local hardware.
Key Features
- →Three-Tier Self-Healing — Process: LaunchAgent with KeepAlive, self-heal daemon monitors every 60s. Application: embedded ChromaDB eliminates zombie containers by architecture, Redis write buffer replays failed writes on recovery. Infrastructure: automated daily backups, Time Machine hourly snapshots.
- →Dual-Path Search — Every query hits both Postgres full-text search AND ChromaDB semantic vectors. Results merged and ranked with recency bonus. Exact keywords and conceptual similarity both work.
- →A-MAC Admission Control — Quality gate on every write. Scores incoming memories on novelty, specificity, and project relevance. LLM-based scoring via local qwen3.5:9b. Eliminated 288+ duplicate memories/day from health checks.
- →Auto-Linking Relationship Graph — Cosine similarity on every write discovers and connects related memories with typed edges (similar_to, superseded_by, contradicts, supports, depends_on). 21,853+ auto-generated edges.
- →Adaptive Retrieval (Agentic RAG) — LangGraph-powered query reformulation. Grades retrieved memories, retries with rephrased queries if relevance is low. Agents don't need perfect keywords.
- →Embedded ChromaDB — PersistentClient mode, no separate server process. If Coquina is up, vectors are up. The zombie container problem is eliminated by architecture, not monitoring.
- →Code Intelligence — tree-sitter AST parsing for Python: function signatures, class hierarchies, import graphs, call sites indexed into searchable memory.
- →Memory Tiers — Facts, decisions, learnings, preferences, procedural (workflows with success tracking), episodic (session-scoped, auto-expires), and code intelligence.
- →MCP + OAuth 2.1 — Model Context Protocol for native agent integration. Full RBAC, API key management, audit logging. Live dashboard at coquina.studio.
Technical Approach
Built local-first with zero cloud dependencies. Every architectural decision prioritizes reliability: ChromaDB runs embedded inside the Coquina process (PersistentClient) so there's no separate server to zombie. A-MAC quality-gates writes at the door. The auto-linker builds the knowledge graph on every write without manual curation. Adaptive retrieval reformulates queries automatically when initial results are weak. Three-tier self-healing ensures the system recovers from process crashes, application degradation, and infrastructure failures without human intervention. Redis write buffer prevents data loss during restarts. The MCP interface makes Coquina feel like native memory to any agent — no special prompting needed.
Outcome
Coquina runs 24/7 natively on Apple M4 Silicon across 29 projects, serving as the shared memory layer for all AI agent interactions. Migrated from Docker on Intel to native on M4 — 2x faster queries, zero zombie containers. 1,712+ memories and 21,853+ auto-generated relationship edges. Kill the process — it's back in 5 seconds. Every write is quality-gated, every query hits dual search paths, every relationship is discovered automatically. The system has fundamentally changed how AI sessions work: context carries forward, decisions persist, and agents build on prior knowledge instead of starting from zero.