Grain Studios: Forge Agent Runtime

Queue-Based Autonomous Worker System

Role: Architect & Developer
Platform: Python, Redis, macOS
Industry: AI Infrastructure
Focus: Task Orchestration

Project Overview

I spent every summer, every other winter, and every spring break in Washington State. The Mt. St. Helens blast zone was my motorcycle playground — miles of volcanic dead ground where nothing was supposed to grow back, and all of it did. That's the energy here: build in harsh conditions, make it self-sustaining, don't look back.

Forge is the Redis-backed runtime that orchestrates AI work on Apple M4 Silicon. Queues coordinate GPU access, health monitoring, code review, research scanning, and overnight chain execution — all autonomous, all running as 81 macOS LaunchAgents with self-healing built in.

Every night it runs a 28-step DAG: GPU warmup, security review, code review, auto-fix with draft PRs, arXiv and ecosystem scouts, knowledge consolidation, perf benchmarks, and a morning briefing on my desk before I wake. 43 specialized workers carry it. Zero cloud — it all runs while I sleep.

Key Features

→GPU Locking — Redis SET NX EX mutex preventing concurrent GPU access, with automatic Ollama model unloading to free VRAM before training jobs.
→Deterministic Triage Gate — Pre-LLM scoring on the Amygdala worker that short-circuits to GREEN when all health checks pass, skipping expensive LLM inference entirely.
→Adapter Versioning — Semantic versioning with symlinked directories, enabling instant rollback of LoRA adapters to any previous version.
→Self-Healing Daemon — Proactive scanner runs every 60 seconds checking all 81 LaunchAgents, kickstarting any that crashed. Reactive listener handles service-level recovery (Ollama restart, SSD remount, queue flush). Max 3 attempts per signal with cooldown timers.
→Overnight Chain Orchestration — 28-step DAG running from GPU warmup through security review, code review, research scanning, ecosystem monitoring, knowledge consolidation, and morning briefing during off-hours. Weekly consolidation compresses knowledge on Sundays.
→43 Specialized Workers — Health checks, morning briefings, Amygdala threat assessment, GPU warmup, article scanning, arXiv research scout, nightly code review, PR digest, site analytics, ecosystem watch, evolution monitoring, weekly consolidation, atlas compilation, autonomous Claude agent, and sandbox builder (generates complete iOS apps via Ollama + xcodegen + xcodebuild) — each running as a macOS LaunchAgent daemon.
→Coquina Integration — All workers authenticate with Coquina API keys to store reports and findings as persistent memories, building institutional knowledge automatically.
→Slack Notifications — Real-time alerts for task completion, failures, and GPU contention pushed to Slack channels with Block Kit formatting.

Technical Approach

Built on Redis BLPOP/RPUSH queues: ordered, reliable, no surprises. Each worker is a LaunchAgent daemon with its own heartbeat, retry logic, and error handling. A GPU coordinator holds a mutex so inference and training never collide.

The 28-step overnight chain runs hands-off on off-peak hours, when the GPU would otherwise sit idle. I wake to a briefing that tells me exactly what the system did.

Amygdala, the security worker, runs a deterministic triage gate before any model is called — rule-based scoring short-circuits to GREEN when health checks pass, and only real anomalies reach the LLM. Don't burn compute on problems math already solved.

Outcome

Forge runs 43 autonomous workers — security council, code review, auto-fix, arXiv and article scouts, ecosystem watch, model scout, consolidation, perf benchmarks, app generation, and more. The 28-step chain runs every night and I don't touch it.

Self-healing keeps it alive without me: a 60-second daemon restarts crashed workers, tasks retry, GPU locks release clean. The deterministic triage gate killed the wasted LLM calls on routine health checks.

Every worker writes its findings back to Coquina as searchable memory. The system remembers what it learned overnight. So do I.

Grain Studios Wegmans

Hiring for AI Product, PM, or design?

Get in Touch