Skip to main content
Back to Portfolio
Case Study
AI/Infrastructure

Grain Studios: Forge Agent Runtime

Queue-Based Autonomous Worker System

Role
Architect & Developer
Platform
Python, Redis, macOS
Industry
AI Infrastructure
Focus
Task Orchestration

Project Overview

Forge is an event-driven worker system for orchestrating AI tasks on Apple M4 Silicon. Redis-backed queues coordinate GPU access, LoRA adapter training, health monitoring, code reviews, and overnight chain orchestration — all running autonomously via 34 macOS LaunchAgents with self-healing capabilities. Currently running 23 specialized workers including an AI safety gate, arXiv research scanner, nightly code reviewer, and an autonomous sandbox builder that generates complete iOS apps from natural language descriptions.

Key Features

  • GPU Locking — Redis SET NX EX mutex preventing concurrent GPU access, with automatic Ollama model unloading to free VRAM before training jobs.
  • ILION Safety Gate — Pre-LLM deterministic triage on the Amygdala worker that short-circuits to GREEN when all health checks pass, skipping expensive LLM inference entirely.
  • Adapter Versioning — Semantic versioning with symlinked directories, enabling instant rollback of LoRA adapters to any previous version.
  • Self-Healing Daemon — Proactive scanner runs every 60 seconds checking all 34 LaunchAgents, kickstarting any that crashed. Reactive listener handles service-level recovery (Ollama restart, SSD remount, queue flush). Max 3 attempts per signal with cooldown timers.
  • Overnight Chain Orchestration — Multi-stage pipeline that runs data collection, training, evaluation, and deployment sequentially during off-hours. Weekly consolidation step compresses knowledge on Sundays.
  • 23 Specialized Workers — LoRA training, health checks, morning briefings, Amygdala threat assessment, GPU warmup, article scanning, arXiv research scout, nightly code review, PR digest, weekly consolidation, atlas compilation, autonomous Claude agent, and sandbox builder (generates complete iOS apps via Ollama + xcodegen + xcodebuild) — each running as a macOS LaunchAgent daemon.
  • Cortex Integration — All workers authenticate with Cortex API keys to store reports and findings as persistent memories, building institutional knowledge automatically.
  • Slack Notifications — Real-time alerts for task completion, failures, and GPU contention pushed to Slack channels with Block Kit formatting.

Technical Approach

Designed around Redis BLPOP/RPUSH queues for reliable, ordered task processing. Each worker is a LaunchAgent daemon with its own heartbeat, retry logic, and error handling. The GPU coordinator prevents resource contention between inference and training workloads. The overnight chain runs a full LoRA training pipeline — from data collection through model evaluation — without human intervention, leveraging off-peak hours when the GPU is idle. The ILION safety gate on Amygdala eliminates unnecessary LLM calls by applying deterministic triage first — only anomalies that survive rule-based filtering reach the language model.

Outcome

Forge runs 23 autonomous workers handling health checks, LoRA training, article scanning, arXiv research, threat assessment, knowledge consolidation, and system monitoring. The overnight chain has successfully trained and deployed multiple LoRA adapter versions autonomously. Self-healing capabilities mean the system recovers from failures without intervention — workers restart, tasks retry, and GPU locks release cleanly. The ILION safety gate cut unnecessary LLM inference on routine health checks, and Cortex integration means every worker's findings persist as searchable institutional memory. The architecture proved robust enough to serve as the foundation for all AI automation on the homelab.

Interested in working together?

Get in Touch