Zero Daemons: How Event-Driven Architecture Cut Our CPU Usage by 93%

We just deployed an event-driven architecture that eliminated 18 background daemons from Cortex. The result? Our idle CPU dropped from ~15% to less than 1%. Response times went from 30-60 seconds to under a second. And we replaced traditional dashboards with AI-powered notebooks that actually understand your data.

The transformation in 30 minutes:

Metric	Before	After	Improvement
Processes	18 daemons	0	100% reduction
CPU	~15%	<1%	93% reduction
Response	30-60s	<1s	60x faster

The Problem with Daemons

Traditional monitoring and orchestration systems love daemons. Background processes that sit idle 99% of the time, waking up periodically to check if something changed. It’s the polling pattern applied to everything:

# The old way: 18 of these running constantly
while true; do
    check_worker_health
    sleep 30
done

while true; do
    poll_routing_updates
    sleep 60
done

while true; do
    aggregate_metrics
    sleep 120
done

The problems compound:

Wasted compute: 18 processes × 30-second loops = constant CPU churn
Memory pressure: Each daemon holds state, connections, buffers
Complexity: Startup ordering, health checks, crash recovery for each
Scaling nightmare: More features = more daemons = more overhead

The Event-Driven Alternative

Instead of processes watching for changes, we flip the model: changes announce themselves.

graph LR
    A[Event Created] --> B[Event<br/>Validator]
    B --> C{Valid?}
    C -->|Yes| D[Event Queue]
    C -->|No| E[Reject]
    D --> F[Event<br/>Dispatcher]
    F --> G{Route Event}
    G --> H1[Handler 1]
    G --> H2[Handler 2]
    G --> H3[Handler N]
    H1 --> I[Capture Output]
    H2 --> I
    H3 --> I
    I --> J[Archive Event]
    J --> K[Process Exits]

    style A fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style B fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style C fill:#30363d,stroke:#f85149,stroke-width:2px
    style D fill:#30363d,stroke:#00d084,stroke-width:2px
    style E fill:#30363d,stroke:#f85149,stroke-width:2px
    style F fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style G fill:#30363d,stroke:#f85149,stroke-width:2px
    style H1 fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style H2 fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style H3 fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style I fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style J fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style K fill:#30363d,stroke:#00d084,stroke-width:2px

Zero processes running between events. Handlers spawn on-demand, exit when done.

17 Event Types, 7 Handlers

We defined a comprehensive event schema covering everything Cortex does:

graph TD
    A[17 Event Types] --> B[Worker Events]
    A --> C[Routing Events]
    A --> D[Security Events]
    A --> E[System Events]
    A --> F[Task Events]

    B --> B1[worker.started]
    B --> B2[worker.completed]
    B --> B3[worker.failed]
    B --> B4[worker.heartbeat]

    C --> C1[routing.decision]
    C --> C2[routing.update]
    C --> C3[routing.fallback]

    D --> D1[security.scan_started]
    D --> D2[security.vulnerability_found]
    D --> D3[security.scan_completed]

    E --> E1[system.startup]
    E --> E2[system.shutdown]
    E --> E3[system.health_check]

    F --> F1[task.created]
    F --> F2[task.assigned]
    F --> F3[task.completed]
    F --> F4[task.failed]

    style A fill:#30363d,stroke:#58a6ff,stroke-width:3px
    style B fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style C fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style D fill:#30363d,stroke:#f85149,stroke-width:2px
    style E fill:#30363d,stroke:#00d084,stroke-width:2px
    style F fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style B1 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style B2 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style B3 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style B4 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style C1 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style C2 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style C3 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style D1 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style D2 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style D3 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style E1 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style E2 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style E3 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style F1 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style F2 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style F3 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style F4 fill:#30363d,stroke:#8b949e,stroke-width:1px

Event Categories:

Worker Events: worker.started, worker.completed, worker.failed, worker.heartbeat
Routing Events: routing.decision, routing.update, routing.fallback
Security Events: security.scan_started, security.vulnerability_found, security.scan_completed
System Events: system.startup, system.shutdown, system.health_check
Task Events: task.created, task.assigned, task.completed, task.failed

Each event carries structured metadata:

{
  "event_id": "evt_20251201_093827_85c15a3d1640",
  "event_type": "worker.completed",
  "source": "development-master",
  "timestamp": "2025-12-01T09:38:27-06:00",
  "payload": {
    "task_id": "task-12345",
    "duration_ms": 4523,
    "result": "success"
  },
  "metadata": {
    "correlation_id": "corr-abc123",
    "priority": "medium"
  }
}

The Dispatcher Pattern

A single dispatcher replaces all 18 daemons:

graph TD
    A[Dispatcher Start] --> B[Scan for<br/>Pending Events]
    B --> C{Events Found?}
    C -->|No| Z[Exit]
    C -->|Yes| D[Validate Event<br/>Schema]
    D --> E{Valid?}
    E -->|No| F[Log Error]
    E -->|Yes| G{Route by Type}
    F --> B

    G -->|worker.*| H1[Worker Handler]
    G -->|routing.*| H2[Routing Handler]
    G -->|security.*| H3[Security Handler]
    G -->|system.*| H4[System Handler]
    G -->|task.*| H5[Task Handler]

    H1 --> I[Execute Handler]
    H2 --> I
    H3 --> I
    H4 --> I
    H5 --> I

    I --> J[Capture Output]
    J --> K[Record Metrics]
    K --> L[Archive Event]
    L --> B

    style A fill:#30363d,stroke:#00d084,stroke-width:2px
    style B fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style C fill:#30363d,stroke:#f85149,stroke-width:2px
    style D fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style E fill:#30363d,stroke:#f85149,stroke-width:2px
    style F fill:#30363d,stroke:#f85149,stroke-width:2px
    style G fill:#30363d,stroke:#f85149,stroke-width:2px
    style H1 fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style H2 fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style H3 fill:#30363d,stroke:#f85149,stroke-width:2px
    style H4 fill:#30363d,stroke:#00d084,stroke-width:2px
    style H5 fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style I fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style J fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style K fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style L fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style Z fill:#30363d,stroke:#00d084,stroke-width:2px

Process all pending events in milliseconds:

# Process all pending events
./scripts/events/event-dispatcher.sh

# Events are routed to appropriate handlers
# Handlers execute, capture output, archive
# Total runtime: milliseconds, not minutes

The dispatcher workflow:

Scans for pending events
Validates each against the schema
Routes to the appropriate handler
Captures output and metrics
Archives processed events
Exits (no daemon!)

The AI Notebook Revolution

Here’s where it gets interesting. Instead of building traditional dashboards (Grafana, Datadog, custom UIs), we built AI-powered notebooks using Marimo.

Why Notebooks Instead of Dashboards?

Traditional dashboards:

Static visualizations you have to interpret
Pre-defined queries you can’t modify
Click through 10 panels to find what you need
No intelligence—just data display

AI notebooks:

Interactive analysis with live code
Ask questions in natural language
Modify queries on the fly
AI explains what the data means

Three Analysis Notebooks

1. Routing Optimization (analysis/routing-optimization.py)

Analyzes MoE (Mixture of Experts) routing decisions:

Which masters handle which task types?
What’s the confidence distribution?
Where are routing failures occurring?
AI-suggested optimizations

2. Security Dashboard (analysis/security-dashboard.py)

Real-time vulnerability analysis:

CVE severity breakdown
Dependency risk scores
Remediation priorities
Trend analysis with AI insights

3. Worker Performance (analysis/worker-performance.py)

Worker health and efficiency:

Task completion rates
Duration distributions
Failure pattern detection
Capacity planning recommendations

Launch in One Command

./scripts/launch-marimo-dashboard.sh

# Opens interactive notebook with:
# - Live data from event logs
# - AI-powered analysis
# - Editable visualizations
# - Export to reports

Quarto Reports: AI-Generated Documentation

Beyond notebooks, we added Quarto reports for automated documentation:

Weekly Summary (reports/weekly-summary.qmd):

Automated rollup of all events
Performance trends
Anomaly highlights
AI-written executive summary

Security Audit (reports/security-audit.qmd):

CVE inventory with remediation status
Compliance checklist
Risk assessment
Generated remediation tickets

Cost Report (reports/cost-report.qmd):

Resource utilization analysis
Cost-per-task breakdown
Optimization recommendations
Budget forecasting

Generate any report with:

quarto render reports/weekly-summary.qmd
# Produces HTML/PDF with live data

Implementation: 4 Phases in Parallel

We built this in 30 minutes by running 4 phases simultaneously:

Phase 1: Event Infrastructure

Event schema (17 types)
Validator with JSON Schema
Logger for creating events
Dispatcher for processing

Phase 2: Handler Network

7 handlers replacing 18 daemons
Pattern-matched routing
Output capture
Automatic archival

Phase 3: AI Observability

3 Marimo notebooks
3 Quarto reports
Python environment setup
GitHub Actions integration

Phase 4: System Integration

Automated setup script
Dashboard launcher
Comprehensive test suite
Full documentation

The Numbers

Before (Daemon Architecture):

18 processes running 24/7
~15% CPU idle usage
30-60s response to events
Complex startup/shutdown
Manual monitoring required

After (Event-Driven):

0 processes running idle
<1% CPU idle usage
<1s response to events
Instant startup/shutdown
AI-powered monitoring

Code Impact:

36 files added/modified
4,649 lines of new code
7 event handlers replace 18 daemons
17 event types covering all operations
3 AI notebooks for interactive analysis
3 Quarto reports for automated documentation

The Philosophy: Compute When Needed

The daemon model assumes you need continuous monitoring. But most systems are idle most of the time. Why burn CPU cycles checking “did anything change?” when the thing that changed can simply tell you?

Event-driven computing:

Events announce changes → no polling needed
Handlers run on-demand → no idle processes
AI analyzes patterns → no manual monitoring
Reports generate automatically → no dashboard building

This isn’t just an optimization. It’s a different way of thinking about system architecture. Compute happens when needed, not just in case.

Getting Started

The event-driven system is live and processing events. Here’s how to use it:

# 1. Set up automated processing
./scripts/setup-event-processing.sh

# 2. Create an event
./scripts/events/lib/event-logger.sh --create \
    "worker.completed" "my-worker" '{"status":"ok"}' "task-1" "medium" > /tmp/event.json
./scripts/events/lib/event-logger.sh "$(cat /tmp/event.json)"

# 3. Process pending events
./scripts/events/event-dispatcher.sh

# 4. Launch AI dashboard
./scripts/launch-marimo-dashboard.sh

# 5. Generate reports
quarto render reports/weekly-summary.qmd

What’s Next

This event-driven foundation enables:

Real-time streaming - WebSocket event delivery
External integrations - GitHub webhooks, Slack alerts
Machine learning - Pattern detection on event streams
Distributed processing - Event routing across clusters

The daemon is dead. Long live the event.

“The best process is no process. Let the events tell you what happened.”

— Ryan Dahlberg, Ry-Ops, December 2025

AI & ML

Building an AI Blog Writer: From Topic to Published Post with n8n, Claude, and GitHub

Developer skills

Cutting Cortex LLM Costs by 90%: The Prompt Engineering Playbook

Engineering

Watching Infrastructure Learn From Itself: A Claude Code Reflection

Enterprise software

Zero-Downtime Database Migrations

News & insights

From Idea to Production in 28 Days

Open Source

Personal AI Operations Memory: Building a Learning System for Git-Ops

Security

Concept: Homomorphic encryption techniques for secure computation on encrypted data