Skip to main content

Zero Daemons: How Event-Driven Architecture Cut Our CPU Usage by 93%

Ryan Dahlberg
Ryan Dahlberg
December 1, 2025 8 min read
Share:
Zero Daemons: How Event-Driven Architecture Cut Our CPU Usage by 93%

We just deployed an event-driven architecture that eliminated 18 background daemons from Cortex. The result? Our idle CPU dropped from ~15% to less than 1%. Response times went from 30-60 seconds to under a second. And we replaced traditional dashboards with AI-powered notebooks that actually understand your data.

The transformation in 30 minutes:

MetricBeforeAfterImprovement
Processes18 daemons0100% reduction
CPU~15%<1%93% reduction
Response30-60s<1s60x faster

The Problem with Daemons

Traditional monitoring and orchestration systems love daemons. Background processes that sit idle 99% of the time, waking up periodically to check if something changed. It’s the polling pattern applied to everything:

# The old way: 18 of these running constantly
while true; do
    check_worker_health
    sleep 30
done

while true; do
    poll_routing_updates
    sleep 60
done

while true; do
    aggregate_metrics
    sleep 120
done

The problems compound:

  • Wasted compute: 18 processes × 30-second loops = constant CPU churn
  • Memory pressure: Each daemon holds state, connections, buffers
  • Complexity: Startup ordering, health checks, crash recovery for each
  • Scaling nightmare: More features = more daemons = more overhead

The Event-Driven Alternative

Instead of processes watching for changes, we flip the model: changes announce themselves.

graph LR
    A[Event Created] --> B[Event<br/>Validator]
    B --> C{Valid?}
    C -->|Yes| D[Event Queue]
    C -->|No| E[Reject]
    D --> F[Event<br/>Dispatcher]
    F --> G{Route Event}
    G --> H1[Handler 1]
    G --> H2[Handler 2]
    G --> H3[Handler N]
    H1 --> I[Capture Output]
    H2 --> I
    H3 --> I
    I --> J[Archive Event]
    J --> K[Process Exits]

    style A fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style B fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style C fill:#30363d,stroke:#f85149,stroke-width:2px
    style D fill:#30363d,stroke:#00d084,stroke-width:2px
    style E fill:#30363d,stroke:#f85149,stroke-width:2px
    style F fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style G fill:#30363d,stroke:#f85149,stroke-width:2px
    style H1 fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style H2 fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style H3 fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style I fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style J fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style K fill:#30363d,stroke:#00d084,stroke-width:2px

Zero processes running between events. Handlers spawn on-demand, exit when done.

17 Event Types, 7 Handlers

We defined a comprehensive event schema covering everything Cortex does:

graph TD
    A[17 Event Types] --> B[Worker Events]
    A --> C[Routing Events]
    A --> D[Security Events]
    A --> E[System Events]
    A --> F[Task Events]

    B --> B1[worker.started]
    B --> B2[worker.completed]
    B --> B3[worker.failed]
    B --> B4[worker.heartbeat]

    C --> C1[routing.decision]
    C --> C2[routing.update]
    C --> C3[routing.fallback]

    D --> D1[security.scan_started]
    D --> D2[security.vulnerability_found]
    D --> D3[security.scan_completed]

    E --> E1[system.startup]
    E --> E2[system.shutdown]
    E --> E3[system.health_check]

    F --> F1[task.created]
    F --> F2[task.assigned]
    F --> F3[task.completed]
    F --> F4[task.failed]

    style A fill:#30363d,stroke:#58a6ff,stroke-width:3px
    style B fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style C fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style D fill:#30363d,stroke:#f85149,stroke-width:2px
    style E fill:#30363d,stroke:#00d084,stroke-width:2px
    style F fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style B1 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style B2 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style B3 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style B4 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style C1 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style C2 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style C3 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style D1 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style D2 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style D3 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style E1 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style E2 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style E3 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style F1 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style F2 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style F3 fill:#30363d,stroke:#8b949e,stroke-width:1px
    style F4 fill:#30363d,stroke:#8b949e,stroke-width:1px

Event Categories:

  • Worker Events: worker.started, worker.completed, worker.failed, worker.heartbeat
  • Routing Events: routing.decision, routing.update, routing.fallback
  • Security Events: security.scan_started, security.vulnerability_found, security.scan_completed
  • System Events: system.startup, system.shutdown, system.health_check
  • Task Events: task.created, task.assigned, task.completed, task.failed

Each event carries structured metadata:

{
  "event_id": "evt_20251201_093827_85c15a3d1640",
  "event_type": "worker.completed",
  "source": "development-master",
  "timestamp": "2025-12-01T09:38:27-06:00",
  "payload": {
    "task_id": "task-12345",
    "duration_ms": 4523,
    "result": "success"
  },
  "metadata": {
    "correlation_id": "corr-abc123",
    "priority": "medium"
  }
}

The Dispatcher Pattern

A single dispatcher replaces all 18 daemons:

graph TD
    A[Dispatcher Start] --> B[Scan for<br/>Pending Events]
    B --> C{Events Found?}
    C -->|No| Z[Exit]
    C -->|Yes| D[Validate Event<br/>Schema]
    D --> E{Valid?}
    E -->|No| F[Log Error]
    E -->|Yes| G{Route by Type}
    F --> B

    G -->|worker.*| H1[Worker Handler]
    G -->|routing.*| H2[Routing Handler]
    G -->|security.*| H3[Security Handler]
    G -->|system.*| H4[System Handler]
    G -->|task.*| H5[Task Handler]

    H1 --> I[Execute Handler]
    H2 --> I
    H3 --> I
    H4 --> I
    H5 --> I

    I --> J[Capture Output]
    J --> K[Record Metrics]
    K --> L[Archive Event]
    L --> B

    style A fill:#30363d,stroke:#00d084,stroke-width:2px
    style B fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style C fill:#30363d,stroke:#f85149,stroke-width:2px
    style D fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style E fill:#30363d,stroke:#f85149,stroke-width:2px
    style F fill:#30363d,stroke:#f85149,stroke-width:2px
    style G fill:#30363d,stroke:#f85149,stroke-width:2px
    style H1 fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style H2 fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style H3 fill:#30363d,stroke:#f85149,stroke-width:2px
    style H4 fill:#30363d,stroke:#00d084,stroke-width:2px
    style H5 fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style I fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style J fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style K fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style L fill:#30363d,stroke:#58a6ff,stroke-width:2px
    style Z fill:#30363d,stroke:#00d084,stroke-width:2px

Process all pending events in milliseconds:

# Process all pending events
./scripts/events/event-dispatcher.sh

# Events are routed to appropriate handlers
# Handlers execute, capture output, archive
# Total runtime: milliseconds, not minutes

The dispatcher workflow:

  1. Scans for pending events
  2. Validates each against the schema
  3. Routes to the appropriate handler
  4. Captures output and metrics
  5. Archives processed events
  6. Exits (no daemon!)

The AI Notebook Revolution

Here’s where it gets interesting. Instead of building traditional dashboards (Grafana, Datadog, custom UIs), we built AI-powered notebooks using Marimo.

Why Notebooks Instead of Dashboards?

Traditional dashboards:

  • Static visualizations you have to interpret
  • Pre-defined queries you can’t modify
  • Click through 10 panels to find what you need
  • No intelligence—just data display

AI notebooks:

  • Interactive analysis with live code
  • Ask questions in natural language
  • Modify queries on the fly
  • AI explains what the data means

Three Analysis Notebooks

1. Routing Optimization (analysis/routing-optimization.py)

Analyzes MoE (Mixture of Experts) routing decisions:

  • Which masters handle which task types?
  • What’s the confidence distribution?
  • Where are routing failures occurring?
  • AI-suggested optimizations

2. Security Dashboard (analysis/security-dashboard.py)

Real-time vulnerability analysis:

  • CVE severity breakdown
  • Dependency risk scores
  • Remediation priorities
  • Trend analysis with AI insights

3. Worker Performance (analysis/worker-performance.py)

Worker health and efficiency:

  • Task completion rates
  • Duration distributions
  • Failure pattern detection
  • Capacity planning recommendations

Launch in One Command

./scripts/launch-marimo-dashboard.sh

# Opens interactive notebook with:
# - Live data from event logs
# - AI-powered analysis
# - Editable visualizations
# - Export to reports

Quarto Reports: AI-Generated Documentation

Beyond notebooks, we added Quarto reports for automated documentation:

Weekly Summary (reports/weekly-summary.qmd):

  • Automated rollup of all events
  • Performance trends
  • Anomaly highlights
  • AI-written executive summary

Security Audit (reports/security-audit.qmd):

  • CVE inventory with remediation status
  • Compliance checklist
  • Risk assessment
  • Generated remediation tickets

Cost Report (reports/cost-report.qmd):

  • Resource utilization analysis
  • Cost-per-task breakdown
  • Optimization recommendations
  • Budget forecasting

Generate any report with:

quarto render reports/weekly-summary.qmd
# Produces HTML/PDF with live data

Implementation: 4 Phases in Parallel

We built this in 30 minutes by running 4 phases simultaneously:

Phase 1: Event Infrastructure

  • Event schema (17 types)
  • Validator with JSON Schema
  • Logger for creating events
  • Dispatcher for processing

Phase 2: Handler Network

  • 7 handlers replacing 18 daemons
  • Pattern-matched routing
  • Output capture
  • Automatic archival

Phase 3: AI Observability

  • 3 Marimo notebooks
  • 3 Quarto reports
  • Python environment setup
  • GitHub Actions integration

Phase 4: System Integration

  • Automated setup script
  • Dashboard launcher
  • Comprehensive test suite
  • Full documentation

The Numbers

Before (Daemon Architecture):

18 processes running 24/7
~15% CPU idle usage
30-60s response to events
Complex startup/shutdown
Manual monitoring required

After (Event-Driven):

0 processes running idle
<1% CPU idle usage
<1s response to events
Instant startup/shutdown
AI-powered monitoring

Code Impact:

  • 36 files added/modified
  • 4,649 lines of new code
  • 7 event handlers replace 18 daemons
  • 17 event types covering all operations
  • 3 AI notebooks for interactive analysis
  • 3 Quarto reports for automated documentation

The Philosophy: Compute When Needed

The daemon model assumes you need continuous monitoring. But most systems are idle most of the time. Why burn CPU cycles checking “did anything change?” when the thing that changed can simply tell you?

Event-driven computing:

  • Events announce changes → no polling needed
  • Handlers run on-demand → no idle processes
  • AI analyzes patterns → no manual monitoring
  • Reports generate automatically → no dashboard building

This isn’t just an optimization. It’s a different way of thinking about system architecture. Compute happens when needed, not just in case.

Getting Started

The event-driven system is live and processing events. Here’s how to use it:

# 1. Set up automated processing
./scripts/setup-event-processing.sh

# 2. Create an event
./scripts/events/lib/event-logger.sh --create \
    "worker.completed" "my-worker" '{"status":"ok"}' "task-1" "medium" > /tmp/event.json
./scripts/events/lib/event-logger.sh "$(cat /tmp/event.json)"

# 3. Process pending events
./scripts/events/event-dispatcher.sh

# 4. Launch AI dashboard
./scripts/launch-marimo-dashboard.sh

# 5. Generate reports
quarto render reports/weekly-summary.qmd

What’s Next

This event-driven foundation enables:

  1. Real-time streaming - WebSocket event delivery
  2. External integrations - GitHub webhooks, Slack alerts
  3. Machine learning - Pattern detection on event streams
  4. Distributed processing - Event routing across clusters

The daemon is dead. Long live the event.


“The best process is no process. Let the events tell you what happened.”

— Ryan Dahlberg, Ry-Ops, December 2025

#architecture #infrastructure #performance #Cortex #Event-Driven Architecture #AI Notebooks