Skip to main content

The Learning Loop: How Cortex Improves Itself

Ryan Dahlberg
Ryan Dahlberg
December 12, 2025 8 min read
Share:
The Learning Loop: How Cortex Improves Itself

The Learning Loop: How Cortex Improves Itself

Yesterday, we explored how Cortex routes tasks. Today, let’s dive into the magic that makes it self-improving: the learning loop.

The Core Insight

Most automation systems are static. They do exactly what you program them to do, forever.

Cortex is different. Every task execution makes it smarter.

Task 1: Confidence 0.70 → Success
Task 10: Confidence 0.78 → Success
Task 100: Confidence 0.92 → Success

The system learns.

The Learning Loop

Step 1: Execute Task

A task completes with these data points:

{
  "task_id": "task-001",
  "description": "Fix authentication bug",
  "routing": {
    "selected_master": "development-master",
    "confidence": 0.85,
    "alternatives": {
      "security-master": 0.65,
      "cicd-master": 0.15
    }
  },
  "execution": {
    "status": "completed",
    "duration_minutes": 18,
    "worker_id": "implementation-worker-001"
  },
  "outcome": {
    "success": true,
    "quality_score": 0.92,
    "tests_passing": true,
    "code_review_score": 0.89
  }
}

Step 2: Extract Patterns

Identify what made this task succeed:

function extractPattern(task) {
  return {
    // Task characteristics
    keywords: extractKeywords(task.description),
    category: categorizeTask(task.description),
    complexity: estimateComplexity(task.description),
    domains: identifyDomains(task.description),

    // Routing decision
    selected_master: task.routing.selected_master,
    routing_confidence: task.routing.confidence,

    // Outcome
    success: task.outcome.success,
    quality: task.outcome.quality_score,
    duration: task.execution.duration_minutes,

    // Metadata
    timestamp: new Date(),
    task_count: incrementTaskCount(task.routing.selected_master)
  };
}

Example Pattern:

{
  keywords: ['fix', 'authentication', 'bug'],
  category: 'bugfix',
  complexity: 'medium',
  domains: ['development', 'security'],
  selected_master: 'development-master',
  routing_confidence: 0.85,
  success: true,
  quality: 0.92,
  duration: 18,
  timestamp: '2025-11-26T10:18:00Z',
  task_count: 23
}

Step 3: Update Confidence

Adjust confidence for similar future tasks:

function updateConfidence(pattern, historicalPatterns) {
  // Find similar historical patterns
  const similar = historicalPatterns.filter(p =>
    similarity(p.keywords, pattern.keywords) > 0.7
  );

  if (similar.length === 0) {
    // New pattern, store as-is
    return pattern;
  }

  // Calculate success rate for this pattern type
  const successfulTasks = similar.filter(p => p.success).length;
  const totalTasks = similar.length;
  const successRate = successfulTasks / totalTasks;

  // Calculate average quality
  const avgQuality = average(similar.map(p => p.quality));

  // New confidence = (success_rate * 0.7) + (avg_quality * 0.3)
  const newConfidence = (successRate * 0.7) + (avgQuality * 0.3);

  return {
    ...pattern,
    updated_confidence: newConfidence,
    sample_size: totalTasks + 1,
    success_rate: successRate
  };
}

Confidence Evolution:

Task 1:  confidence: 0.70, success_rate: 1/1   = 1.00
Task 5:  confidence: 0.78, success_rate: 5/5   = 1.00
Task 10: confidence: 0.82, success_rate: 9/10  = 0.90
Task 20: confidence: 0.87, success_rate: 18/20 = 0.90
Task 50: confidence: 0.92, success_rate: 47/50 = 0.94

Step 4: Store Knowledge

Persist the learned pattern:

// coordination/knowledge-base/routing-patterns.jsonl
{"keywords":["fix","auth","bug"],"master":"development","confidence":0.92,"sample_size":47}
{"keywords":["scan","CVE","vulnerability"],"master":"security","confidence":0.94,"sample_size":31}
{"keywords":["deploy","production","release"],"master":"cicd","confidence":0.89,"sample_size":19}

JSONL format enables:

  • Append-only writes (fast)
  • Easy pattern matching (grep)
  • Simple backup (just copy file)
  • Version control friendly

Step 5: Apply Learning

Next similar task uses updated confidence:

New task: "Fix authentication token expiration"

Pattern matching finds:
  Pattern: "fix auth bug" (47 samples)
  Master: development-master
  Confidence: 0.92 (learned from history!)

Route with high confidence → Faster, better decisions

Real Learning Example

Let’s trace how Cortex learned to handle security tasks:

Initial State (Day 1)

Task: "Scan for CVE-2024-001"
Patterns: [] (no history)
Routing:
  security-master: 0.60 (domain match only)
  development-master: 0.45

Selected: security-master (0.60)
Result: Success (quality: 0.88, duration: 7 min)

Learned:
  "CVE scan" → security-master works!
  Sample size: 1

After 5 Similar Tasks (Day 3)

Task: "Scan for CVE-2024-005"
Patterns: 4 similar tasks (all successful)
Routing:
  security-master: 0.82 (domain + patterns)
  development-master: 0.35

Selected: security-master (0.82)
Result: Success (quality: 0.92, duration: 6 min)

Learned:
  "CVE scan" → security-master confidence: 0.82
  Success rate: 5/5 = 1.00
  Avg quality: 0.90
  Sample size: 5

After 20 Similar Tasks (Day 10)

Task: "Scan for CVE-2024-020"
Patterns: 19 similar tasks (18 successful, 1 failed)
Routing:
  security-master: 0.94 (high confidence!)
  development-master: 0.25

Selected: security-master (0.94)
Result: Success (quality: 0.95, duration: 5 min)

Learned:
  "CVE scan" → security-master confidence: 0.94
  Success rate: 19/20 = 0.95
  Avg quality: 0.92
  Avg duration: 6.2 min (getting faster!)
  Sample size: 20

The system learned: CVE scanning → Security Master, with 94% confidence.

Handling Failures

What happens when routing fails?

Failure Scenario

Task: "Implement OAuth2 integration"
Routing:
  development-master: 0.78

Result: FAILED (worker timeout)

Learning:
  Don't just penalize development-master
  Analyze WHY it failed

Failure Analysis

function analyzeFailure(task, result) {
  return {
    failure_type: result.failure_type, // timeout, error, quality
    root_cause: analyzeRootCause(task, result),
    was_routing_error: wasRoutingWrong(task, result),
    complexity_mismatch: wasTaskTooComplex(task, result)
  };
}

Example Analysis:

{
  failure_type: 'timeout',
  root_cause: 'task_complexity_too_high',
  was_routing_error: false, // Right master
  complexity_mismatch: true  // But task too complex
}

Adjusted Learning

// Don't penalize development-master for complex tasks
// Instead: Learn that OAuth tasks need more time/resources

Pattern updated:
  "OAuth integration" tasks:
    - Estimated complexity: HIGH (not medium)
    - Estimated duration: 60+ min (not 20 min)
    - Worker type: Implementation (with extended timeout)
    - Confidence: 0.75 (appropriate for complex task)

Multi-Master Learning

Some tasks need multiple masters. Learn these patterns too:

Task: “Implement rate limiting with security audit”

Execution Flow:

1. Development-Master implements (30 min)
2. Security-Master audits (15 min)
3. CI/CD-Master deploys (10 min)

Total: 55 minutes
Success: true
Quality: 0.94

Learned Pattern:

{
  keywords: ['implement', 'security'],
  workflow_type: 'sequential_multi_master',
  masters: [
    {order: 1, master: 'development', duration: 30},
    {order: 2, master: 'security', duration: 15},
    {order: 3, master: 'cicd', duration: 10}
  ],
  total_duration: 55,
  success_rate: 1.0,
  sample_size: 1
}

Next Similar Task:

Task: "Implement authentication with security review"

Pattern match found!
Route automatically:
  1. Development (predicted: 30 min)
  2. Security (predicted: 15 min)
  3. CI/CD (predicted: 10 min)

Confidence: 0.85 (good pattern match)

Meta-Learning

Cortex also learns how to learn better:

Learning Rate Adjustment

function adjustLearningRate(master, recentPerformance) {
  // If master very stable, learn slower (more conservative)
  if (recentPerformance.variance < 0.05) {
    return 0.1; // Small adjustments
  }

  // If master unstable, learn faster (adapt quickly)
  if (recentPerformance.variance > 0.15) {
    return 0.3; // Larger adjustments
  }

  return 0.2; // Default
}

Pattern Decay

Old patterns lose relevance:

function applyTimeDecay(pattern, currentDate) {
  const ageInDays = (currentDate - pattern.timestamp) / (1000 * 60 * 60 * 24);

  // Exponential decay: 50% relevance after 30 days
  const decayFactor = Math.exp(-ageInDays / 30);

  return {
    ...pattern,
    effective_confidence: pattern.confidence * decayFactor
  };
}

Patterns from 60 days ago have 25% of original weight.

Sample Size Weighting

Trust patterns with more samples:

function weightBySampleSize(pattern) {
  // Logarithmic weighting
  const weight = Math.log(pattern.sample_size + 1) / Math.log(100);

  return {
    ...pattern,
    weighted_confidence: pattern.confidence * weight
  };
}
Sample size 1:   weight = 0.00 (don't trust yet)
Sample size 10:  weight = 0.50 (some trust)
Sample size 50:  weight = 0.85 (good trust)
Sample size 100: weight = 1.00 (full trust)

Continuous Improvement Metrics

Track learning effectiveness:

{
  "learning_metrics": {
    "total_patterns": 1247,
    "avg_confidence_increase": 0.18, // 18% improvement over time
    "routing_accuracy": {
      "week_1": 0.72,
      "week_2": 0.81,
      "week_3": 0.87,
      "week_4": 0.92
    },
    "confidence_correlation": 0.89, // High confidence = high success
    "pattern_reuse_rate": 0.73 // 73% of tasks match patterns
  }
}

Key Insight: Routing accuracy improved 28% over 4 weeks!

The Virtuous Cycle

More tasks executed

More patterns learned

Better routing confidence

Higher success rate

More accurate patterns

Even better routing

(repeat)

This is why Cortex gets exponentially better over time.

Preventing Negative Learning

Outlier Detection

Don’t learn from anomalies:

function isOutlier(task, historicalTasks) {
  const similar = historicalTasks.filter(/* similarity */);
  const avgDuration = average(similar.map(t => t.duration));
  const stdDev = standardDeviation(similar.map(t => t.duration));

  // If this task took 3+ standard deviations longer, it's an outlier
  return task.duration > (avgDuration + 3 * stdDev);
}

if (isOutlier(task, history)) {
  // Don't update patterns with this data
  logAnomaly(task);
  return;
}

Confidence Bounds

Never let confidence go too extreme:

function clampConfidence(confidence) {
  return Math.max(0.15, Math.min(0.95, confidence));
}

// Never 0% (always some hope)
// Never 100% (always some uncertainty)

Tomorrow’s Topic

Tomorrow, I’ll show you how to build a Coordinator Master from scratch - the complete implementation with code examples.

Key Takeaways

  1. Every task is a learning opportunity
  2. Confidence increases with successful patterns
  3. Failures teach what NOT to do
  4. Meta-learning optimizes the learning itself
  5. The system gets exponentially better over time

The learning loop is what separates Cortex from static automation. It’s the difference between a tool and an evolving intelligence.

Learn More About Cortex

Want to dive deeper into how Cortex works? Visit the Meet Cortex page to learn about its architecture, capabilities, and how it scales from 1 to 100+ agents on-demand.


Part 7 of the Cortex series. Next: Building a Coordinator Master from Scratch

#Machine Learning #Self-Improvement #Learning Systems #Cortex