Building the Cortex Fabric: A Day of Infrastructure Engineering
The Problem We Started With
Today began with a familiar frustration: Claude Desktop couldn’t reliably connect to my MCP servers running on k3s. The infrastructure was there—13 MCP servers handling everything from Proxmox VM management to YouTube ingestion—but the connectivity was fragile. Port-forwards would break, sessions wouldn’t persist, and there was no shared context between Claude Desktop and the cortex-chat web interface.
The deeper issue? My architecture had grown organically but lacked a connective tissue. Each piece worked in isolation, but they didn’t work together as a fabric.
The Architecture Before
Claude Desktop ──(broken port-forwards)──► k3s MCP Servers
cortex-chat ──────────────────────────────► k3s MCP Servers
Claude Code ──────────────────────────────► (nothing)
Three clients, no shared state, no event streaming, no session continuity.
The Vision: A True Fabric
What I wanted was simple in concept but required careful design:
- Any client connects the same way - Whether it’s Claude Desktop on my Mac, the cortex-chat web app, or Claude Code in my terminal, they should all speak the same protocol
- Sessions persist across clients - Start a task in Claude Desktop, continue it in cortex-chat
- Events flow everywhere - When I edit a file locally, all connected clients know about it
- MCP access is unified - One gateway routes to all 14 MCP servers
The Solution: Cortex Fabric
We built three components today:
1. Fabric Gateway (runs on k3s)
The heart of the fabric. A FastAPI service with WebSocket support that:
- Accepts connections from any client type
- Routes MCP tool calls to the appropriate backend server
- Broadcasts events via Redis pubsub
- Manages sessions through the Memory Service
- Handles client heartbeats and stale connection cleanup
@app.websocket("/ws/fabric")
async def websocket_fabric(websocket: WebSocket, client_type: str):
client = await gateway.connect_client(websocket, client_type)
# All clients speak the same protocol from here
The gateway knows about all 14 MCP servers:
- proxmox, unifi, kubernetes, github, github-security
- outline, n8n, cloudflare, sandfly, checkmk
- langflow, youtube-ingestion, youtube-channel, cortex-school
2. Cortex Bridge (runs locally on Mac)
A lightweight Python service that bridges the gap between local development and the k3s cluster:
- Provides MCP stdio interface for Claude Desktop
- Maintains persistent WebSocket to Fabric Gateway
- Watches cortex-platform for file changes
- Auto-starts/resumes sessions
- Handles reconnection with exponential backoff
The bridge is ~600 lines of Python with only two dependencies: websockets and watchdog.
3. Fabric Client (JavaScript library)
For cortex-chat and other browser-based clients:
const fabric = new FabricClient('wss://fabric.ry-ops.dev/ws/fabric');
await fabric.connect();
// Call any MCP tool
const pods = await fabric.callMcpTool('kubernetes', 'list_pods', {
namespace: 'cortex-system'
});
// Subscribe to events
fabric.on('file_changed', (data) => {
console.log('File changed:', data.path);
});
The Protocol
All clients speak the same WebSocket protocol:
// Client → Fabric
{"type": "mcp_call", "data": {"server": "proxmox", "tool": "list_vms"}}
{"type": "event", "data": {"name": "file_changed", "path": "..."}}
{"type": "session_start", "data": {"working_directory": "/..."}}
// Fabric → Client
{"type": "connected", "client_id": "uuid", "mcp_servers": ["proxmox", ...]}
{"type": "mcp_result", "result": {...}}
{"type": "event_broadcast", "event": "deployment_complete", "data": {...}}
The Architecture After
┌─────────────────────────────────────────────────────────────┐
│ CORTEX FABRIC │
│ │
│ Claude Desktop ──┐ │
│ cortex-chat ─────┼──► Fabric Gateway ──► MCP Servers │
│ Claude Code ─────┘ │ │
│ ▼ │
│ Memory Service (sessions) │
│ Redis (events) │
└─────────────────────────────────────────────────────────────┘
Other Work Today
Before building the fabric, we also:
- Created MCP wrappers for youtube-ingestion, youtube-channel, and cortex-school services
- Fixed quota issues - bumped ConfigMaps quota from 30 to 40
- Committed infrastructure files that were sitting untracked:
scripts/mcp-port-forward.shservices/layer-activator/services/cortex-live/build.shandpush.sh
- Updated .gitignore to exclude generated data directories
Deployment
Everything went through the GitOps pipeline:
cortex-platformrepo: Source code for bridge, gateway, client librarycortex-gitopsrepo: Kubernetes manifests for fabric-gateway deployment
ArgoCD picks up changes within 3 minutes and syncs to the cluster.
What’s Next
- Test end-to-end connectivity - Once ArgoCD syncs the fabric-gateway
- Update cortex-chat - Import the fabric-client.js library
- Iterate on the protocol - Add more event types, improve error handling
Lessons Learned
- Fabric > Point-to-Point - Instead of solving each connection problem individually, building a unified layer pays dividends
- Client-Agnostic Design - Making all clients speak the same protocol simplifies everything
- GitOps is the Way - Push to git, let ArgoCD handle deployment, never kubectl apply manually
- Session Continuity Matters - Being able to pick up where you left off across clients is transformative
Built with Claude Code and deployed via ArgoCD to a k3s cluster running on Proxmox VMs. The infrastructure manages itself.