I built git-steer as an MCP server for autonomous GitHub repository management — it scans for CVEs, creates RFC issues, dispatches fix workflows, enforces branch protection policies, and keeps a fleet of repos in compliance. It’s powerful. It’s also, in its original form, kind of a menace to the GitHub API.
After running a full security sweep across my managed repos, I started seeing secondary rate limit errors in the logs. GitHub’s abuse detection had noticed. This post covers the eight-step hardening sprint I ran to get git-steer behaving like a good API citizen.
The Problem
git-steer’s security_sweep tool, when pointed at a large repo fleet, would:
- Fire N REST calls simultaneously to fetch Dependabot alerts (one per repo)
- Fire another N REST calls for code scanning alerts
- Fire N REST calls to list branches
- Resolve the installation owner with a paginated
GET /installation/repositoriescall every time state loaded - Create RFC issues in rapid serial succession with no throttling
- Never cache any responses — every run re-fetched everything from scratch
For a small fleet this is fine. For anything beyond ~10 repos, you’re looking at a burst of 40–80 API calls in a few seconds. GitHub’s secondary rate limit triggers on write bursts and dense request windows — exactly what we were generating.
The Fix: Eight Steps
Step 1 — Throttling and Retry Plugins
The first change was the simplest and most impactful: wire @octokit/plugin-throttling and @octokit/plugin-retry into the Octokit instance.
The throttling plugin intercepts 429 (primary rate limit) and 403 (secondary rate limit) responses and automatically backs off before retrying. Rather than crashing on a rate limit hit, git-steer now waits out the Retry-After interval and continues. Primary rate limit hits retry up to 4 times; secondary hits always retry since GitHub’s guidance is to never skip them.
const ThrottledOctokit = Octokit.plugin(throttling, retry).defaults({
throttle: {
onRateLimit(retryAfter, options, _octokit, retryCount) {
console.warn(`Primary rate limit hit. Retry-After: ${retryAfter}s`);
return retryCount < 4;
},
onSecondaryRateLimit(retryAfter, options) {
console.warn(`Secondary rate limit hit. Backing off ${retryAfter}s`);
return true;
},
},
});
Step 2 — Concurrency Caps
Raw parallelism was the root cause of burst abuse. I introduced p-limit with three named limiters:
writeLimit(2)— mutations (blob creation, issue/PR creation, workflow dispatch)readLimit(8)— reads (alert scans, file fetches, list endpoints)searchLimit(1)— Search API calls (GitHub enforces a 10 req/min secondary limit)
Every bulk loop in security_sweep, security_scan, and security_digest now runs through these limiters. The write cap of 2 is conservative by design — GitHub’s secondary rate limit documentation specifically calls out back-to-back write bursts as the primary trigger.
Step 3 — ETag Cache Layer
The Contents API supports conditional requests via If-None-Match. When a file hasn’t changed since the last fetch, GitHub responds with 304 Not Modified — no body, and it counts at a reduced rate-limit weight.
I built an ETagCache class that:
- Stores the
ETagheader value alongside the decoded content and blob SHA on every successful GET - Sends
If-None-Matchon subsequent requests for the same path - Catches Octokit’s
RequestError { status: 304 }(it throws rather than returning normally) and returns the cached content - Persists the full ETag map to
state/cache.jsonon shutdown and restores it on startup, so cache hits work across process restarts — not just within a session
Over time, this means repeated state loads skip most file fetches entirely.
Step 4 — GraphQL Batching
Three REST hotspots got replaced with single GraphQL round-trips:
Owner resolution — StateManager.load() was calling GET /installation/repositories and paginating through results just to read owner.login. Replaced with query { viewer { login } } — one call, no pagination.
Branch listing — branch_list and branch_reap each called REST paginated endpoints per repo. Replaced with a GraphQL refs query that fetches name, protection status, and last commit date in one round-trip. Includes a fallback to REST for repos with >100 branches to prevent silent truncation.
Dependabot alerts — security_sweep was making one REST call per repo. Replaced with a single aliased GraphQL query that fetches vulnerabilityAlerts for the entire repo batch simultaneously. Code scanning alerts have no GraphQL equivalent, so those remain as parallelised REST calls.
Step 5 — Rate Limit Budget Visibility
steer_status previously called GET /rate_limit live on every invocation and only returned the core bucket. I expanded this to:
- Fetch at startup and cache, refreshing every 30 minutes via
setInterval - Return all resource buckets (
core,graphql,search,code_scanning,actions_runner_registration) - Include
percentRemainingper bucket - Emit a
warningsarray for any bucket below 15% remaining
Now you can see where you stand before kicking off a large sweep.
Step 6 — Rate Limit Telemetry in the Audit Log
Every entry in audit.jsonl now carries rate limit context:
{
"ts": "2026-02-19T17:00:00Z",
"action": "security_sweep",
"result": "success",
"rate_remaining": 4823,
"rate_reset": "2026-02-19T17:59:00Z",
"is_secondary_limit_hit": true,
"retry_count": 2,
"backoff_ms": 6000
}
This makes post-incident analysis straightforward — you can see exactly which tool calls were running hot and whether the throttle plugin had to intervene.
Step 7 — Audit for git clone Paths
A full audit of the codebase confirmed no git clone operations anywhere. All remote file access uses the Contents API with ETag caching. The only execFileSync call in the codebase is the CodeRabbit CLI integration, which operates on a local working tree provided by the caller — not a clone triggered by git-steer.
Step 8 — Chunked Sweep with Cursor Persistence
The final piece addresses fleet scale. security_sweep now supports chunked execution:
// First call — process 10 repos, save cursor
security_sweep({ chunkSize: 10, severity: "critical" })
// → { hasMore: true, nextIndex: 10, totalRepos: 47 }
// Subsequent calls — resume from cursor
security_sweep({ resume: true })
// → { hasMore: true, nextIndex: 20, totalRepos: 47 }
// ... repeat until hasMore: false
The cursor is persisted to the state repo between calls, so a sweep can safely span multiple MCP sessions. A skipRecentHours parameter implements polling-fallback behaviour — repos swept within the specified window are skipped, avoiding redundant scans when git-steer runs on a schedule.
Results
- Secondary rate limit errors: eliminated in testing
- API calls per full sweep: reduced by ~60% (GraphQL batching + ETags)
- Audit log now contains full rate limit telemetry for post-incident analysis
- Large fleets (50+ repos) can be swept incrementally without timeout risk
The changes are all in ry-ops/git-steer on main as of commit dd5dbee.
git-steer is an MCP server for autonomous GitHub repository management. It exposes tools for security scanning, branch governance, PR workflows, and RFC issue tracking.