AI-Assisted Code Review Workflows
AI-Assisted Code Review Workflows
Code review is critical but time-consuming. Good reviewers catch bugs, suggest improvements, and ensure consistency—but they’re also a bottleneck. AI won’t replace human reviewers, but it can handle the tedious parts, freeing humans to focus on architecture, design decisions, and business logic.
After integrating AI into code review workflows across multiple teams, I’ve learned what works and what doesn’t. Let me share the patterns that deliver real value.
The Right Role for AI in Code Review
AI should augment, not replace, human reviewers:
What AI Does Well
- Catching common bugs (null checks, off-by-one errors)
- Identifying security vulnerabilities
- Enforcing style guidelines
- Suggesting documentation improvements
- Finding performance anti-patterns
- Checking test coverage completeness
What Humans Still Do Better
- Evaluating architectural decisions
- Understanding business context
- Assessing API design
- Judging code maintainability
- Making trade-off decisions
- Mentoring junior developers
Architecture: GitHub Action for PR Review
The foundation is a GitHub Action that triggers on pull requests:
# .github/workflows/ai-code-review.yml
name: AI Code Review
on:
pull_request:
types: [opened, synchronize]
permissions:
contents: read
pull-requests: write
jobs:
ai-review:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Get PR diff
id: diff
run: |
git fetch origin ${{ github.base_ref }}
git diff origin/${{ github.base_ref }}...HEAD > pr.diff
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: |
pip install anthropic PyGithub gitpython
- name: Run AI code review
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
python .github/scripts/ai_code_review.py \
--pr-number ${{ github.event.pull_request.number }} \
--repo ${{ github.repository }} \
--diff-file pr.diff
The Review Script
Here’s a production-ready implementation:
# .github/scripts/ai_code_review.py
import os
import argparse
from dataclasses import dataclass
from typing import List, Dict, Optional
from anthropic import Anthropic
from github import Github
import re
@dataclass
class ReviewComment:
path: str
line: int
severity: str # 'critical', 'warning', 'suggestion'
message: str
suggestion: Optional[str] = None
class AICodeReviewer:
def __init__(self, anthropic_key: str, github_token: str):
self.anthropic = Anthropic(api_key=anthropic_key)
self.github = Github(github_token)
def review_pr(
self,
repo_name: str,
pr_number: int,
diff_content: str
) -> List[ReviewComment]:
"""Analyze PR diff and generate review comments"""
# Parse diff into chunks
chunks = self._parse_diff(diff_content)
all_comments = []
for chunk in chunks:
comments = self._review_chunk(chunk)
all_comments.extend(comments)
return all_comments
def _parse_diff(self, diff_content: str) -> List[Dict]:
"""Parse unified diff into reviewable chunks"""
chunks = []
current_file = None
current_chunk = []
current_line = 0
for line in diff_content.split('\n'):
if line.startswith('diff --git'):
if current_chunk:
chunks.append({
'file': current_file,
'content': '\n'.join(current_chunk),
'start_line': current_line
})
current_chunk = []
# Extract filename
match = re.search(r'b/(.+)$', line)
current_file = match.group(1) if match else None
elif line.startswith('@@'):
# Parse line number
match = re.search(r'\+(\d+)', line)
current_line = int(match.group(1)) if match else 0
current_chunk.append(line)
elif current_file:
current_chunk.append(line)
if line.startswith('+') and not line.startswith('+++'):
current_line += 1
if current_chunk:
chunks.append({
'file': current_file,
'content': '\n'.join(current_chunk),
'start_line': current_line
})
return chunks
def _review_chunk(self, chunk: Dict) -> List[ReviewComment]:
"""Review a single diff chunk with AI"""
prompt = self._build_review_prompt(chunk)
response = self.anthropic.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=2048,
temperature=0.3, # Lower temperature for consistency
messages=[{
"role": "user",
"content": prompt
}]
)
# Parse AI response into structured comments
return self._parse_review_response(
response.content[0].text,
chunk['file']
)
def _build_review_prompt(self, chunk: Dict) -> str:
"""Build prompt for code review"""
return f"""You are an expert code reviewer. Analyze this code change and provide focused feedback.
File: {chunk['file']}
Changes:
```diff
{chunk['content']}
Review for:
- Bugs and potential errors
- Security vulnerabilities
- Performance issues
- Code style and best practices
- Missing error handling
- Documentation needs
Format your response as:
LINE <line_number> | SEVERITY <critical|warning|suggestion> |
Only comment on actual issues. Be concise. No praise, just actionable feedback. """
def _parse_review_response(
self,
response: str,
filepath: str
) -> List[ReviewComment]:
"""Parse AI response into structured comments"""
comments = []
current_comment = None
for line in response.split('\n'):
line = line.strip()
if line.startswith('LINE'):
# Save previous comment if exists
if current_comment:
comments.append(current_comment)
# Parse new comment
match = re.match(
r'LINE (\d+) \| SEVERITY (\w+) \| (.+)',
line
)
if match:
current_comment = ReviewComment(
path=filepath,
line=int(match.group(1)),
severity=match.group(2).lower(),
message=match.group(3)
)
elif line.startswith('[SUGGESTION]') and current_comment:
current_comment.suggestion = line.replace('[SUGGESTION]', '').strip()
if current_comment:
comments.append(current_comment)
return comments
def post_review_comments(
self,
repo_name: str,
pr_number: int,
comments: List[ReviewComment]
):
"""Post review comments to GitHub PR"""
repo = self.github.get_repo(repo_name)
pr = repo.get_pull(pr_number)
# Group comments by severity
critical = [c for c in comments if c.severity == 'critical']
warnings = [c for c in comments if c.severity == 'warning']
suggestions = [c for c in comments if c.severity == 'suggestion']
# Post summary comment
summary = self._build_summary(critical, warnings, suggestions)
pr.create_issue_comment(summary)
# Post inline comments
commit = pr.get_commits().reversed[0]
for comment in comments:
body = self._format_comment(comment)
try:
pr.create_review_comment(
body=body,
commit=commit,
path=comment.path,
line=comment.line
)
except Exception as e:
print(f"Failed to post comment: {e}")
def _build_summary(
self,
critical: List[ReviewComment],
warnings: List[ReviewComment],
suggestions: List[ReviewComment]
) -> str:
"""Build review summary comment"""
parts = ["## 🤖 AI Code Review\n"]
if critical:
parts.append(f"### 🚨 Critical Issues ({len(critical)})\n")
for c in critical[:3]: # Show top 3
parts.append(f"- **{c.path}:{c.line}** - {c.message}")
if len(critical) > 3:
parts.append(f"\n*...and {len(critical) - 3} more*")
if warnings:
parts.append(f"\n### ⚠️ Warnings ({len(warnings)})\n")
for c in warnings[:3]:
parts.append(f"- **{c.path}:{c.line}** - {c.message}")
if len(warnings) > 3:
parts.append(f"\n*...and {len(warnings) - 3} more*")
if suggestions:
parts.append(f"\n### 💡 Suggestions ({len(suggestions)})")
if not any([critical, warnings, suggestions]):
parts.append("✅ No issues found!")
parts.append("\n---\n*This review was performed by AI. Please validate findings before acting on them.*")
return '\n'.join(parts)
def _format_comment(self, comment: ReviewComment) -> str:
"""Format individual comment for GitHub"""
emoji = {
'critical': '🚨',
'warning': '⚠️',
'suggestion': '💡'
}.get(comment.severity, '📝')
parts = [f"{emoji} **{comment.severity.upper()}**\n"]
parts.append(comment.message)
if comment.suggestion:
parts.append("\n**Suggested fix:**")
parts.append(f"```\n{comment.suggestion}\n```")
return '\n'.join(parts)
def main(): parser = argparse.ArgumentParser() parser.add_argument(‘—pr-number’, required=True, type=int) parser.add_argument(‘—repo’, required=True) parser.add_argument(‘—diff-file’, required=True) args = parser.parse_args()
# Read diff
with open(args.diff_file, 'r') as f:
diff_content = f.read()
# Initialize reviewer
reviewer = AICodeReviewer(
anthropic_key=os.environ['ANTHROPIC_API_KEY'],
github_token=os.environ['GITHUB_TOKEN']
)
# Review PR
comments = reviewer.review_pr(
repo_name=args.repo,
pr_number=args.pr_number,
diff_content=diff_content
)
# Post comments
reviewer.post_review_comments(
repo_name=args.repo,
pr_number=args.pr_number,
comments=comments
)
if name == ‘main’: main()
## Advanced Patterns
### 1. Context-Aware Reviews
Include related files for better context:
```python
class ContextAwareReviewer(AICodeReviewer):
def _get_file_context(
self,
repo,
filepath: str,
pr_base_sha: str
) -> Optional[str]:
"""Get full file content for context"""
try:
content = repo.get_contents(filepath, ref=pr_base_sha)
return content.decoded_content.decode('utf-8')
except:
return None
def _review_chunk(self, chunk: Dict, repo, pr) -> List[ReviewComment]:
"""Enhanced review with file context"""
# Get full file for context
file_context = self._get_file_context(
repo,
chunk['file'],
pr.base.sha
)
prompt = self._build_contextual_prompt(chunk, file_context)
# ... rest of review logic
2. Language-Specific Rules
Tailor review to programming language:
class LanguageSpecificReviewer(AICodeReviewer):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.language_rules = {
'python': {
'checks': ['PEP 8', 'type hints', 'docstrings'],
'patterns': ['threading safety', 'exception handling']
},
'javascript': {
'checks': ['ESLint', 'async/await', 'null checks'],
'patterns': ['promise handling', 'React hooks']
},
'go': {
'checks': ['gofmt', 'error handling', 'goroutine leaks'],
'patterns': ['context usage', 'defer statements']
}
}
def _get_language(self, filepath: str) -> str:
"""Detect language from file extension"""
ext_map = {
'.py': 'python',
'.js': 'javascript',
'.ts': 'javascript',
'.go': 'go',
'.rs': 'rust',
'.java': 'java'
}
ext = os.path.splitext(filepath)[1]
return ext_map.get(ext, 'unknown')
def _build_review_prompt(self, chunk: Dict) -> str:
"""Build language-specific prompt"""
language = self._get_language(chunk['file'])
rules = self.language_rules.get(language, {})
base_prompt = super()._build_review_prompt(chunk)
if rules:
specific_checks = '\n'.join(f"- {check}" for check in rules.get('checks', []))
patterns = '\n'.join(f"- {p}" for p in rules.get('patterns', []))
return f"""{base_prompt}
Language-specific considerations for {language}:
{specific_checks}
Common patterns to check:
{patterns}
"""
return base_prompt
3. Security-Focused Review
Dedicated security scanning:
class SecurityReviewer(AICodeReviewer):
SECURITY_PATTERNS = {
'sql_injection': r'(execute|query)\s*\([^)]*\+',
'hardcoded_secret': r'(password|api_key|token|secret)\s*=\s*["\'][^"\']+["\']',
'command_injection': r'(exec|system|shell_exec|eval)\s*\(',
'path_traversal': r'open\([^)]*\+',
}
def _check_security(self, chunk: Dict) -> List[ReviewComment]:
"""Pattern-based security checks"""
comments = []
content = chunk['content']
lines = content.split('\n')
for i, line in enumerate(lines):
for pattern_name, pattern in self.SECURITY_PATTERNS.items():
if re.search(pattern, line):
comments.append(ReviewComment(
path=chunk['file'],
line=chunk['start_line'] + i,
severity='critical',
message=f"Potential {pattern_name.replace('_', ' ')}: {line.strip()}"
))
return comments
def _build_review_prompt(self, chunk: Dict) -> str:
"""Security-focused prompt"""
return f"""You are a security-focused code reviewer. Analyze this change for vulnerabilities.
File: {chunk['file']}
Changes:
```diff
{chunk['content']}
Check for:
- SQL injection vulnerabilities
- XSS (Cross-Site Scripting) risks
- Authentication/authorization issues
- Sensitive data exposure
- Cryptographic weaknesses
- Command injection
- Path traversal
- Race conditions
- Input validation issues
- Hardcoded secrets
Be thorough. Even “unlikely” vulnerabilities are worth mentioning.
Format as: LINE
## Filtering Noise
AI can be verbose. Filter low-value comments:
```python
class SmartReviewer(AICodeReviewer):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.min_confidence = 0.7
self.ignore_patterns = [
r'consider adding comments', # Too generic
r'variable name could be better', # Subjective
r'this works but', # Vague
]
def _should_post_comment(self, comment: ReviewComment) -> bool:
"""Filter low-value comments"""
# Skip suggestions for style in tests
if 'test' in comment.path and comment.severity == 'suggestion':
return False
# Skip generic advice
for pattern in self.ignore_patterns:
if re.search(pattern, comment.message.lower()):
return False
# Critical always posts
if comment.severity == 'critical':
return True
# Warnings need substance
if comment.severity == 'warning':
return len(comment.message) > 50
return True
def post_review_comments(self, repo_name, pr_number, comments):
"""Post only high-value comments"""
filtered = [c for c in comments if self._should_post_comment(c)]
super().post_review_comments(repo_name, pr_number, filtered)
Cost Optimization
Reviews can get expensive. Optimize token usage:
class CostEfficientReviewer(AICodeReviewer):
def __init__(self, *args, max_chunk_lines=100, **kwargs):
super().__init__(*args, **kwargs)
self.max_chunk_lines = max_chunk_lines
def _should_review_chunk(self, chunk: Dict) -> bool:
"""Skip reviewing certain changes"""
# Skip large chunks (likely generated code)
if chunk['content'].count('\n') > self.max_chunk_lines:
return False
# Skip binary files
if self._is_binary(chunk['file']):
return False
# Skip lock files, generated code
skip_files = ['package-lock.json', 'yarn.lock', 'go.sum', '.pb.go']
if any(skip in chunk['file'] for skip in skip_files):
return False
return True
def review_pr(self, repo_name, pr_number, diff_content):
"""Review only relevant chunks"""
chunks = self._parse_diff(diff_content)
reviewable = [c for c in chunks if self._should_review_chunk(c)]
print(f"Reviewing {len(reviewable)}/{len(chunks)} chunks")
all_comments = []
for chunk in reviewable:
comments = self._review_chunk(chunk)
all_comments.extend(comments)
return all_comments
def _is_binary(self, filepath: str) -> bool:
"""Check if file is binary"""
binary_extensions = [
'.png', '.jpg', '.jpeg', '.gif', '.pdf',
'.zip', '.tar', '.gz', '.bin', '.exe',
'.so', '.dylib', '.dll'
]
return any(filepath.endswith(ext) for ext in binary_extensions)
Integration with Existing Tools
Combine AI with traditional linters:
# .github/workflows/comprehensive-review.yml
name: Comprehensive Review
on:
pull_request:
types: [opened, synchronize]
jobs:
static-analysis:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
# Traditional linters
- name: Run pylint
run: pylint **/*.py
continue-on-error: true
- name: Run mypy
run: mypy .
continue-on-error: true
- name: Security scan
uses: aquasecurity/trivy-action@master
ai-review:
needs: static-analysis
runs-on: ubuntu-latest
steps:
# AI review from earlier
- name: AI Code Review
run: python .github/scripts/ai_code_review.py
Measuring Effectiveness
Track review quality:
class MetricsTrackingReviewer(AICodeReviewer):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.metrics = {
'comments_generated': 0,
'comments_resolved': 0,
'false_positives': 0,
'time_saved_minutes': 0
}
def track_comment_resolution(self, pr_number: int):
"""Track which comments were acted on"""
repo = self.github.get_repo(self.repo_name)
pr = repo.get_pull(pr_number)
ai_comments = [
c for c in pr.get_issue_comments()
if 'AI Code Review' in c.body
]
for comment in ai_comments:
# Check if comment led to code changes
if self._comment_was_addressed(comment, pr):
self.metrics['comments_resolved'] += 1
def _comment_was_addressed(self, comment, pr) -> bool:
"""Heuristic to detect if comment was acted on"""
# Check for commits after comment
commits_after = [
c for c in pr.get_commits()
if c.commit.author.date > comment.created_at
]
if not commits_after:
return False
# Check if files mentioned in comment were changed
files_in_comment = re.findall(r'`([^`]+\.\w+)`', comment.body)
changed_files = [f.filename for c in commits_after for f in c.files]
return any(f in changed_files for f in files_in_comment)
Best Practices
1. Start Small
Don’t review everything at once:
# Start with security only
reviewer = SecurityReviewer(...)
# Then add style checks
reviewer = LanguageSpecificReviewer(...)
# Finally, comprehensive reviews
reviewer = SmartReviewer(...)
2. Set Clear Expectations
Add to PR template:
## AI Review Checklist
The AI reviewer will check for:
- [ ] Common bugs and errors
- [ ] Security vulnerabilities
- [ ] Performance issues
- [ ] Missing tests
Please review AI comments, but use your judgment. False positives happen.
3. Human Override Always
# Allow developers to skip AI review
if '[skip-ai-review]' in pr.title:
print("Skipping AI review as requested")
exit(0)
4. Learn from False Positives
# Track patterns that generate false positives
false_positive_patterns = load_false_positives()
def is_likely_false_positive(comment: ReviewComment) -> bool:
return any(
pattern in comment.message
for pattern in false_positive_patterns
)
Key Takeaways
Effective AI code review requires:
- Clear scope - Focus AI on what it does best
- Noise filtering - Post only high-value comments
- Cost awareness - Skip generated files and large chunks
- Human context - Always allow override
- Continuous improvement - Track and learn from outcomes
AI won’t replace code review, but it will make reviews faster and catch more issues. Start with security checks, expand gradually, and always keep humans in the loop.
Resources
Augmenting human reviewers, not replacing them. One PR at a time.