Back to browse
GitHub Repository

Claude Code SRE Handbook

11 starsPython

K8s watcher that investigates incidents and opens PRs (it can't merge)

by har-ki·Jun 16, 2026·2 points·0 comments

AI Analysis

MidNiche GemShip It

Reference implementation for AI SRE workflows, but it's a blog example not a deployable tool.

Strengths
  • Broken K8s manifest scenarios provide concrete test cases for AI SRE workflows.
  • Benchmark runner outputs JSONL data for measuring AI incident response performance.
  • OTEL + ClickHouse demo setup is genuinely useful for observability testing.
Weaknesses
  • Watcher is reference code for a blog post, not a packaged tool you can install.
  • No clear deployment path — requires significant adaptation to use in production.
Category
Target Audience

SREs and platform engineers exploring AI-assisted incident response

Similar Projects

Infrastructure●●Solid

RunbookAI – Stop scrolling dashboards at 3 a.m., let AI investigate

The project converts on-call triage into a hypothesis-driven agent that forms and prunes hypotheses, fetches evidence from CloudWatch/Kubernetes and your runbooks, and surfaces an investigation plus approval-gated remediation steps. I like the npx demo, read-only-by-default K8s stance, and built-in audit trail; the obvious caveat is its dependence on proprietary LLM keys and the ops work needed before trusting any mutating actions in production.

Solve My ProblemNiche GemWizardry
EmTekker
104mo ago
Developer Tools●●Solid

Nightwatch, The open-source, read-only AI SRE

Read-only AI agent architecture prevents production accidents during incident response.

Big BrainShip It
egorferber
331015d ago