Back to browse
Reducing a 66-node dependency cycle to 13 in Scrapy

Reducing a 66-node dependency cycle to 13 in Scrapy

by pvizgenerator·Apr 23, 2026·2 points·0 comments

AI Analysis

●●SolidBig BrainWizardry

Dependency cycle analysis with iteration snapshots shows non-linear structural change patterns.

Strengths
  • Downloadable PViz snapshot bundles per iteration let you explore structural evolution yourself.
  • Distinguishes runtime SCC from conceptual SCC masked by TYPE_CHECKING guards.
  • Concrete metrics: 66→13 conceptual SCC, 23→2 runtime SCC across 68 tracked iterations.
Weaknesses
  • Showcase focuses on Scrapy refactor rather than PViz as a general-purpose tool.
  • No clear installation path or docs for applying this to your own codebase yet.
Target Audience

Maintainers of large Python codebases dealing with dependency cycles

Similar To

pyan · pydeps · CodeGraph

Post Description

I built PViz to help developers understand the structural shape of large codebases — surfacing module coupling, strongly connected components, and the edges driving them.

To put it through its paces, I ran it on Scrapy with a concrete goal: reduce SCCs without changing runtime behavior.

Starting point → Final result: - Largest conceptual (TYPE_CHECKING-masked) SCC: 66 → 15 nodes - Runtime SCC: 23 → 2 nodes

Going in with no prior knowledge of the codebase, the refactor took 68 iterations and surfaced some non-obvious structural behaviors:

- Runtime coupling collapsed early (23 → 4 by iteration 17) while the conceptual graph stayed largely intact — suggesting runtime and conceptual coupling respond to different kinds of changes - A ~24 iteration plateau (iterations 27–50) where the conceptual SCC held at 30 nodes, indicating a load-bearing architectural core that couldn’t be decomposed incrementally - A “kernel break” at iteration 51 where the crawler, engine, scraper, and spider middleware all exited the SCC in a single step — nonlinear progress after a long stall - A deliberate regression at the end (13 → 15): HTTP-layer coupling was identified as structurally necessary during testing and reinstated

The full progression is documented through curated dependency snapshots across key iterations, along with test logs, a detailed analysis report, and compressed analysis bundles.

Happy to discuss if you find this interesting.

Similar Projects

Health●●Solid

Ovumcy – self-hosted menstrual cycle tracker

Privacy-first cycle tracker you fully control, but the category already has established players.

Solve My ProblemNiche Gem
terrain07
103mo ago