Back to browse
GitHub Repository

XAI-driven augmentation & diagnostics for PyTorch vision - find model failures, fix with saliency-guided augmentation (ICD/AICD), prove with auditable reports.

21 starsPython

BNNR – a closed-loop pipeline for improving vision models

by dominka·Apr 10, 2026·1 point·0 comments

AI Analysis

●●SolidBig BrainNiche Gem

XAI-driven model improvement loop, but Weights & Biases already tracks experiments better.

Strengths
  • OptiCAM and GradCAM integration explains why changes improved metrics, not just that they did
  • Parallel candidate strategy testing avoids blind hyperparameter tweaking
  • Live React + FastAPI dashboard with structured JSON reports for audit trails
Weaknesses
  • ML experiment tracking is crowded — W&B, MLflow, Neptune all do this with more features
  • Only 5 GitHub stars suggests early stage with unproven real-world adoption
Category
Target Audience

Computer vision researchers, ML engineers improving vision models

Similar To

Weights & Biases · MLflow · Neptune

Post Description

Hi HN,

we’ve been working on computer vision models for a while, and one thing kept coming up: improving them is surprisingly unstructured.

You train a model, try a few augmentations, tweak some hyperparameters, run it again - and if the metric goes up, you keep it. But it’s often unclear why it improved, or whether the model is actually learning something better.

We kept running into questions like: - Did this change actually help, or is it just noise? - Is the model focusing on the right features? - Are we improving generalization, or just overfitting differently?

So we built BNNR (Bulletproof Neural Network Recipe) - an open-source PyTorch toolkit that turns model improvement into a closed loop: - Train a controlled baseline - Explain what the model actually learned (via XAI methods like OptiCAM / GradCAM) - Improve by testing candidate strategies in parallel - Prove the result with structured comparisons

One thing we focused on is avoiding “blind” changes. Instead of committing to a single idea (e.g. an augmentation), BNNR evaluates multiple candidates and only keeps those that measurably improve a selected metric. The goal is to reduce manual trial-and-error, not replace control.

We also use explainability as part of the loop, not just for visualization.

For example, in one experiment a model classifying airplanes was mostly focusing on the sky background rather than the object itself. This kind of behavior is hard to spot from metrics alone. After applying a targeted modification based on the model’s attention, the focus shifted toward the airplane, and performance improved on held-out data.

Under the hood, some of the improvements are driven by XAI-based transformations: - ICD (Intelligent Coarse Dropout) masks the most salient regions (what the model relies on too much), forcing it to learn from broader context - AICD (Anti-ICD) does the opposite - it masks less relevant regions and keeps only what the model considers important

We don’t treat these as “magic augmentations”, but as ways to test hypotheses about what the model is actually using. BNNR works with its own augmentations as well as external libraries (e.g. Albumentations)

Each run is tracked and visualised in a live dashboard, where you can see: - baseline vs improved metrics - per-class performance - attention maps before/after - candidate branches being explored in parallel and which ones were selected or discarded

There is a trade-off: evaluating multiple candidates in parallel adds compute cost. In practice, we’ve found it comparable (or better) than manually running multiple experiments and tuning setups - but with much more structured results.

It’s still early, but currently supports: - image classification - multi-label classification - object detection (just added today)

Would really appreciate feedback - especially from people experimenting with vision models or training pipelines.

You can try it here: GitHub: https://github.com/bnnr-team/bnnr Website: https://www.bnnr.dev/ Colab: https://colab.research.google.com/github/bnnr-team/bnnr/blob...

Similar Projects

AI/ML●●Solid

NeuroFlow 55.8x video inference speedup for Vision Transformers PyTorch

Training-free dual-memory protocol cuts 1792p SigLIP inference from 678ms to 11.9ms.

Big BrainWizardry
ynnk
8219d ago