Back to browse
GitHub Repository

PostgreSQL backup manager with BTRFS block-level deduplication via dduper

2 starsGo

PostgreSQL backup manager with BTRFS block-level deduplication

by giis·Mar 24, 2026·2 points·0 comments

AI Analysis

●●SolidBig BrainNiche Gem

Gzip breaks dedup; this stores uncompressed snapshots on BTRFS for 85% savings.

Strengths
  • Identifies that compression destroys block-level deduplication efficiency on consecutive snapshots.
  • Single binary with no cloud dependencies keeps data entirely on your server.
  • Supports point-in-time recovery via WAL archiving alongside physical backups.
Weaknesses
  • Requires BTRFS filesystem and patched btrfs-progs, limiting adoption significantly.
  • Orchestration wrapper around pg_basebackup and dduper rather than novel storage engine.
Target Audience

Self-hosted PostgreSQL admins running BTRFS

Similar To

pgBackRest · Wal-G · Barman

Post Description

I wrote BTRFS block level de-duplication sometime in 2020 as use-case for a patch sent in 2015 https://stackoverflow.com/a/34163236 and now after 6 more years created an use-case for dduper via pgdedup!

How it works? Consecutive pg_basebackup snapshots share most of their blocks. Store them uncompressed on BTRFS and let dduper deduplicate it.

Interestingly: - gzip completely breaks block-level dedup. Two pg_basebackup -z runs of the same database produce < 1% matching blocks. - Chunk size matters hugely. dduper's default 128KB chunks only found 19% savings. Lowering to 8KB (PostgreSQL's page size) jumped to 68%.

Similar Projects

Infrastructure●●Solid

Rivestack – Managed PostgreSQL with pgvector, $29/mo

It spins up dedicated Postgres instances with pgvector pre-installed, uses Patroni for HA and pgBackRest for snapshots, and publishes concrete vector benchmarks (2k QPS @ <4ms for 10k vectors; 252 QPS at 1M). The stack choices (Hetzner NVMe, read replicas, HNSW) feel pragmatic for teams who don't want serverless/shared trade-offs, though I'd want clearer SLA/multi-region details and independent benchmarks at larger scales before moving critical workloads.

Niche GemSolve My Problem
stranger90
103mo ago
Infrastructure●●Solid

Clawstash – Encrypted incremental backups for OpenClaw

Nice little CLI: one-liner install and an interactive 'clawstash setup' get you an hourly daemon that auto-downloads restic and uploads AES-256 encrypted, deduplicated blocks to any S3-compatible store. It's pragmatic and tightly scoped — excellent if you run OpenClaw, but mostly a focused wrapper around restic rather than a novel backup system.

Niche GemSolve My Problem
a_micali
213mo ago