Infrastructure●●●Banger
cuSBF – Faster GPU Bloom Filter for Sequence Data
92× faster than CPU Super Bloom with minimizer-based shard selection.
WizardryNiche Gem
tdortman
2022d ago
High-Performance GPU Super Bloom Filter
92× faster insert and 234× faster query than CPU Super Bloom on GPU.
Bioinformatics researchers working with large-scale sequence data
Super Bloom · cuCollections · GQF
92× faster than CPU Super Bloom with minimizer-based shard selection.
350x faster GPU Bloom filter with academic paper backing the performance claims.
Bloom filter + AHash pipeline cuts exact dedup from 7:55 to 2:55, 688MB vs 21.9GB RAM.
Compile CUDA for AMD GPUs with zero code changes—breaks NVIDIA lock-in.
Production-ready CUDA profiling when NSight only works in development.
28% faster Vulkan-to-CUDA on Qwen, but llm.c and llama.cpp already own inference.