Faster than std:sort and pdqsort
Beats std::sort by 24% on M1 using sorting networks for small subsets and branchless partitioning.
Fast Branchless Quicksort for C and C++ - Single- and Multithreaded Versions
Branchless partitioning with 512-element stack buffer beats pdqsort on M1 and Ryzen.
C/C++ developers working on performance-critical applications
pdqsort · fluxsort · std::sort
Beats std::sort by 24% on M1 using sorting networks for small subsets and branchless partitioning.
Beats std::sort by 30% on M1 using sorting networks to eliminate conditional branches.
Beats std::sort and pdqsort by replacing branches with sorting networks.
Branch-avoidant stores beat std::sort on M1, but it's a micro-optimization of a solved algorithm.
Crawled 5.6M sites, but tech stack distribution dashboards already exist (BuiltWith, Wappalyzer).
Removing branches from Quicksort cuts sort time in half on Apple Silicon.