Merge pull request #16919 from ronawho/noinit-scans
Optimize scans by using noinit for the result array
Our scan implementation previously created a default initialized array
that it would immediately write to. Here just skip initialization for
POD types to avoid wasting time default initializing something we’re
about to assign to anyways. Note that we don’t use
noinit but instead
an internal equivalent because
noinit doesn’t work for non-POD arrays.
I don’t imagine scans on non-POD types being common so I didn’t worry
about optimizing them. Doing so would require using move-initialize the
For scanPerf with 1/4 of memory, this shaves ~1.0s off DefaultRectangular
scan for 1 locale (3.9s -> 2.9s) and ~0.3s off BlockDist scan for 2-512
locales (2.7s-3.0s -> 2.4s-2.7s). Note that multi-locale scan is faster
than single locale because we use hugepages for multi-locale ugni, which
improves performance of the raw memory operations.