[Chapel Merge] Optimize scans by using noinit for the result arra

Branch: refs/heads/master
Revision: 83015f8
Author: ronawho
Log Message:

Merge pull request #16919 from ronawho/noinit-scans

Optimize scans by using noinit for the result array

[reviewed by @bradcray and @mppf]

Our scan implementation previously created a default initialized array
that it would immediately write to. Here just skip initialization for
POD types to avoid wasting time default initializing something we’re
about to assign to anyways. Note that we don’t use noinit but instead
an internal equivalent because noinit doesn’t work for non-POD arrays.
I don’t imagine scans on non-POD types being common so I didn’t worry
about optimizing them. Doing so would require using move-initialize the
elements.

For scanPerf with 1/4 of memory, this shaves ~1.0s off DefaultRectangular
scan for 1 locale (3.9s -> 2.9s) and ~0.3s off BlockDist scan for 2-512
locales (2.7s-3.0s -> 2.4s-2.7s). Note that multi-locale scan is faster
than single locale because we use hugepages for multi-locale ugni, which
improves performance of the raw memory operations.

Modified Files:
A test/scan/testScanPOD.chpl
A test/scan/testScanPOD.good
A test/scan/testScanPOD.numlocales
M modules/dists/BlockDist.chpl
M modules/internal/DefaultRectangular.chpl

Compare: Comparing d469977a2465...83015f80f295 · chapel-lang/chapel · GitHub