[Chapel Merge] Use narrow arrays for blockdist scan

Branch: refs/heads/main
Revision: 45c8bcd
Author: ronawho
Link: Use narrow arrays for blockdist scan by ronawho · Pull Request #20590 · chapel-lang/chapel · GitHub
Log Message:

Merge pull request #20590 from ronawho/narrow-block-dist-scan

Use narrow arrays for blockdist scan

[reviewed by @benharsh and @bradcray]

Now that duplicate targetLocales are no longer allowed for block arrays,
we know that the local array pointer myLocArr is safe to use. Using it
instead of locArr[locid] allows us to do scan operations on narrow
pointers, which reduces any wide pointer overhead.

On a 128-core Rome CPU running with just 2 cores this takes a 16G Block
scan from 4.9s to 2.9s, bringing it on par with DR. This has no real
impact when using all 128 cores since we're bandwidth bound. I expect
this to improve performance on systems with fewer cores and in cases
where not all cores are used (e.g. serial scan in a parallel region)

Modified Files:
M modules/dists/BlockDist.chpl

Compare: https://github.com/chapel-lang/chapel/compare/4fba618fde29...45c8bcddae50