[Chapel Merge] A few communication micro-optimizations to improve

Branch: refs/heads/master
Revision: b2b7f03
Author: mppf
Log Message:

Merge pull request #17354 from mppf/fewer-comms-block-arr-init

A few communication micro-optimizations to improve Block array creation

This PR removes a bunch of unnecessary GETs in Block array creation. For
a simple benchmark from

https://github.com/Cray/chapel-private/issues/1789

I observed (with --no-cache-remote) 1068 GETs for locales other than 0
and this PR gets it down to 780.

While I was investigating, I discovered that the verbose comm output with
--cache-remote enabled was not as useful as it could be. So, this PR
makes a couple of updates to fix verbose comm output with the cache
enabled.

Future Work:

  • When developing this PR, I observed a lot of inefficiency with
    privatization and reprivatization. These could be cleaned up and use
    the serialize/deserialize functions on the privatized classes.
  • Make similar improvements for other distributions.
  • [x] full gasnet testing

    Modified Files:
    A test/multilocale/engin/verboseComm.cache.good
    A test/multilocale/engin/verboseComm.compopts
    A test/multilocale/engin/verboseComm.no-cache.good
    R test/multilocale/engin/verboseComm.good
    M modules/dists/BlockDist.chpl
    M modules/internal/ChapelArray.chpl
    M modules/internal/ChapelDistribution.chpl
    M runtime/src/chpl-cache.c
    M runtime/src/comm/gasnet/comm-gasnet.c
    M test/arrays/slices/commCounts/sliceBlockWithRanges.good
    M test/arrays/slices/commCounts/sliceBlockWithRanges.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/alloc.block.good
    M test/distributions/robust/arithmetic/performance/multilocale/alloc.block.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/alloc_all.block.good
    M test/distributions/robust/arithmetic/performance/multilocale/alloc_all.block.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/assignReindex.block.good
    M test/distributions/robust/arithmetic/performance/multilocale/assignReindex.block.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/assignReindex.cyclic.good
    M test/distributions/robust/arithmetic/performance/multilocale/assignReindex.cyclic.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/assignReindex.replicated.good
    M test/distributions/robust/arithmetic/performance/multilocale/assignReindex.replicated.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/assignSlice.block.good
    M test/distributions/robust/arithmetic/performance/multilocale/assignSlice.block.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/assignSlice.cyclic.good
    M test/distributions/robust/arithmetic/performance/multilocale/assignSlice.cyclic.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/assignSlice.replicated.good
    M test/distributions/robust/arithmetic/performance/multilocale/assignSlice.replicated.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/reduceSlice.block.good
    M test/distributions/robust/arithmetic/performance/multilocale/reduceSlice.block.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/reduceSlice.cyclic.good
    M test/distributions/robust/arithmetic/performance/multilocale/reduceSlice.cyclic.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/reduceSlice.replicated.good
    M test/distributions/robust/arithmetic/performance/multilocale/reduceSlice.replicated.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/assignReindex.block.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/assignReindex.block.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/assignReindex.cyclic.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/assignReindex.cyclic.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/assignReindex.replicated.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/assignReindex.replicated.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/assignSlice.block.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/assignSlice.block.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/assignSlice.cyclic.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/assignSlice.cyclic.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/assignSlice.replicated.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/assignSlice.replicated.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/reduceSlice.block.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/reduceSlice.block.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/reduceSlice.cyclic.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/reduceSlice.cyclic.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/reduceSlice.replicated.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/reduceSlice.replicated.na-none.good
    M test/distributions/robust/arithmetic/performance/multilocale/rvfSlices/reduceSlice2.good
    M test/distributions/vass/changeBoundingBox.chpl
    M test/multilocale/engin/verboseComm.prediff

    Compare: Comparing bc75efd9cddf...b2b7f03ac60d · chapel-lang/chapel · GitHub