Branch: refs/heads/main
Revision: d8c48ef
Author: ronawho
Link: Improve our gasnet support for the ofi conduit by ronawho · Pull Request #20033 · chapel-lang/chapel · GitHub
Log Message:
Merge pull request #20033 from ronawho/improve-gasnet-ofi
Improve our gasnet support for the ofi conduit
GASNet-EX 2022.3.0 significantly improved the ofi conduit in order to
support Slingshot 10/11 networks and restored support for Omni-Path
networks. Now that ofi support is back, improve our usage of it.
For the chplenv, change the default conduit to be ofi when gasnet is
used on hpe-cray-ex systems and ensure segment fast is used in those
cases. Note that we don't always default to segment fast since that
isn't helpful on omni-path systems.
In the gasnet shim, serialize active-message polling for ofi (this is
similar to #14912 and #17419) and parallelize the heap fault-in for ofi
segment fast (similar to #17405). This results in significant performance
improvements for gasnet-ofi. For instance, parallelizing the heap
fault-in takes 16 node indexgather on SS-11 from 620 MB/s/node to 730
MB/s/node and serializing polling takes it up to 3875 MB/s/node. On an
omnipath system we see serialization takes us from 415 MB/s/node to 490
MB/s/node.
There are a few places where we expect gasnet-ofi might be our default.
This is definitely true for omni-path systems, where ofi with the psm2
provider is recommended. Note that our native CHPL_COMM=ofi
layer does
not work with psm2 and we don't expect to put in the effort to get it
working (beyond the comm layer, we would also need spawning and
out-of-band support that I don't think is worth adding currently.) On
Slingshot-10 systems it's still up in the air if gasnet-ofi, gasnet-ucx,
or our native ofi comm layer will be best in the long term. Currently,
our native ofi layer is not working there, but this is a bug we need to
address. And lastly it's possible to use gasnet-ofi on Slingshot-11
systems, but we expect our native ofi comm layer to be the preferred
option since that's what we've mostly been developing it for. This is
much like using our native ugni layer on Aries systems instead of
gasnet-aries because it gives us more control on flagship HPE/Cray
systems.
In order to evaluate the current state of gasnet-ofi and what we might
recommend to users I gathered performance figures on a few systems for 5
benchmarks that expose different patterns/idioms we care about:
- Stream (no communication, numa affinity sensitive)
- PRK-stencil (little comm, numa affinity sensitive)
- ISx (concurrent bulk comm, numa affinity sensitive)
- Indexgather (concurrent bulk/aggregated comm)
- RA (concurrent fine-grained comm -- RDMA (rmo) and AM (on) variants)
chpl --fast test/release/examples/benchmarks/hpcc/stream.chpl
chpl --fast test/studies/prk/Stencil/optimized/stencil-opt.chpl -sorder="sqrt(16e9*numLocales / 8):int"
chpl --fast test/studies/isx/isx-hand-optimized.chpl -smode=scaling.weakISO
chpl --fast test/studies/bale/indexgather/ig.chpl -sN=10000000 -sprintStats -smode=Mode.aggregated
chpl --fast test/release/examples/benchmarks/hpcc/ra.chpl -sverify=false -suseOn=false -sN_U="2**(n-12)" -o ra-rmo
chpl --fast test/release/examples/benchmarks/hpcc/ra.chpl -sverify=false -suseOn=true -sN_U="2**(n-12)" -o ra-on
./stream -nl 16
./stencil-opt -nl 16
./isx-hand-optimized -nl 16
./ig -nl 16
./ra-rmo -nl 16
./ra-on -nl 16
Omni-path: