Merge pull request #17811 from gbtitus/ofi-pmi
Simplify and improve comm=ofi PMI-based out-of-band support.
(Reviewed by @ronawho.)
For some time comm=ofi has had out-of-band support based on both Cray
PMI and "Slurm" PMI2, the latter being something of a misnomer because
it was really just plain old PMI2 but we were getting the libraries from
Slurm installs. These were referred to as the "pmi" and "slurm-pmi2"
versions of the out-of-band support. Here, merge these two, creating
"pmi2" out-of-band support . This retains the init and fini support
which was based on PMI2_Init(), PMI2_Initialized(), and PMI2_Finalize()
in both cases. By preference, barrier, allgather, and broadcast are
implemented in terms of PMI_Barrier(), PMI_Allgather(), and PMI_Bcast().
These are always present on Cray XC systems and, as of this moment, are
also present on HPE Cray EX systems when the user has the cray-pmi and
cray-pmi-lib modules loaded. Alternatively, these can be implemented by
falling back to PMI2_KVS_Fence(), PMI2_KVS_Get(), and PMI2_KVS_Put(),
via the PMI2 key-value store interface.
We declare these PMI_ and PMI2_KVS_ functions for barrier, allgather,
and broadcast as weak so user programs will link without complaints
about undefined externals either way, and then just call whatever is
ends up having been loaded at execution time. We should have either the
Cray style or KVS style functions available to call at execution time no
matter what, since the shared object which provides PMI2_Init() should
also provide the PMI2 KVS interface.
In addition, here we also try to arrange for the number of key-value
store entries to be large enough for our needs if we end up having to
use that. We've seen many cases, especially on HPE Cray EX systems, in
which the default number of KVS entries size is insufficient for what
we're doing with it. So here, set PMI_MAX_KVS_ENTRIES large enough for
our estimated needs on the launch side, unless it's already set and
larger than we need. The executable on the compute node will inherit
that. (We do this on the launch side because we're pretty sure this
environment variable gets read in .init code by the executable on the
compute node, so setting it there will be too late.)
As a necessary side effect, this adds a numLocales argument to the comm
layer implementation-specific chpl_comm_preLaunch() functions called by
Modified Files: A runtime/src/comm/ofi/comm-ofi-oob-pmi2.c