Merge pull request #17762 from gbtitus/ofi-default-fixed-heap
Default to a large fixed heap on systems we expect to benefit from it.
(Reviewed by @ronawho.)
On the target platforms with the most capable networks (InfiniBand, for
example) the ofi comm layer cannot use the best-performing providers
(verbs, for example) unless there is a fixed heap. But in the past the
comm layer hasn't created a fixed heap by default, meaning that users
had to force it to do so through the CHPL_RT_MAX_HEAP_SIZE environment
variable in order to get good performance.
Here, when a tentative provider search indicates that a fixed heap could
improve performance by enabling use of a better provider, create one by
default and make it large: 85% of the compute node's physical memory.
We have to use a tentative provider search because we usually need to
make this decision early, when initializing the memory layer, which is
done long before the bulk of comm layer initialization. And we need
this so early because some memory layers including the default jemalloc
based one initialize quite differently with and without a fixed heap.
"Tentative" here means that the provider search uses just the base
libfabric capabilities that we require. In particular we don't look at
anything related to MCM conformance like the full provider search that
comes later does.
Users can override this by setting the CHPL_RT_MAX_HEAP_SIZE environment
variable to specify a different size for the fixed heap or setting it to
"0" to disable creating a fixed heap altogether.
As part of this, adjust the verbose output from comm=ofi to include
whether or not a fixed heap exists and if so, how big it is.
Modified Files: M doc/rst/platforms/libfabric.rst