New Issue: Should we support 'CHPL_INTERCONNET' / 'CHPL_NETWORK'?

25616, "bradcray", "Should we support 'CHPL_INTERCONNET' / 'CHPL_NETWORK'?", "2024-07-23T19:40:51Z"

Today, we support a CHPL_TARGET_PLATFORM variable that sometimes tells us a lot about the target platform if it's something specific like an HPE Cray EX or Cray XC system, but sometimes tells us little if it's a Linux cluster. In the latter case, the user has to set CHPL_COMM_* variables to specify how Chapel should map itself to the interconnect, using values like gasnet or ofi. In this issue, I'm wondering whether we should introduce a CHPL_INTERCONNECT or CHPL_NETWORK variable that would support values like none, slingshot, infiniband, ethernet, efa, unset, etc. as a higher-level way to say something about the target system that's higher-level and likely more known/knowable to a user than the details of how our communication is implemented. From there, we could then (typically) infer reasonable values for the lower-level CHPL_COMM* related variables (while still permitting a user to set them explicitly, if desired).

For example, I might imagine that setting CHPL_TARGET_PLATFORM=hpe-apollo would cause CHPL_INTERCONNECT to be inferred to be infiniband which would then cause CHPL_COMM to be inferred to gasnet and CHPL_COMM_SUBSTRATE to be inferred to be ibv (and so on). Yet on a Linux cluster that doesn't have a more specific platform identifier than linux64, a user could set CHPL_INTERCONNECT=infiniband and get the same lower-level settings. Or on an Apollo system, the user could override the default and set CHPL_COMM=ofi if they wanted to try the ofi-based implementation.

To me, this seems like it would prevent most users from ever having to set CHPL_COMM or its related variables, which feels like a win since that's more about how we implement things than about things a typical user would know, or should need to know.