New Issue: Support for multidimensional arrays in loop expressions as kernels

25630, "jabraham17", "Support for multidimensional arrays in loop expressions as kernels", "2024-07-24T20:05:14Z"

Using a multidimensional array in a loop expression prevents the compiler from being able to create kernels. This should work and result in a kernel launch.

For example, the following loop expression cannot be gpuized

on here.gpus[0] {

  const D = {1..10, 1..10};
  var x: [D] int;
  var y: [D] int;
  var z = foreach i in x.domain do x(i) != y(i);
}

Compiling with @assertOnGpu fails to compile, as the compiler determines that the loop expression is not gpuizable. I also confirmed that there is no kernel launch with start/stopVerboseGpu. Changing D to be {1..0} does work and does result in a kernel launch. Also, changing the loop expression into a loop statement also results in a kernel launch, regardless of the domain.

var z: [D] bool;
foreach i in x.domain do z(i) = x(i) != y(i);

The same is true with promotion, which currently fails to be gpuized with multidimensional arrays. The following fails to compile and turn into a kernel, but changing the domain to be {1..10} does.

on here.gpus[0] {

  const D = {1..10, 1..10};
  var x: [D] int;
  var y: [D] int;
  var z = x != y;
}

In the loop expression case, the compiler complains that advance is not gpu eligible with no further information. In the promotion case, the compiler gives a little more information:

$CHPL_HOME/modules/internal/ChapelArray.chpl:3550: In function 'chpl__initCopy_shapeHelp':
$CHPL_HOME/modules/internal/ChapelArray.chpl:3598: error: Loop is marked with @assertOnGpu but is not eligible for execution on a GPU
$CHPL_HOME/modules/internal/ChapelArray.chpl:1341: note: called function has outer var access
$CHPL_HOME/modules/internal/ChapelArray.chpl:3598: note:   reached via call to 'advance' in loop body here
  $CHPL_HOME/modules/internal/ChapelArray.chpl:3533: called as chpl__initCopy_shapeHelp(shape: domain(unmanaged DefaultRectangularDom(2,int(64),one)), ir: _ir_chpl_promo1_!=) from function 'chpl__initCopy'
  foo2.chpl:9: called as chpl__initCopy(ir: _ir_chpl_promo1_!=, definedConst: bool)
note: generic instantiations are underlined in the above callstack