20290, "ShreyasKhandekar", "Creating arrays inside forall loops makes then ineligible for the GPU", "2022-07-22T20:37:12Z"
Creating an array of any type inside a forall
loop makes it not launch on the GPU.
It does not matter if the array is created inside another structure (e.g. another for
loop or an if
statement) within the forall
loop.
I observed this while working with the Sort benchmark of the SHOC suite
Steps to Reproduce
Source Code:
use GPUDiagnostics;
on here.gpus[0] {
startGPUDiagnostics();
forall i in 1..<20 {
var a : [1..10] int;
}
stopGPUDiagnostics();
writeln(getGPUDiagnostics());
}
Running the above code will give us the following output
(kernel_launch = 0)
but commenting out the line that creates the array gives us
(kernel_launch = 1)
Configuration Information
- Output of
chpl --version
:
chpl version 1.28.0 pre-release (a2c7053587)
built with LLVM version 13.0.0
Copyright 2020-2022 Hewlett Packard Enterprise Development LP
Copyright 2004-2019 Cray Inc.
(See LICENSE file for more details)
- Output of
$CHPL_HOME/util/printchplenv --anonymize
:
CHPL_TARGET_PLATFORM: cray-xc
CHPL_TARGET_COMPILER: llvm *
CHPL_TARGET_ARCH: x86_64
CHPL_TARGET_CPU: native *
CHPL_LOCALE_MODEL: gpu *
CHPL_COMM: none *
CHPL_TASKS: qthreads
CHPL_LAUNCHER: slurm-srun *
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_MEM: jemalloc
CHPL_ATOMICS: cstdlib
CHPL_GMP: bundled
CHPL_HWLOC: bundled
CHPL_RE2: bundled
CHPL_LLVM: system *
CHPL_AUX_FILESYS: none
- Back-end compiler and version, e.g.
gcc --version
orclang --version
:
gcc (GCC) 11.2.0 20210728 (Cray Inc.)
- (For Cray systems only) Output of
module list
:
Currently Loaded Modulefiles:
1) modules/3.2.11.4
2) craype-network-aries
3) nodestat/2.3.89-7.0.4.0_34.8__g8645157.ari
4) sdb/3.3.821-7.0.4.0_28.13__g8c59c9d.ari
5) udreg/2.3.2-7.0.4.0_37.11__g5f0d670.ari
6) ugni/6.0.14.0-7.0.4.0_28.11__ge0d449e.ari
7) gni-headers/5.0.12.0-7.0.4.0_38.14__gd0d73fe.ari
8) dmapp/7.1.1-7.0.4.0_40.13__gcec52bc.ari
9) xpmem/2.2.29-7.0.4.0_50.10__g35859a4.ari
10) llm/21.4.635-7.0.4.0_46.8__g33a55bc.ari
11) nodehealth/5.6.32-7.0.4.0_81.14__g66010cb.ari
12) system-config/3.6.3214-7.0.4.0_58.2__gcc05884c.ari
13) slurm/20.11.5-1
14) Base-opts/2.4.142-7.0.4.0_43.5__g8f27585.ari
15) cray-mpich/7.7.20
16) dws/3.0.38-7.0.4.0_69.9__gd993441.ari
17) cudatoolkit/10.2.89_3.28-7.0.3.0_2.66__g52c0314
18) gcc/11.2.0
19) craype/2.7.17.1
20) cray-libsci/20.09.1
21) pmi/5.0.17
22) atp/3.14.9
23) rca/2.2.22-7.0.4.0_27.13__ged51428.ari
24) perftools-base/22.04.0
25) PrgEnv-gnu/6.0.11