24741, "e-kayrakli", "Provide quicker ways to get started with GPUs", "2024-04-01T22:55:03Z"
... and eliminate potential gotchas.
@DanilaFe brought up a particular example in the GPU meeting today:
var CpuArr: [1..10] int;
on here.gpus[0] {
foreach i in 1..10 do CpuArr[i] = i;
}
This code doesn't work today with internal error: gpu-nvidia.c:370: Error calling CUDA function: an illegal memory access was encountered (Code: 700). This is because we're trying to write to CPU memory from inside a kernel.
This is an example of a potential gotcha. The missing piece in implementation is GPU-driven comm, which is not something we can prioritize soon. Importantly, the code above does work with CHPL_GPU_MEM_STRATEGY=unified_memory as the data transfer is handled by the driver through page migration.
With that context, the big-picture question is whether we can improve ergonomics especially for new-comers to Chapel or GPU programming in general. We could consider doing a combination of the following:
Improve the error message above. We do have some error messages we generate for GPU-driven get/put today. However, for simple cases like the ones above we squash get/puts. Maybe we should stop doing that, but note that it may mean doing some legal memory accesses through get/put, which in turn could reduce performance.
Add some scripts like util/gpu/setchplenv.bash or util/setchplgpuenv.bash of sorts and their quickstart versions.
We could make the quickstart versions use CHPL_GPU_MEM_STRATEGY=unified_memory, given that it should enable a quicker start, albeit a non-performant one. Though note that some code written in this mode will not work with the non-quick GPU config.
Maybe we embrace unified_memory a bit more as a quicker way of getting started with GPU programming in the technote.