New Issue: Provide quicker ways to get started with GPUs

e-kayrakli1 · April 15, 2024, 3:03pm

24741, "e-kayrakli", "Provide quicker ways to get started with GPUs", "2024-04-01T22:55:03Z"

Provide quicker ways to get started with GPUs

opened 10:55PM - 01 Apr 24 UTC

type: Feature Request area: Makefiles / Scripts area: GPU Support

... and eliminate potential gotchas. @DanilaFe brought up a particular exampl…e in the GPU meeting today: ```chpl var CpuArr: [1..10] int; on here.gpus[0] { foreach i in 1..10 do CpuArr[i] = i; } ``` This code doesn't work today with `internal error: gpu-nvidia.c:370: Error calling CUDA function: an illegal memory access was encountered (Code: 700)`. This is because we're trying to write to CPU memory from inside a kernel. This is an example of a potential gotcha. The missing piece in implementation is GPU-driven comm, which is not something we can prioritize soon. Importantly, the code above does work with `CHPL_GPU_MEM_STRATEGY=unified_memory` as the data transfer is handled by the driver through page migration. With that context, the big-picture question is whether we can improve ergonomics especially for new-comers to Chapel or GPU programming in general. We could consider doing a combination of the following: 0. Improve the error message above. We do have some error messages we generate for GPU-driven get/put today. However, for simple cases like the ones above we squash get/puts. Maybe we should stop doing that, but note that it may mean doing some legal memory accesses through get/put, which in turn could reduce performance. 1. Add some scripts like `util/gpu/setchplenv.bash` or `util/setchplgpuenv.bash` of sorts and their quickstart versions. 2. We could make the quickstart versions use `CHPL_GPU_MEM_STRATEGY=unified_memory`, given that it should enable a quicker start, albeit a non-performant one. Though note that some code written in this mode will not work with the non-quick GPU config. 3. Maybe we embrace `unified_memory` a bit more as a quicker way of getting started with GPU programming in the technote.

... and eliminate potential gotchas.

@DanilaFe brought up a particular example in the GPU meeting today:

var CpuArr: [1..10] int;
on here.gpus[0] {
  foreach i in 1..10 do CpuArr[i] = i;
}

This code doesn't work today with internal error: gpu-nvidia.c:370: Error calling CUDA function: an illegal memory access was encountered (Code: 700). This is because we're trying to write to CPU memory from inside a kernel.

This is an example of a potential gotcha. The missing piece in implementation is GPU-driven comm, which is not something we can prioritize soon. Importantly, the code above does work with CHPL_GPU_MEM_STRATEGY=unified_memory as the data transfer is handled by the driver through page migration.

With that context, the big-picture question is whether we can improve ergonomics especially for new-comers to Chapel or GPU programming in general. We could consider doing a combination of the following:

Improve the error message above. We do have some error messages we generate for GPU-driven get/put today. However, for simple cases like the ones above we squash get/puts. Maybe we should stop doing that, but note that it may mean doing some legal memory accesses through get/put, which in turn could reduce performance.
Add some scripts like util/gpu/setchplenv.bash or util/setchplgpuenv.bash of sorts and their quickstart versions.
We could make the quickstart versions use CHPL_GPU_MEM_STRATEGY=unified_memory, given that it should enable a quicker start, albeit a non-performant one. Though note that some code written in this mode will not work with the non-quick GPU config.
Maybe we embrace unified_memory a bit more as a quicker way of getting started with GPU programming in the technote.

Topic		Replies	Views	Activity
Chapel on an open-science Nvidia GPU + ARM system Users	6	298	September 20, 2021
[GPU] differing behavior between CPU and GPU when promoted-assigning a smaller array onto a larger Developers	8	147	May 19, 2023

New Issue: Provide quicker ways to get started with GPUs

Related Topics