Branch: refs/heads/main
Revision: 5f3d3d24548a6273b60acaa98b30b466d61ecfe0
Author: e-kayrakli
Link: Update GPU Technote for 1.30 by e-kayrakli · Pull Request #21903 · chapel-lang/chapel · GitHub
Log Message:
Update GPU Technote for 1.30 (#21903)
This PR changes the GPU Technote in the following ways:
"Factual" Changes
- Clarifies order-independence rule
- Makes
CHPL_GPU_ARCH
discussion symmetric between NVIDIA and AMD - Makes
CHPL_GPU_MEM_STRATEGY
discussion symmetric between
unified_memory
andarray_on_device
- Drops the first
assertOnGpu
emphasis, the second one under
"Utilities" is still there - Drops limitation about lack of setting block size per kernel as per
GPU: Add procedures wrapping `syncThreads` and `allocShared` primitives. by DanilaFe · Pull Request #21801 · chapel-lang/chapel · GitHub - Adds lack of distributed array support as limitation
- Removes blanket recommendation for
CHPL_RT_NUM_THREADS_PER_LOCALE=1
though keeps it as a limitation for CUDA 10. See
https://github.com/Cray/chapel-private/issues/4554 - Adds a note about
Memory.Diagnostics
facilities. - Adds a small section about debugging/profiling for NVIDIA
Reorganization
- Splits "Requirements and Limitations" into two
- "Requirements" are now under "Setup"
- "Known Limitations" are at the very end of the note
- Adds a Further Information section
- Drops the initial summary paragraph as table of contents comes right
after it and feels a bit redundant
[Reviewed by @stonea]
Diff:
M doc/rst/technotes/gpu.rst
https://github.com/chapel-lang/chapel/pull/21903.diff