Branch: refs/heads/main
Revision: 8b7a18f
Author: daviditen
Log Message:
Merge pull request #18308 from daviditen/clone-gpu-loops
Generate both GPU and CPU versions of outlined loops
[reviewed by @e-kayrakli and @gbtitus]
When outlining loops for GPU, make two copies - the outlined one for GPU and
the original for CPU. Add a conditional that checks if the code is currently
running on the GPU and run the outlined code in that case, and run the original
loop otherwise.
Add a new primitive PRIM_GET_REQUESTED_SUBLOC that is generated as a call to
the runtime function chpl_task_getRequestedSubloc() for use in determining
if it is on GPU or not.
If the locale model has GPUs work around issue Cray/chapel-private#2413
by getting the requested sublocale instead of always using "any".
Update the gpu/native/jacobi test to run once on GPU and once on CPU.
Signed-off-by: David Iten daviditen@users.noreply.github.com
Modified Files:
M compiler/AST/primitive.cpp
M compiler/codegen/cg-expr.cpp
M compiler/include/primitive_list.h
M compiler/optimizations/deadCodeElimination.cpp
M compiler/optimizations/optimizeOnClauses.cpp
M compiler/util/exprAnalysis.cpp
M test/gpu/native/jacobi/jacobi.chpl
M test/gpu/native/jacobi/jacobi.good
Compare: https://github.com/chapel-lang/chapel/compare/a006cbcc3cc0...8b7a18f1e3bd