Branch: refs/heads/main
Revision: 4ab5f08
Author: e-kayrakli
Link: Unavailable
Log Message:
Merge pull request #19804 from e-kayrakli/gpu-diag
Add GPUDiagnostics
module and track GPU memory allocations
Adds GPUDiagnostics module. Makes memtracking record/report GPU ids.
The GPUDiagnostics module is heavily inspired by the CommDiagnostics module,
both in interface and in implementation. The current interface looks like:
startGPUDiagnostics()
stopGPUDiagnostics()
getGPUDiagnostics()
startVerboseGPU()
stopVerboseGPU()
The support for *Here
functions are a future work.
The only event that this module records are kernel launches.
This PR also makes a major cleanup in GPU tests. While doing that, I also
realized that kernels can't call functions that access module-scope variables.
So, a related test is futurized.
[Reviewed by @ronawho, @stonea and @daviditen]
Test:
-
[x] gpu/native
Modified Files:
A modules/standard/GPUDiagnostics.chpl
A runtime/include/chpl-gpu-diags.h
A runtime/src/chpl-gpu-diags.c
A test/gpu/native/diags.chpl
A test/gpu/native/diags.good
A test/gpu/native/diags.prediff
A test/gpu/native/distArray/blockOutsideOnWorkaround.good
A test/gpu/native/kernelFnCalls/callFnAccessModVar.bad
A test/gpu/native/kernelFnCalls/callFnAccessModVar.future
A test/gpu/native/kernelFnCalls/fnWithForall.prediff
A test/gpu/native/multiGPU/multiGPU.good
A test/gpu/native/multiGPU/worksharing.good
A test/gpu/native/multiGPU/worksharingBasic.good
R test/gpu/native/dataPingPong.execopts
R test/gpu/native/dataPingPong.prediff
R test/gpu/native/distArray/blockInsideOn-verbose.chpl
R test/gpu/native/distArray/blockInsideOn-verbose.execopts
R test/gpu/native/distArray/blockInsideOn-verbose.good
R test/gpu/native/distArray/blockInsideOn-verbose.prediff
R test/gpu/native/distArray/blockOutsideOnWorkaround-verbose.chpl
R test/gpu/native/distArray/blockOutsideOnWorkaround-verbose.execopts
R test/gpu/native/distArray/blockOutsideOnWorkaround-verbose.good
R test/gpu/native/distArray/blockOutsideOnWorkaround-verbose.prediff
R test/gpu/native/distArray/blockOutsideOnWorkaround.good
R test/gpu/native/innerBlock.execopts
R test/gpu/native/innerBlock.prediff
R test/gpu/native/jacobi/jacobi-verbose.chpl
R test/gpu/native/jacobi/jacobi-verbose.execopts
R test/gpu/native/jacobi/jacobi-verbose.good
R test/gpu/native/jacobi/jacobi-verbose.prediff
R test/gpu/native/multiGPU/EXECOPTS
R test/gpu/native/multiGPU/PREDIFF
R test/gpu/native/multiGPU/README
R test/gpu/native/multiGPU/multiGPU.numlaunches
R test/gpu/native/multiGPU/worksharing.numlaunches
R test/gpu/native/multiGPU/worksharingBasic.numlaunches
M compiler/codegen/cg-expr.cpp
M doc/rst/meta/modules/standard.rst
M modules/Makefile
M runtime/include/chpl-comm-internal.h
M runtime/include/chpl-gpu.h
M runtime/include/chpl-mem-desc.h
M runtime/include/chplmemtrack.h
M runtime/include/stdchpl.h
M runtime/src/Makefile.share
M runtime/src/chpl-comm.c
M runtime/src/chpl-gpu.c
M runtime/src/chplmemtrack.c
M test/gpu/native/dataPingPong.chpl
M test/gpu/native/dataPingPong.good
M test/gpu/native/distArray/blockInsideOn.chpl
M test/gpu/native/distArray/blockInsideOn.good
M test/gpu/native/distArray/blockOutsideOnWorkaround.chpl
M test/gpu/native/innerBlock.chpl
M test/gpu/native/innerBlock.good
M test/gpu/native/jacobi/flags-no-checks.good
M test/gpu/native/jacobi/flags-warn-unstable.good
M test/gpu/native/jacobi/jacobi.chpl
M test/gpu/native/jacobi/jacobi.good
M test/gpu/native/kernelFnCalls/callFnAccessModVar.chpl
M test/gpu/native/kernelFnCalls/callFnAccessModVar.good
M test/gpu/native/kernelFnCalls/callFnFromFn.chpl
M test/gpu/native/kernelFnCalls/callFnFromFn.good
M test/gpu/native/kernelFnCalls/callTrivialFn.chpl
M test/gpu/native/kernelFnCalls/callTrivialFn.good
M test/gpu/native/kernelFnCalls/fnWithForall.chpl
M test/gpu/native/kernelFnCalls/fnWithForall.good
M test/gpu/native/multiGPU/multiGPU.chpl
M test/gpu/native/multiGPU/worksharing.chpl
M test/gpu/native/multiGPU/worksharingBasic.chpl
M test/gpu/native/streamPrototype/dr.chpl
M test/gpu/native/streamPrototype/dr.good
M test/gpu/native/streamPrototype/forallOverArray.chpl
M test/gpu/native/streamPrototype/forallOverArray.good
M test/gpu/native/streamPrototype/forallOverDomain.chpl
M test/gpu/native/streamPrototype/forallOverDomain.good
M test/gpu/native/streamPrototype/forallOverZipArray.chpl
M test/gpu/native/streamPrototype/forallOverZipArray.good
M test/gpu/native/streamPrototype/stream.chpl
M test/gpu/native/streamPrototype/stream.goodCompare: Comparing 34f29449ce5a...4ab5f08137b7 · chapel-lang/chapel · GitHub