stonea
November 1, 2021, 5:02pm
1
Branch: refs/heads/main
Revision: 3c59cc1
Author: stonea
Link: Fix test that previously expected fatbin file to be output. by stonea · Pull Request #18663 · chapel-lang/chapel · GitHub
Log Message:
Merge pull request #18663 from stonea/gpu_bundle_bin_testfix
Fix test that previously expected fatbin file to be output.
Fixing the gpuAddNums test failure in Jenkins noted here:
https://chapel-ci.us.cray.com/job/cray-cs-correctness-test-cray-cs-gpu-native/246/testReport/(root)/gpu_native_gpuAddNums/gpuAddNums/
Introduced by this PR:
chapel-lang:main
← stonea:gpu_bundle_bin
opened 12:01AM - 27 Oct 21 UTC
This PR reads and store the contents of the generated `.fatbin` file into a glob… al string variable called `chpl_gpuBinary`. It also modifies Chapel's GPU runtime library to use that variable rather than reading in the `.fatbin` file at runtime.
After this change the `gpu_files` directory is still produced, but its contents are no longer used by the generated executable. In other words, we could safely remove the `gpu_files` directory after compilation.
Ultimately we may want to have Chapel do this, or better yet find a way of producing the fatbin without performing file I/O. This is complicated by the fact that we produce the GPU binary code on one thread and then insert it into a global on the other, so if we wanted to avoid file I/O we'd have to find some other way to transfer this data.
**To reviewers (or anyone else familiar with how Chapel handles strings):**
I've confirmed that these changes pass paratests.
One change I'm not totally happy with is adding `new_RawStringSymbol` and `genGlobalRawString` functions.
Looking at `symbol.cpp`, it looks like there are are string symbols, cstring literals, and bytes symbols. Since my string is for storing the fatbin code and not a normal null-terinated string it may contain \0 values in the middle. I also don't want to unescape the string during code generation (it looks like we have this postprocessing to rewrites things like '\n' if we're targetting LLVM).
I would have assumed that `new_BytesSymbol()` (in `symbol.h`) would be what I should use, but it looks like that function asserts if you call it after resolution. I tried commenting out that assertion, but I got other confusing errors about having multiple of the same global.
EDIT: Also let me know if this code I added to `embedGpuCode` to set the currentAST location is standard practice (if I don't do this I'll hit an assert complaining that currentAstLoc is null). Seems like we'd want some sort of sentinel object that sets currentAstLoc on construction and reverts it on destruction. This code as-is isn't exception safe, but I copied them pattern from somewhere else (I forget exactly where):
```C++
astlocT prevloc = currentAstLoc;
currentAstLoc = astlocT(0, astr("<internal>"));
// ...
currentAstLoc = prevloc;
```
I missed this failure in failure in paratests since we don't run the GPU tests and missed it on my manual run of those tests in Osprey as the fatbin file was left around from a previous testrun and I didn't clean up the environment between runs.
[Reviewed by @e-kayrakli ]
Modified Files:
M test/gpu/native/gpuAddNums/gpuAddNums.chpl
Compare: https://github.com/chapel-lang/chapel/compare/bfced523f53f...3c59cc1785e2