Cannot make gpu-enabled chapel

Hello. I am trying to make a gpu-enabled runtime. I am running linux mint 22; clang is 18.1.3; llvm is 18.1.3 (as far as I can tell). setchplenv.bash has 3 additional lines at the end:
export CHPL_LLVM=system
export CHPL_LOCALE_MODEL=gpu
export CHPL_GPU=nvidia

adding or deleting
export CHPL_CUDA_PATH=/usr/lib/cuda

does not seem to make a difference. Everything works fine almost to the end, when make fails with

In file included from :1:
In file included from /usr/lib/llvm-18/lib/clang/18/include/__clang_cuda_runtime_wrapper.h:41:
/usr/lib/llvm-18/lib/clang/18/include/cuda_wrappers/cmath:27:15: fatal error: 'cmath' file not found
27 | #include_next
| ^~~~~~~
1 error generated when compiling for sm_60.
make[6]: *** [Makefile.share:38: ../../../../build/runtime/linux64/llvm/x86_64/cpu-native/loc-gpu/gpu-nvidia/gpu_mem-array_on_device/comm-none/tasks-qthreads/tmr-generic/unwind-none/mem-jemalloc/atomics-cstdlib/hwloc-bundled/re2-none/fs-none/lib_pic-none/san-none/src/gpu/nvidia/gpu-nvidia-cub.o] Error 1
make[5]: *** [../../make/Makefile.runtime.foot:29: nvidia.makedir] Error 2
make[4]: *** [../make/Makefile.runtime.foot:29: gpu.makedir] Error 2
make[3]: *** [make/Makefile.runtime.foot:29: src.makedir] Error 2
make[2]: *** [Makefile:49: all.helpme] Error 2
make[1]: *** [Makefile:107: runtime] Error 2
make: *** [Makefile:70: comprt] Error 2

I wonder if anyone has any suggestion to have make succeed?

Regards

Nelson

Hi Nelson,

Not an answer, but what do you get when you run printchplenv --all --internal --anonymize?

Do you know if your system LLVM has support for targeting NVIDIA GPUs? printchplenv should check for it and fail if that's not the case. And looking at the output it does have some sort of CUDA support..

Engin

1 Like

Hi Engin! without setting /usr/lib/cuda, right after sourcing setchplenv.bash and before saying make, I get

Error: Can't find libdevice. Please make sure your CHPL_CUDA_PATH is set such that CHPL_CUDA_PATH/nvmm/libdevice/libdevice*.bc exists. To avoid this issue, you can have GPU code run on the CPU by setting 'CHPL_GPU=cpu'. To turn this error into a warning set CHPLENV_GPU_REQ_ERRS_AS_WARNINGS.

If I set CHPL_CUDA_PATH: /usr/lib/cuda, i get a long list:
CHPL_HOST_PLATFORM: linux64
CHPL_HOST_COMPILER: gnu
CHPL_HOST_CC: gcc
CHPL_HOST_CXX: g++
CHPL_HOST_BUNDLED_COMPILE_ARGS: -DHAVE_LLVM -I/home/nldias/Downloads/chapel-2.1.0/third-party/jemalloc/install/host/linux64-x86_64-gnu/include
CHPL_HOST_SYSTEM_COMPILE_ARGS: -fno-rtti -I/usr/lib/llvm-18/include -std=c++17 -fno-exceptions -funwind-tables -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
CHPL_HOST_BUNDLED_LINK_ARGS: -L/home/nldias/Downloads/chapel-2.1.0/third-party/jemalloc/install/host/linux64-x86_64-gnu/lib -ljemalloc
CHPL_HOST_SYSTEM_LINK_ARGS: -lclang-cpp -L/usr/lib/llvm-18/lib -Wl,-rpath,/usr/lib/llvm-18/lib -L/usr/lib/llvm-18/lib -lLLVM-18 -lm -lpthread
CHPL_HOST_ARCH: x86_64
CHPL_HOST_CPU: none
CHPL_TARGET_PLATFORM: linux64
CHPL_TARGET_COMPILER: llvm
CHPL_TARGET_CC: /usr/lib/llvm-18/bin/clang
CHPL_TARGET_CXX: /usr/lib/llvm-18/bin/clang++
CHPL_TARGET_COMPILER_PRGENV: none
CHPL_TARGET_BUNDLED_COMPILE_ARGS: -I/home/nldias/Downloads/chapel-2.1.0/runtime/include/localeModels/gpu -I/home/nldias/Downloads/chapel-2.1.0/runtime/include/localeModels -I/home/nldias/Downloads/chapel-2.1.0/runtime/include/comm/none -I/home/nldias/Downloads/chapel-2.1.0/runtime/include/comm -I/home/nldias/Downloads/chapel-2.1.0/runtime/include/tasks/qthreads -I/home/nldias/Downloads/chapel-2.1.0/runtime/include -I/home/nldias/Downloads/chapel-2.1.0/runtime/include/qio -I/home/nldias/Downloads/chapel-2.1.0/runtime/include/atomics/cstdlib -I/home/nldias/Downloads/chapel-2.1.0/runtime/include/mem/jemalloc -I/home/nldias/Downloads/chapel-2.1.0/third-party/utf8-decoder -DHAS_GPU_LOCALE -I/home/nldias/Downloads/chapel-2.1.0/runtime/include/gpu/nvidia -DCHPL_JEMALLOC_PREFIX=chpl_je_ -I/home/nldias/Downloads/chapel-2.1.0/third-party/gmp/install/linux64-x86_64-native-llvm-none/include -I/home/nldias/Downloads/chapel-2.1.0/third-party/hwloc/install/linux64-x86_64-native-llvm-none-gpu/include -I/home/nldias/Downloads/chapel-2.1.0/third-party/qthread/install/linux64-x86_64-native-llvm-none-gpu-jemalloc-bundled/include -I/home/nldias/Downloads/chapel-2.1.0/third-party/jemalloc/install/target/linux64-x86_64-native-llvm-none/include
CHPL_TARGET_SYSTEM_COMPILE_ARGS: -I/usr/lib/cuda/include -D__STRICT_ANSI__=1
CHPL_TARGET_LD: /usr/lib/llvm-18/bin/clang++
CHPL_TARGET_BUNDLED_LINK_ARGS: -L/home/nldias/Downloads/chapel-2.1.0/lib/linux64/llvm/x86_64/cpu-native/loc-gpu/gpu-nvidia/gpu_mem-array_on_device/comm-none/tasks-qthreads/tmr-generic/unwind-none/mem-jemalloc/atomics-cstdlib/hwloc-bundled/re2-none/fs-none/lib_pic-none/san-none -lchpl -L/home/nldias/Downloads/chapel-2.1.0/third-party/gmp/install/linux64-x86_64-native-llvm-none/lib -lgmp -Wl,-rpath,/home/nldias/Downloads/chapel-2.1.0/third-party/gmp/install/linux64-x86_64-native-llvm-none/lib -L/home/nldias/Downloads/chapel-2.1.0/third-party/qthread/install/linux64-x86_64-native-llvm-none-gpu-jemalloc-bundled/lib -Wl,-rpath,/home/nldias/Downloads/chapel-2.1.0/third-party/qthread/install/linux64-x86_64-native-llvm-none-gpu-jemalloc-bundled/lib -lqthread -L/home/nldias/Downloads/chapel-2.1.0/third-party/jemalloc/install/target/linux64-x86_64-native-llvm-none/lib -ljemalloc
CHPL_TARGET_SYSTEM_LINK_ARGS: -L/usr/lib/cuda/lib64 -lcudart -lcuda -lm -lpthread
CHPL_TARGET_ARCH: x86_64
CHPL_TARGET_CPU: native
CHPL_RUNTIME_CPU: native
CHPL_TARGET_CPU_FLAG: arch
CHPL_TARGET_BACKEND_CPU: native
CHPL_LOCALE_MODEL: gpu *
CHPL_GPU: nvidia *
CHPL_GPU_ARCH: sm_60
CHPL_GPU_MEM_STRATEGY: array_on_device
CHPL_CUDA_PATH: /usr/lib/cuda *
CHPL_CUDA_LIBDEVICE_PATH: /usr/lib/cuda/nvvm/libdevice/libdevice.10.bc
CHPL_COMM: none
CHPL_TASKS: qthreads
CHPL_LAUNCHER: none
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_HOST_MEM: jemalloc
CHPL_HOST_JEMALLOC: bundled
CHPL_MEM: jemalloc
CHPL_TARGET_MEM: jemalloc
CHPL_TARGET_JEMALLOC: bundled
CHPL_MAKE: gmake
CHPL_ATOMICS: cstdlib
CHPL_GMP: bundled
CHPL_GMP_IS_OVERRIDDEN: False
CHPL_HWLOC: bundled
CHPL_RE2: none
CHPL_RE2_IS_OVERRIDDEN: False
CHPL_LLVM: system *
CHPL_LLVM_SUPPORT: system
CHPL_LLVM_CONFIG: llvm-config-18
CHPL_LLVM_VERSION: 18
CHPL_LLVM_CLANG_C: /usr/lib/llvm-18/bin/clang
CHPL_LLVM_CLANG_CXX: /usr/lib/llvm-18/bin/clang++
CHPL_LLVM_STATIC_DYNAMIC: dynamic
CHPL_LLVM_TARGET_CPU: native
CHPL_AUX_FILESYS: none
CHPL_LIB_PIC: none
CHPL_SANITIZE: none
CHPL_SANITIZE_EXE: none
CHPL_RUNTIME_SUBDIR: linux64/llvm/x86_64/cpu-native/loc-gpu/gpu-nvidia/gpu_mem-array_on_device/comm-none/tasks-qthreads/tmr-generic/unwind-none/mem-jemalloc/atomics-cstdlib/hwloc-bundled/re2-none/fs-none/lib_pic-none/san-none
CHPL_LAUNCHER_SUBDIR: linux64/gnu/x86_64/loc-gpu/comm-none/tasks-qthreads/launch-none/tmr-generic/unwind-none/mem-jemalloc/atomics-cstdlib/lib_pic-none/san-none
CHPL_COMPILER_SUBDIR: linux64/gnu/x86_64/hostmem-jemalloc/llvm-system/18/san-none
CHPL_HOST_BIN_SUBDIR: linux64-x86_64
CHPL_TARGET_BIN_SUBDIR: linux64-x86_64-native
CHPL_SYS_MODULES_SUBDIR: linux64-x86_64-llvm
CHPL_LLVM_UNIQ_CFG_PATH: system
CHPL_GASNET_UNIQ_CFG_PATH: linux64-x86_64-native-llvm-none/substrate-none/seg-none
CHPL_GMP_UNIQ_CFG_PATH: linux64-x86_64-native-llvm-none
CHPL_HWLOC_UNIQ_CFG_PATH: linux64-x86_64-native-llvm-none-gpu
CHPL_HOST_JEMALLOC_UNIQ_CFG_PATH: host/linux64-x86_64-gnu
CHPL_TARGET_JEMALLOC_UNIQ_CFG_PATH: target/linux64-x86_64-native-llvm-none
CHPL_LIBFABRIC_UNIQ_CFG_PATH: linux64-x86_64-native-llvm-none
CHPL_LIBUNWIND_UNIQ_CFG_PATH: linux64-x86_64-native-llvm-none
CHPL_QTHREAD_UNIQ_CFG_PATH: linux64-x86_64-native-llvm-none-gpu-jemalloc-bundled
CHPL_RE2_UNIQ_CFG_PATH: linux64-x86_64-native-llvm-none
CHPL_PE_CHPL_PKGCONFIG_LIBS:

Your setup looks quite similar to mine (except that I have a manual CUDA install, but I don't see why that matters).

I don't see anything wrong in that config by reading through it. One thing that would be good to make sure is correct is libclang-18-dev That package is a prerequisite for Chapel and if it was missing it should have been caught by the scripts. But could something with that package have gone wrong? Could you see if you can reinstall it?

I think that clang18 is selecting g++14, but the system compiler is 14.
Previously a workaround was installing libstd++12-dev for clang to work
with gcc11 (!) but this no longer works. I am trying llvm=bundled now

I mean the system compiler is 23

Sorry 13

Ah, that's a good observation and I think I am familiar with similar issues. Unfortunately, clang is hard-wired to pick the newest libc++ in a given path. Please keep us posted how it goes with the bundled LLVM.

Engin

Here is the upshot of trying to make chapel 2.1 + gpu support on linx mint 22:

There is an unsolvable incompatibility between Clang and gcc. The current gcc version is 13, and Clang is 18. However, many packages including emacs-gtk depend on libgcc-14-dev. If the latter is removed from the system, make succeeds with gpu support. However, as soon as libgcc-14-dev is re-introduced (by re-installing emacs, for example), the compiler ceases to work. The only choice seems to revert to linux mint 21.3. I tried to upgrade gcc to 14, but it didn't work. I hope this info can save a lot of hours trying to make gpu-enabled Chapel: at this point it seems better to use slightly older distributions.

Cheers Nelson

1 Like

Thanks for that summary Nelson. This has to do with several flags clang uses to determine which GCC installation to use to find the standard C/C++ headers.

@mppf has a PR up that will add a new environment variable to have better control over that that we believe can help you: Add CHPL_LLVM_GCC_INSTALL_DIR as an alternative to CHPL_LLVM_GCC_PREFIX by mppf · Pull Request #25913 · chapel-lang/chapel · GitHub. With that PR, you should be able to set CHPL_LLVM_GCC_INSTALL_DIR to point to your gcc 13 installation to avoid the issue you were facing. We are getting close to the 2.2 release and want to add this feature to the new release. Would you be interested in giving the patch a try if you are using the upstream, by any chance? (I am seeing that the diff on that patch merges cleanly on 2.1.0 as well, but can't be sure if there are any logical conflicts)

1 Like

Hi Engin: sure things. Please send over the patch with a few instructions.
I will do my best :slight_smile:

Cheers

Nelson

Hi Nelson -

It looks like you are using 2.1. If it were me, I would test the patch by trying it on a clone of Chapel (because it makes it easier to keep track of whether you have patched the sources / what version you have). If you wait a bit (until my PR is merged, later today), you can make a clone with git clone https://github.com/chapel-lang/chapel and then you can bring that up to date now or in the future with git pull.

Patching

If you want to try to patch 2.1, you can try that now. Save the following to a file, say patch.txt:

diff --git a/util/chplenv/chpl_llvm.py b/util/chplenv/chpl_llvm.py
index c29da81c40..7f898b5adf 100755
--- a/util/chplenv/chpl_llvm.py
+++ b/util/chplenv/chpl_llvm.py
@@ -649,7 +649,7 @@ def llvm_enabled():
     return False
 
 @memoize
-def get_gcc_prefix():
+def get_gcc_prefix_dir():
     gcc_prefix = overrides.get('CHPL_LLVM_GCC_PREFIX', '')
 
     # allow CHPL_LLVM_GCC_PREFIX=none to disable inferring it
@@ -710,6 +710,12 @@ def get_gcc_prefix():
 
     return gcc_prefix
 
+@memoize
+def get_gcc_install_dir():
+    gcc_dir = overrides.get('CHPL_LLVM_GCC_INSTALL_DIR', '')
+
+    return gcc_dir
+
 
 # The bundled LLVM does not currently know to look in a particular Mac OS X SDK
 # so we provide a -isysroot arg to indicate which is used.
@@ -808,9 +814,13 @@ def get_system_llvm_built_sdkroot():
 def get_clang_basic_args():
     clang_args = [ ]
 
-    gcc_prefix = get_gcc_prefix()
-    if gcc_prefix:
-        clang_args.append('--gcc-toolchain=' + gcc_prefix)
+    gcc_install_dir = get_gcc_install_dir();
+    if gcc_install_dir:
+        clang_args.append('--gcc-install-dir=' + gcc_install_dir)
+    else:
+        gcc_prefix = get_gcc_prefix_dir()
+        if gcc_prefix:
+            clang_args.append('--gcc-toolchain=' + gcc_prefix)
 
     sysroot_args = get_sysroot_resource_dir_args()
     if sysroot_args:

Then, cd to your chapel-2.1.0 directory and then run patch -p1 < patch.txt.

Using the patch

To understand what to set CHPL_LLVM_GCC_INSTALL_DIR to, try a test compile:

  • echo 'int main() { return 0; }' > hello.cc
  • clang++ -v hello.cc

This will print out lines along these lines:

  Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/11
  Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/12
  Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13
  Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/14
  Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/14

The paths printed here are suitable for use with CHPL_LLVM_GCC_INSTALL_DIR. Since you want to use GCC 13 (since it is the system default), you should pick that one.

For me, it amounts to doing

export CHPL_LLVM_GCC_INSTALL_DIR=/usr/bin/../lib/gcc/x86_64-linux-gnu/13

You can check if all of this is working by running printchplenv --all. You should see CHPL_TARGET_CC / CHPL_TARGET_CXX / CHPL_TARGET_LD lines that include --gcc-install-dir options with the path you have selected.

For me:

./util/printchplenv  --all
 machine info: Linux iris 6.8.0-44-generic #44-Ubuntu SMP PREEMPT_DYNAMIC Tue Aug 13 13:35:26 UTC 2024 x86_64
CHPL_HOME: /home/mppf/chapel-old-versions/chapel-2.1.0
script location: /home/mppf/chapel-old-versions/chapel-2.1.0/util/chplenv
CHPL_HOST_PLATFORM: linux64
CHPL_HOST_COMPILER: gnu
  CHPL_HOST_CC: gcc
  CHPL_HOST_CXX: g++
CHPL_HOST_ARCH: x86_64
CHPL_TARGET_PLATFORM: linux64
CHPL_TARGET_COMPILER: llvm
  CHPL_TARGET_CC: /usr/lib/llvm-18/bin/clang --gcc-install-dir=/usr/bin/../lib/gcc/x86_64-linux-gnu/13
  CHPL_TARGET_CXX: /usr/lib/llvm-18/bin/clang++ --gcc-install-dir=/usr/bin/../lib/gcc/x86_64-linux-gnu/13
  CHPL_TARGET_LD: /usr/lib/llvm-18/bin/clang++ --gcc-install-dir=/usr/bin/../lib/gcc/x86_64-linux-gnu/13
...

Hi Michael!

Many thanks for the info on patching. I will try it as soon as I can. One
question. The last line of the patch
is

if sysroot_args:

is there not a clause to this last if?

cheers

Nelson

Hi Nelson -

Yes there is, in the source file. The way that patch files work is that any line beginning with + is a line to be added added; any line beginning with - is a line to be removed, and any other line is context for the patch program, so that it can figure out where to make the changes. So, the rest of the if sysroot_args: should be in the source code, but it isn't important for the patch program itself.

If you inspect the file after patching it, you should see well-formed source code that includes mentions of --gcc-install-dir and CHPL_LLVM_GCC_INSTALL_DIR. (And you shouldn't get errors from Python about bad code, either).

I see! Thanks.

Hi Michael and Engin,

I am happy to report that now (after the patch) make finishes without
problems, and that I can compile a sample program (one of mine)
and that everything works fine.

Many thanks for the fast turnaround!

Best regards

Nelson

Great news! Thanks for trying it. We should have this mechanism available in 2.2 (to be released later this month).

Thanks for the report and checking the patch, Nelson! Thanks Michael for the quick fix!

Engin

1 Like