problem building Chapel on WSL ubuntu with gpu

iarshavsky · March 27, 2025, 4:22pm

Having error:
Error: Could not find the clang header /usr/lib/llvm-19/include/clang/Basic/Version.h

clang-19 --version gives:
Ubuntu clang version 19.1.7 (++20250114103332+cd708029e0b2-1~exp1~20250114103446.78)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm-19/bin

I see include directory:
\wsl.localhost\Ubuntu\usr\lib\llvm-19\lib\clang\19\include

BTW, I was able to build Chapel without gpu support and got 'Hello World' working.

Please help !!!

e-kayrakli · March 27, 2025, 4:29pm

Welcome to the Chapel Discourse, @iarshavsky !

We can take a closer look, but I wanted to make sure that you check out this: Measure the Performance of your Gaming GPU with Chapel

On the surface, this error looks like it might be about not having the Clang developer headers, e.g. libclang-dev, which is covered in that article.

Engin

iarshavsky · March 27, 2025, 6:22pm

This is looking great. I am following the instructions, started from cuda
installation, which was looking good, and then tried to check cuda, and got
the following:

igor@AIMobile10:~/Chapel/chapel-2.4.0$ nvcc -O3 -o cuda-stream
cuda-stream.cu -allow-unsupported-compiler
/usr/include/x86_64-linux-gnu/bits/strings_fortified.h(26): error:
identifier "__builtin_dynamic_object_size" is undefined

/usr/include/x86_64-linux-gnu/bits/strings_fortified.h(33): error:
identifier "__builtin_dynamic_object_size" is undefined

/usr/include/x86_64-linux-gnu/bits/string_fortified.h(30): error:
identifier "__builtin_dynamic_object_size" is undefined

A lot of errors are following.

See gcc --version
igor@AIMobile10:~/Chapel/chapel-2.4.0$ gcc --version
gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I got version 13, that is why using -allow-unsupported-compiler

Could you please help.

Thanks a lot !!!

e-kayrakli · March 27, 2025, 6:46pm

GCC (maybe more likely libc) version is certainly suspect here. GCC 13.3 doesn't seem to be the default GCC on ubuntu 24.04 as per : 1. Introduction — Installation Guide for Linux 12.8 documentation

If you have downloaded CUDA 12.8, you may need to back down to GCC 13.2. Note a similar issue with a potential compatibility regression introduced with GCC 13.2.1 where backing it down to 13.2.0 seemed to have fixed the issue.

Let us know how it goes.

Engin

iarshavsky · March 27, 2025, 11:41pm

I did the following:

cd ~
sudo apt install build-essential sudo apt install libmpfr-dev libgmp3-dev
libmpc-dev -y wget http://ftp.gnu.org/gnu/gcc/gcc-13.2.0/gcc-13.2.0.tar.gz
tar -xf gcc-13.2.0.tar.gz cd gcc-13.2.0 ./configure -v --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
--prefix=/usr/local/gcc-13.2.0
--enable-checking=release --enable-languages=c,c++ --disable-multilib
--program-suffix=-13.2.0 make -j3 sudo make install
cd /usr/bin
sudo ln -sf gcc-13.2.0 gcc

Reinstalled CUDA 11.8.0 with upgrade option (since it was already there),
gcc 13.2.0 is not supported, so used --override
sudo sh cuda_11.8.0_520.61.05_linux.run --override

See below results of CUDA test run:

nvcc -O3 -o cuda-stream cuda-stream.cu -allow-unsupported-compiler

/usr/include/stdlib.h(141): error: identifier "_Float32" is undefined

/usr/include/stdlib.h(147): error: identifier "_Float64" is undefined

/usr/include/stdlib.h(153): error: identifier "_Float128" is undefined
.........................................................................................................

So, ...,. I am dead in the water. Any clue ????

Thanks a lot,

Igor

P.S. I am a US citizen working in CurtissWright Nuclear Simulation Division

bradcray · March 28, 2025, 12:13am

Hi Igor —

I'm probably not the best person to respond here, but Engin's out for the evening so...

Doing some cursory web searching, it seems that CUDA historically hasn't done the best job maintaining compatibility with newer versions of gcc and that the problem here is that your gcc version is newer than CUDA supports. Specifically, I think the cause of your woes is throwing -allow-unsupported-compiler. If your compiler isn't supported by CUDA, it's probably because of issues like this that they're trying to protect you from, but by throwing that flag, you're effectively taking down the guardrails and hitting the issues firsthand.

In the release notes for CUDA 12.8, I'm seeing that they have addressed some gcc portability problems on ubuntu and claim support for gcc 13.2. So I would suggest upgrading to CUDA 12.8 and sticking with your gcc version 13.2 and see if that works (while avoiding the -allow-unsupported-compiler flag).

-Brad

arezaii · March 28, 2025, 12:34am

Hi Igor,

I have been trying to reproduce your experience on my local systems.

I am not able to reproduce this failure mode on Windows 11 5.15.167.4-microsoft-standard-WSL2, with Ubuntu 24.04.1 LTS. I used the default gcc gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 after downloading and installing CUDA 12.8 from CUDA Toolkit 12.8 Update 1 Downloads | NVIDIA Developer

I was able to use the same command you reported to compile the example cuda-stream.cu code and then run the program. I also noted that I do not need the -allow-unsupported-compiler flag and get the same results with or without it.

$ which nvcc
/usr/local/cuda-12.8/bin/nvcc

w/ unsupported compiler flag

$ nvcc -O3 -o cuda-stream cuda-stream.cu -allow-unsupported-compiler
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).

w/o unsupported compiler flag

nvcc -O3 -o cuda-stream cuda-stream.cu
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).

From the error I am guessing there is an issue with the build environment. Is this a fresh WSL setup?

Edit: I see now that you were using CUDA 11.8 instead of 12.8. With Ubunutu 24.04, LLVM 18 is available as a package, so the limitation that forced us to write the article with 11.8 is not a constraint and you may have more favorable results if you use the later CUDA Toolkit version.

iarshavsky · March 28, 2025, 2:57pm

Thank you Brad,
I understand this is CUDA issue, and you guys don't really work/support WSL
(ubuntu). I will try to resolve the issue looking into nvidia support and
such. I got impressed of what you have done (CHAPL), my experience is
mostly nuclear related simulations, I have some yacc experience though. If
you think I can try to join your team of contributors please give me a hand
with ... llvm? and such.

Igor

iarshavsky · March 28, 2025, 3:00pm

Thank you Ahmad,
Will try to follow your advice.

Igor

iarshavsky · March 28, 2025, 3:29pm

I got it, thank you very much !!! So - 1. gcc 13.2.0; 2. nvcc 12.8.93

iarshavsky · March 28, 2025, 3:34pm

Brad,
Thanks to Ahmad Rezaii I got through CUDA versions compatibility issues. In
case it may be of interest to other people.

For WSL ubuntu 24.04.2
gcc 13.2.0
nvcc 12.8.93

Works !

iarshavsky · March 28, 2025, 5:26pm

Ahmad,
I did chapel build and ran hello_gpu test, got the following output:

Num gpu 1
Hello from LOCALE0 gpu: 0
On locale 0 diags = (kernel_launch = 1, host_to_device = 1, device_to_host
= 1, device_to_device = 0)
CORRECT = true

(I added the first print line to make sure my gpu got recognized).

So, all looks not bad, right ???

Do you have a decent test for gpu to see that chapel is running
vector operations on gpu ?

Thank you so much !!!

Igor

shreyask · March 28, 2025, 7:07pm

Hi Igor,

chapel/test/gpu/native/examples/releaseNotes/basic.chpl at main · chapel-lang/chapel · GitHub has an extremely basic test for making sure Chapel is using GPUs as expected.

If you're looking for a more involved example/benchmark with a few more complicated computations, chapel/test/gpu/native/jacobi/jacobi.chpl at main · chapel-lang/chapel · GitHub is a good one to test on your system as well.

iarshavsky · March 28, 2025, 8:00pm

Ran jacobi and got exactly the same numbers 'On GPU' and 'On CPU', does it
tell you anything ?

Thanks !!!

iarshavsky · March 28, 2025, 8:06pm

Ah, sorry for complete stupidity, I looked into the source code
Do you have time measurements operators/lines handy compare execution time
GPU vs CPU. I would put this code into examples/gpu then.

Thanks !

iarshavsky · March 28, 2025, 8:21pm

Never mind, it is really easy in chapel, see what I got :

on GPU:
1.0 1.20906 2.19525 2.9034 3.5552 4.01125 4.21599 4.10448 3.62297 2.85669
1.60226 1.0
Execution time = 0.005723
on CPU:
1.0 1.20906 2.19525 2.9034 3.5552 4.01125 4.21599 4.10448 3.62297 2.85669
1.60226 1.0
Execution time = 0.000436068

Any thoughts ???

shreyask · March 28, 2025, 8:22pm

In general, you can use the stopwatch functionality in Chapel for timing. Here's a primer to get you started with the basic functionality: stopwatches — Chapel Documentation 2.4

If you want to time the execution of GPU vs CPU in the Jacobi example, you would do something like:

var t: stopwatch;
writeln("on GPU:");
t.start();
jacobi(here.gpus[0]);
t.stop();
writeln("Time taken by GPU: ", t.elapsed());
t.clear();
writeln("on CPU:");
t.start();
jacobi(here);
t.stop();
writeln("Time taken by CPU: ", t.elapsed());

If you want to get a real benchmark you can increase the size of the array by increasing n

shreyask · March 28, 2025, 8:29pm

Hi Igor,

Your timing results are expected when using small arrays. This is because the overhead of setting up a GPU kernel, the execution, and the teardown outweigh any performance benefit we get from using GPUs at that size.

But these times are close to constant, therefore they do not increase as the size of the array increases. If you increase the size of the array, that's when we start to see some performance benefits:

(note that I commented out the line that prints A in jacobi.chpl so I can focus on the timings)

❯ ./jacobi 
on GPU:
Time taken by GPU: 0.001866
on CPU:
Time taken by CPU: 0.000378

❯ ./jacobi --n=1_000_000
on GPU:
Time taken by GPU: 0.004782
on CPU:
Time taken by CPU: 0.020808

❯ ./jacobi --n=500_000_000
on GPU:
Time taken by GPU: 0.789809
on CPU:
Time taken by CPU: 10.6528

iarshavsky · March 28, 2025, 8:35pm

The test is good !!! See results for n=1000000

on GPU:
Execution time = 0.062171
on CPU:
Execution time = 0.602016

bradcray · March 28, 2025, 8:53pm

Hi Igor —

Glad you were able to get up and running. When doing any timings in Chapel, be sure to compile with the --fast flag if you aren't already.

-Brad

Topic		Replies	Views	Activity
Cannot make gpu-enabled chapel Users	17	159	September 10, 2024
Runtime has not been built for this configuration Users	19	355	January 31, 2024

problem building Chapel on WSL ubuntu with gpu

Related topics