Chapel is using MPI!

Hello Chapel team,

I am trying to run Chapel hello world (the example), but it seems that it is using MPI while I haven't set MPI env. Would you please help?

My batch file

#! /bin/bash -l

#SBATCH --partition=standard
#SBATCH --account=accc
#SBATCH --nodes=8
#SBATCH --cpus-per-task=8
#SBATCH --time=7:00:00
#SBATCH --comment="image=registry.maze.science.gc.ca/ssc-hpcs/generic-job:ubuntu22.04"
#SBATCH -o stdout.txt
#SBATCH --verbose

export CHPL_LAUNCHER_WALLTIME=03:00:00
export CHPL_LAUNCHER_PARTITION=standard
export CHPL_LAUNCHER_ACCOUNT=accc
/home/maa004/hello2 -nl 8

What I get is:

warning: refusing to reload ordenv; ORDENV_SETUP already set

WARNING: Open MPI accepted a TCP connection from what appears to be a
another Open MPI process but cannot find a corresponding process
entry for that peer.

This attempted connection will be ignored; your MPI job may or may not
continue properly.

Local host: ib13be-087
PID: 111

[ib13be-065.science.gc.ca:00120] 3 more processes have sent help message help-mpi-btl-tcp.txt / server accept cannot find guid
[ib13be-065.science.gc.ca:00120] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

Please post the output of "$CHPL_HOME/util/printchplenv --anonymize --all".

John

@jhh67 Hi John, here is the output

CHPL_HOST_PLATFORM: linux64
CHPL_HOST_COMPILER: gnu
CHPL_HOST_CC: gcc
CHPL_HOST_CXX: g++
CHPL_HOST_ARCH: x86_64
CHPL_TARGET_PLATFORM: linux64
CHPL_TARGET_COMPILER: gnu
CHPL_TARGET_CC: gcc
CHPL_TARGET_CXX: g++
CHPL_TARGET_LD: mpicxx
CHPL_TARGET_ARCH: x86_64
CHPL_TARGET_CPU: unknown
CHPL_LOCALE_MODEL: flat
CHPL_COMM: gasnet +
CHPL_COMM_SUBSTRATE: ibv +
CHPL_GASNET_SEGMENT: large
CHPL_TASKS: qthreads
CHPL_LAUNCHER: slurm-srun +
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_HOST_MEM: jemalloc
CHPL_MEM: jemalloc
CHPL_ATOMICS: cstdlib
CHPL_NETWORK_ATOMICS: none
CHPL_GMP: bundled
CHPL_HWLOC: bundled
CHPL_RE2: bundled
CHPL_LLVM: none +
CHPL_LLVM_SUPPORT: bundled
CHPL_LLVM_CONFIG: /space/partner/nrcan/geobase/work/opt/chapel-1.31.0/third-party/llvm/install/support-only-linux64-x86_64/bin/llvm-config
CHPL_LLVM_VERSION: 15
CHPL_AUX_FILESYS: none
CHPL_LIB_PIC: none
CHPL_SANITIZE: none
CHPL_SANITIZE_EXE: none

I believe the problem is that CHPL_LAUNCHER should be slurm-gasnet_ibv, not slurm-srun since you are using gasnet as your communication layer. It looks like that variable was set by a configuration file. Try setting CHPL_LAUNCHER=slurm-gasnet_ibv in your environment. If that works, update the value in the configuration file. See Chapel Launchers — Chapel Documentation 1.31 for imore nformation on Chapel launchers and Setting up Your Environment for Chapel — Chapel Documentation 1.31 for more information on configuration files.

Thank you so much @jhh67 , I'll set that env correctly every time but when I run $CHPL_HOME/util/printchplenv --anonymize --all, it still shows slurm-srun, In fact I follow these steps for Chapel installation, I appreciate any insights:

There is folder where you have Chapel files extracted

Run these env commands

export CHPL_LLVM=none
export CHPL_COMM=gasnet
export CHPL_COMM_SUBSTRATE=ibv
export CHPL_LAUNCHER=slurm-gasnet_ibv

cd to the directory where you have extracted Chapel

run the following commands

./configure
make
make install

Share the address where you have installed Chapel (extracted) --> it is basically CHPL_HOME

chpl command will be automatically added to /usr/local/bin/

set the CHPL_HOME

export CHPL_HOME=

try to compile

chpl -o hello $CHPL_HOME/examples/hello6-taskpar-dist.chpl

After that when I run it, it keeps telling me about MPI

Does it mean that something is missing on the cluster?

chpl -o hello /space/partner/nrcan/geobase/work/opt/chapel-1.31.0/examples/hello6-taskpar-dist.chpl

/usr/bin/ld: cannot find /space/partner/nrcan/geobase/work/opt/chapel-1.31.0/lib/linux64/gnu/x86_64/loc-flat/comm-gasnet/ibv/large/tasks-qthreads/launch-slurm-gasnet_ibv/tmr-generic/unwind-none/mem-jemalloc/atomics-cstdlib/lib_pic-none/san-none/main_launcher.o: No such file or directory
/usr/bin/ld: cannot find -lchpllaunch: No such file or directory
collect2: error: ld returned 1 exit status
gmake[1]: *** [/space/partner/nrcan/geobase/work/opt/chapel-1.31.0/runtime/etc/Makefile.launcher:52: all] Error 1
gmake: *** [/space/partner/nrcan/geobase/work/opt/chapel-1.31.0/runtime/etc/Makefile.exe:44: /tmp/chpl-maa004.deleteme-25NKW6/hello.tmp] Error 2
error: compiling generated source

If you set the CHPL_LAUNCHER environment variable it should override the default setting and any configuration file. The output of printchplenv will show a '*' after the value to indicate that it was taken from the environment. You can use printenv | grep CHPL_LAUNCHER to make sure it is set correctly in your environment.

Your instructions seem reasonable, although instead of setting CHPL_HOME directly I suggest sourcing util/setchplenv.<shell> which will set CHPL_HOME and set your PATH and other environment variables correctly.

I'm not sure what went wrong with the build and why main_launcher.o. is not found. If you change a configuration variable such as CHPL_LAUNCHER you have to rebuild Chapel before compiling a program, although it should give you a reasonable error message in that case. I suggest running make clobber followed by make.

Thank you so much John. Do you think it may because either gasnet is not insatlled or it cannot be found? I do change the CHPL_LAUNCHER every time and it always gets beack to slurm-srun, without any error..

No, printchplenv just prints the values of the configuration variables. In general, it does not catch mistakes such as gasnet not being installed. That would be caught later, during the build. We bundle gasnet with Chapel, so it should always be found. Please send the output from:

% printenv | grep CHPL
% $CHPL_HOME/util/printchplenv --anonymize --all

Thank you so much! I appreciate it!
$printenv | grep CHPL
CHPL_HOME=/space/partner/nrcan/geobase/work/opt/chapel-1.31.0

$CHPL_HOME/util/printchplenv --anonymize --all
CHPL_HOST_PLATFORM: linux64
CHPL_HOST_COMPILER: gnu
CHPL_HOST_CC: gcc
CHPL_HOST_CXX: g++
CHPL_HOST_ARCH: x86_64
CHPL_TARGET_PLATFORM: linux64
CHPL_TARGET_COMPILER: gnu
CHPL_TARGET_CC: gcc
CHPL_TARGET_CXX: g++
CHPL_TARGET_LD: mpicxx
CHPL_TARGET_ARCH: x86_64
CHPL_TARGET_CPU: unknown
CHPL_LOCALE_MODEL: flat
CHPL_COMM: gasnet +
CHPL_COMM_SUBSTRATE: ibv +
CHPL_GASNET_SEGMENT: large
CHPL_TASKS: qthreads
CHPL_LAUNCHER: slurm-gasnet_ibv +
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_HOST_MEM: jemalloc
CHPL_MEM: jemalloc
CHPL_ATOMICS: cstdlib
CHPL_NETWORK_ATOMICS: none
CHPL_GMP: bundled
CHPL_HWLOC: bundled
CHPL_RE2: bundled
CHPL_LLVM: none +
CHPL_LLVM_SUPPORT: bundled
CHPL_LLVM_CONFIG: /space/partner/nrcan/geobase/work/opt/chapel-1.31.0/third-party/llvm/install/support-only-linux64-x86_64/bin/llvm-config
CHPL_LLVM_VERSION: 15
CHPL_AUX_FILESYS: none
CHPL_LIB_PIC: none
CHPL_SANITIZE: none
CHPL_SANITIZE_EXE: none

then I run -->
chpl -o hello /space/partner/nrcan/geobase/work/opt/chapel-1.31.0/examples/hello6-taskpar-dist.chpl

I get ::

/usr/bin/ld: cannot find /space/partner/nrcan/geobase/work/opt/chapel-1.31.0/lib/linux64/gnu/x86_64/loc-flat/comm-gasnet/ibv/large/tasks-qthreads/launch-slurm-gasnet_ibv/tmr-generic/unwind-none/mem-jemalloc/atomics-cstdlib/lib_pic-none/san-none/main_launcher.o: No such file or directory
/usr/bin/ld: cannot find -lchpllaunch: No such file or directory
collect2: error: ld returned 1 exit status
gmake[1]: *** [/space/partner/nrcan/geobase/work/opt/chapel-1.31.0/runtime/etc/Makefile.launcher:52: all] Error 1
gmake: *** [/space/partner/nrcan/geobase/work/opt/chapel-1.31.0/runtime/etc/Makefile.exe:44: /tmp/chpl-maa004.deleteme-ZHB7nQ/hello.tmp] Error 2
error: compiling generated source

Let's switch over to email as this thread is getting a bit long, john.hartman@hpe.com. Please email me the output from running chpl --print-comands -o hello /space/partner/nrcan/geobase/work/opt/chapel-1.31.0/examples/hello6-taskpar-dist.chpl

Also ls /space/partner/nrcan/geobase/work/opt/chapel-1.31.0/lib/linux64/gnu/x86_64/loc-flat/comm-gasnet/ibv/large/tasks-qthreads/launch-slurm-gasnet_ibv/tmr-generic/unwind-none/mem-jemalloc/atomics-cstdlib/lib_pic-none/san-none

John

Hi Marjan and John —

Catching up after vacation, I wanted to see whether you were able to resolve this offline.

Thanks,
-Brad