17825, "PHHargrove", "ra test crashes with CHPL_LLVM=system on macOS/aarch64", "2021-05-26T08:29:18Z"
I am working from the master
branch at 7e877bf8f9.
This is on an Apple M1 (aarch64) Mac Mini running macOS 11.4 (Build version 20F71) and Xcode 12.5 (Build version 12E262) from which clang reports Apple clang version 12.0.5 (clang-1205.0.22.9)
My environment has been setup with PATH=$PATH:/opt/homebrew/opt/llvm@11/bin
to support CHPL_LLVM=system
via Homebrew's llvm@11.
Running configure reports the following, reflecting my manual environment settings for CHPL_{COMM,COMM_SUBSTRATE,TASKS,LLVM}
:
Currently selected Chapel configuration:
CHPL_TARGET_PLATFORM: darwin
CHPL_TARGET_COMPILER: clang
CHPL_TARGET_ARCH: arm64
CHPL_TARGET_CPU: unknown
CHPL_LOCALE_MODEL: flat
CHPL_COMM: gasnet *
CHPL_COMM_SUBSTRATE: smp *
CHPL_GASNET_SEGMENT: fast
CHPL_TASKS: fifo *
CHPL_LAUNCHER: smp
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_MEM: jemalloc
CHPL_ATOMICS: cstdlib
CHPL_NETWORK_ATOMICS: none
CHPL_GMP: bundled
CHPL_HWLOC: none
CHPL_RE2: bundled
CHPL_LLVM: system *
CHPL_AUX_FILESYS: none
With this build I get a SEGV running ra
and ra-atomics
tests.
I am running with CHPL_CORES_PER_LOCALE=1
at runtime (no clue if this could be related or not).
When I add CHPL_COMM_DEBUG=1
to my environment and start over, I get the following when running those tests:
Assertion failed: (raddr != 0), function make_entry, file chpl-cache.c, line 2174.
I've also tried the same system with CHPL_COMM_SUBSTRATE=udp
and/or CHPL_LLVM=none
(all 4 combinations).
The failures do not occur for either CHPL_LLVM=none
case, but do occur for both CHPL_LLVM=system
cases.
I've run tests on a similarly configured x86_64 system (older macOS 11.3, but same Apple clang version and also Homebrew's llvm@11), where I am not setting CHPL_TASKS=fifo
. Here I do NOT see the errors. So, it appears likely to be specific to LLVM-based code generation on aarch64. I do not currently have any Linux/aarch64 testing of Chapel (let alone with CHLP_LLVM=system
).
The following are backtraces for the failing CHPL_COMM_DEBUG=1
builds of ra
and ra-atomics
, respectively:
[0] (lldb) bt all
[0] * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
[0] * frame #0: 0x00000001977eab94 libsystem_kernel.dylib`__ulock_wait + 8
[0] frame #1: 0x00000001978258ec libsystem_pthread.dylib`_pthread_join + 456
[0] frame #2: 0x0000000100bbe2dc ra_real`chpl_task_callMain(chpl_main=(ra_real`chpl_executable_init at chpl-init.c:300)) at tasks-fifo.c:454:8
[0] frame #3: 0x0000000100ba77b8 ra_real`main(argc=132, argv=0x000000016f3aeb78) at main.c:33:3
[0] frame #4: 0x0000000197841450 libdyld.dylib`start + 4
[0] thread #2
[0] frame #0: 0x00000001977e8edc libsystem_kernel.dylib`swtch_pri + 8
[0] frame #1: 0x000000019782068c libsystem_pthread.dylib`cthread_yield + 20
[0] frame #2: 0x0000000100bc0690 ra_real`chpl_thread_yield at threads-pthreads.c:317:3
[0] frame #3: 0x0000000100bbf23c ra_real`chpl_task_yield at tasks-fifo.c:802:3
[0] frame #4: 0x0000000100bcc0e8 ra_real`polling(x=0x0000000000000000) at comm-gasnet.c:752:5
[0] frame #5: 0x0000000100bbe5c8 ra_real`comm_task_wrapper(arg=0x0000000000000000) at tasks-fifo.c:532:3
[0] frame #6: 0x0000000197823878 libsystem_pthread.dylib`_pthread_start + 320
[0] thread #3
[0] (lldb) bt all
[0] * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
[0] * frame #0: 0x00000001977eab94 libsystem_kernel.dylib`__ulock_wait + 8
[0] frame #1: 0x00000001978258ec libsystem_pthread.dylib`_pthread_join + 456
[0] frame #2: 0x000000010020c05c ra-atomics_real`chpl_task_callMain(chpl_main=(ra-atomics_real`chpl_executable_init at chpl-init.c:300)) at tasks-fifo.c:454:8
[0] frame #3: 0x00000001001f5538 ra-atomics_real`main(argc=133, argv=0x000000016fd1eb50) at main.c:33:3
[0] frame #4: 0x0000000197841450 libdyld.dylib`start + 4
[0] thread #2
[0] frame #0: 0x00000001977f31ec libsystem_kernel.dylib`__select + 8
[0] frame #1: 0x00000001003a6ed0 ra-atomics_real`::myselect(n=4, readfds=0x0000000101603974, writefds=0x0000000000000000, exceptfds=0x0000000000000000, timeout=0x0000000101603960) at sockutil.cpp:589:16
[0] frame #2: 0x00000001003a6dd8 ra-atomics_real`inputWaiting(s=3, dothrow=false) at sockutil.cpp:435:16
[0] frame #3: 0x00000001003a2bb8 ra-atomics_real`::AMUDP_SPMDHandleControlTraffic(controlMessagesServiced=0x0000000000000000) at amudp_spmd.cpp:1251:5
[0] frame #4: 0x0000000100396058 ra-atomics_real`::AM_Poll(eb=0x000000014e8040b0) at amudp_reqrep.cpp:882:18
[0] frame #5: 0x0000000100245dfc ra-atomics_real`gasnetc_AMPoll(_gasneti_threadinfo_farg=0x000000014e904400) at gasnet_core.c:619:5
[0] frame #6: 0x0000000100212bb4 ra-atomics_real`_gasneti_AMPoll(_gasneti_threadinfo_farg=0x000000014e904400) at gasnet_help.h:1290:18
[0] frame #7: 0x00000001002121e0 ra-atomics_real`_gasnet_AMPoll(_gasneti_threadinfo_farg=0x000000014e904400) at gasnet_help.h:1423:12
[0] frame #8: 0x0000000100219e90 ra-atomics_real`am_poll_try at comm-gasnet.c:743:12
[0] frame #9: 0x0000000100219e64 ra-atomics_real`polling(x=0x0000000000000000) at comm-gasnet.c:751:5
[0] frame #10: 0x000000010020c348 ra-atomics_real`comm_task_wrapper(arg=0x0000000000000000) at tasks-fifo.c:532:3
[0] frame #11: 0x0000000197823878 libsystem_pthread.dylib`_pthread_start + 320
[0] thread #3
[0] frame #0: 0x00000001977e8edc libsystem_kernel.dylib`swtch_pri + 8
[0] frame #1: 0x000000019782068c libsystem_pthread.dylib`cthread_yield + 20
Neither looks terribly informative to me as the asserting thread's info appears to be missing.