28838, "jabraham17", "Investigating LLVM 22 performance regression: spectralnorm", "2026-05-15T22:36:52Z"
When upgrading our perf infrastructure from LLVM 20 to LLVM 22, we noticed a pretty significant performance regression to spectralnorm. I am going to use this issue to catalog what I have found as we search for a fix/reason.
Perf graph in question: Chapel Performance Graphs
I followed very similar steps as Investigating LLVM 22 performance regression: nbody · Issue #28837 · chapel-lang/chapel · GitHub. I compiled chpl --fast test/studies/shootout/spectral-norm/bradc/spectralnorm-blc.chpl --savec gen-spec --mllvm --print-after=slp-vectorizer --mllvm --print-before=slp-vectorizer --mllvm --filter-print-funcs=wrapcoforall_fn_DefaultRectangular_line_329_chpl7 --mllvm --print-module-scope with LLVM 20 and LLVM 22. With LLVM 20, the main loop body gets converted to vector code (2 wide). With LLVM 22, it does not
I actually took the LLVM 20 emitted LLVM IR and dropped it into opt from LLVM 20, LLVM 21, and LLVM 22. The regression happens between LLVM 21 and LLVM 22. See Compiler Explorer. For some reason, the slp-vectorizer in LLVM 22 fails to vectorize. In LLVM 20, with -pass-remarks, I get remark: <unknown>:0:0: SLP vectorized with cost -10 and with tree size 17 reported. With LLVM 22, nothing is reported as being vectorized.