A '-S' flag and separate compilation

damianmoz · November 17, 2022, 4:18am

Will the Chapel compiler ever be able to support a '-S' flag, i.e. pass a Chapel file with one or more individual routines to the compiler and have it output the assembler?

vass · November 17, 2022, 9:35pm

Hi Damian,

We do not have plans for -S at the moment. Is this something you can do with a workaround?

I am thinking to pass --savec to Chapel then manually compile the generated C code. I would run 'make' by hand to see the compilation command, then modify that command.

Vass

dmk · November 17, 2022, 10:25pm

I can endorse Vass' solution. It has worked well for me in the past.

bradcray · November 18, 2022, 12:41am

Hi Damian —

I think that it's reasonable to expect that Chapel could and should
support a -S flag, though it has not historically been a priority.
Filing a feature request issue for it could be a first step to start to
change that, particularly if it's important / blocking.

The "and separate compilation" part is trickier since Chapel doesn't
support separate compilation at all today.

Note that the approach to this Vass and David are suggesting only work
when using the C back-end (CHPL_TARGET_COMPILER != llvm). If you were to
add --incremental to your command-line options, which causes each Chapel
module's '.c' file to be compiled separately, you could likely hone in on
a specific section of code more easily.

-Brad

damianmoz · November 19, 2022, 4:26am

Hi Vass,

I have tried to figure out a work around. Maybe I am just slow.

I compiled a small program

chpl --savec tmp s.chpl

and then I did

make -f tmp/Makefile

and the output was

rm -f s
mv tmp/s.tmp s

so not a lot of joy there.

I tried objdump on the executable s and well, looking through 200,000 lines of assembler is beyond my skill levell.

damianmoz · November 19, 2022, 5:20am

Brad, I built the compiler (the one with AsBits() working for param) with CHPL_LLVM=bundled.

I assumed I had the LLVM compiler but anyway, I set CHPL_TARGET_COMPILER=llvm.

chpl --savec tmp s.chpl
make -f tmp/Makefile

and it says

rm -f s
mv tmp/s tmp s

which is not very helpful.

Having -S is not a high priority but not having it is not productive.

These days my development cycle goes like

develop/debug code in Chapel code, say `t.chpl`
find crucial routines from `t.chpl`
manually recode them as `t-crucial-bits.c
repeat
   gcc11 -O3 -S -fno-math-errno -mfma t-crucial-bits.c
   ... review assembler and tweak C code
   clang14 -O3 -S  -ffp-exception-behaviour -frounding-math t-crucial-bits.c
   ... review assembler and tweak C code
until minimalist assembler achieved
refactor t-crucial-bits.c back into the Chapel code

Not an overly satisfactory cycle

I am a bit of a fan of McIlroy's negative coding approach, i.e. I want the smallest number of lines of assembler generated for a given amount of Chapel (or its C translation). The program which produces the smallest number of lines of assembler also generally tends to be the fastest. It is often the simplest to read and understand, if only from the perspective that, within a single rotuine, my eyes start to water beyond 50 lines of assembler and my brain shuts down at slightly over 100 lines of assembler.

It would make me lots more productive if I could just do

develop/debug code in Chapel code, say `t.chpl`
find crucial routines from `t.chpl`
cut and past them into `t-crucial-bits.chpl (probably only a single module)
create a driver to exercise the **proc**s in t-crucial-bits.chpl
repeat
   chpl -S -o t.s --ieee-float --fast t-crucial-bits.chpl
   view t.s
until minimalist assembler is achieved
cut and paste t-crucial-bits.chpl back into t.chpl

Looks a lot easier to me.

It would be good to have such a feature sometime.

damianmoz · November 19, 2022, 5:24am

David, thanks for your input. Once I get a reply from @vass or @bradcray which might point to the error of my ways (or thinking), I should be able to reply more intelligently to your post.

mppf · November 22, 2022, 5:42pm

I'd just like to note that we already have an issue about this : Assembler Output of individual code chunks - Desirable / Low Priority · Issue #15043 · chapel-lang/chapel · GitHub

I've known how to do this for a while and Damian it sounds like it has been a thorn in your side so I went ahead and drafted something to do it. Please see Add support for displaying resulting assembly by mppf · Pull Request #21076 · chapel-lang/chapel · GitHub . I'd appreciate help in testing this PR (I prototyped it quickly but have not gotten to testing it more).

Here is an example

// bb.chpl
config const n = 10_000;
proc foo() {
  var result = 0.0;
  for i in 1..n {
    result += sqrt(i*i);
  }
  return result;
}

proc bar() {
  var total = 0.0;
  for i in 1..10 {
    total += foo();
  }
  return total;
}

writeln(foo());
writeln(bar());

$ chpl bb.chpl --fast  --llvm-print-ir foo,bar --llvm-print-ir-stage asm 
...
shows assembly for foo and bar
...

If you aren't able to try out that PR, you could run the commands manually:

$ chpl bb.chpl --fast  --llvm-print-ir foo,bar --llvm-print-ir-stage full --savec=tmp

... LLVM IR representation output you can ignore...

$ objdump --disassemble=foo_chpl tmp/chpl__module.o

... assembly output ...

Note that using --llvm-print-ir might be necessary to disable inlining for that symbol (otherwise objdump might not find it). Additionally, if the function isn't called, the Chapel compiler currently won't compile it, so you'll need the calls to it in your test program.

damianmoz · November 22, 2022, 7:33pm

Thanks for this, especially for the fast turnaround. I will see what I need to do. I assume this means rebuilding the compiler? I am a total novice when it comes to a PR.

mppf · November 22, 2022, 8:26pm

Yes you would check out the branch from the PR (or apply the changes manually if that is easier for you). The branch is here GitHub - mppf/chapel at resolve-15043 . You would indeed need to rebuild the compiler.

bradcray · January 4, 2023, 11:56pm

Hi @damianmoz —

I'm catching up on mail and was curious whether you were ever able to give this a try. Now that 1.29.0 is out, this capability that Michael is implemented is available there in its current form.

-Brad

damianmoz · January 5, 2023, 1:17am

Sorry. I got distracted into helping others in the team with Chapel and loads of paperwork, the latter being my day-job (sadly). Hopefully next week.

bradcray · January 5, 2023, 1:23am

No worries, just curious!

-Brad

damianmoz · January 5, 2023, 8:47am

Tested. It works. It looks quite useful as a comparison too.

Thanks heaps.

Topic		Replies	Views
[design] role of CHPL_TARGET_COMPILER in LLVM compiles Developers	0	298	May 21, 2021
Where is the backend code of chapel? Developers	4	244	July 29, 2022
Announcing Chapel 1.23.0! Announcements	0	408	October 15, 2020
Announcing Chapel 1.25.0! Announcements	0	403	September 24, 2021
[design] Queries about how the program is being compiled Developers	0	199	February 4, 2022

A '-S' flag and separate compilation

Related topics