Is the use of proc main() { ... } recommended?

Hello! I wonder if there are recommendations, best practices, etc., regarding the use of proc main() { ... }. I almost never use it, and it is clear that I will need it if I want my program to return an exit status, but otherwise is there any other reason to use it?

Regards

Nelson

Hi Nelson,

I am not aware of any clear consensus (or even a discussion) about the stylistic choice as to whether main should be explicitly defined or not. So, I'll just share some of my thoughts and personal choices as food for thought.

  • More factual than stylistic: having a main function helps with multi-module Chapel files. Otherwise, the compiler can't know which module is the main entry point to the application. Note that you can circumvent that by passing the --main-module=XYZ.chpl file.
  • On the same vein, modules with explicit main functions are more reusable than modules that have free-standing code. I might want to use/import that module for another project and maybe particularly for some helpers in it. If you have code outside of main, that code will execute when the module is imported from another module, which is likely unintentional. So, if the module you are writing will have a long life cycle, having an explicit main can make the code more maintainable.
  • Stylistically: maybe my personal rule of thumb for code that is beyond a prototype is to have an explicit main if I needed to define another function. To me the following is bad style:
var x = 10;

proc foo() {
  writeln("In foo");
}

x = 20;
writeln(x);
foo();

While the snippet is rather small, it captures that things start to get more difficult to follow when you have other functions defined. Clearly, you can try to put all the helper functions like foo at the end or the beginning of the file, but that feels too loose to me. And it gets especially difficult if your module gets big enough.

I'd be curious to hear what others think as well.

Engin

I almost never write code at module scope, mostly for maintainability and readability.

As soon as I need to write more than 20ish lines of code that may have multiple functions, it can be hard to track whats going on. Engin's example is a perfect example of this. I have the same style in other languages. For example in Python anytime my scripts get bigger than what can fit in my window at once I end up writing a main function.

Whether its in Chapel or Python, encapsulating my code means its much more reusable. I can either import the file and use it (without the side effects of module scope code) or have discrete functions I can copy/paste and reuse.

In Chapel, there can also be a slight difference in how variables behave, since module scope variables can have different behavior than local scope variables in multi-locale programs. In theory they should work the same, but I've found in practice they don't always. Rather than worrying about such things, I just encapsulate things.

Ultimately I think its a personal choice. From a technical perspective, I wouldn't recommend it one way or the other. From a maintenance/readability perspective, I personally always prefer a main function.

-Jade

thanks Engin and Jade: I am fully convinced that proc main() { ... } is the way to go.

Regards

Nelson

Hi Nelson / all —

I generally agree with what Engin and Jade have written. I almost always start without a main() but typically switch to using one once I have multiple files or too much module-scope code (by some definition).

I don't always introduce a main() as soon as I've defined procs or iters, but agree that mixing module-scope executable code and proc/iter declarations, as in Engin's short example above, is bad style. My approach in such cases is to keep all module-scope executable code together, and to define it before any subroutines. So I might write:

var x = 10;

x = 20;
writeln(x);
foo();

proc foo() {
  writeln("In foo");
}

Beyond the question of when/whether to use main(), I wanted to make a few technical notes on the messages above:

  • Nelson, you mentioned the desire to return an exit status as a motivation for using main(). Note that this can also be done (with or without a main() procedure) using the exit(myStatus); call. For a trivial example, see ATO. Even in programs that have a main() procedure, it can often be simpler to generate error codes using exit() rather than finding your way back to main() to return from there.

  • I wanted to push back on the notion that users should expect module-scope variables to behave equivalently to local ones. In part, this is because we sometimes implement module-scope variables differently: for example, module-scope consts are broadcast across locales at module setup, whereas local consts can't be implemented that way in general. More generally, the compiler can't know all the places where a module-scope variable may be accessed without doing whole-program analysis, whereas a local-scope variable's potential for escaping its scope is much simpler to analyze. So the choice of scope can also affect how optimizable the code is (with no single approach being best in all situations).

  • I definitely agree that main() becomes more useful for multi-file programs, but wanted to note that it's often possible to avoid the need to compile with --main-module by only naming one .chpl file on your chpl command-line and having the others be found by the compiler (by having their filenames match their module names and making sure they're in the module search path). In such cases, the single module named in the chpl command will be the main module. For example, given:

    A.chpl:

    use B;
    
    writeln("Initializing A");
    
    proc main() {
      writeln("In A's main");
    }
    

    B.chpl:

    use A;
    
    writeln("Initializing B");
    
    proc main() {
      writeln("In B's main");
    }
    

    either chpl A.chpl or chpl B.chpl will work and make that module the main module. Whether having multiple modules define main() like this is "cool and powerful" or "incredibly confusing" is a separate question that I don't want to weigh in on here. :slight_smile: But passing just one .chpl file to chpl is generally my preferred approach in the presence of multi-module programs, to support shorter, more readable command lines.

Again, none of these notes are intended to suggest that using main() is a bad idea or should be avoided.

-Brad

1 Like

Hi Brad: thanks for the clarification.

The approach below,
"
I definitely agree that main() becomes more useful for multi-file programs, but wanted to note that it's often possible to avoid the need to compile with --main-module by only naming one .chpl file on your chpl command-line and having the others be found by the compiler (by having their filenames match their module names and making sure they're in the module search path). In such cases, the single module named in the chpl command will be the main module.
"
is the one I have effectively been using all the time.

Cheers,

Nelson

1 Like