New Issue: begin? for optional tasks

16387, “mppf”, “begin? for optional tasks”, “2020-09-15T13:16:56Z”

Split-off from #13518.

Currently, begin, cobegin, and coforall always create qthreads-style tasks. However, for recursive algorithms, we would like it be easy to create tasks only when there aren’t already a lot of tasks going.

Some parallel languages have a begin like construct is inherently optional - the implementation doesn’t create a task if there is already enough parallel work. For example Cilk tasks work that way. This even has an impact on compiler analysis and optimization for the parallel tasks. Another term for this is to say that these tasks are serializable.

The begin constructs in Chapel however insist on creating the tasks because these tasks could do things like wait on each other with sync variables. This makes it awkward to write efficient recursive algorithms using these constructs (see for example a specific case in #13518). In contrast, using forall is pretty easy even in recursive algorithms because the built-in iterators supporting forall limit the number of tasks created based upon here.runningTasks() (typically done in _computeNumChunks and controlled by dataParTasksPerLocale / dataParIgnoreRunningTasks). But, using forall for algorithms like merge sort or quick sort that have 2 recursive problems is awkward and strange as well.

Additionally, Chapel has a serial blocks - serial { } - which serializes all tasks launched within the block. This presents a problem for libraries. Consider a library that - somewhere deep inside a call - needs to create a task that can block in order to function correctly. For example, it might need to create some kind of helper task like a progress thread. If the program calling the library makes the call within a serial block, then the library will simply not function. (It will probably deadlock). In a way, use of the serial block only makes sense if all tasks called in its body (including in called functions) are serializable. But, that property is very hard to see if the serial block contains function calls to libraries. Would documentation for all library functions need to include a note about whether or not all tasks the function creates are serializable?

This issue is proposing that users be able to opt into creating optional tasks. Similar to how forall works in practice, the implementation would automatically throttle the number of tasks. The serial block would then be adjusted to only affect these optional tasks. In other words, code creating non-optional tasks within a serial block would still create those tasks. (Note that this change to how serial works would be a breaking change).

One potential alternative would be for compiler analysis to identify begin / cobegin / coforall bodies that are serializable. However even so it would probably be useful to have a way that users can indicate optional tasks. Additionally, the interaction with the serial block suggests different behavior for optional and non-optional tasks. Having the serial block behave differently based upon analysis of contained task bodies does not seem like a good idea.

It would also be possible to migrate the task-throttling that happens currently in iterators supporting forall into using these optional tasks.

For specific syntax, this issue proposes that begin?, cobegin?, and coforall? create optional tasks; while begin, cobegin, and coforall would not. The ? is by analogy to nilability support - when it is applied to a class type it means an optional class type. begin?, cobegin?, and coforall? would be parsed as a single token.

For example (from #13518) with merge sort, the code could include a recursive call like this, and the tasks created would automatically be serialized by the implementation once there is enough parallelism.

  proc _MergeSort(...) {
    cobegin? {
      { _MergeSort(...); }
      { _MergeSort(...); }