Merge pull request #16607 from gbtitus/task-count-race
Fix a race in counting running tasks.
(Reviewed by, and discussed extensively with, @ronawho.)
In the runtime’s pre-user-code hook we force the running task counts to
the proper values after module initialization is done, because (for
reasons we need not go into here) that phase can leave the counts
incorrect but we need them to be correct when we embark on the user
program. We also set up memory tracking in that hook, because we don’t
want to track memory during module init but we do want to track it (if
doing so is requested) in the user code. But this has a race with the
running task count setting, due to on-stmts from non-0 nodes back to
node 0 to access the associated config const values. Here, add a
barrier between the running task count reset and the memory tracking
setup to resolve this race.
While here, improve the commentary that describes why the pre-user-code
hook is the way it is.
This should fix the problem described in https://github.com/Cray/chapel-private/issues/1395.
We expect to do some follow-up work to clean up this area of the
cooperative module+runtime startup code, both with respect to counting
running tasks and with respect to memory tracking. That will allow
taking the inter-node barrier added here back out. But in the meantime,
this change will fix mis-counting the running tasks.