New Issue: Optimization: Start remote gets early

16670, “bradcray”, “Optimization: Start remote gets early”, “2020-11-05T22:30:38Z”

This captures an optimization idea that we discussed in the performance meeting today while looking at Greg’s recent improvements for remote puts and thinking about what it would take to get a similar performance jump for remote gets. @ronawho pointed out that gets are different because you tend to need their data before you can go on. This made me wonder whether we could do a compiler-driven optimization in which remote gets are made non-blocking and moved back in the source until they’re unable to due to:

  • hitting a write of the same variable or, more likely, one of:
  • hitting a write that we have to conservatively assume is the same variable, or
  • hitting a memory consistency event that prevents moving it back further.

As a really trivial / uninspired example, in code like the following:

var A = newBlockArr({1..n}, real);

A = ...;

writeln("I'm about to print A[i]");

writeln(A[i]);

you could imagine that the get corresponding to A[i] in the final line to be replaced with a wait with a non-blocking get moved back to the point right before A was assigned. That is, rather than doing the following pseudo-code:

...chapel
writeln("I'm about to print A[i]");
var tmp = get(A[i]);
writeln(tmp);

we would do:

...
A = ...;
var (tmp, handle) = get_nb(A[i]);

writeln("I'm about to print A[i]");

wait(handle);
writeln(tmp);

We did a similar optimizations to this in ZPL, though it was a much simpler language to do the def-use analysis on without being overly conservative.