New Issue: Asymmetric multi-level promotions: Design and hazards

17970, "bradcray", "Asymmetric multi-level promotions: Design and hazards", "2021-06-23T15:39:54Z"

Chapel intentionally supports asymmetric promotions so that cases like this:

config const n = 10;

var A, B: [1..n] real;
B = A + 0.1;

work. Specifically, the + operator is defined for pairs of real values, so when array A is passed to it, as above, it results in a promotion of just one argument, resulting in the equivalent to:

forall (b, a) in zip(B, A) do
  b = a + 0.1;

This can sometimes cause confusion however when there are multiple levels of promotion. For example, imagine a case like:

var C, D, E: [1..n] [1..2] real;
var pair: [1..2] real;
D = C + pair;

where the user presumably intended for pair to be added to each of C's [1..2] real elements. However, because of how Chapel is implemented, we see that two arrays are passed to + (C's [1..n] array and pair's [1..2] array). As a result, this will essentially be rewritten as:

forall (d, c, p) in zip(D, C, pair) do
  d = c + p;

at that point, the compiler will see that an array (c, with type [1..2] real) is being passed to a + resulting in an asymmetric promotion, similar to the original one above, and rewrite this to:

forall (d, c, p) in zip(D, C, pair) do
  forall (dd, cc) in zip(D, C) do
    dd = cc + p;

What will happen at execution time is that:

  • if n > 2 is that we'll get a zip size mismatch between the n elements of d and c and the 2 elements of pair
  • if n == 2, things will "work", though whether or not it was what the user intended is unclear (we'll come back to this case below)
  • if n < 2, the program will compile and run where it should generate a zip size mismatch error as in the first case (issue #11428)

For this specific case, it's probably most likely that we'd fall into the first case which is somewhat reassuring since it's the one that's most likely to point out the user's mistake, though only at execution time. Yet the instability and potential for confusion here is still unsettling to me. Note that, once a user runs into this asymmetry, they could achieve their original intent by rewriting the loop they wanted more explicitly, for example as:

D = [c in C] c + pair;

The instability in the case above may suggest that cases of multi-level promotion, like the ones that occur for D and C, shouldn't be supported, yet I don't think that's the case, as in other cases it can be an invaluable help. For example, consider:

E = C + D;

In this case, it's valuable and sensible that we would rewrite this to:

forall (e, c, d) in zip(E, C, D) do
  e = c + d;

and then, seeing that e, c, and d are still arrays, rewrite it to:

forall (e, c, d) in zip(E, C, D) do
  forall (ee, cc, dd) in zip(e, c, d) do
    ee = cc + dd;

which provides a succinct and naturally composable way to do a reasonable operation in parallel.

So that takes my mind to wondering about whether we could / should be doing something different in the case of asymmetric multi-level promotions. Returning to the original case above:

var C, D, E: [1..n] [1..2] real;
var pair: [1..2] real;
D = C + pair;

One approach would be to say something like "The compiler should notice the commonality of [1..2] real in this example and decide to try to make those dimensions align when it's implementing the promotion rather than trying to align [1..n] and [1..2] as it does today. But I'm not sure that's practical. In part, this reflects the limits of our compiler today, where we don't really reason about array sizes at compile-time. But in more general cases, it may not be evident to the compiler which dimensions of an array match in size or not (e.g., a function that takes its array arguments in and doesn't impose any size constraints on them).

Another approach would be to only support single-level asymmetric promotions (like the original example in this issue) or multi-level promotions that are symmetric in nesting depth (like the E = C + D example). My best attempt at trying to summarize this would be "A promotion must either involve scalar arguments or array arguments of the same nesting depth").

A third approach would be to say that today's approach is completely reasonable and appropriate, and that all we should do is close the bug / design issue in #11428. To evaluate this, let's consider the n==2 case again, more explicitly:

var F, G: [1..2] [1..2] real;
var pair: [1..2] real;
G = F + pair;

Here, I think the question is "Is it reasonable to interpret this as follows?":

forall (g, f, p) in zip(G, F, pair) do
  g = f + p;

and therefore:

forall (g, f, p) in zip(G, F, pair) do
  forall (gg, ff) in zip(g, f) do
    gg = ff + p;

or, equivalently:

forall i in 1..2 do
  forall j in 1..2 do
    G[i][j] = F[i][j] + pair[i];

rather than, say:

forall i in 1..2 do
  forall j in 1..2 do
    G[i][j] = F[i][j] + pair[j];

Or should we just make it an error (or warning) to do an asymmetric multi-level promotion like this?

Seeing this example makes me think that making it an error/warning would be the most prudent approach, as I don't have an intuition that the interpretation we'd choose today is the most logical or reasonable for any reason. And this would also have the benefit of making the original asymmetric multi-level example a compilation-time issue rather than an execution-time one.

So then the question for me is "How hard would it be to notice and flag such asymmetric multi-level promotions?