I love chapel's set array operations, but the edge cases are really hard

ahysing · June 21, 2022, 9:35pm

Hi community. I love chapel. I love all the functionality. I am used to write code on my own that I now get for for free.
I have grouped together three questions that i don't find answers two online.

I am very productive with arrays. that is what i am focusing on today.

Question 1.

var myArray : [0..10] int = 42;

Does all array values end up with value 42? I am have not yet figured if this is valid chapel. To be honest i don't trust the compiler to catch all the errors I throw at it.

Question 2.
i am doing products and arrays.

module Poc {
  proc main() {
    var myArray : [0..10] int = {0..10};
    var otherArray : [0..20] int = {0..20};
    var result =  myArray * otherArray;
    writeln(result);
  }
}

my chapel 1.26 compiler spits out ten squares; 0 1 4 9 16 25 36 49 64 81 100.

If I switch the order of the two factors around I assumed the result would be the same, but that is not the case.

module Poc {
  proc main() {
    var myArray : [0..10] int = {0..10};
    var otherArray : [0..20] int = {0..20};
    var result =  otherArray * myArray;
    writeln(result);
  }
}

gives
error: halt reached - size mismatch in zippered iteration (dimension 0).

Is this expeced?

Question 3.
this time i am having a hard time with dmapped domains.

module Poc {
  use BlockDist;
  proc main() {
    writeln("started");
    var A : domain(1) dmapped Block({0..10}) = {0..10};
    var myArray : [A] int = {0..10};
    var B : domain(1) = {0..10};
    var otherArray : [B] int = {0..10};
    var result =  myArray * otherArray;
    writeln(result);
    writeln("finished");
  }
}

this gives output

Build Successful

started
0 1 4 9 16 25 36 49 64 81 100
finished

What locales is results distributed over? If the results is stuck on the main locale what would be the simplest way to make result use an identical domain and Block distribution as myArray on the left hand side?

damianmoz · June 21, 2022, 10:04pm

When it comes to the multiplication of two1D arrays.the only truly valid simple arithmetic vector operations are those where the dimensionality is identical. My 2c.

damianmoz · June 22, 2022, 1:27am

The original operation you tried which multiplied (elementwise) a 11x1 1D array by an 21x1 1D array. This looked at the first operand and decides to produce a result which was 11x1. If Chapel was more type strict, it would reject that. It makes no sense to me. If I really wanted to achieve what you tried, I would write slice otherArray

var result = myArray * otherArray[0..10];

This has the advantage that writing it with the operands swapped will also work.

I am not a dmapped expert.

bradcray · June 22, 2022, 2:06am

Hi Andreas —

I'm glad to hear you're enjoying Chapel. If you have questions that you think would be valuable to the broad public, feel free to ask them on StackOverflow with the chapel tag, and we'll answer them there to build up our corpus of information. When things are working right, we get notified when such questions are asked.

Yes, this works as you're expecting and is as intended. Assignments from a value of type t to an array of type t results in a conceptual forall loop, like:

forall elem in myArray do
  elem = 42;

This is a current bug, unfortunately (so you're right not to trust us "to catch all the errors you throw" at us ), and it's not a particularly easy one for us to fix. Zippered iteration in Chapel generally requires the iterands to have compatible sizes/shapes. However, as you're noting here, if the first iterand in the zip is larger than the second, it sneaks by without complaint. This is issue zippered forall loops with size mismatches can silently drop iterations on the floor · Issue #11428 · chapel-lang/chapel · GitHub.

In order to make such zipperings legal, you'll either need to make sure the arrays are the same size, or else slice the larger array by the smaller (e.g., otherArray[0..myArray.size] in this case), as I see Damian's also suggested since I started typing this.

ahysing:

Question 3.
var result =  myArray * otherArray;
What locales is results distributed over? If the results is stuck on the main locale what would be the simplest way to make result use an identical domain and Block distribution as myArray on the left hand side?

As you've written the code, result will have the same domain as myArray, so will be distributed. If you were to swap the orders of the arguments, it would have the same domain as otherArray, so would be local. You can verify this using a loop like this:

    forall i in result do
      writeln(here.id, " owns ", i);

If you wanted the result to have the other distribution, you could force it by using an explicit type declaration that linked the domain to the array you wanted, like.

var result: [B] int = myArray * otherArray;

or:

var result: [otherArray.domain] int = myArray * otherArray;

However, this still wouldn't change where the computation is done (for better or worse). To understand why, read on.

The reason for this behavior is that * on two arrays essentially calls the scalar * operator on the array's elements in a zippered fashion. Thus,

...myArray * otherArray...

is equivalent to:

forall (m,o) in zip(myArray, otherArray) do
  ...myArray * otherArray...

and in these zippered contexts, the result expression takes the shape/size/domain of the "leader" iterand, which is to say the first one in the (explicit or implicit) zip expression.

This choice of will also have a profound effect on how the operation is implemented. Specifically, if a distributed array like myArray is the leader, all cores on all locales that it targets will be involved in the computation; whereas if otherArray is the leader, only the local cores will be involved in the computation. Thus, when using arrays with different domains/distributions, it's important to pay attention to which one leads the computation. Of course, you can always make it more explicit (and verbose) in your code by using a parallel loop, like:

forall i in myArray.domain do
  ...myArray[i] * otherArray[i]...

Hope this helps explain some of the "algebra" of which domain governs an expression in Chapel. Obviously, feel free to ask follow-up questions about anything that's unclear.

-Brad

ahysing · June 22, 2022, 5:24am

@damianmoz I agree with you. these operations makes no sense.

I get it now. It looks like binary operations on arrays , like * is essentially a forall loop with a zip. The domain of the left hand side is also used in the results. the same domain also decides what locales the computation is performed.

bradcray · June 22, 2022, 5:33am

Precisely. One other corollary that's important, though, which I forgot to mention. If you have a chained series of promoted operators within a single statement like:

A = B * C + D * E;

don't think of it as:

var Temp = B * C;
var Temp2 = D * E;
A = Temp2 + Temp;

with each statement turning into its own zippered forall loop. Instead, it gets transformed into something equivalent to:

forall (a, b, c, d, e) in zip(A, B, C, D, E) do
  a = b*c + d*e;

Therefore, for such a statement, A's distribution will govern how the whole computation is performed.

-Brad

ahysing · June 22, 2022, 5:58am

That is a really good explanation.

Another related topic is catching these undefined behaviours.
In some cases you are not in total control of your input arrays. Let’s say you are writing a library, and you are taking arrays A and B as input. What would be the most common pattern for caching this for a chapel programmer ?

I imagine one could do


proc multiply(A, B) {
if A.domain != B.domain then
  throw new InvalidArgumentException(«both inputs must have the same domain»);

assert , exceptions or clever use of types and generics. There are so many different options to choose from

bradcray · June 22, 2022, 6:31am

The precise check to use depends on the library of course, but for built-in operators like +, note that the domains need not match (as in your original question 3), just the shapes/sizes. For such a case, A.shape != B.shape would be the more permissive check.

If the shapes match, but you want the specific indices to differ, many libraries can be written in an index-neutral way, either by iterating over the arrays directly:

forall (a, b) in zip(A, B) do ...

or by iterating over their domains:

forall (i, j) in zip(A.domain, B.domain) do ...

or by breaking the domains down into their component dimensions:

forall (Aj, Bj) in zip(A.domain.dim(1), B.domain.dim(0)) do ...

I'm also often overwhelmed by choices between halting/asserting/throwing when errors occur. Different approaches can be more or less suitable for different user profiles or situations. One other tool that can be valuable for checks that can be done statically is compilerError() which will generate a compile-time error when the call is resolved by the compiler. For example, a simple case would be:

if (A.rank != B.rank) then compilerError("rank mismatch between A and B");

As I think you allude to, another approach would be to put a constraint on the library routine to begin with such that it simply never resolved, as in:

proc multiply(A, B) where A.rank == B.rank {
  ...
}

-Brad

Topic		Replies	Views
New chapel questions for Jan 15 - Stack Exchange Stack Overflow	0	278	January 15, 2021
Announcing Chapel 1.24.0! Announcements	0	605	March 18, 2021
Multi-Dimensional Array Initialization Users	1	247	December 19, 2021
Reindexing to same array Users	12	375	April 21, 2021
Encoding an Integer into a string in Pure Chapel Users	14	133	May 19, 2024

I love chapel's set array operations, but the edge cases are really hard

Related Topics