LinearAlgebra module in 1.24

This is a post continuing the discussion from Announcing Chapel 1.24.0! - #5 by damianmoz off the announcements category.

Here is the discussion from before:

@damianmoz:

I have yet to try and build LinearAlgebra with 1.24.0,

Is the 'lu' routine within LinearAlgebra able to accept a matrix from
existing user code like

var A : [1..n, 1..n] real(64);

What allows it to use array subscripts with a domain of

(0..<n, 0..<n)

without doing a reindexing. How does that work?

What is the overhead of a reindex()?

If 'kron' is given 1-based indices as input, is the result 1-based?

Regards - Damian

@bradcray:

@damianmoz : Good index-neutral routines won't try to impose their bounds on their input arguments, but will use the input arguments as they stand. For example, if I were writing an "increment" routine for an array, I'd tend to write:

proc increment(A: [?D] ?t) {
forall i in D do
A[i] += 1;
}

or:

proc increment(A: [?D] ?t) {
forall a in A do
a += 1;
}

or ...(imagine other variants here) rather than:

proc increment(A: [?D] ?t) {
ref A0 = A.reindex({0..<A.dim(0).size, 0..<A.dim(1).size});
forall i in 0..<A.dim(0).size do
for j in 0..<A.dim(1).size do
A0[i,j] += 1;
}

(say).

If I did need to operate on A's indices more directly, I'd rely on queries like D.dim(i).low and D.dim(i).high to compute on A using its natural indices rather than reindexing the array to fit my preferred indexing mode.

I assume that the LinearAlgebra module takes a similar approach, but would need someone else closer to the package to validate that assumption (where @ben-albrecht or @mstrout might be good candidates).

-Brad

@ben-albrecht:

Good index-neutral routines won't try to impose their bounds on their input arguments, but will use the input arguments as they stand.

This is a design goal of the LinearAlgebra module. However, not all linear algebra functions fulfill this criteria yet. In Chapel 1.23, some functions always returned 1-based indexed array or would only accept 1-based indexed arrays as input arguments. In Chapel 1.24, some of those functions were updated to be index-neutral, but many were simply updated to expect / always return 0-indexed arrays instead. Taking a peek at the source, it looks like lu is one of the functions that assumes 0-based indexing in 1.24. kron is index-neutral w.r.t. arguments, but always returns a 0-indexed array in 1.24. It remains an ongoing effort to update the remaining functions in LinearAlgebra to be index-neutral w.r.t. arguments and returns.

@damianmoz:

Quoting from the DIscourse post by @brad immediately before the most recent,

Good index-neutral routines will not try to impose their bounds on their input arguments, but will use the input arguments as they stand

I agree. I cannot see this approach in LinearAlgebra but maybe I need to look harder.

I apologise for the slight mis-quote there. I change won't to will not because Discourse goes anti-social with a quote inside a block because it thinks it is a string and starts using red text!

Quoting from the most recent Discourse post of @Ben

This is a design goal of the LinearAlgebra module.

Quoting from the DIscourse post by @brad immediately before the most recent,

Good index-neutral routines will not try to impose their bounds on their input arguments, but will use the input arguments as they stand

I agree. I cannot see this approach in LinearAlgebra but maybe I need to look harder.

I apologise for the slight mis-quote there. I change won't to will not because Discourse goes anti-social with a quote inside a block because it thinks it is a string and starts using red text!

Quoting from the most recent Discourse post of @Ben

This is a design goal of the LinearAlgebra module.

Actually, that is a mandate as per the original concepts of the email from @bradcray mediately before the most recent,

	
Quoting from the most recent Discourse post again,

This is a design goal of the LinearAlgebra module.

Actually, that is a mandate as per the original concepts of the email from @bradcray of 5th Nov 2019 with the subject _Should Chapel use 0-based indexing for tuples, strings, etc_. Part of the email talks about **Indexing cases that won't need to change**:

The "in cases where the language chooses..." qualification above is crucial
because the most common places in Chapel that deal with indexing require the
programmer to specify low and high bounds:

  • ranges (e.g., 0..n-1 vs. 1..n vs. -3..3)

  • domain values (e.g., {1..n, 0..n-1, -3..3})

  • array types (e.g., var A: [1..n, 0..n-1, -3..3] real;)

Such cases would not require any changes in user code if we were to make the proposed change.

In that same Nov 2019 email (and its correction), there was mention that

array types inferred from literals (e.g., var A = ["hi", "there"]; results in A having type [1..2] string today; in the proposal, it would have type [0..1] string)

There was no mention that Chapel was zero-based across the board, only that a base of would be the default where array bounds were to be inferred from a declaration. In fact, I can still do
```chapel
var A : [1..2] string = [ "hi", "there" ];

and things should still work.

Quoting from the most recent Discourse post,

In Chapel 1.24, some of those functions were updated to be index-neutral, but many were simply updated to expect / always return 0-indexed arrays instead.

Again it is counter to the mandate. It breaks existing code.

Taking a peek at the source, it looks like lu is one of the functions that assumes 0-based indexing in 1.24. kron is index-neutral w.r.t. arguments, but always returns a 0-indexed array in 1.24.

That is the worst of both worlds. It breaks everybody's code. See also my comments below about a reindex such as exists in kron.

It remains an ongoing effort to update the remaining functions in LinearAlgebra to be index-neutral w.r.t. arguments and returns.

Can you point me at any index-neutral approach in the existing LinearAlgebra routines? I am struggling to find them. Things like a reindex are not an index neutral approach in my humble opinion but others may have differing views.

Hi @damianmoz ,

For some additional context, I think the LinearAlgebra module should have been
updated with these changes when Chapel initially switched to 0-based indexing.
Not doing so was an oversight, in part because the LinearAlgebra module is
still a package module and does not receive the same level of focus as the
core language and standard library.

Again it is counter to the mandate. It breaks existing code.

Some breaking changes were inevitable when making this switch because there are
some functions that must choose a default indexing by design, e.g. Matrix(3,3)
.I agree that it is unfortunate that some functions still require a specific
indexing which made the breaking changes more impactful. Updating all of
these is a larger effort that will need to be tackled in the upcoming release.

That is the worst of both worlds. It breaks everybody's code. See also my
comments below about a reindex such as exists in kron.

As I mentioned earlier, a breaking change was inevitable (and should have
occurred when Chapel first switched to 0-based indexing). Before this change,
the LinearAlgebra module was in an odd state where many functions asserted
1-based indexing or always returned arrays with 1-based indexing, despite the
rest of the language/libraries defaulting to 0-based indexing.

Things like a
reindex are not an index neutral approach in my humble opinion but others
may have differing views.

I am not sure I understand the rationale behind this view. I think of
index-neutrality having 2 components: (1) Does the function accept arrays with
any stride or offset? (2) Does the function return an array that inherits the
stride and offset from the argument(s)? Whether or not the function utilizes
reindex is an implementation detail.

Can you point me at any index-neutral approach in the existing
LinearAlgebra routines? I am struggling to find them.

For a few examples, see eye, transpose, cross, and matrix-matrix
multiplication (dense or sparse).

For some additional context, I think the LinearAlgebra module should
have been updated with these changes when Chapel initially switched to
0-based indexing.

As I mentioned, I did not know this has happened. All that has happened
is that when a declaration occurs without explicit bounds being given,
then those bounds are memory offset limits (or 0-based indexes in IT
jargon).

I (and several others) started new work in Chapel from C++ solely because
it supported methematical indexing (or 1-based indexes in IT jargon).

  Can you point me at any index-neutral approach in the existing
  LinearAlgebra routines? I am struggling to find them.

For a few examples, see eye, transpose, cross, and matrix-matrix
multiplication (dense or sparse).

I agree with the first two.

I will try and redo 'cross' as truly index-neutral. That said, one can
assume that library routines only support unit-stride for a cross-product.
I have never seen a cross product called with non unit-stride 1D-arrays in
40 years but then I normally only work in geometry, graphics and finite
element applications so my experience is by definition limited.

I will try and do an index-neutral 'lu' by the end of May but I need to
get the (re-)naming of the 'stridable' param and the new index-neutrality
support methods (#17454) to a sufficient level of agreement before I start
it. I will also seek to ensure that the performance issues caused by the
cache hit problem of the current 'lu' disappears.

Thanks - Damian

Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
Views & opinions here are mine and not those of any past or present employer

Isn't this more an index neutral approach?

// cross product : the dimensions of the result are those of the first argument

proc cross(x : [?xD] ?R, y : [?yD] R)
{

    if xD.rank != 1 || yD.rank != 1 then
    {
        compilerError("Ranks are not 1");
    }

    assert(xD.size > 2);
    assert(yD.size > 2);
        
    const i = xD.dim(0).low;
    const j = yD.dim(0).low;
    const (x0, x1, x2) = (x[i], x[i + 1], x[i + 2]);
    const (y0, y1, y2) = (y[j], y[j + 1], y[j + 2]);
    const z : [xD] R = [ x1*y2 - x2*y1, x2*y0 - x0*y2, x0*y1 - x1*y0 ];

    return z;
}

Others may have a better solution.

I have no idea of the overhead of the tuplized extraction of the elements of the tone dimensional arrays. But that is the compiler writer's problems.

The above naive implementation has severe numerical flaws when given what amount to near co-incident vectors. But that is a whole new ball game.

I avoided the use of the halt() procedure because I figure it cripples the optimizer which would make this routine totally impractical to use in the real-world.

What is the difference between the low method and first?

There were bad typo's in the original post if you are reading this from
email.

I have edited the post on Discourse to address these.

Regards - Damian

Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
Views & opinions here are mine and not those of any past or present employer

What is the difference in overhead between

    const (x0, x1, x2) = (x[i], x[i + 1], x[i + 2]);
    const (y0, y1, y2) = (y[j], y[j + 1], y[j + 2]);

and

   const x0 = x[i + 0], y0 = y[i + 0];
   const x1 = x[i + 1], y1 = y[i + 1];
   const x2 = x[i + 2], y2 = y[i + 2];

I have liked the former ever since I saw the concept in the70s in CPL (although it called them lists). But not if it has an overhead in Chapel.

Within the Module, when is halt used and when is an error thrown.

Not a big fan of either but the way this module is written, it needs try/throw/catch exceptions.

What is the overhead of try/throw/catch logic? Does it work well in parallel?

Has the debate about such an approach for a parallel user library been debated at length?

The modules uses a conventional pivot vector which allows very simple permutation logic to be used if a new copy of the array in question is used.

Both LINPACK and LAPACK use a different approach which allows permutation in-place to be programmed quite easily.

Is there any documentation of why one approach was preferred over the other?

Hey Damian,

Finally getting a chance to catch up on this thread. I've taken a survey of index-neutrality in the LinearAlgebra module as of 1.24 (technically 1.25 pre-release) and posted it in this issue (I know you've spotted it and commented, but posting for the benefit of others)

Isn't this more an index neutral approach?

The implementation is more index neutral, but the interface is just as index
neutral as the current implementation as far as I can tell.

What is the difference between the low method and first?

Ranges can have negative strides, which means their lowest value may not be
their first value, e.g. the range 1..3 by -1 has a low value of 1, but a
first value of 3.

What is the difference in overhead between...

I would expect no overhead between the two implementations.

Within the Module, when is halt used and when is an error thrown.

There is an overhead for throwing an error, so many functions in performance-critical library calls use halts instead. However, throwing errors is more user-friendly, so some functions still do
this. There is not yet an established policy for this in the LinearAlgebra module. So far, the decision to halt vs. throw has been made on a case by case basis.

Does it work well in parallel?

Yes, there are mechanisms for handling cases where multiple errors are thrown
in a parallel context. Check out the Error Handling chapter
from the spec for details.

Is there any documentation of why one approach was preferred over the other?

I don't think there's documentation on this (there's no design document on the
linear algebra module today). The current approach has generally been to
support the out-of-place implementation first, with an intention to eventually
support in-place implementations with an inplace=true argument, where
applicable. The rationale being that out-of-place interfaces are higher level
and easier to work with, at the expense of performance.