LinearAlgebra module in 1.24

This is a post continuing the discussion from Announcing Chapel 1.24.0! - #5 by damianmoz off the announcements category.

Hi @damianmoz ,

For some additional context, I think the LinearAlgebra module should have been
updated with these changes when Chapel initially switched to 0-based indexing.
Not doing so was an oversight, in part because the LinearAlgebra module is
still a package module and does not receive the same level of focus as the
core language and standard library.

Again it is counter to the mandate. It breaks existing code.

Some breaking changes were inevitable when making this switch because there are
some functions that must choose a default indexing by design, e.g. Matrix(3,3)
.I agree that it is unfortunate that some functions still require a specific
indexing which made the breaking changes more impactful. Updating all of
these is a larger effort that will need to be tackled in the upcoming release.

That is the worst of both worlds. It breaks everybody's code. See also my
comments below about a reindex such as exists in kron.

As I mentioned earlier, a breaking change was inevitable (and should have
occurred when Chapel first switched to 0-based indexing). Before this change,
the LinearAlgebra module was in an odd state where many functions asserted
1-based indexing or always returned arrays with 1-based indexing, despite the
rest of the language/libraries defaulting to 0-based indexing.

Things like a
reindex are not an index neutral approach in my humble opinion but others
may have differing views.

I am not sure I understand the rationale behind this view. I think of
index-neutrality having 2 components: (1) Does the function accept arrays with
any stride or offset? (2) Does the function return an array that inherits the
stride and offset from the argument(s)? Whether or not the function utilizes
reindex is an implementation detail.

Can you point me at any index-neutral approach in the existing
LinearAlgebra routines? I am struggling to find them.

For a few examples, see eye, transpose, cross, and matrix-matrix
multiplication (dense or sparse).

For some additional context, I think the LinearAlgebra module should
have been updated with these changes when Chapel initially switched to
0-based indexing.

As I mentioned, I did not know this has happened. All that has happened
is that when a declaration occurs without explicit bounds being given,
then those bounds are memory offset limits (or 0-based indexes in IT
jargon).

I (and several others) started new work in Chapel from C++ solely because
it supported methematical indexing (or 1-based indexes in IT jargon).

  Can you point me at any index-neutral approach in the existing
  LinearAlgebra routines? I am struggling to find them.

For a few examples, see eye, transpose, cross, and matrix-matrix
multiplication (dense or sparse).

I agree with the first two.

I will try and redo 'cross' as truly index-neutral. That said, one can
assume that library routines only support unit-stride for a cross-product.
I have never seen a cross product called with non unit-stride 1D-arrays in
40 years but then I normally only work in geometry, graphics and finite
element applications so my experience is by definition limited.

I will try and do an index-neutral 'lu' by the end of May but I need to
get the (re-)naming of the 'stridable' param and the new index-neutrality
support methods (#17454) to a sufficient level of agreement before I start
it. I will also seek to ensure that the performance issues caused by the
cache hit problem of the current 'lu' disappears.

Thanks - Damian

Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
Views & opinions here are mine and not those of any past or present employer

Isn't this more an index neutral approach?

// cross product : the dimensions of the result are those of the first argument

proc cross(x : [?xD] ?R, y : [?yD] R)
{

    if xD.rank != 1 || yD.rank != 1 then
    {
        compilerError("Ranks are not 1");
    }

    assert(xD.size > 2);
    assert(yD.size > 2);
        
    const i = xD.dim(0).low;
    const j = yD.dim(0).low;
    const (x0, x1, x2) = (x[i], x[i + 1], x[i + 2]);
    const (y0, y1, y2) = (y[j], y[j + 1], y[j + 2]);
    const z : [xD] R = [ x1*y2 - x2*y1, x2*y0 - x0*y2, x0*y1 - x1*y0 ];

    return z;
}

Others may have a better solution.

I have no idea of the overhead of the tuplized extraction of the elements of the tone dimensional arrays. But that is the compiler writer's problems.

The above naive implementation has severe numerical flaws when given what amount to near co-incident vectors. But that is a whole new ball game.

I avoided the use of the halt() procedure because I figure it cripples the optimizer which would make this routine totally impractical to use in the real-world.

What is the difference between the low method and first?

There were bad typo's in the original post if you are reading this from
email.

I have edited the post on Discourse to address these.

Regards - Damian

Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
Views & opinions here are mine and not those of any past or present employer

What is the difference in overhead between

    const (x0, x1, x2) = (x[i], x[i + 1], x[i + 2]);
    const (y0, y1, y2) = (y[j], y[j + 1], y[j + 2]);

and

   const x0 = x[i + 0], y0 = y[i + 0];
   const x1 = x[i + 1], y1 = y[i + 1];
   const x2 = x[i + 2], y2 = y[i + 2];

I have liked the former ever since I saw the concept in the70s in CPL (although it called them lists). But not if it has an overhead in Chapel.

Within the Module, when is halt used and when is an error thrown.

Not a big fan of either but the way this module is written, it needs try/throw/catch exceptions.

What is the overhead of try/throw/catch logic? Does it work well in parallel?

Has the debate about such an approach for a parallel user library been debated at length?

The modules uses a conventional pivot vector which allows very simple permutation logic to be used if a new copy of the array in question is used.

Both LINPACK and LAPACK use a different approach which allows permutation in-place to be programmed quite easily.

Is there any documentation of why one approach was preferred over the other?