Parallel Identification of NaN within a matrix

damianmoz · December 23, 2023, 1:05am

How does one rapidly identify if there are NaNs within a matrix.
The best I can deduce is that given an array A

const haveSeenNaN = | reduce [t in A] (t != t);

or

const countOfNaN = + reduce [t in A] (t != t):int(32)

Is there a smarter way especially if say A was 2000x2000.

I am worried about memory access which is the crucial task here. Sorry for the initial typo that needed editing

bradcray · January 2, 2024, 6:30pm

Hi Damian —

I think a reduction is a completely reasonable way to write this given Chapel's current feature set. A few things I might suggest:

for potential performance benefits: use a || reduction rather than a | reduction in hopes of getting an advantage from the potential for short-circuiting
for clarity benefits: use the isNan() call in the Math module (unless there's a reason the != approach is preferred?)
for arguable clarity benefits: use promotion in the reduction expression rather than an explicit loop

Here's my proposed rewrite:

const haveSeenNaN = || reduce isNan(A);

Or, if you have a good reason for avoiding isNan(), maybe:

const haveSeenNaN = || reduce (A != A);

-Brad

damianmoz · January 4, 2024, 1:11am

Thanks for that insight. My comments follow and I apologies for my lack of skill at quoting.

for arguable clarity benefits: use promotion in the reduction expression,
rather than an explicit loop

Your clarity benefits are beyond arguable, they are obvious. Too bad I did not think of that.

for potential performance benefits: use a || reduction rather than a | reduction
in hopes of getting an advantage from the potential for short-circuiting

I would like to understand that a bit better.

for clarity benefits: use the isNan() call in the Math module (unless there is a reason
the != approach is preferred?

The original isinf() and isnan() functionality was introduced in IEEE 754 largely as a workaround. They were used to handle anti-social (or as Kahan put it rude) compilers optimizing away

x != x

or even x == x to be always false and true respectively. They also provided a way to cleanly handle hardware that did not (fully or at all) support infinities or NaN such as the IBM Cell (or some GPUs).

For clarity, instead of the isinf() and isnan(), I prefer

abs(x) != INFINITY
x != x

And if programmers cannot read the latter as saying x being unordered with respect to itself, educators are failing their students. I would have thought compilers would have optimized those expressions better in most contexts. The sooner we get rid of workarounds, the better, not least because it simplifies things.

bradcray · January 4, 2024, 1:41am

First of all, to make sure anyone reading is on the same page, | can be viewed as a bitwise or non-short-circuiting boolean "or" operation, whereas || is a short-circuiting "or" operation (following C's lead), meaning that if an earlier argument is true, subsequent arguments need not / should not be evaluated. So when applied to a collection of values, a || reduce expression can arguably skip the evaluation of subsequent values once it finds a true whereas a | reduce can't by default (to compute side effects of subsequent expressions, e.g.).

Given that, there are a few reasons I said it could have a potential performance advantage:

If there are no "true" elements in the expression being reduced, then all elements will need to be evaluated anyway, and no advantage will be gained
For any reduction computed in parallel, it may be that one task finds a true and stops searching, but in our current implementation, it has no way of notifying other tasks that it found such a value (sometimes called a "eureka" in parallel literature), so they will continue searching through their local data. If there is only one true in the entire collection, it's unlikely that any significant benefit will be gained since most tasks still have to evaluate their complete subset of values.
For an | reduce, it's plausible that the compiler could optimize it into a short-circuiting implementation if it determined that the evaluation of the operand expressions is side-effect-free. To my knowledge, our compiler does not currently do this, but in the future it could.

Given the potential upsides (and no downsides that I'm aware of), I tend to use || and && reductions over | and & reductions on collections of bools.

Got it, thanks for explaining.

-Brad

damianmoz · January 4, 2024, 3:18am

Thanks. Great explanation.

Topic		Replies	Views
Werid performance slow-down Users	3	177	January 12, 2023
Minimum of Two Real numbers Users	9	192	August 9, 2022
Array assignment Users	3	90	January 16, 2024
Partial reductions (operate on 2D-array and produce 1-D array) Users	19	324	December 31, 2022
Cache Unfriendly Reduction Users	9	105	January 21, 2024

Parallel Identification of NaN within a matrix

Related Topics