Generic Floating Point Routine Definition Style

What is the better way to define generic routines?

A simple routine which returns the smallest normal number could be defined two ways.

This:

inline proc leastNormal(type T) param where T == real(32) do return 0x1p-126:int(32);
inline proc leastNormal(type T) param where T == real(64) do return 0x1p-1022:int(64);
inline proc leastNormal(param x : ?Real) param do return leastNormal(Real);
inline proc leastNormal(const x : ?Real) param do return leastNormal(Real);

Or

inline proc leastNormal(type T) param where T == real(32) do return 0x1p-126:int(32);
inline proc leastNormal(type T) param where T == real(64) do return 0x1p-1022:int(64);
inline proc leastNormal(param x : real(?w)) param do return leastNormal(real(w));
inline proc leastNormal(const x : real(?w)) param do return leastNormal(real(w));

What is better or more optimal?

Hi Damian —

Since you use the phrase "more optimal", let me start by saying that you shouldn't see any performance difference between these approaches. I'd call the difference more a matter of style and generality (so maybe "more optimal in terms of clarity / precision").

For a routine that only wanted to support these real(w) signatures, I prefer the second of these two choices since it puts a tighter constraint on the x argument, requiring it to be a real. In particular, note that your first approach would permit a call like leastNormal(42) which would cause it to try and call leastNormal(int) which will lead to a resolution error since there is no such overload.

Of course, if you wanted to have the routine support additional types, then the first form might be preferable, where you might want to add an additional where clause to permit only the types of x that you want to support to make it through. But since you named that type query Real, I'm guessing that's not the case here.

As a few other style notes, consider:

  • To avoid the verbosity of a where-clause, you could change the first two overloads in either approach into a constraint on the type argument:

    inline proc leastNormal(type T: real(32)) param do return 0x1p-126:int(32);
    inline proc leastNormal(type T: real(64)) param do return 0x1p-1022:int(64);
    
  • Since the behavior of the param and const overload are identical (in that both return a param and do so using the same logic), you could do away with the param overload which doesn't add any significant value over the const version. Specifically, param actuals can be passed to non-param formals. The typical reason for supporting routines that accept param and non-param actuals is when the former returns a param and the latter does not.

  • For that matter, since param routines are computed at compile-time, the argument intent of const isn't really necessary or meaningful. So for brevity, you could just drop that intent.

  • Similarly, since param procedures are computed at compile-time, the inline keyword shouldn't really have any benefit here.

As a result of all those comments, this is probably how I would write this family of overloads (ATO):

proc leastNormal(type T: real(32)) param do return 0x1p-126:int(32);
proc leastNormal(type T: real(64)) param do return 0x1p-1022:int(64);
proc leastNormal(x: real(?w)) param do return leastNormal(real(w));

Let us know if this response raises any additional questions for you,
-Brad

Thanks. You have answered all my original questions and more.

1 Like

I notice that in some Chapel code, there is often code written like

proc fred(x : real(32)) ...
proc fred(x : real(64))

rather than

proc fred(x : real(?w)) // note my edit to this

I like to write a routine generically, debug it in 32-bit mode and then check accuracy issues using the 64-bit version. Also, my approach means half the lines of code.

Am I strange or is the generic approach sub-optimal? Thanks. I might have asked a similar question before but I cannot find it. There are some thoughts on this topic in several issues on Github (but more as an aside in the context of other things so I did not want to post something in those issues there that was not really on-topic).

Hi Damian —

Performance-wise, the generic version and non-generic overloads should perform the same. Offhand, I think the main difference between the two approaches is that Chapel supports implicit conversions to non-generic arguments and does not support them for generic arguments. So if you either do or do not want to support implicit conversions to your routine's arguments, that would be a reason to choose one or the other.

-Brad

1 Like

Got it. Thanks for the concise reply. So using generic arguments is a way to enforce stricter typing. Good to know. Now I just need to embed that little gem of knowledge into my brain cells forever.

I just noticed a typo in your earlier response. The routines return a real(???)

proc leastNormal(type T: real(32)) param do return 0x1p-126:real(32);
proc leastNormal(type T: real(64)) param do return 0x1p-1022:real(64);
proc leastNormal(x: real(?w)) param do return leastNormal(real(w))

What is the difference between the first two below and the third (they both return the same result) for the same argument.

proc leastFiniteA(type T: real(32)) param do return 0x1p-149:real(32);
proc leastFiniteA(type T: real(64)) param do return 0x1p-1074:real(64);

proc leastFiniteB(type T: real(?w)) param do return (1:uint(w)).transmute(real(w));

assert(leastFiniteA(real(32)) == leastFiniteB(real(32)));
assert(leastFiniteA(real(64)) == leastFiniteB(real(64)));

Thanks.

I could also write these two more succinctly

proc leastFiniteA(type T : real(?w)) param do return (1:uint(w)).transmute(T);

What one is better?