External Issue: Floating Point Constants

18599, "damianmoz", "Floating Point Constants", "2021-10-20T19:32:43Z"

Various generic programming issues (which is largely unobtrusive currently) will arise as Chapel has to adapt to handling either real(16) or real(128) found in some of the latest CPUs. Indeed, one only has to look at what you have to do in C/C++ when trying to generically program to handle floating point constants for float, double and long double.

The only way around this is to assume that Chapel effectively defines a floating constant as having infinite precision until it is used opr at least the maximum precision handled by the hardware, or some compiler-flag-restricted maximum precision . The latter still allows us to mandate that a floating point constant is 64-bits wide so existing programs are not broken. The programmer can still specify the size occupied by that constant from the start of execution by a cast as in

const SQRT_2_0 = 1.41421356237309504880168872420969807856967187537694807317667973799:real(w);

The question then arises of how you specify the precision of a symbolic constant. These are things like INFINITY, e and pi(both of which drive me crazy because they conflict with names my programs use), and others.

This also will mean that the compiler will need a flag which allows a programmer to specify the rounding mode used at compile time to
an expression like the above or the one below:

param t = 1 / 3.0:real(32);

Similarly, is the expression on the right-hand side of this,

const t = 1 / 3.0:real(32);

evaluated at compile time and determined by that previously mentioned flag or by the rounding flag in effect at run-time,. or is it determined by a separate compiler flag?

Let us see where this discussion goes. I am sure this discussion has implications for the optimiser.

This discssion may lead to the need to define a parameterised type with say

type zuse(w) = uint(w);

in Chapel. But it may not.