21043, "damianmoz", "Compiler Built-in Mathematical routines - Long Term - Not Urgent", "2022-11-16T02:18:46Z"
Introduced not to make more work in the short term but to raise the issues involved so that they are considered in decisions made going forward.
Modern C compilers (try and) treat several fundamental mathematical routines as 'built-in'. These are those with the functionality (and a 64-bit draft C23 name) as follows:
FMA - fused multiply and add (fma)
ABS - absolute value of a floating point number (fabs)
square root (sqrt)
truncate towards zero (trunc)
round to nearest with ties away from zero (round)
round to nearest with ties to even (roundeven)
round towards positive infinity (ceil)
round towards negative infinity (floor)
round according to the current rounding direction (rint)
minimum of two floating point numbers of one or more flavours (fmin)
maximum of two floating point numbers of one or more flavours (fmax)
transfer the sign of one floating point number to another (copysign)
get the negative bit of a floating point number (signbit)
These compilers implement such routines with a subroutine call using either a special primitive as would likely be the case with ABS and FMA, or the far simpler expedient of using a header file containing an inline C routine with (hopefully minimalist) embedded assembler, something really feasible only with more recent versions of the C language standard.
There are other routines that arguably could also be in that list:
split a floating point number into an exponent and a signed factor
ramp function (fdim) or some other sort of Heaviside function
scale a floating point number by the radix raised to an integral power
round to nearest with ties to odd
inverse square root (rsqrt)
other flavours of the minimum of two floating point numbers
other flavours of the maximum of two floating point numbers
A flavour of the first of these is the C routine frexp, a routine that in the opinion of some does not fit modern needs, not least because it reflects floating point numbers of the 1970s!! The functionality of the last three is recommended by the latest IEEE 754 standard and appear in drafts of the next C standard.
That supplementary list is not exhaustive and deliberately does not include the routines that work with floating point exceptions and other aspects of the floating point environment. They are a whole new ball game, especially in the context of LLVM.
Long term, does Chapel try and simply leverage what the C standard provides, which is dictated by what is standardized by C17 or C23 or does it exploit its own more powerful (and arguably simpler) features and handle builtins itself???? Sometimes the quality of the routine that you get in a C library is sub-optimal and it would be good to avoid this. For example, the glibc version of the scaling noted above is arguably nearly 3 times slower than it needs to be.
There is at least one bigger issue here. Chapel is yet to address fused multiply/additions, something that in my humble opinion only the Rust language has done rigorously and consistently and thoroughly. So that needs to be considered here. Some ideas on this are discussed in #11335.
Food for thought!! And discussion. Not sure if this needs multiple issues. Its content will overlap (to some extent) other issues but the focus here is how to provide the aforementioned functionality such that any subroutine call overhead is avoided and optimal performance is achieved (at the expense of code).