Integer Promotion Weirdness

Consider the following:

The compile time routine bias() returns an unsigned iinteger. It is passed a real(?w) type

        const z = 25:uint(32);
        const y = z + 1;
        const b = bias(real(32));
        const c = 2 * b + 1;
        const d = 2 * bias(real(32)) + 1;

        writeln(z.type:string);
        writeln(y.type:string);
        writeln(b.type:string);
        writeln(c.type:string);
        writeln(d.type:string);

Look at x and y. Adding 1 to an unsigned integer results in an unsigned integer. Expected.

Saving its result in b and then doubling it and adding 1 yields an unsigned integer. Expected.

But dong that computation a single expression yields a signed integer of double the size of the original. How does this happen?

I thought maybe somebody was trying to be smarter than they should be so I changed those const identifiers to param identifiers. But no, it is worse. It is a mess. I am totally lost.

Hi Damian —

I suspect we'll need you to share a definition of bias() in order to help with this. Taking a guess at how you may have defined it with Chapel 2.0, I'm seeing only consistent/expected results (see link below), so suspect I've guessed at what you're doing incorrectly.

Run – Attempt This Online

-Brad

Sorry. I figured it did not matter because I am doing exactly the same computations.

proc bias(type T : real(32)) param do return 127:uint(32);
proc bias(type T : real(64)) param do return 1023:uint(64);
proc bias(x : real(?w)) param do return bias(real(w));

This came out of the earlier post "Generic Floating Point Definition Style".

Yes. It is Chapel 2.0.

Hi Damian -

I agree that this is an odd case. I can explain why it is happening.

I found that passing both of the flags --warn-param-implicit-numeric-conversions --warn-implicit-numeric-conversions does give a warning in this case (although I'm not sure I can justify today why both of these flags are required to warn here).

Anyway, at the root of the issue is that Chapel's implicit conversions rules are intended to make literals more flexible. Since literals are param (and params should behave like literals) that means that the flexibility applies here to the expression bias(real(32)) since that returns param.

In particular, if you have someNonParamValue + someParamValue then it will try to prefer the type of someNonParamValue. E.g., for var myInt32: int(32); myInt32 + 1, the type of 1 is int(64) but the result of that expression will be int(32) because the param value 1 can be represented as int(32) and the type of the param is considered less important here than the type of the non-param.

In contrast, when we have someParamValue + someOtherParamValue the two expressions being added are considered equally for the result type. E.g. with param myParamInt32: int(32) = 0; var myParamInt64: int(64) = 0; myParamInt32 + myParamInt64; the last expression will have type int(64). That's the same as a non-param case like var myInt32: int(32); var myInt64: int(64); myInt32 + myInt64;. In both cases, we choose the larger type. (We have to do that in the non-param case, and we do it in the param case to be consistent with the non-param case & keep the rules sensible).

What to do about it? All I can think of to improve the situation at this point is to potentially bump up the priority of opting out of implicit numeric conversions and/or improving the warning for implicit numeric conversions.

1 Like

Thanks. I still do not really understand the rules. And it breaks lots of my old code.

Damian —

Quick question: Do your uses of bias() require it to be a param? If not, you could just remove the param return intent and get the behavior you want, I think.

-Brad

The bias() is known at compile time, so it should reflect reality and be a param.

Besides, Lots of things (for which I want no run-time overhead) depend on the bias() so it has to be a param,

What are the rules which seem to yield an int(64) from an expression which contains only identifiers and literals that I would consider int(32). Or is it that a literal integral type is given a type of int(64) irrespective of context. Is it also that a literal non-imaginary floating point type is given a type of real(64).

Yes, it's that the literal 1 is an int(64) regardless of context; it can implicitly convert into int(32), but that doesn't help with something like 1:int(32) + 2 (which results in int(64)). Of course, you can cast the other value or the result of the whole expression. And, this is very different from C, where integer literals are typically 32-bit ints.

Similarly, 1.25 has type real(64) regardless of context.

It's just that these literals are param, and params have more flexibility to convert into smaller numbers (real(32), int(8), etc) than regular variables.

It's important that 1 have a type so that we (and the compiler) know what var x = 1 means (i.e. that x will be an int(64)).

Some ideas dragged out of my past rewritten/translated in a Chapel context. My memory may be failing or I could have mistranslated or even have lost my marbles.

If the programmer has not thought enough about the structure of a mixed mode
expression, the last thing a compiler should do is provide a (likely) broken
way out of their laziness.

Assumptions of mixed mode arithmetic:

  • Rules/Definitions/Mandates are to be orthogonal and easy to remember
    -- they should be kept to a minimum
  • The compiler can question the programmer's intelligence at any time
    -- with lots of warnings (but that is all)
  • It is not the compiler's role to be a numerical analyst
    -- there are tools to help with that, e.g. Herbie
  • The compiler will have the following compiler switches:
    ---- width of integral type of last-resort, e.g. --itolr-width=32
    ---- width of floating-point type of last-resort, e.g. --ftolr-width=64
    -- these can be mandated within code by a pragma(t)
    -- the active type is that at the compiler command line or within user code
    ---- when the active type is within user code, a command line value is an error
    -- if the active last-resort type != that within an include file => ERROR

There is/are the concepts of:

  • explicit and implicit types
    -- the width of an implicit type is NEVER known
    ---- this will handle larger and larger reals and integral types
  • a raw expression is one which
    • is made up of a mix of identifiers and literals
    • has no parentheses (i.e. precedence is determined by operators only)
  • a label, name and identifier are the same (in this context)

Definitions/Mandates

  • a param, const or var has an explicit type
  • a proc which returns a value has an explicit type
    ---- even if that type is given to it (implicitly) at the RETURN statement
  • a literal has an implicit type
    ---- real(w) if it contains a binary or decimal point or exponent
    ---- int(w) if it contains neither binary nor decimal point nor exponent
    ---- w is an unknown quantity
    ---- a literal NEVER has an explicit type
  • an anonymous param is a literal which has been coerced to an explicit type
    ---- e.g. 1.2345678987654321:real(32)
    ---- it is treated as a param (which has an explicit type)
  • an explicit real(p) type dominates an explicit real(q) if p > q
  • an explicit int(p) type dominates an explicit int(q) if p > q
  • an explicit uint(p) type dominates an explicit uint(q) if p > q
  • an explicit real(f) type dominates an explicit int(g) for any f and g
  • an explicit int(f) type dominates an explicit uint(g) for any f and g
  • an explicit type dominates an implicit type irrespective of bit-width
  • a thing is a literal|param|const|var|proc (the last four have a label)

There is only one rule:

The type of the result of an expression of things of
different numeric types is the dominant numeric type

This has a simplification:

The type of the result of an expression of things of
the same numeric type T is the numeric type T

Note:
-- the compiler is free to complain loudly if it objects to the above
-- the compiler cannot produce code that over-rides any of the above

Note that in the event that a raw expression is assigned to a pre-typed identifier, the type of that identifier is NOT the dominant type of the expression.

In evaluating a raw expression (no parentheses), the dominant type may change left to right throughout an expression:

  • an identifier is coerced to the explicit type dominant at the point
    in the expression where it occurs
  • literals take (or assume or are coerced to) the dominant explicit type
    at the point of their appearance in the expression evaluation. It is a
    compile time error if the dominant explicit type (at some point in the
    expression) is integral where there appears a floating point literal,
    i.e. something which has an implicit type.

Should an (un-typed/un-coerced) expression be made up of

a) integral literals only, it is evaluated as if
- t.b.a.
b) floating point literals only, it is evaluated as if
- t.b.a.
c) a mix of floating point and integral literals, it is evaluated as if
- t.b.a.

t.b.a. = o.t.i. (open to interpretation) = o.f.d. (open for discussion)

There is a mandate that in the extreme says that a real(16) dominates an int(128). Anybody who exploits that, or appears to do so, is not very smart. In this event, the compiler should be complaining bitterly. Then again, anybody using a real(16) will by definition be paying a lot of attention to accuracy so it is impossible that such a problem will arise in practice. If any confusion exists, then attention is drawn to the second sentence of this paragraph.

My 2c.

The definitions/mandates have implications for handling and infinity and the various NaNs. But nothing dramatic.

There are deliberate contentious issues in some of my words above but that is more to stimulate discussion rather than to be controversial.

Sorry Damian, I've been meaning to catch back up on this discussion but
drew the short straw on being in charge of testing and we had a lot of
noise over the weekend. Still a bit underwater myself but hoping to get
back on top of things by the end of the week.

Thanks,
Lydia

No rush. I noticed the problem in the performance testing I was doing for our ChapelCon 24 and I simply put some words to paper (in a digital/virtual sense) while they were fresh in my mind.