27983, "DanilaFe", "How should de-tupling fields work, if at all?", "2025-10-29T22:21:23Z"
Consider the following program:
proc foo() do return (2,3);
record R {
var (x, (y, z)) = (1, foo());
}
As I write this, the program causes an internal error on main, and I will be making the error user-facing.
Should this compile, and if so, what should the semantics be? Of particular interest to me personally is, should we allow non-tuple literals on the RHS, and if so, how should they work in the presence of users overriding the fields?
Some options:
Option 1: Force tuple literals on the right
In this option, we disallow the program above because foo() is not a tuple literal. However, we do accept the following program:
record R {
var (x, (y, z)) = (1, (2, 3));
}
We treat the above program the same as the following:
record R {
var x = 1, y = 2, z = 3;
}
But then, why do we support this syntax if the multi-decl syntax already exists, and makes it easier to associate variables with their initial values?
Option 2: Desugar verbatim
The verbatim desugaring of the above code, as it's done in Chapel today, is as follows:
var tmp = (1, foo());
var x = tmp(0);
var y = tmp(1)(0);
var z = tmp(1)(0);
Technically, the same desugaring works for fields as well. We could create:
record R {
var xyz = (1, foo());
var x = xyz(0);
var y = xyz(1)(0);
var z = xyz(1)(1);
}
This has some advantages: with the xyz field having been given a concrete name, we can precisely control whether the initialization expression (1, foo()) runs. If the user overrides it, the default will not be executed. Since subsequent fields' default expressions refer to this explicit preceding field, they will correctly receive "downstream" values if the user updates the whole tuples. Despite this, the user can explicitly override each individual field ,though if they write new R(x=1, y=2, z=3), the foo() call will still be run since xyz is initialized with its default.
This does have some surprising semantics (where does the name for xyz come from? is the user okay with doubling the size of the field, since the tuple now exists alongside the variables?) Notably we can't use ref as in ref x, since records don't support ref fields.
Option 3: No field for xyz, compute it as part of the initializers
We could try to desugar the above program by having an initializer that lazily computes (1, foo()) when any of the fields x/y/z use their default value, and otherwise does not compute it. However, this is a nontrivial transformation (so much so that I haven't written a pure Chapel desugaring). It also complicates the mental model of the fields' semantics (when should the user expect (1, foo()) to be evaluated?
One way to restrict this is to require that either all of the fields are initialized, or none of the fields are. The effective desugaring is:
record R {
var x, y, z: int;
proc init() { /* compiler generated */
(this.x, (this.y, this.z)) = (1, foo());
}
// note: no defaults
proc init(x: int, y: int, z: int) { /* compiler generated */
this.x = x;
this.y = y;
this.z = z;
}
}
However, this all-or-nothing approach seems a bit odd. Plus, there are implementation difficulties. For one, normally, we generate only one constructor. However, we can't write a single constuctor that somehow enforces groups of arguments are either all provided, or all defaulted. For another, even if we generate explicit constructors as I did in my desugaring, we quickly hit the exponential cliff. For two field groups, the number of required constructors doubles to 4:
record S {
var (x, y) = (1, 2);
var (z, w) = (3, 4);
proc init() { /* .. */ }
proc init(x: int, y: int) { /* .. */ }
proc init(z: int, w: int) { /* .. */ }
proc init(x: int, y: int, z: int, w: int) { /* .. */ }
}