19167, "mppf", "definition of shadowing is inconsistent between variables and functions", "2022-02-03T12:27:56Z"
I have been looking at this code in function resolution as part of porting it over to the new resolver. I have been scratching my head a bit. I am pretty sure that the current behavior is not reasonable. But, that might be a big in function resolution or maybe it is a sign that we need to adjust the language design.
One thing to note about the current implementation is that it's pretty complicated and it does a traversal of all scopes visible from the call (including use/import), twice, in order to decide if one candidate is more specific than another candidate. I am worried that this contributes to performance problems. It would be much less worrisome if the consideration only needed to go up through the parent scopes (not counting use/import).
However a larger issue is that the behavior seems unnecessarily inconsistent between variables and functions. More details about that are in the next section.
Program to Explore the Current Behavior
Here is a program to explore the current behavior.
To summarize the current situation:
-
with
UseA_UseUseB
in the program below:- variables are resolved according to an idea of distance in terms of use/import statements - e.g. A.x is available with fewer hops through use statements than B.x, so A.x is shadowing B.x
- for function disambiguation, the fact that A.f is available through fewer hops does not seem to matter, and it is ambiguous
-
with
CUseA_ImportA
- for variables,
import A
does not impact what could be definingx
, so it has no effect on whatx
could refer to - for function disambiguation,
import A
is considered the same asuse A
and so is considered creating a path toA.f
at the same number of hops as the path toC.f
. - I am pretty sure that private uses, uses with renaming, or
use only
with an unrelated name will have the same problem for functions as the import here
- for variables,
module A {
proc f() { writeln("A.f"); }
var x = "A";
}
module B {
proc f() { writeln("B.f"); }
var x = "B";
}
module CUseA {
public use A;
proc f() { writeln("C.f"); }
var x = "C";
}
module UseA_UseB {
public use A;
public use B;
}
module UseB {
public use B;
}
module UseA_UseUseB {
public use A;
public use UseB;
}
module CUseA_UseA {
public use CUseA;
public use A;
}
module CUseA_ImportA {
public use CUseA;
import A;
}
module Program {
//use UseA_UseB; // -> ambiguity between A.f and B.f
// -> x is multiply defined
//use UseA_UseUseB; // -> ambiguity between A.f and B.f
// -> x refers to A.x
//use CUseA; // -> f refers to C.f
// -> x refers to C.x
//use CUseA_UseA; // -> ambiguity between A.f and C.f
// -> x is multiply defined
//use CUseA_ImportA; // -> ambiguity between A.f and C.f
// -> x refers to C.x
proc main() {
f();
writeln(x);
}
}
History and Related Issues
- bde150a476e5c05cd0a8e8038221fd1ac8d6d2aa has some history of the isMoreSpecific code for functions and it definitely predates
import
and probably predatesuse only
and private uses. - #14014 brings up a related question about module scoping
- #19160 is tangentially related because it asks if symbols brought in by
import
should be subject to shadowing at all
What does the spec say on the matter?
https://chapel-lang.org/docs/language/spec/procedures.html#determining-more-specific-functions
For functions X and Y, we have this rule about which is more specific (which is considered after things like formal argument types):
- If X shadows Y, then X is more specific.
- If Y shadows X, then Y is more specific.
However this section does not define shadows in any way.
https://chapel-lang.org/docs/language/spec/modules.html#conflicts
Describes shadowing in terms of a distance idea:
Because symbols brought into scope by a use or import statement are placed at a scope enclosing where the statement appears, such symbols will be shadowed by other symbols with the same name defined in the scope with the statement. The symbols that are shadowed will only be accessible via Qualified Naming of Module Symbols.
Symbols defined by public use or import statements can impact the scope they are inserted into in different ways (see Public and Private Use Statements and Re-exporting for more information on the public keyword). Symbols that are brought in by a public use for unqualified access are treated as at successive distances relative to how many public use statements were necessary to obtain them. For instance,
...
Symbols brought in directly by a public import are treated as though defined at the scope with the public import for the purpose of determining conflicts (see Re-exporting). This means that if the public use in module B of the previous example was instead replaced with a public import A.x, A’s x would conflict with C.x when resolving the main function’s body.
What should we do about it?
I think that at the very least, the language should have one definition of shadowing that is used for both resolving variables and for resolving functions.
Here are some ideas:
A. Consider the behavior with variables today to be correct and formalize it by describing a distance in number of hops. A symbol shadows another symbol if, in a given scope, it has a smaller distance. Have the new compiler code literally compute this distance and compare distances.
- the module brought in by use and any symbol brought in by import adds 1 to the number of hops
- the contents of a module brought in by a use adds 2 to the number of hops
- an enclosing scope (e.g. going outside of
{ }
) adds 1 to the number of hops
B. Simplify the rules about number of hops:
- If a symbol is defined at an inner scope, or if it is defined in a module but that module
use
/import
s something defining that symbol, then we have shadowing within that module. This can continue to consider 3 scopes: things defined directly in the module; modulesuse
d; and contents of modules brought in byuse
. This is the situation withinCUseA
in the example program. - However, something
use
/import
ing that module views the symbols defined in it as completely flat. So if you wanteduse CUseA
to findC.x
andC.f
(and not find them ambiguous withA.x
andA.f
) you would have to adjustCUseA
to not publicly exportA.x
andA.f
. - This design has the advantage that when the compiler is deciding if one candidate function is more specific than another due to scoping, it only needs to visit the parent scopes and does not need to consider use/import statements at all. This property should also make it easier for users to predict what will happen with their programs.
- It would make the example at the top of #14014 be an ambiguity error which was the original request in that issue