Hi Damian —
No need to reply quickly to these responses quickly for our sake, but just to address your most immediate question in case it helps (now or next week):
To me, a use on a file Include.chpl
should behave the same way as
direct insertion of that same file. Instead, it behaves differently.
This "behaves differently" is correct and by intention. Chapel's use
definitely isn't an equivalent to '#include' in C/C++ or \input
in LaTeX (not that you're necessarily suggesting that, but others sometimes think so, and others may read this thread). It also doesn't make the public declarations within 'Include.chpl' act as though they were defined within the current scope. Instead, it is as though they are introduced into a scope just outside of the current scope. Let me try to explain why.
I'm fairly certain that at Chapel's outset (and maybe for some number of years thereafter? I can't keep track) it was more like you are describing and expecting. But what we found was that it led to more confusion, errors, and instability than benefits. As an example, imagine I write some code like the following:
use Math; // I want to use sqrt(), so am use-ing Math to get it
// here are some scalar variables:
var a, b, c, d, e, f: real;
var x0, y0: int;
var i1, j1, k1: int;
// here are some array variables
var x: [1..y0] real = [i in 1..y0] sqrt(i: real);
var A: [1..x0, 1..y0] real;
var Cube: [1..i1, 1..j1, 1..k1] real;
// compute a conjugate gradient...
var B = conjg(A, x);
// by defining a conjugate gradient routine...
proc conjg(M: [?D], v: [?vD]) {
...
}
Now, being very familiar with the Math module, you probably see the pitfalls that could occur here and are wanting to scream at my naivete in choosing these identifier names. But to someone not terribly familiar with all the symbols introduced by the Math module, who just wanted to use sqrt() and correctly guessed it was there; yet was too lazy or unaware to filter down to just that symbol using 'use Math only sqrt;', they made the unforunate choice of naming some of their variables and procedures the same thing as several symbols in the Math module:
- y0 and j1 conflict with the Bessel functions
- e conflicts with the mathematical constant
- conjg() nearly conflicts with conjg() but instead just adds a new overload, potentially resulting in confusion
In your "use inserts here" model (and Chapel's historical model), this code would result in errors due to having duplicate definitions of 'e', 'y0', and 'j1' within the same scope. The diverging definitions of 'conjg()' wouldn't cause an error outright, but would result in an overload that I probably wasn't intending or aware of.
[Note that this example isn't entirely fictious. For example, we definitely had (multiple) users and developers who tried to declare symbols named 'e' and ran into surprises and errors due to conflicts with the Math module's definition of 'e'.]
Maybe the above isn't so bad, though? After all, the compiler will yell at me, I'll learn that the Math module defines those symbols, swear that someone took the names e
, y0
, and j1` away from me, rename them grumpily and move on?
But then, more generally, we started to get concerned about function hijacking or code instability across releases where adding new procedures or variables to a module like 'Math' might change a program's behavior if the author of the program wasn't aware of those changes and the new symbols started conflicting with their own or becoming better matches than their overloads.
As a simple example, maybe I rename e
above to avoid a conflict, but a later release of Chapel introduces a variable named f
. Suddenly my code starts breaking and I have to rename another variable? More swearing...
As another example, imagine that a future version of the Math module defined a conjg()
overload with a similar signature as the one in my code above, yet with a more precise element type like this:
proc conjg(M: [] real, v: [] real) { ... }
Moreover, imagine it does something very different than computing the conjugate gradient as mine did. Suddenly, my program would see a better match for the given routine, and the behavior of my program would completely change meaning through no fault of my own (well, other than potentially the fact that I relied on 'use' which is inherently subject surprises since it opens the gate so wide by default).
As a result of both these concerns, (quite some time ago) we made 'use' statements start inserting their symbols into a "shadow scope" just outside of the use statement's scope. Schematically, if my Chapel code looked like this:
{
var a, b, c;
{
use Math;
var x, y, z;
}
}
the resulting scoping ends up being something like this:
{
a, b, c are defined here
{
Math's sqrt, y0, j1, e, conjg, and everything else it defines are here
{
x, y, z are defined here
}
}
}
This avoids the conflicts in my original code: My y0, j1, and e are now defined at a different scope from Math's, so the fact that I was blissfully unaware that its no longer matters; and by preferring "more local" routines, I prevent the possibility of a new overload of Math.conjg() from accidentally (or maliciously) hijacking mine.
Now, as Michael said, if you really want that "inject symbol at this scope" behavior — and/or you want a safer alternative to 'use' to begin with — then 'import' is your friend. Specifically:
-
'import' does not automatically bring in any symbols; it only brings in the ones you name
-
because of this, it is also considered to bring the symbols into the current scope rather than using a shadow scope (because it's self-evident in the code precisely which symbols are being made available).
Reconsidering my scoping example with import, if I wrote:
{
var a, b, c;
{
import Math.sqrt;
var x, y, z;
}
}
I would get the following scoping:
{
a, b, c are defined here
{
Math's sqrt is made available here
x, y, and z are also defined here
}
}
I think all of us would say today that if your goal is just to sketch out code quickly and sloppily, use
is just fine and very convenient. But for anything considered to be production code, you really want to be using import
for all of its precision benefits and lack of surprises.
Briefly, import
made me think that the way you'd want to write your example would be:
import Math.sqrt;
proc cmplx(...) ...
proc sqrt(...) ...
proc main ...
but as Michael suggested, that doesn't work because:
- Math already defines sqrt() overloads taking complex arguments
- so now I have two routines with conflicting signatures at the same scope
- so now the call doesn't know which one to dispatch to
Perhaps there should be a way to say "only import the version of sqrt that accepts real values" but we don't have that sort of control today. Importing a symbol brings in all overloads of the symbol.
I rip code out of a big file and put it into another file for subsequent
inclusion
In many cases, this should probably "just work". Cases where it doesn't—as illustrated by your example here—are ones where a set of overloads of the same routine are split between multiple modules. And again, this is by intention and relates to the "overload set" concept that Michael mentioned.
I don't want to go much into that concept at this point since this response has gone a bit long already, but the concept is related to the two definitions of conjg() above as "complex conjugate" vs. "conjugate gradient". We don't want users to accidentally end up with overloads of a single name unless those overloads were meant to be aware of one another. Most often, this would be done by:
- defining the overloads in the same module
- or at modules that are all similarly used/imported (so they're "equidistant")
- or one module defines some overloads while publicly importing (re-exporting) others, causing them all to virtually be defined at the same scope
Again, if you find cases that don't "just work" as Michael and I are hoping, we'd like to be aware of that and to understand what the motivation for those use cases is to see how we/Chapel can help.
One other tool that may be useful here (though I'm skeptical): Chapel has an include
statement that can be used to bring in a file as a sub-module. This still isn't the same as a C/C++-style #include
, but can be a useful way to refactor code into distinct files for various reasons while still making it accessible, albeit through a sub-module.
Also, I have proposed extending the include
statement to support C/C++-style #include
and while that proposal hasn't generated enough support to implement yet, if real users like yourself were interested in it, that would increase the chances of it happening.
Also, you keep using the word symbol
to mean the name of the
overloaded routine or proc. To me, the symbol name (for purposes of
resolution) should be the procedure name and its signature.
Fair enough. If it was me, I am admittedly sloppier with terminology than I should be much of the time.
-Brad