Hi Gabriel —
Thanks for the interest, kudos, and questions. I should prefix my comments by saying that I'm not an expert on the Sort module, but am decent on records, consts, and implicit data passing, so will see where I can get you, and maybe we can call in others to help with things I'm shaky on.
I'm going to take your questions out of order, to address what feels to me (and to you, I think) like the elephant in the room. Specifically, I think there is something very wrong with both the first and second overwrites of comparator.key()
and that Chapel isn't treating these as it should, for a few reasons:
So as you can see, even though you got the code to seemingly work as you'd expected/hoped, it's making me nervous that it does, as written. The best case scenario would be that it is reasonable for sort()
to call your .key()
for reasons related to my guesses above, that Chapel should be generating an ambiguity error when two such overloads were defined, that we'd similarly generate an error for trying to define methods on variables rather than types, but that you'd define it on rec
and all would be well. But I'm not optimistic that that's right.
Continuing to delay getting to your actual questions, I thought I'd try to sketch out how I think this pattern could (almost) be written more legally, and came up with this:
use Sort;
proc argsort(Data : [?Dom] ?eltType) {
var n : int = Data.size;
var d : domain(1) = {0..<n}; // this array of indices will be sorted
var idx : [d] int;
forall j in d {
idx[j] = j;
}
record MyComparator {
proc key(a) {
writeln('third version'); return Data[a];
}
}
sort(idx,comparator=new MyComparator());
return idx;
}
var a = [1,1,20,4,-5,2,3,111,12,5,100];
var sort_indices = argsort(a);
writeln(a[sort_indices]);
Specifically, what I'm doing here is defining my own comparator type with its own .key()
method defined as you were, and passing that to Sort.sort()
. I think this is all on the up-and-up.
The reason I say it "almost" implements the pattern that you were shooting for is that I've dropped the comparator argument because I'm not certain what I would do with it if it one were passed in. Would I want it to override this local comparator I've created and get passed in instead? Would I want its logic to be combined into this local comparator record in some way? I decided these were questions that would be best put off until daytime or kicked back to you for consideration and just removed the comparator argument instead.
And then as one last delay, I wanted to note that Arkouda implements argsort and coargsort functionality (where I've spent very little time with their code, and am just recognizing the names), in case that might be useful to you. There has been discussion about spinning their sort routines out into a mason package to make them usable to other users, and some baby steps in that direction, but it hasn't happened yet.
OK, on to your questions:
Q1a: I would say that you are not overwriting defaultComparator
in any way. Specifically, you're accepting it as a default value into your argsort()
routine, and effectively having the argument named comparator
store a const ref
that refers to it. The const
-ness of defaultComparator
and const ref
-ness of comparator
means that you wouldn't be allowed to modify its fields, if it had any. But it doesn't, it's effectively just a way of passing a bundle of methods around.
What you're doing here is effectively extending the methods that its type (DefaultComparator
) supports (illegally, in my opinion, though switching to rec
would make it legal). This is less a question of const
-ness as it is what methods a given type supports. Methods like these are called tertiary methods in Chapel because they are defined in a different module/scope than the type itself. Sometimes it makes people uneasy to think that they can define tertiary methods at all, but if you think of them as being like defining a standalone procedure that accepts a rec
/DefaultComparator
as its first argument, and it just gets some special syntax that lets that first argument be the receiver of the method call, then hopefully it's not as unnerving.
Q1b: In general, for nested procedures like this, any references to outer variables within the procedure itself are passed into the procedure as additional hidden/automatic arguments by the compiler, so the fact that they refer to those outer variables isn't problematic. And since the procedure, by definition, only logically exists while its outer scope does (where I'm ignoring asynchrony for the time being), it makes sense that it can refer to such variables. As an example, if you were to look at the generated code for a case like:
proc foo() {
var x = 42;
proc bar() {
writeln(x);
}
bar();
}
what you'd see is that x
is getting passed by the compiler-generated code into bar()
through the call.
In your case, it's a bit more mysterious, however, since the Sort module is expecting to call a key() method that just takes the one argument. I believe what's going on here is, again, a powerful case of Chapel's point-of-instantiation semantics where we're stamping out the generic sort()
routine as though it were conceptually local to this context, and therefore its calls to .key()
are getting specialized to pass Data
through. And inspecting the generated code seems to suggest this, as I'm seeing the definition of key
to take three arguments, where the third is the array Data
:
/* testit-orig.chpl:13 */
static int64_t key_chpl(DefaultComparator_chpl * this_chpl10,
int64_t a_chpl2,
_array_DefaultRectangularArr_1_int64_t_F_int64_t_int64_\
and the calls within the generated sort routine to be passing these additional arguments through:
call_tmp_chpl44 = key_chpl(comparator_chpl, ret_chpl2, Data_chpl2);
All that said, given my previous misgivings about whether .key()
should be winning at all or allowed to co-exist with a .keyPart()
method, I'd be hard-pressed to say whether this behavior is a bug or a feature in your specific example. This would be another good place to have @mppf, @vass, and/or @lydia check my work.
And then I think I've effectlvely already answered Q2a and Q2b by saying "you're right, that seems really weird and inconsistent. To summarize, I definitely think the ability to call one but not to get an ambiguity error when defining both is a bug, though I'm less certain whether the bug is simply the lack of an ambiguity error or the fact that it's letting you call the method when one is defined to begin with. And I definitely think it's an unintentional bug that we let you define methods on specific variables rather than their types.
OK, let me stop there, and see if I've helped clarify some of your questions, where we may need to rely on others to help with some of the areas where I feel non-confident.
I'll also be curious whether my attempt to rewrite your pattern feels like a step in the right direction and more comprehensible / reasonable, and what your thoughts are as to what should happen if a comparator argument was to be re-added. One possibility might be to create two overloads of this routine, one that takes a comparator and passes it through, the second of which doesn't take one and passes through a local one like the one I created (?).
-Brad