I was wondering if Chapel supports self registering types like C++. Something like: Factory with Self-Registering Types. I see application for this in programs where different numerical libraries or functionalities could be swapped out by specifying a string or ID to the main solver class.
I've attempted a derivative of this in Chapel, but get stuck in trying to declare the static variables used to register classes at compile time. Passing the class type to the factory without access to this, I tried to pass the name of the derived class directly from that derived class, but this resulted in this recursive type construction is not yet handled.
I do see that Chapel supports Reflection, but seems to be targeted at accessing fields rather than class names themselves.
Thanks for your question. This is going to be only a very cursory answer since it's the weekend—if it's not helpful, we can have someone on the team take a closer look next week. Specifically, I'm not familiar with the 'factory with self-registering types' work and don't have the time to read about it today, so am focusing in on the bit about names of classes in case that's helpful in the meantime.
The first thing I wanted to check on is to see whether you're aware that the type of any expression can be queried in Chapel using .type and that a type can be converted to a string by casting (e.g., expr.type: string). That said, for classes, the .type query will only know about the class's static type, not its dynamic type, which it sounds like you may need here. So, this example prints owned C (the parent class) for the third declaration even though its dynamic type is owned D (the child class).
Issue #16916 requests an execution-time routine to get a class's dynamic type as a string, and it also provides a workaround that provides such a capability for the time being. I'm finding this morning that the workaround generates warnings in the current compiler, seemingly because the primitive it uses relies on a deprecated feature (c_string) and needs updating (to use c_ptr(c_char) instead), suggesting we don't use the primitive much ourselves. But ignoring the warnings, it produces the owned D result for the case above: ATO
Hopefully one of these might be helpful. If it's not, or if having to rely on the strings at all feels like too much of a workaround, I wanted to note that I think we consider the Reflection module to be fairly minimal at present, providing what we, and users, have needed to date; and that we're very open to expanding it (or the language proper) to cover a greater variety of queries. So please feel encouraged to open issues proposing routines/patterns/queries that would be helpful for you to have the language or libraries support.
It's still considered an "unstable" feature but Chapel has support for first class procedures. You could make a global map of strings to your factory functions and populate this map in module initialization code (Module initialization is performed at program start-up time). See: First-class Procedures in Chapel — Chapel Documentation 2.2
use Map;
class Archive {
proc type extension() do return "illegal";
proc extension() do return this.type.extension();
}
var archiveFactoryFuncsTable : map(string, proc() : shared Archive);
// ----------------------------------------------------------------------------
class BzipArchive : Archive {
override proc type extension() do return "bzip";
override proc extension() do return this.type.extension();
}
proc makeBzipArchive() : shared Archive do return new shared BzipArchive();
archiveFactoryFuncsTable.add((BzipArchive).extension(), makeBzipArchive);
// ----------------------------------------------------------------------------
class GzipArchive : Archive {
override proc type extension() do return "gzip";
override proc extension() do return this.type.extension();
}
proc makeGzipArchive() : shared Archive do return new shared GzipArchive();
archiveFactoryFuncsTable.add((GzipArchive).extension(), makeGzipArchive);
// ----------------------------------------------------------------------------
class ZipArchive : Archive {
override proc type extension() do return "zip";
override proc extension() do return this.type.extension();
}
proc makeZipArchive() : shared Archive do return new shared ZipArchive();
archiveFactoryFuncsTable.add((ZipArchive).extension(), makeZipArchive);
// ----------------------------------------------------------------------------
proc makeArchive(extension : string) : shared Archive {
return try! archiveFactoryFuncsTable[extension]();
}
writeln(makeArchive("bzip").extension());
writeln(makeArchive("gzip").extension());
writeln(makeArchive("zip").extension());
@jmag722 Even if the Reflection module supported looking up a class by its name, the name would have to be a param. If your application fits with this constraint, we can discuss this further.
Otherwise the factory registry needs to contain values. Storing first-class functions, as Andy suggests, is a good direction. Another route is to store instances of factory classes. Here is a simple sketch:
module FactoryRegister {
public class FactoryPrototype {
proc create(arg: int): owned RootClass { halt("pure virtual"); }
}
private use Map;
private var registry: map(string, owned FactoryPrototype);
public proc register(name: string, in factory: owned FactoryPrototype) {
assert(registry.add(name, factory));
}
/* invokes the factory function for 'name' */
proc create(name: string, arg: int) {
return try! registry[name].create(arg);
}
}
module Lib1 {
private import FactoryRegister;
public class LibClass1 {
proc type create(arg: int) {
writeln("LibClass1.create ", arg);
return new LibClass1();
}
}
/*private*/ class F1: FactoryRegister.FactoryPrototype {
override proc create(arg: int) {
return LibClass1.create(arg): owned RootClass;
}
}
// executed at module initialization time
FactoryRegister.register("LibClass1", new F1());
}
// a clone of Lib1, just to show that multiple libs are OK
module Lib2 {
private import FactoryRegister;
public class LibClass2 {
proc type create(arg: int) {
writeln("LibClass2.create ", arg);
return new LibClass2();
}
}
/*private*/ class F2: FactoryRegister.FactoryPrototype {
override proc create(arg: int) {
return LibClass2.create(arg): owned RootClass;
}
}
// executed at module initialization time
FactoryRegister.register("LibClass2", new F2());
}
module App {
import FactoryRegister;
proc main {
writeln(FactoryRegister.create("LibClass1", 10));
writeln(FactoryRegister.create("LibClass1", 20));
writeln(FactoryRegister.create("LibClass2", 30));
writeln(FactoryRegister.create("LibClass2", 40));
}
}
This is good to know that .type will only return the static type of the object (for now). Ideally the string ID used to register a class would not have to be directly tied to the class name itself, but could be specified as a class data member. Also the actual string name argument would be passed at runtime, so I don't know if it could be a param.
And thank you for these very detailed examples. It's good to know too about this proc type functionality, this is like a static member function in C++ (or class method in python) I assume? The method of storing functions in the map instead with the generic makeArchiveFactoryFunc does look clean. But I like the route of storing instances of factory classes because it looks like an easier way to be able to pass additional arguments to the class. Speaking of that, is there a way to use the create function to pass arguments to the constructor? Something like
module Lib1 {
private import FactoryRegister;
public class LibClass1 {
var foo: real;
proc init(foo: real) {
this.foo = foo;
}
proc type create(arg: int, foo: real) {
writeln("LibClass1.create ", arg);
return new LibClass1(foo);
}
}
class F1: FactoryRegister.FactoryPrototype {
override proc create(arg: int, foo: real = 0.5) {
return LibClass1.create(arg, foo): owned RootClass;
}
}
// executed at module initialization time
FactoryRegister.register("LibClass1", new F1());
}
// Lib2 here ...
module App {
import FactoryRegister;
proc main {
var lib1 = FactoryRegister.create("LibClass1", 10);
writeln(lib1.foo); // expecting 0.5
var lib1_2 = FactoryRegister.create("LibClass1", 10, 200.0);
writeln(lib1_2.foo); // expecting 200.0
writeln(FactoryRegister.create("LibClass1", 20));
writeln(FactoryRegister.create("LibClass2", 30));
writeln(FactoryRegister.create("LibClass2", 40));
}
}
I get a compile-time error because it's looking for foo in RootClass rather than LibClass1.
error: unresolved call 'owned RootClass.foo'
note: because no functions named foo found in scope
You are correct, proc type is invocable on a type, not an instance. BTW you can define such procs on any Chapel type, including classes, records, numeric, ...
My example was overly simplistic by having proc FactoryPrototype.create() return RootClass instances. As you see, it needs to return something more specific. I would expect that the factory framework includes the definition of an interface for all things that can be "created" through the factory. This is analogous to how FactoryPrototype defines what kind of create methods all factory objects support. Since Chapel's interfaces are not yet up for the task, I defined a class to serve as an interface.
To implement your intention, we can define a class Registrant with the method foo and require that the create method of each factory objects returns an instance of Registrant.
See my modified example below. I used other tweaks to make it compile. Paren-less methods like foo are not dynamically dispatched, so I changed it to paren-ful. I propagated the optional foo formal argument into FactoryRegister.create. Most importantly, FactoryPrototype.create now returns Registrant.
module FactoryRegister {
public class Registrant {
proc foo(): real { halt("pure virtual"); }
}
public class FactoryPrototype {
proc create(arg: int, foo: real): owned Registrant { halt("pure virtual"); }
}
private use Map;
private var registry: map(string, owned FactoryPrototype);
public proc register(name: string, in factory: owned FactoryPrototype) {
assert(registry.add(name, factory));
}
/* invokes the factory function for 'name' */
proc create(name: string, arg: int, foo: real = 0.5) {
return try! registry[name].create(arg, foo);
}
}
module Lib1 {
private import FactoryRegister;
public class LibClass1: FactoryRegister.Registrant {
override proc foo(): real { return foo_; }
var foo_: real;
proc init(foo: real) {
this.foo_ = foo;
}
proc type create(arg: int, foo: real) {
writeln("LibClass1.create ", arg);
return new LibClass1(foo);
}
}
class F1: FactoryRegister.FactoryPrototype {
override proc create(arg: int, foo: real = 0.5) {
return LibClass1.create(arg, foo): owned FactoryRegister.Registrant;
}
}
// executed at module initialization time
FactoryRegister.register("LibClass1", new F1());
}
module App {
import FactoryRegister;
proc main {
var lib1 = FactoryRegister.create("LibClass1", 10);
writeln(lib1.foo()); // expecting 0.5
var lib1_2 = FactoryRegister.create("LibClass1", 10, 200.0);
writeln(lib1_2.foo()); // expecting 200.0
}
}
Thank you all. I took another stab at it with a different example, piecing what I've seen you all suggest. It's just a math kernel with various implementations (compute sum, mean, or some "other" operation on a list of numbers). The correct kernel is specified via a config string.
The idea is to be able to swap in different functionalities without an ever-growing switch statement, keep concrete implementation details out of the factory, and require future modules to only need to override appropriate methods and register. An issue I've seen is that if different derived classes possess a different set of optional parameters at initialization, getting elegant access to the constructor is impossible is because it's buried in the anonymous function.
It doesn't appear possible to have the registry map store a type rather than an instance (the option I went with was the function which returns an instance like you shared). But, for example, if I wanted to be able to pass input values from a TOML/YAML file (name and optionally, epsilon if name==other), that could be tricky.
The alternative is to use a config at the module level, which works beautifully, but alters encapsulation of that variable from the class to the module. But maybe this is just a design issue I haven't worked out.
module MathKernel {
class MathKernel {
proc type name() {
halt("Must override `name`.");
}
proc name() do return this.type.name();
proc evaluate(arr: [] real): real {
halt("Must override procedure `evaluate`.");
}
}
}
module MathKernelFactory {
use Map;
use MathKernel;
private var registry: map(string, proc(): shared MathKernel);
public proc register(name: string, type t) {
const func = proc(): shared MathKernel { return new shared t(); };
assert(registry.add(name, func));
}
proc getKernel(name: string) : shared MathKernel {
return try! registry[name]();
}
}
module MeanKernel {
use MathKernel;
use MathKernelFactory;
class Mean : MathKernel {
override proc type name() do return "mean";
override proc name() do return this.type.name();
override proc evaluate(arr: [] real) : real {
var avg = 0.0;
for a in arr {
avg += a;
}
return avg / arr.size;
}
}
MathKernelFactory.register(Mean.name(), Mean);
}
module SumKernel {
use MathKernel;
use MathKernelFactory;
class Sum : MathKernel {
override proc type name() do return "sum";
override proc name() do return this.type.name();
override proc evaluate(arr: [] real) : real {
var sum = 0.0;
for a in arr {
sum += a;
}
return sum;
}
}
MathKernelFactory.register(Sum.name(), Sum);
}
module OtherKernel {
use MathKernel;
use MathKernelFactory;
class Other : MathKernel {
var epsilon: real = 0.51;
override proc type name() do return "other";
override proc name() do return this.type.name();
override proc evaluate(arr: [] real) : real {
var other : real = 0.0;
for a in arr {
other += a**this.epsilon;
}
return other;
}
}
MathKernelFactory.register(Other.name(), Other);
}
module Main {
use MathKernel;
use MathKernelFactory;
config var name: string;
proc main() {
var myKernel = MathKernelFactory.getKernel(name);
var a = [12.0, 15.2, 0.0, 3.0];
writeln(myKernel.evaluate(a));
}
}
I like your clean design! I am even thinking that the kernel factory could be provided by the MathKernel module itself so the user does not have to deal with the extra "factory" module.
I am not seeing your concern about name being a config variable. In my view where the name comes from (config, yaml, ...) and how the factory reacts to it are orthogonal. For example, name could be read from the yaml file and still work the same way as in your code. Regardless, I am open to further discussion here.
What I do see is that you want to pass each kernel a different set of arguments. For example, "other+1" might accept an array and an additional arg, "other+2" two additional args, ... One way to provide this feature is to pass a single additional argument to evaluate() that is a container for the additional arguments. As a base solution, it can be a list of 0 or more reals. We can elaborate by putting named args into a dictionary, wrapping int vs real in different child classes, etc.
Of course when a user is writing Chapel code that invokes a specific kernel, there is no need to go through the factory. An instance of the kernel class can be created directly using new SumKernel.Sum() for example. All the special features of the Sum class can be used directly with compile-time checking.
Thanks! I took from what you and @stonea had suggested. That's a good idea merging modules too. I looked some more, I like the idea to pass a more generic data type to each kernel. The registration scheme ID could also be passed to our kernel factory as well as a kernel *data factory. The data factory constructs some container of additional arguments, and fills them with defaults if the user doesn't override it with other configs. Then that data container is passed to the kernel.