Hi Anjiang / all β
Coming to this thread late, I wanted to add a few more details:
[quote="Anjiang, post:1, topic:24032"] Chapel supports multi-node. Each
node is called a locale, and the number of locales can be accessed by
ChapelLocale.numLocales [/quote]
Yes. ...
Though this is typically the case and a good mental model, we can be
slightly more precise or vague: On the vague side, a locale is just a
unit of the target architecture with processors and memory; on the more
precise side, it's a process in practice and the resources that that
process is bound to. Recent work has been adding a mode in which a locale
can be created per NIC or socket on a compute node β so slightly
finer-grained than the traditional locale-per-node model.
Also good to know about is locale.maxTaskPar, which will give the number
of tasks that the locale is capable of running concurrently. Typically,
this is the number of cores (particularly when running a locale per node),
but if the OS or another user setting has limited (or oversubscribed) the
number of threads, you'll get a different answer. When deciding how many
tasks to create, maxTaskPar is generally the preferred practice over
numCores().
Technically, DSI is a more advanced interface that a typical user
doesn't interact directly with. It describes how one can create a custom
distribution. Chapel releases contain several distributions including
but not limited to Block, Cyclic and BlockCyclic. These distributions
implement DSI for them to be able to be used as distributions on Chapel
arrays.
Engin's answer is correct. Putting it a different way, I'd say that
Chapel lets users control how data is distributed using domain maps (or
"distributions for short"). The domain map standard interface is more
about authoring your own domain map than how a typical user would specify
how an array is distributed.
Users in Chapel do not have a way to express memory layouts for arrays
(row-major, column-major for axis ordering mapped onto physical memory),
or layouts for structs (Array Of Structs, or Structs of Array) [/quote]
There is an (undocumented?) defaultStorageOrder
compilation flag which
you can set as -sdefaultStorageOrder=ArrayStorageOrder.CMO
to make
all Chapel arrays column-major. As of yet, there's no fine-grained
control over this.
Adding to this, nothing in the language prevents users from creating array
layouts that are CMO, tiled, use space-filling curves, etc. That said, it
requires writing your own domain map ("layout") which is not a very
well-documented task. Providing finer-grained control over the CMO layout
that Engin mentions above would not be a particularly difficult task, but
hasn't been one that any users have requested (that I know of), so it has
also not received any attention.
On the AoS vs. SoA question, at times we've discussed whether Chapel's
support for adding direct access methods and default iterators to records
was sufficient to make these kinds of choices without changes to the
"science" operating on the logical data structure, but that was years ago,
and I don't remember where it fell on the "rock solid" vs. "stunt" scale.
If you're aware of a language that has good support for changing from AoS
to SoA effortlessly, I'd be very interested in hearing about that, to
learn from it or see whether we could do the same thing.
Probably one of the ways, yes. Note also that it is a Package module,
which typically receives less attention design and implementation-wise
compared to Standard modules.
Some of our users have created their own, improved DistributedBag
recently, which I hope will make it into the packages directory at some
point: `DistBag_DFS`: our revisited version of `DistBag` for depth-first tree-search Β· Issue #21958 Β· chapel-lang/chapel Β· GitHub This was also
covered in their CHIUW talk. See "Towards a Scalable Load Balancing for
Productivity-Aware Tree-Search" at Chapel: CHIUW 2023: 10th Annual Chapel Implementers and Users Workshop.
As Engin says, though, nothing about this is inherently part of Chapel;
simply an abstraction for load-balancing built on top of Chapel's language
features.
Definitely can be improved but take a gander at
DynamicIters β Chapel Documentation 1.32.
I would say "Chapel does not implement general load-balancing in the
language or its runtime directly (as Charm++ would, for example), but
supports the ability for users to create abstractions (collections,
iterators, etc.) that provide load-balancing capabilities. The
DistributedBag and DynamicIters cases are examples of such abstractions.
Basically, our philosophy has been that we'd prefer an imperative language
that gives you a reasonably firm foundation in terms of how your program
will execute and to build more complex policies (like load balancing) in
terms of that than to have the language and runtime try to be smart but
not have any recourse when you want to control something more precisely.
That's a compiler internal that may be outdated. See
Classes β Chapel Documentation 1.31
for class memory management. Array allocations are typically freed at
the end of the lexical scope.
My summary here would be: The lifetime of all Chapel types (scalars,
records, arrays, etc.) are based on scoping other than classes where a
class object may outlive its scope and either be automatically freed (if
it is 'owned' or 'shared') or manually freed (if it is 'unmanaged').
Finally, I'll mention that my preferred Chapel citation is:
B. L. Chamberlain, βChapel,β in Programming Models for Parallel Computing,
P. Balaji, Ed. MIT Press, November 2015, ch. 6, pp. 129β159.
which is somewhat old at this point, but still the best published
reference for Chapel overall (where the website, current version of the
spec, release notes, etc. would be other more open-source artifacts to
cite).
Thanks for your interest in Chapel,
-Brad