DimensionalDist2D no support for arbitrary dimension

DimensionalDist2D — Chapel Documentation 1.28 seems to only support 2D array. I am interested in the design of such domain map interface, and I am curious about the reason why the most general nD mechanism is not implemented. What's the difficulty in the implementation? If it could be implemented, will it be a series of mechanisms like DimensionalDist3D, DimensionalDist4D, DimensionalDist5D? Or actually, DimensionalDistND can be supported in Chapel? If not, what would be the major difficulty there?

Another question is DimensionalDist2D can accept BlockCycDim, BlockDim, and ReplicatedDim. Why is CyclicDim not supported?

Lastly, I am curious about the design philosophy of domain map's standard distributions. Is it true that these distributions are implemented incrementally (i.e., adding new distributions if there are situations where there exists cases are not supported by the current design)?

I am also curious about the internal implementation of domain map for arrays: when is the array materialized in the physical memory? Before applying the domain mapping (then one processor has to be able to materialize the whole array then distribute certain elements to other processors), or after the domain mapping has finished (i.e., materializing the arrays in physical memory after the each processor knows its own portion)? I assume it is the latter case.

How does the runtime know whether the current index access to the array has its data on its own processor or needs to request data from the remote processor?

Hi Anjiang —

Thanks for your questions about Chapel's distributions, which I'll try to
answer here. I'm going to answer them out of order to try and streamline
things a bit:

Lastly, I am curious about the design philosophy of domain map's
standard distributions. Is it true that these distributions are
implemented incrementally (i.e., adding new distributions if there are
situations where there exists cases are not supported by the current
design)?

that's a reasonable assessment of when we add new standard domain maps: We
currently add a new standard domain map (block, cyclic, dimensional,
replicated, etc.) when we have a use case that warrants it. Those use
cases could either be motivated by an end-user or by a use-case we're
interested in internally.

In recent years, Block and Cyclic have been sufficient for most of our —
and our current users' — needs, and we've had other priorities to focus
on. So our set of standard domain maps hasn't particularly grown in
recent years. One effort that has started to get some attention recently
is what would be required to get standard distributions like 'block'
targeting GPUs.

DimensionalDist2D — Chapel Documentation 1.28
seems to only support 2D array. I am interested in the design of such
domain map interface, and I am curious about the reason why the most
general nD mechanism is not implemented. What's the difficulty in the
implementation? If it could be implemented, will it be a series of
mechanisms like DimensionalDist3D, DimensionalDist4D, DimensionalDist5D?
Or actually, DimensionalDistND can be supported in Chapel? If not, what
would be the major difficulty there?

I'm not aware, offhand, of any intrinsic barrier to supporting a general
nD dimensional distribution, though doing so could obviously potentially
add additional overhead. Generally, when implementing things like this, I
find it can be instructive to start with a specific instance (like 2D
dimensional distributions) and then generalize it over time rather than
starting with the most general solution from the start. For example,
though I don't recall for sure, I wouldn't be surprised if our first
'block' distribution may have been similarly limited to 1D domains/arrays.

The current 2D dimensional distribution was developed as a
proof-of-concept, and never really received the performance tuning and
attention it deserves to become a first-class citizen in the standard
library. This is a big part of why it is currently marked as unstable.
Our original motivation for developing it was to apply it to the HPL
benchmark used by the HPC Challenge compeition and the top-500, but then
we moved on to other priorities before getting that code into competitive
shape. It would be interesting to return to.

DimensionalDist2D was also an important philosophical proof-of-concept for
us because when we started Chapel, the general expectation was that all
distributions should be specified on a dimensional basis (as was done in
HPF). But we wanted to take more of a multidimensional approach in order
to support distributions like recursive bisection without precluding the
ability to still specify things on a per-dimension basis when desired.
So DimensionalDist2D was our (modest) proof of concept that our philosophy
was achievable.

Another question is DimensionalDist2D can accept BlockCycDim, BlockDim,
and ReplicatedDim. Why is CyclicDim not supported?

I don't think there's any deep reason other than that we didn't need it
for our HPL port? Technically, I don't think we needed 'BlockDim' either,
but it was an easy/obvious/simple starting point given the relative
simplicity of 'Block' (though arguably 'Cyclic' is even easier... it just
doesn't offer much spatial locality, when desired).

I am also curious about the internal implementation of domain map for
arrays: when is the array materialized in the physical memory? Before
applying the domain mapping (then one processor has to be able to
materialize the whole array then distribute certain elements to other
processors), or after the domain mapping has finished (i.e.,
materializing the arrays in physical memory after the each processor
knows its own portion)? I assume it is the latter case.

Yes, it's the latter. Essentially, when we see a declaration like:

var A: [D] real;

we translate this into a call like:

D.buildArray(real);

And if 'D' is distributed, it will typically allocate each locale's
sub-array at this point on the respective locale. I say "typically"
because nothing in the domain map API requires this; but it's arguably the
only reasonable thing to do, and I'm fairly certain what all of our
standard distributions do.

How does the runtime know whether the current index access to the array
has its data on its own processor or needs to request data from the
remote processor?

The interface on distributions supports a call that permits code to query
which locale owns a particular index (dsiIndexToLocale). This can be used
by the compiler or the module code implementing the distribution (and its
domains and arrays) to determine whether a given index is local or remote.
Such code can then special-case the local case if desired, or just use
general "may be remote" code relying on Chapel's global namespace to
resolve any remote references.

Note that, strictly speaking, the Chapel runtime itself is completely
unaware of distributions — all distributions, domains, and arrays are
implemented using module code that relies on standard Chapel concepts for
its implementation — objects, on-clauses, parallelism, locales, etc. By
the time the runtime sees code, it's all of the form of lower-level
memory, tasking, and communication calls.

Let us know if you have any follow-up questions, and whether the status of
any of these questions are holding up work that you would want to do in
Chapel.

-Brad

1 Like