Hello! This is a bit long, but I need to provide the full background to my question. I am trying to simulate a multi-locale computer on my desktop, which only has 1 cpu. To do this, I first did
source util/setchplenv.bash
==> source mlchplenv.sh
make
where mlchplenv.sh is
#!/bin/bash
# --------------------------------------------------------------------
# Simulation of Multi-locale Chapel
# --------------------------------------------------------------------
export CHPL_HOME=~/Dropbox/software/mlchapel-1.24.1
CHPL_BIN_SUBDIR=`"$CHPL_HOME"/util/chplenv/chpl_bin_subdir.py`
# --------------------------------------------------------------------
# reset path
# --------------------------------------------------------------------
export PATH="$PATH":"$CHPL_HOME/bin/$CHPL_BIN_SUBDIR"
export MANPATH="$MANPATH":"$CHPL_HOME"/man
# --------------------------------------------------------------------
# for the time being, I am putting all modules here
# --------------------------------------------------------------------
export CHPL_MODULE_PATH=/home/nldias/Dropbox/chapel/modules
# --------------------------------------------------------------------
# use all cores!
# --------------------------------------------------------------------
export CHPL_RT_NUM_THREADS_PER_LOCALE=MAX_LOGICAL
# --------------------------------------------------------------------
# This seems enough to generate a "multi-locale" compiler
# --------------------------------------------------------------------
export CHPL_COMM=gasnet
export CHPL_TARGET_CPU=none
export CHPL_LAUNCHER=smp
export CHPL_COMM_SUBSTRATE=smp
export GASNET_ROUTE_OUTPUT=0
This compiled the Chapel compiler without errors and executed simple example programs in Chapel's home page correctly
Then I went on to implement a solution of d^2u/dx^2 = 0, u(0) = 0, u(1) = 1 with a relaxation method. I did it serially, then with a Block distribution (using -nl 4), then with a Stencil distribution (using -nl 4). The first two ran OK and produced the same output for 16 internal points plus two boundary points, namely
0.0 0.0588235 0.117647 0.17647 0.235294 0.294117 0.352941 0.411764 0.470588 0.529411 0.588235 0.647058 0.705882 0.764706 0.823529 0.882353 0.941176 1.0
The Stencil version failed miserably, however, giving
0.0 0.1 0.2 0.3 0.4 0.5 0.541667 0.583333 0.625 0.666667 0.708333 0.75 0.791667 0.833333 0.875 0.916667 0.958333 1.0
Here is the Stencil version: the serial version and the Block dist version are pretty much the same code, with the obvious changes/omissions:
use StencilDist;
const Di = {1..16}; // the internal subdomain
const D = {0..17} // the total domain and its bounding box
dmapped Stencil(boundingBox=Di,fluff=(1,));
var u: [D] real; // the solution
var deltau: [D] real; // the residues
const epsilon = 1.0e-10; // the overall accuracy
var deltam = epsilon; // the "norm" over the residues
u[0] = 0.0; // a ghost point (BC at 0)
u[17] = 1.0; // a ghost point (BC at 1)
u[Di] = 0.5; // the initial guess
var nc = 0; // iteration counter
for loc in Locales do {
on loc {
writeln(u.localSubdomain()); // subdomains, per locale
}
}
while deltam >= epsilon do { // check convergence
deltam = 0.0;
forall i in 1..15 by 2 do {
var uavg = (u[i+1]+u[i-1])/2.0;
deltau[i] = (uavg - u[i]);
u[i] += deltau[i];
deltau[i] = abs(deltau[i]);
}
u.updateFluff();
forall i in 2..16 by 2 do {
var uavg = (u[i+1]+u[i-1])/2.0;
deltau[i] = (uavg - u[i]);
u[i] += deltau[i];
deltau[i] = abs(deltau[i]);
}
u.updateFluff();
deltam = (+ reduce deltau)/16;
nc += 1;
}
writeln(nc);
writeln(u);
Any light on what I am doing wrong, or why Stencil does not work in this simulated multi-locale environment, will be greatly appreciated.
Cheers
Nelson