Building a distributed chapel array with multiple pointers

Hello,

I recently reached out to the chapel team over email about the function makeArrayFromPtr() with regards to creating a chapel array at memory location returned from mmap for c interoperability. Thank you for the replies.

Now I'm curious if it is possible to create a single, distributed chapel array with multiple pointers. For example, I have N locales on N separate nodes with an mmap region. Is it possible to take the region pointer from each locale and have them accessible by a 1D chapel array like a block distribution or achieve similar functionality?

Best,
Trevor

Hi Trevor -

It's definitely possible to do that but I don't see a way to do it unless you create a domain map.

Domain maps are really implementations of different array types. We talk about two kinds of domain maps - layouts and distributions. A layout is describing how an array is implemented within a locale (aka compute node). A distribution describes how an array is implemented across many locales. So Block is a distribution but if you are talking about having an array that wraps some mmap'd region then I'd say you're talking about a layout.

Now, the Block distribution is implemented as a bunch of local arrays. Each of these arrays has a layout - it's normally the default layout but Block has some support for specifying the layout for sparse arrays specifically (see the sparseLayoutType initializer argument). So, you could create a layout and then seek to make adjustments to Block to allow it to use your layout.

Alternatively, you could create a domain map that handles the distribution as well as your mmap issue. That would more or less amount to making something similar to Block. (I wouldn't favor this approach for that reason).

Anyway, maybe others will have a better idea, but my advice would be to try to create a domain map that is a layout for your mmap'd arrays and then to try to get Block to use it. Here I think one of the main challenges is that the domain map standard interface ("dsi") is neither stable or well documented. We do have this dsi documentation and you can learn a lot by looking at the source code for other layouts and distributions.

-michael

Hi Trevor —

Following up on Michael's response with a few additional comments:

  • I haven't tried this yet, but have recently been postulating that it ought to be possible for us to create an "expert interface" to create a block-distributed array in which the user would supply a C pointer per locale to the data buffer to use on that locale. That might help with a case where Chapel was "adopting" memory from external code, though in order to work correctly, the per-locale sizes of the chunks would need to conform to how Block distributes data in the event that the number of locales doesn't evenly divide the problem size. I have no experience with mmap, so am not sure whether it would bring additional challenges or pain points beyond doing this for a normal C pointer (but Michael does, so I'm feeling cautious due to his mention of needing a new layout).

  • As mentioned in response to your previous question, this theme resonates strongly with a project that one of our interns is working on this summer, to see what it would take to create a Chapel interface to Apache Arrow / Parquet. We're still in the early stages there, but my general sense is that the theme of wanting to "adopt" memory from another source is similar between the efforts. So I expect we'd be wrestling with and making progress with your style of question this summer even if you hadn't asked it.

  • Michael's correct that our current documentation on writing domain maps isn't spectacular. I have a longstanding intent to write a friendly, step-by-step "So you want to write a domain map?" document and should really try to make that happen this summer. If you were to decide to write your own domain map, we can use that as pressure on me to buckle down and get it done.

-Brad

Thanks for the replies,

I'm still working with this idea and wanted to ask about my current approach.

I'm trying to modify BlockDist.chpl to use my mmap c array, and I am modifying the class LocBlockArr to have the attribute myElems reference my c array.

In the constructor, normally myElems = ...buildArray(...)
and I modified it to myElems = makeArrayFromPtr(...)

both of these functions eventually return _newArray(...)

However, I am having issues with myElems being created at the pointer
Outside of a class if I use:
var Arr = makeArrayFromPtr(CPointer, num_elts);

Then "c_ptrTo(Arr)" and "CPointer" would be the same memory location.

but if I use:
var Arr: [1..num_elts] int;
Arr = makeArrayFromPtr(CPointer, num_elts);

Then c_ptrTo(Arr) and CPointer are different

Inside of the class, I am getting the result where the Cpointer and myElems address are different, which makes sense because it is structurally similar to the second example. I have looked around the documentation for ways to deinitialize or deallocate so that I could reinitialize it with a assignment to makeArrayFromPtr(), but I've been unsuccessful with that.

Alternatively, if I use:
var Arr;
Arr = makeArrayFromPtr(CPointer, num_elts);

Arr is generic, and this produces my desired result, but when I do this inside the class, I am getting compilation errors. I get similar errors when I go the original BlockDist.chpl file and only change myElems to a generic variable. I'm not quite sure about all of the error messages I get, but one of them is: "note: array element type cannot currently be generic". So I don't think I'll be able to go this route.

Is there a way I can force myElems use the address from makeArrayFromPtr?

Trevor

Hi Trevor -

I think that the initialization of myElems will be moving the result of the RHS if it can. However, the issue here is that arrays have runtime types (namely the domain) and a copy is occuring if the runtime types don't match. I believe a fix would be to allow makeArrayFromPtr to accept the domain to use when creating the array instead of always creating a new array. (Then you could pass locDom.myBlock as the domain and the copy should go away).

@lydia - do you know if there is a reason that makeArrayFromPtr always creates a new domain, or is providing a domain argument just something we didn't think about at the time?

-michael

I think the main thing to be concerned about is that any domain that
would be provided in that case must be 0-based, and not strided (I think

  • we'd probably need to do some work to allow either of those). I would
    also be concerned about passing distributed domains in. All of these
    would have difficulty matching the ptr that has been passed to them. I
    don't think that difficulty is insurmountable, but it's not something
    we've put effort into and I would be concerned about extending the cases
    covered without serious consideration about edge and use cases.

So basically, it always creates a new domain today because that's
simpler than excluding all the variants of domains we don't support just
yet.

Lydia

@lydia - right, but in this case we are interested only in creating default rectangular arrays (so the domain can't be distributed). While I can see handling multidimensional cases being potentially tricky, I don't see a problem with arrays with a different base index, as long as we can communicate that in the domain. I.e. suppose we allowed one to pass a domain in the call to makeArrayFromPtr. We would simply assume that the number of elements in the domain matched the size of the array storage. If you are providing a domain from say 1..10, you have 10 elements, and these are at offsets 0..#10.

Anyway I agree we would want to rule out certain cases if we update makeArrayFromPtr to allow this.

All of these would have difficulty matching the ptr that has been passed to them

I'm not really following what this concern is here? What do you mean by matching the pointer? Is it the mapping from elements in the pointed-to data to/from array indices? This seems easy enough (with the reasoning in the example above and more generally with ideas like index-to-offset and offset-to-index).

In some parallel conversations, @ronawho has proposed routines similar to our current "make a bytes or string from this buffer [and now you own it | but I still own it] functions", and in those conversations, I'd been imagining that the user would pass a domain into the routine to indicate the array's domain as Michael suggests in the previous post. I've asked Elliot if he wouldn't mind making a public version of the private issue where some of this discussion has been occurring since it's relevant to this conversation (and some others I've been having lately).

-Brad

All, thank you very much for helping Trevor with this issue.
What we want is a BlockDomain in which we provide the memory for the actual data. Everything else can work just like BlockDomain. We are happy to restrict to 1D arrays and one locale per node (though we will run on multiple nodes).
Is it correct that, somewhere deep down in Chapel's BlockDomain perhaps in buildArray, an equivalent of malloc is called with the appropriate size for the node's part of the array? We'd like to replace that malloc with a pointer to memory that we promise will be big enough (and if not, it's our fault). We'd be happy to pass the pointer down to the right place in our custom version of the BlockDomain.

Are there any other runtime types that could be causing a copy? I made a modified version of makeArrayFromPtr that takes in a domain as an argument and passes it to a modified makeArrayFromExternArray. Inside the modified makeArrayFromExternArray I just changed the line with defaultDist.dsiNewRectangularDom to reflect the domain from locDom.myBlock.

To compare the arrays' domains, I made a temp variable to equal my modified makeArrayFromPtr function and then set myElems = temp. Temp and myElems had the same domain, rank, size, and eltType. Those were the Array variables I found at the bottom of Arrays — Chapel Documentation 1.32.

Should I investigate other runtime types? I skimmed https://github.com/chapel-lang/chapel/blob/main/modules/internal/ChapelArray.chpl, but I didn't find any leads to further my investigation.

Trevor

Hi Trevor,

Can you share your code and if you have it a test case to exercise it?

A modified version of makeArrayFromPtr that passes the domain along seems like it should work, but it's hard to guess what's going wrong without seeing the code. I will note that having equivalent domains isn't enough, right now the domains have to have the same identify in order to avoid the copy.

Thanks,
Elliot

I created a modified BlockDist.chpl file that gets a C pointer and tries to give it to myElems in LocBlockArr. The parts I edited are denoted in the comment block. The original line here was this.myElems = this.locDom.myBlock.buildArray(eltType, initElts=initElts);
When creating a distributed array, it seems that a new LocBlockArr is created with domain 1..0 in addition to the local array for each locale, so I left the original line in for this case in the else statement.

//These modules are necessary for the changes I added to LocalBlockArr
use SysCTypes; 
use CPtr;

class LocBlockArr {
  type eltType;
  param rank: int;
  type idxType;
  param stridable: bool;
  const locDom: unmanaged LocBlockDom(rank, idxType, stridable);
  var locRAD: unmanaged LocRADCache(eltType, rank, idxType, stridable)?; // non-nil if doRADOpt=true
  pragma "local field" pragma "unsafe"
  // may be initialized separately
  var myPtr:c_ptr(eltType);
  var myElems: [locDom.myBlock] eltType;
  var locRADLock: chpl_LocalSpinlock;

  proc init(type eltType,
            param rank: int,
            type idxType,
            param stridable: bool,
            const locDom: unmanaged LocBlockDom(rank, idxType, stridable),
            param initElts: bool) {
    this.eltType = eltType;
    this.rank = rank;
    this.idxType = idxType;
    this.stridable = stridable;
    this.locDom = locDom;
    ////////////////////////////////////////////////////////Start Changes/////////////////////////////////////////////////////////
    if(this.locDom.myBlock.size)
    {
      var myPtr = createMmap();
      var temp = trevorMakeArrayFromPtr(myPtr, this.locDom.myBlock.size:uint, this.locDom.myBlock);
      writeln("Temp Domain: ", temp.domain);
      this.myElems = temp;
      writeln("myElems domain: ", this.myElems.domain);
      writeln("Array pointer at: ", c_ptrTo(this.myElems), " on ", here.id);
      if(myPtr != c_ptrTo(this.myElems)) //wont be able to close mmap later if it fails here
      {
        closemmap(myPtr);
      }
    }
    else
    {
      this.myElems = this.locDom.myBlock.buildArray(eltType, initElts=initElts);
    } 
    ////////////////////////////////////////////////////////End Changes/////////////////////////////////////////////////////
  }

  // guard against dynamic dispatch resolution trying to resolve
  // write()ing out an array of sync vars and hitting the sync var
  // type's compilerError()
  override proc writeThis(f) throws {
    halt("LocBlockArr.writeThis() is not implemented / should not be needed");
  }

  proc deinit() {
    // Elements in myElems are deinited in dsiDestroyArr if necessary.
    // Here we need to clean up the rest of the array.
    if locRAD != nil then
      delete locRAD;
  }
}

CreateMmap.chpl contains functions that are used in the modified block. I've listed it below.
The main takeaway is the function createMmap() returns a c_ptr(c_int) that I want myElems to use, and makeArrayFromPtr and makeArrayFromExternArray have been modified to take a domain as an argument.

use CPtr;
use SysCTypes;
use Sys;
use SysBasic;
use Time;

require "mmap.h";
require "sys/mman.h";
require "sys/stat.h";
require "fcntl.h";

extern proc shm_open(name:c_string, oflag:c_int, mode:c_int):c_int;
extern proc shm_unlink(name:c_string):c_int;
extern proc mmap(addr:c_void_ptr, length:size_t, prot:c_int, flags:c_int, fd:c_int, offset:off_t):c_void_ptr;
extern proc munmap(addr:c_void_ptr, length:size_t):c_int;
extern proc getpagesize():size_t;
extern proc ftruncate(fd:c_int, length:off_t):c_int;

extern const O_RDWR: c_int;
extern const O_CREAT: c_int;
extern const PROT_READ:c_int;
extern const PROT_WRITE:c_int;
extern const MAP_SHARED:c_int;
extern const MAP_FIXED:c_int;
//extern const nBytes:c_int;

const backingFile = "/test.bak".c_str();
const accessPerms = 644:c_int;
const nInts = 8:c_int;
const nBytes = 32:c_int; //limits mmap region to 32 bits

proc trevorMakeArrayFromExternArray(value:chpl_external_array, type eltType, dom:domain)
{
    var mydom = defaultDist.dsiNewRectangularDom(rank=1,
                                                 idxType=int,
                                                 stridable=false,
                                                 inds=(dom.low..dom.high,));
    mydom._free_when_no_arrs = true; 
    var arr = new unmanaged DefaultRectangularArr(eltType = eltType,
                                                rank = 1,
                                                idxType=mydom.idxType,
                                                stridable=mydom.stridable,
                                                dom=mydom,
                                                data=value.elts:_ddata(eltType),
                                                externFreeFunc=value.freer,
                                                externArr=true,
                                                _borrowed=true);
    mydom.add_arr(arr, locking=false);
    return _newArray(arr);
}

proc trevorMakeArrayFromPtr(value:c_ptr, num_elts:uint, dom:domain)
{
    var data = chpl_make_external_array_ptr(value:c_void_ptr, num_elts);
    return trevorMakeArrayFromExternArray(data, value.eltType, dom);
}

proc createMmap():c_ptr(c_int)
{
    var fd = shm_open(backingFile, O_RDWR | O_CREAT, accessPerms):c_int; 
    ftruncate(fd, nBytes);
    var region;
    region = mmap(nil, nBytes:size_t, PROT_READ | PROT_WRITE, MAP_SHARED, fd:fd_t, 0:off_t);
    
    var area = region:c_ptr(c_int);
    writeln("mmap region: ", region, " on ", here.id);
    return area; 
}

proc closemmap(region:c_void_ptr)
{
    munmap(region, nBytes:size_t);
    shm_unlink(backingFile);
}

I've been using the following for a test case. The code above writes the memory address for myElems as well as the pointer returned from mmap(). The desired output would be for these addresses to be the same.

Compile with: chpl testMyBlock.chpl -lrt

use myBlockDist;
use CreateMmap;
use CPtr;
use SysCTypes;
use Time;

const Space = {1..8};
const D: domain(1) dmapped Block(boundingBox=Space) = Space;
var A: [D] c_int;

for i in Space
{ 
  A[i] = (i):c_int;
}

writeln(A);

sleep(10); //time to read mmap data from another process

for loc in Locales
{
  closemmap(c_ptrTo(A[A.localSubdomain().low]));
}

If the memory address for myElems matches the mmap pointer, then another process should be able to read the data written by chapel with mmap. I've been using this c program to test for that, but so far I haven't been able to get the memory addresses to match

/** Compilation: gcc -o memreader memreader.c -lrt  **/
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <semaphore.h>
#include <string.h>
#include "mmap.h"

#define backingFile "/test.bak"
#define nInts 8
#define nBytes (nInts * sizeof(int))
#define accessPerms 0644


void report_and_exit(const char* msg) {
  perror(msg);
  exit(-1);
}

int main() {
  int fd = shm_open(backingFile, O_RDWR, accessPerms);  /* empty to begin */
  if (fd < 0) report_and_exit("Can't get file descriptor...");

  /* get a pointer to memory */
  caddr_t memptr = mmap(NULL,       /* let system pick where to put segment */
                        nBytes,   /* how many bytes */
                        PROT_READ | PROT_WRITE, /* access protections */
                        MAP_SHARED, /* mapping visible to other processes */
                        fd,         /* file descriptor */
                        0);         /* offset: start at 1st byte */
  if ((caddr_t) -1 == memptr) report_and_exit("Can't access segment...");

  
  int *imemptr = (int *) memptr;
  for (int i = 0; i < nInts; i++)
    printf("%d %d\n", i, imemptr[i]);
  
  printf("%p", imemptr);

  /* cleanup */
  munmap(memptr, nBytes);
  close(fd);
  unlink(backingFile);
  return 0;
}

It looks like trevorMakeArrayFromExternArray makes a copy of the domain instead of just passing the domain through. This domain copy may be equivalent but won't have the same identity, so the array assignment will still copy. Instead of creating mydom and passing that along to the DR array with dom=mydom I would instead just pass dom=dom to ensure your array is using the same domain instance.

If that doesn't work we'll probably have to do some experimenting on our side. That will probably be easiest if you could put your code on a branch. We could apply your changes manually but directly using your branch would simplify things and reduce risk for applying changes incorrectly.

Elliot

I initially tried doing dom=dom in that function, but I got the following output:

./CreateMmap.chpl:34: In function 'trevorMakeArrayFromExternArray':
./CreateMmap.chpl:41: error: unresolved call 'DefaultRectangularArr.init(eltType=type int(32), rank=1, idxType=type int(64), stridable=0, dom=domain(1,int(64),false), data=_ddata(int(32)), externFreeFunc=c_void_ptr, externArr=1, _borrowed=1)'
$CHPL_HOME/modules/internal/DefaultRectangular.chpl:1020: note: this candidate did not match: DefaultRectangularArr.init(type eltType, param rank, type idxType, param stridable, dom: unmanaged DefaultRectangularDom(rank = rank, idxType = idxType, stridable = stridable), param initElts = true, data: _ddata(eltType) = nil, externArr = false, _borrowed = false, externFreeFunc: c_void_ptr = nil)
./CreateMmap.chpl:41: note: because call actual argument #5 with type domain(1,int(64),false)
$CHPL_HOME/modules/internal/DefaultRectangular.chpl:1022: note: is passed to formal 'dom: unmanaged domain(1,int(64),false)'
  ./CreateMmap.chpl:57: called as trevorMakeArrayFromExternArray(value: chpl_external_array, type eltType = int(32), dom: domain(1,int(64),false)) from function 'trevorMakeArrayFromPtr'
  ./myBlock2.chpl:443: called as trevorMakeArrayFromPtr(value: c_ptr(int(32)), num_elts: uint(64), dom: domain(1,int(64),false)) from initializer
  ./myBlock2.chpl:905: called as borrowed LocBlockArr.init(type eltType = int(32), param rank = 1, type idxType = int(64), param stridable = 0, locDom: _unknown, param initElts = 0) from method 'dsiBuildArray'
  $CHPL_HOME/modules/internal/ChapelArray.chpl:1589: called as borrowed BlockDom(1,int(64),false,unmanaged DefaultDist).dsiBuildArray(type eltType = int(32), param initElts = 1)
  within internal functions (use --print-callstack-on-error to see)
note: generic instantiations are underlined in the above callstack

I tried to use
dom=_to_unmanaged(dom), but this produced the same error output.

I can try to setup a branch this afternoon

Ah, so it should instead be dom=dom._value (this unwraps to provide the underlying instance that DefaultRectangularArr expects.)

Elliot

That change seemed to make everything work the way I wanted. I am able to make a distributed array that builds the local array at my C pointer. It is important to note that if I made a temp variable = trevorMakeArrayFromPtr then set this.myElems = temp, a copy still occurred. I'm not sure why that is the case, but by removing the temp variable and setting myElems directly to trevorMakeArrayFromPtr there isn't a copy, and the C Pointer and myElems array address are the same. I haven't done very extensive testing. I have tested using only 1 locale on chapel-1.23.0-smp and with 1-2 locales on chapel-1.24.0. Both of these builds produced the desired result in those cases.

If anyone is interested in the changes, I'll leave my current version of the code below. When I first reached out to the chapel team about the function makeArrayFromPtr, I was told this function was not ready for stabilization and might change in the future. As well as that this function makes certain assumptions about how to cleanup arrays that are wrapped this way. This could cause problems such as leaking or double freeing memory if used incorrectly. There is probably a more stable way to achieve this functionality.

Replaced LocBlockArr Class in BlockDist.chpl

class LocBlockArr {
  type eltType;
  param rank: int;
  type idxType;
  param stridable: bool;
  const locDom: unmanaged LocBlockDom(rank, idxType, stridable);
  var locRAD: unmanaged LocRADCache(eltType, rank, idxType, stridable)?; // non-nil if doRADOpt=true
  pragma "local field" pragma "unsafe"
  // may be initialized separately
  var myPtr:c_ptr(eltType);
  var myElems: [locDom.myBlock] eltType;
  var locRADLock: chpl_LocalSpinlock;

  proc init(type eltType,
            param rank: int,
            type idxType,
            param stridable: bool,
            const locDom: unmanaged LocBlockDom(rank, idxType, stridable),
            param initElts: bool) {
    this.eltType = eltType;
    this.rank = rank;
    this.idxType = idxType;
    this.stridable = stridable;
    this.locDom = locDom;
    ////////////////////////////////////////////////////////Start Changes/////////////////////////////////////////////////////////
    if(this.locDom.myBlock.size)
    {
      var myPtr = createMmap();
      this.myElems = trevorMakeArrayFromPtr(myPtr, this.locDom.myBlock.size:uint, this.locDom.myBlock);
      //writeln("Temp Domain: ", temp.domain);
      //this.myElems = temp;
      writeln("myElems domain: ", this.myElems.domain);
      writeln("Array pointer at: ", c_ptrTo(this.myElems), " on ", here.id);
      if(myPtr != c_ptrTo(this.myElems))
      {
        writeln("here");
        closemmap(myPtr);
      }
    }
    else
    {
      this.myElems = this.locDom.myBlock.buildArray(eltType, initElts=initElts);
    } 
    ////////////////////////////////////////////////////////End Changes/////////////////////////////////////////////////////
  }

  // guard against dynamic dispatch resolution trying to resolve
  // write()ing out an array of sync vars and hitting the sync var
  // type's compilerError()
  override proc writeThis(f) throws {
    halt("LocBlockArr.writeThis() is not implemented / should not be needed");
  }

  proc deinit() {
    // Elements in myElems are deinited in dsiDestroyArr if necessary.
    // Here we need to clean up the rest of the array.
    if locRAD != nil then
      delete locRAD;
  }
}

Modified functions to avoid making a copy to myElems by using the same domain

proc trevorMakeArrayFromExternArray(value:chpl_external_array, type eltType, dom:domain)
{
    var arr = new unmanaged DefaultRectangularArr(eltType = eltType,
                                                rank = 1,
                                                idxType=int,
                                                stridable=false,
                                                dom=dom._value,
                                                data=value.elts:_ddata(eltType),
                                                externFreeFunc=value.freer,
                                                externArr=true,
                                                _borrowed=true);
    dom.add_arr(arr, locking=false);
    return _newArray(arr);
}

proc trevorMakeArrayFromPtr(value:c_ptr, num_elts:uint, dom:domain)
{
    var data = chpl_make_external_array_ptr(value:c_void_ptr, num_elts);
    return trevorMakeArrayFromExternArray(data, value.eltType, dom);
}

Thanks again to the chapel team for helping me work through this idea!

Trevor

1 Like