Memory Mapped File Operations or Raw I/O

Does Chapel have the ability to memory map a file?

As an alternative, does Chapel support, binary unbuffered I/O. I cannot see how I can guarantee that file transfers goes from (big) memory data structures straight to the file.

Thanks - D

Does Chapel have the ability to memory map a file?

Yes, it used to be the default for large enough files, but we changed away from that because it was leading to performance problems with Lustre filesystems and generally speaking the read/write calls are more common in applications.

You can still request memory mapping a file with hints=IOHINT_PARALLEL or hints=QIO_METHOD_MMAP when you open the file. These should both activate the mmap logic today. We need to work on better names for these (which means you should expect the names to change at some point). You can also | in QIO_HINT_SEQUENTIAL to double the OS readahead.

Also, note that, in my previous experience, I've seen performance problems with using mmap to write large files while extending the size of the file.

Anyway, here is an example that works today with mmap (it mmaps 8MiB regions; without the hint it will pread 64KiB at a time).

use IO;

config const path="pidigits";

proc main() {
  // can also use hints=IOHINT_PARALLEL to get mmap, today
  var f = open(path, iomode.r, hints=QIO_METHOD_MMAP);

  var r = f.reader();

  var nZeroBytes = 0;
  var byte: uint(8);
  while r.readBinary(byte) {
    if byte == 0 then
      nZeroBytes += 1;
  }

  writeln(nZeroBytes, " zero bytes in ", path);
}

As an alternative, does Chapel support, binary unbuffered I/O. I cannot see how I can guarantee that file transfers goes from (big) memory data structures straight to the file.

The I/O system used to, but it doesn't really today. However you are welcome to use extern proc calls to invoke the read or write system calls yourself. (To do that, you'd also have to open the file yourself with an extern proc call; but it would be a reasonable request to have a Chapel file method that can give back the OS file descriptor number if one exists).

Note that the OS generally buffers your I/O as well, unless you use O_DIRECT when opening your file with the open system call, but that flag has a raft of caveats and is the subject of this rant by Linus Torvolds.

Anyway, besides all of the above, generally speaking if you are reading/writing big arrays, it's reasonable to expect that the array-at-a-time I/O calls you make will have reasonable performance. But that might not be the case today and we encourage you to open an issue with a particular example reproducer if you are able.

A related issue that we are looking at is this one -- Chapel I/O Significantly Slower than GLIBC fread/fwrite (10x) · Issue #18913 · chapel-lang/chapel · GitHub .

Best,

-michael