17327, "mstrout", "Propose not supporting first and last methods on sparse domains", "2021-03-03T21:26:44Z"
Currently for domains, D.first and D.last, will return the first and last index in a domain respectively based on the order the indices/coordinates are stored in. This means for a CSR or CSC sparse ordering in LayoutCS, it will return the index/coordinate for the first non-zero in the index array.
The main question is whether we should continue to support this for sparse domains. If we decide not to then we would need to do the following:
- [ ] deprecate those methods for the sparse matrix implementations
- [ ] update any relevant documentation
- [ ] remove their use from test cases
@bradcray thoughts about this:
I’d be inclined to not support .first and .last on sparse domains, at least until we have more discussion about them. Starting with some equivalent questions: “Should a for loop on a dense 2D domain/array always iterate over the indices/elements in the same order (“to make the science of a code independent of the data structure order”), or should a domain map be able to define its own order (“for efficiency”) such that a RMO array would return in RMO, a CMO array in CMO, and an array using tiled memory or a space-filling curve would reveal that order. And similarly, if the 2D array is printed out, should that printout be independent of its memory layout, or reflect it? In practice, we’ve usually said “the same order” for productivity and postulated that there could be a .naturalOrder() iterator method if the user didn’t care about the order and wanted whatever serial ordering was faster.
The .first/.last / serial iteration order question also applies to sparse domains where, to me, it’s not clear how an algorithm could rely on these queries if they reflected the layout in memory (i.e., this may be the first index, but that doesn’t mean there aren’t other indices that aren’t lower/higher). That said, it’s not that clear to me how an algorithm would benefit from these queries even if they were well-defined. Where for a dense domain/array, they represent extreme indices regardless of serial iteration order, for a sparse domain/array, they don’t imply that there isn’t another index with a lower/higher row/column number, so how would I make use of them effectively?
@mstrout thoughts:
I would also be inclined to not support .first and .last on sparse domains. The number of different orders is mind boggling especially when one considers the multi-dimensional generalization, sparse tensors.