20059, "benharsh", "Should file.lines be an iterator, or return an iterable object?", "2022-06-22T17:38:11Z"
Our current file.lines
signature:
proc file.lines(param locking:bool = true,
start:int(64) = 0,
end:int(64) = max(int(64)),
hints:iohints = IOHINT_NONE) throws
This method returns an iterable object (an ItemReader) that will yield lines from the file. Is that desirable, or should this simply be an iterator?
By returning an object the caller can lazily evaluate the results, though one could potentially argue that such functionality should be available with normal iterators as well.
In considering this design issue, consider that the original purpose of file.lines
is to support distributed and/or parallel iteration over a file. Is there an advantage in returning an object in those cases? Does robust support for a distributed/parallel file.lines
suggest what the signature should be, and does that indicate a preference between an iterable and an iterator? See #4959 for relevant discussion on parallel file.lines
.