New Issue: I/O module: readUntil

19769, "mppf", "I/O module: readUntil", "2022-05-06T15:14:33Z"

This issue is a spin-off from issue #19496.

This issue proposes having a readUntil for reading a string/bytes until some kind of separator. Since reading a line is a very common operation, that should get its own function (e.g. readLine as discussed in #19495). However it is useful to have a function that can read until something for other cases.

Here is a sketch of what this might look like:

// maxSize arguments indicate that the function should throw
// if it finds a input longer than that (and leaves the input there)


// keepSeparator means that a separator found in the input will be included in
// the returned string. Note that if the input reaches EOF without a separator,
// the returned string won't contain a separator, even if keepEOL=true.

// The first set uses a separator that is a string/bytes, so it could be e.g. "end".

// For this one, t can be bytes or string
proc reader.readUntil(type t=string, separator: t, maxSize=-1, keepSeparator=true): t throws

// these two functions:
//   return `false` if EOF is reached and no data is read
//   resize the passed string/bytes (but may reuse the existing buffer)
proc reader.readUntil(ref s: string, separator: string, maxSize=-1, keepSeparator=true): bool throws
proc reader.readUntil(ref b: bytes, separator: string, maxSize=-1, keepSeparator=true): bool throws

// The second set reads until a regular expression
proc reader.readUntil(type t=string, separator: regex(t), maxSize=-1, keepSeparator=true): t throws

// these two functions:
//   return `false` if EOF is reached and no data is read
//   resize the passed string/bytes (but may reuse the existing buffer)
proc reader.readUntil(ref s: string, separator: regex(string), maxSize=-1, keepSeparator=true): bool throws
proc reader.readUntil(ref b: bytes, separator: regex(bytes), maxSize=-1, keepSeparator=true): bool throws

These functions have some similarity to:

  • #19610.

Should it be called readUntil? Or does that imply that it leaves the separator in the input? I don't think this function should leave the separator in the input. I'm not sure I can think of a better name. readPast doesn't sound great to me (and "past" could be misinterpreted as "read from history").