New Issue: Rename regexMatch fields to emphasize that they are byte-based

19076, "e-kayrakli", "Rename regexMatch fields to emphasize that they are byte-based", "2022-01-23T21:41:36Z"

The regexMatch record has two fields: offset and size.

They represent the byte offset of a match from the beginning of the string buffer, and the size of the match -- again in bytes. We should rename these to make it clear that they are byte-based, otherwise, they can be confused with other string fields/procs:

use Regex;

var r = compile("rkç");
var s = "Türkçe";

var match = r.search(s);

writeln(match.size);   // 4: because `size` is byte-based
writeln(s[match].size);  // 3: because `size` is codepoint-based

I propose we rename

  • size as numBytes: String and bytes types already have numBytes fields.
  • offset as byteOffset: Seems clear enough, but arguably this is a bit more open for discussion.