New Issue: Should we always unroll loops over tuples? (or those up to a given size?)

19544, "bradcray", "Should we always unroll loops over tuples? (or those up to a given size?)", "2022-03-25T01:43:49Z"

While reviewing New version of nbody by bradcray · Pull Request #19537 · chapel-lang/chapel · GitHub in the performance meeting today, it was noted that if loops over homogeneous tuples were unrolled, we could write a loop like:

  for param i in 0..<numBodies do
    bodies[i].pos += dt * bodies[i].vel;

as simply:

  for b in bodies do
    b.pos += dt * b.vel;

the incentive for doing this would be that in cases where we've unrolled such loops, the exposure of all those flat ops to the back-end compiler has improved the performance of the generated code.

The main downside would be that while tuples are typically modest in size, they aren't always, so it could cause a surprising code explosion. In which case, we could limit it to tuples of a certain size, or just unroll to a specific factor of the tuple's size.

Somewhat related issue: Support direct iteration over heterogeneous tuples · Issue #14929 · chapel-lang/chapel · GitHub