19544, "bradcray", "Should we always unroll loops over tuples? (or those up to a given size?)", "2022-03-25T01:43:49Z"
While reviewing New version of nbody by bradcray · Pull Request #19537 · chapel-lang/chapel · GitHub in the performance meeting today, it was noted that if loops over homogeneous tuples were unrolled, we could write a loop like:
for param i in 0..<numBodies do
bodies[i].pos += dt * bodies[i].vel;
as simply:
for b in bodies do
b.pos += dt * b.vel;
the incentive for doing this would be that in cases where we've unrolled such loops, the exposure of all those flat ops to the back-end compiler has improved the performance of the generated code.
The main downside would be that while tuples are typically modest in size, they aren't always, so it could cause a surprising code explosion. In which case, we could limit it to tuples of a certain size, or just unroll to a specific factor of the tuple's size.
Somewhat related issue: Support direct iteration over heterogeneous tuples · Issue #14929 · chapel-lang/chapel · GitHub