[Chapel Merge] Implement the `foreach` loop

Branch: refs/heads/main
Revision: a6a8ed2
Author: e-kayrakli
Log Message:

Merge pull request #18046 from e-kayrakli/add-foreach3

Implement the foreach loop

This PR implements the foreach loop as proposed in
how to mark loops as order independent without adding qthreads-tasks · Issue #16404 · chapel-lang/chapel · GitHub.

Co-developed with @mppf
Supersedes his Replace order independent pragma with foreach by mppf · Pull Request #17014 · chapel-lang/chapel · GitHub

Background

foreach loops are the user facing way of implementing sequential yet
vectorizable loops. Before this PR, the vectorization support relied on non-user
facing pragmas to be attached to the iterator symbols. With this PR, those
pragmas are no longer necessary if the iterator uses foreach loops.

Implementation

The implementation is rather straightforward. foreach loops are just
ForLoops that are made order independent from the time they are created.
foreach-ness of the loop is not stored in the ForLoop's interface.

Similar to what we have today, a ForLoop can be made order independent during
resolution if it consists of only a yield and iterates over something that is
also order independent. If that happens to a for, after that point it is
indistinguishable from a foreach for the rest of the compilation.

Note that we still need the pragma order independent yielding loops (and it's
inverse, just for completeness (?) ) to mark iterators that uses non-for loops
as order independent. However, with this PR, flags corresponding to those
pragmas are no longer added by the compiler. Instead, the compiler choses
between for/foreach when writing iterators for forall expressions, for
example.

Future work

  • We need to implement foreach intents. (It appears we'll also add for
    intents Should we support task intents on `for` loops? · Issue #17857 · chapel-lang/chapel · GitHub)

  • @bradcray has made a good point that intent implementations for
    for/foreach and forall may end up to be sufficiently different.
    Depending on what we see then, we could think of creating a ForeachLoop AST
    node.

  • Do we want foreach expressions? Currently they fail with a generic syntax
    error.

  • We should probably beef up testing, especially for the standard/internal
    module iterators that now use foreach to make sure that they are vectorized
    as appropriate. (I can add some more testing in this PR, but maybe not
    exhaustively)

[Reviewed by @mppf]

Testing

  • [x] release/examples with --fast

  • [x] standard

  • [x] gasnet

    Modified Files:
    A test/performance/vectorization/notAForeachReport.chpl
    A test/performance/vectorization/notAForeachReport.compopts
    A test/performance/vectorization/notAForeachReport.good
    A test/performance/vectorization/vectorPragmas/basicItersForeach.chpl
    A test/performance/vectorization/vectorPragmas/basicItersForeach.compgood
    A test/performance/vectorization/vectorPragmas/cForLoopInParIterForeach.chpl
    A test/performance/vectorization/vectorPragmas/cForLoopInParIterForeach.compgood
    A test/performance/vectorization/vectorPragmas/doWhileInParIterForeach.chpl
    A test/performance/vectorization/vectorPragmas/doWhileInParIterForeach.compgood
    A test/performance/vectorization/vectorPragmas/forallInStandaloneForeach.chpl
    A test/performance/vectorization/vectorPragmas/forallInStandaloneForeach.compgood
    A test/performance/vectorization/vectorPragmas/loopWithoutYieldForeach.chpl
    A test/performance/vectorization/vectorPragmas/loopWithoutYieldForeach.compgood
    A test/performance/vectorization/vectorPragmas/nestedLoopsInFollowerForeach.chpl
    A test/performance/vectorization/vectorPragmas/nestedLoopsInFollowerForeach.compgood
    A test/performance/vectorization/vectorPragmas/nonInlinableFollowerForeach.chpl
    A test/performance/vectorization/vectorPragmas/nonInlinableFollowerForeach.compgood
    A test/performance/vectorization/vectorPragmas/whileDoInParIterForeach.chpl
    A test/performance/vectorization/vectorPragmas/whileDoInParIterForeach.compgood
    A test/performance/vectorization/vectorPragmas/zipIterInFollowerForeach.chpl
    A test/performance/vectorization/vectorPragmas/zipIterInFollowerForeach.compgood
    A test/performance/vectorization/vectorPragmas/zipIterInFollowerNoVectorForeach.chpl
    A test/performance/vectorization/vectorPragmas/zipIterInFollowerNoVectorForeach.compgood
    R test/performance/vectorization/foreach/parse-foreach.chpl
    R test/performance/vectorization/foreach/parse-foreach.good
    M compiler/AST/ForLoop.cpp
    M compiler/AST/LoopExpr.cpp
    M compiler/include/ForLoop.h
    M compiler/include/bison-chapel.h
    M compiler/include/flags_list.h
    M compiler/parser/bison-chapel.cpp
    M compiler/parser/chapel.ypp
    M compiler/resolution/resolveFunction.cpp
    M modules/dists/BlockCycDist.chpl
    M modules/dists/BlockDist.chpl
    M modules/dists/CyclicDist.chpl
    M modules/dists/DimensionalDist2D.chpl
    M modules/dists/HashedDist.chpl
    M modules/dists/PrivateDist.chpl
    M modules/dists/SparseBlockDist.chpl
    M modules/dists/StencilDist.chpl
    M modules/dists/dims/BlockCycDim.chpl
    M modules/dists/dims/BlockDim.chpl
    M modules/dists/dims/ReplicatedDim.chpl
    M modules/internal/ArrayViewRankChange.chpl
    M modules/internal/ArrayViewReindex.chpl
    M modules/internal/ArrayViewSlice.chpl
    M modules/internal/Bytes.chpl
    M modules/internal/BytesStringCommon.chpl
    M modules/internal/ChapelArray.chpl
    M modules/internal/ChapelError.chpl
    M modules/internal/ChapelHashtable.chpl
    M modules/internal/ChapelLocale.chpl
    M modules/internal/ChapelRange.chpl
    M modules/internal/ChapelTuple.chpl
    M modules/internal/DefaultAssociative.chpl
    M modules/internal/DefaultRectangular.chpl
    M modules/internal/DefaultSparse.chpl
    M modules/internal/String.chpl
    M modules/layouts/LayoutCS.chpl
    M modules/packages/DistributedBag.chpl
    M modules/packages/DistributedDeque.chpl
    M modules/packages/EpochManager.chpl
    M modules/packages/FunctionalOperations.chpl
    M modules/packages/LinkedLists.chpl
    M modules/packages/LockFreeQueue.chpl
    M modules/packages/LockFreeStack.chpl
    M modules/packages/OrderedSet/Treap.chpl
    M modules/packages/RangeChunk.chpl
    M modules/packages/RecordParser.chpl
    M modules/packages/Sort.chpl
    M modules/packages/VisualDebug.chpl
    M modules/standard/DateTime.chpl
    M modules/standard/FileSystem.chpl
    M modules/standard/Heap.chpl
    M modules/standard/IO.chpl
    M modules/standard/List.chpl
    M modules/standard/Map.chpl
    M modules/standard/Random.chpl
    M modules/standard/Set.chpl
    M modules/standard/Types.chpl
    M test/performance/vectorization/vectorPragmas/GOOD_EXEC_OUTPUT
    M test/performance/vectorization/vectorPragmas/InvokeIters.chpl
    M test/performance/vectorization/vectorPragmas/basicIters.compgood
    M test/performance/vectorization/vectorPragmas/cForLoopInParIter.compgood
    M test/performance/vectorization/vectorPragmas/doWhileInParIter.compgood
    M test/performance/vectorization/vectorPragmas/forallInStandalone.compgood
    M test/performance/vectorization/vectorPragmas/loopWithoutYield.compgood
    M test/performance/vectorization/vectorPragmas/loopsInForallNoVector.chpl
    M test/performance/vectorization/vectorPragmas/loopsInForallNoVector.compgood
    M test/performance/vectorization/vectorPragmas/nestedLoopsInFollower.compgood
    M test/performance/vectorization/vectorPragmas/nonInlinableFollower.compgood
    M test/performance/vectorization/vectorPragmas/whileDoInParIter.compgood
    M test/performance/vectorization/vectorPragmas/zipIterInFollower.compgood
    M test/performance/vectorization/vectorPragmas/zipIterInFollowerNoVector.compgood
    M test/performance/vectorization/vectorPragmas/zipIterNoVector.chpl
    M test/performance/vectorization/vectorizeOnly/iterator-loop-vectorization.chpl
    M test/performance/vectorization/vectorizeOnly/iterator-loop-vectorization.good

    Compare: Comparing ea96e13593a5...a6a8ed23e80a · chapel-lang/chapel · GitHub