[Chapel Merge] responsive compiler prototype: query framework

Branch: refs/heads/master
Revision: ae295b0
Author: mppf
Log Message:

Merge pull request #17586 from mppf/new-ast-query-framework

responsive compiler prototype: query framework

This PR contains a query framework and supporting classes as part of the
prototype compiler library effort.

Split out from PR #17583, which is a start at the compiler revamp effort
described in the
1.24 release notes ongoing efforts.

The new effort implements the early portions of the compiler as "queries"
that have no side-effects other than computing some output from some
input. These queries are memoized in a manner that allows recomputing
them when a dependency has changed. The idea is that this structure can
better support IDEs but at the very least it is highly incremental which
is one of the goals here. This strategy is based upon the approach used
in the Rust compiler (described in
this talk).

  • Adds ErrorMessage and Location classes. ErrorMessage contains a
    Location. The query framework saves ErrorMessages along with query
    results so needs to have these types available. That way, the
    ErrorMessage can be emitted again in an incremental setting when the
    query itself is not recomputed.
  • Adds UniqueString, which is a type that represents a string unique'd in
    a table. This is similar to astr in the old compiler, with four
    differences:
    1. The UniqueString table is stored in the Context rather than as a
      global variable
    2. The UniqueString can store short strings inline
    3. Uses a different type for unique'd strings - namely UniqueString -
      where the astr mechanism just returned const char* which made it
      unclear whether or not the string was already unique'd.
    4. The UniqueString table can be garbage collected.
  • The type PODUniqueString is used in the parser, where the results
    parser actions are stored in the union, so the types in that union
    can't have default constructors. It should not be used outside of such
    a restricted setting.
  • Adds the Context type, which serves a sort of program database. See the
    comment at the start of Context for details. The saved state in the
    Context type can be garbage collected by 1) incrementing the revision
    counter and requesting garbage collection (so "mark" step can be done
    where appropriate); 2) running a toplevel query; and 3) calling
    collectGarbage after the toplevel query (and all other queries) are
    done running.

Some notes about the query implementation:

  • The implementation uses C++11 variadic templates. I found these
    references useful in figuring out how to use them:
    RIP index_sequence, 2014–2017. It was good while it lasted… | by Matt Aubury | Medium
    Variadic templates in C++ - Eli Bendersky's website
  • The query implementation uses a lot of std::unordered_map and in
    implementing hash functions we sometimes need something to combine two
    hashes. I added a hash_combine function based on the Boost one which
    is documented here:
    Function template hash_combine - 1.35.0
  • The query implementation uses a combine operation on the results. The
    justification for this is that, during parsing, the AST tree being
    parsed might be substantially the same, with say one change in a block
    inside a block inside a function. In that event, we'd like to keep the
    old AST nodes where possible. That should allow:
    1. reuse of queries that referred to these old AST nodes
    2. queries that have pointers to AST nodes in their results,
      provided that they do pointer comparision (because if the pointer
      has changed, we have a new (and different) AST element; but if
      the pointer has not changed, we have reused the old AST element
      pointer).
  • Related to the above, the query implementation is careful to store both
    the new and old results when doing a combine operation. The intention
    of this is to avoid ABA-like problems.
    • I have not observed such a problem with the combining but I think it
      is possible. The map could contain a value that contains a pointer to
      something old. If the map entry it depends upon it had no changes, it
      could also have no changes. That could be a problem if the pointer
      value was freed and then allocated in the time between checking the
      depended-upon entry and the map entry itself.
    • Anyway keeping the old values around fits in with the
      'context->collectGarbage()' mechanism. It should be possible to
      access memory inside of an old value, but the old value will be
      changed by combine calls, so if it's garbage the value won't match up
      anyway. As a result, after the parseFile call, AST elements in values
      should only have their pointers compared in combine calls.

Future Work:

  • [x] full local testing

Reviewed by @leekillough @e-kayrakli and @dlongnecke-cray - thanks!

Modified Files:
A compiler/next/include/chpl/AST/ErrorMessage.h

A compiler/next/include/chpl/AST/Location.h
A compiler/next/include/chpl/AST/UniqueString.h
A compiler/next/include/chpl/AST/UniqueStringDetail.h
A compiler/next/include/chpl/Queries/Context.h
A compiler/next/include/chpl/Queries/ContextDetail.h
A compiler/next/include/chpl/Queries/QueryImpl.h
A compiler/next/include/chpl/Util/memory.h
A compiler/next/lib/AST/ErrorMessage.cpp
A compiler/next/lib/AST/UniqueString.cpp
A compiler/next/lib/Queries/Context.cpp
A compiler/next/lib/Util/bswap.h
A compiler/next/lib/Util/files.cpp
A compiler/next/lib/Util/files.h
A compiler/next/lib/Util/mystrerror.cpp
A compiler/next/lib/Util/mystrerror.h
A compiler/next/lib/Util/sys_basic.h

Compare: https://github.com/chapel-lang/chapel/compare/f561755541de...ae295b0ef565