New Issue: What should the syntax for a context manager statement look like?

17965, "dlongnecke-cray", "What should the syntax for a context manager statement look like?", "2021-06-22T23:09:33Z"

What should the syntax for a context manager statement look like?

In #17809 there was more or less unanimous support for introducing
new syntax to support a context management statement.

This issue explores some possibilities for what the syntax of this
statement could look like.

To start things off, here's an example of a context manager taken
straight out of Python (which is what inspired this effort):

with createAndUseSomeManager() as myResource:
  myResource.doSomething()

This code calls a function createAndUseSomeManager(), which
returns an instance of a type that can be used as a context manager.
Under the covers it calls __enter__ on that object to get a handle
to a resource. Within the scope of the managed block, the resource
can be referred to as myResource.

So for the context manager construct, we're going to need syntax that
represents at least three pieces:

  • An expression to get the manager object, which produces a resource
  • An identifier that can be used as a "resource handle"
  • A statement (which could be a block)

1) Immediate translation to Chapel

Let's say we just translated the Python syntax directly into Chapel.
We might end up with something like:

with createAndUseSomeManager() as myResource do
  myResource.doSomething();

This may be acceptable. However there are still some questions that
may require new syntax to answer.

2) Handling multiple return intents

There is an open question, which is what the storage kind
(e.g. ref/const ref/var) of myResource should be if there are
multiple different return intents for the __enter__ method of the
context manager object.

E.g., imagine if we used record foo as a context manager:

record foo {
  var x = 0;

  // This would be called '__enter__' in Python...
  proc enterThis() ref { return x; }
  proc enterThis() { return x; }
  proc leaveThis() {}
}

Which overload of enterThis() is selected for myResource? What is
the storage kind of myResource? If we wanted to omit an explicit
storage kind, we would need to decide how to disambiguate based on
return intent.

But if we do not, or if we would like to explicitly select which
overload is called, we can introduce a fourth piece of syntax:

// Create a copy of 'myResource' as a 'var'.
with createAndUseSomeManager() as var myResource do
  myResource.doSomething();

// Refer to 'myResource' with 'ref'.
with createAndUseSomeManager() as ref myResource do
  myResource.modify(); 

3) Potential confusion beteween 'with' and task intents

The with keyword is already used to introduce task/forall intent
lists (also known as "with clauses").

Because of this, we might want to use a different keyword besides
with to signal the start of a context manager statement. One
option is to use the keyword manage:

manage createAndUseSomeManager() as myResource do
  myResource.doSomething();

This has the downside of adding a new keyword to the language.

4) Using a syntax that looks more like assignment

One idea that has been proposed is to use a syntax that looks more
like assignment instead of using the as keyword to bind the
resource to a handle:

with myResource = createAndUseSomeManager() do
  myResource.doSomething();

This idea is interesting, but I think it has a downside, which is
that what is occurring under the covers is not actually assignment
of createAndUseSomeManager() to myResource. Instead, what happens
looks more like:

ref tempManager = createAndUseSomeManager();
var myResource = tempManager.enterThis();

Because it is too easy to take this form of syntax as assignment at
face value, I think it may cause more harm than good.

5) Unifying context managers and task intents (with clauses)

Because the original translated form uses the with keyword, this
gave rise to a novel idea: what if we could use context managers
within with clauses?

Perhaps something like:

begin with (myResource = createAndUseSomeManager()) {
  myResource.doSomething();
}

This is a cool idea. However this first translation suffers from the
same problem as (4), which is that it is impossible to differentiate
the syntax from assignment.

To fix this problem, we could introduce the as keyword instead:

begin with (createAndUseSomeManager() as myResource) {
  myResource.doSomething();
}

In this fashion, we could integrate context managers into with clauses.
We could offer this form as a secondary way of invoking a context
manager. I do not think we could have it be the only way of creating
a context manager, as with clauses can only be used by things which
may introduce task intents.

6) Flipping the order of context manager and resource handle

An alternative syntax that flips the order of createAndUseManager()
and myResource has been proposed:

with myResource in createAndUseManager() do
  myResource.doSomething();

Here we list the resource handle before the manager expression that
is used to create it. Because the as keyword doesn't make as much
sense with this ordering, we can use the in keyword instead. The
in keyword also makes conceptual sense because it implies that we
are working with a resource within a context.

7) Support for nested context managers

As a last piece of functionality we might want the syntax to support,
suppose we might want to allow nested context managers. Imagine a
situation where you have two managers: a timer entity and a file.
With our original translation, this could be achieved by nesting
context managers:

var t = new timer();

// Open a file, then time how long it takes to write some stuff.
with open('file.txt', 'w') as myFile do
  with t do
    myFile.write('foo');

writeln(t.elapsed());

Nesting context managers works, but the downside is it introduces
nesting (potentially a lot of nesting, depending on how many context
managers we use).

As an extension to our syntax, we could support nested context
managers on a single line:

var t = new timer();

with open('file.txt', 'w') as myFile, t do
  myFile.write('foo');

writeln(t.elapsed());

8) Summary

Here are some questions for discussion:

  • What keyword do we want to use to signal the start of a context
    manager statement? (e.g. with or manage)
  • How do we want to handle multiple return intents for enter?
    Are we OK with requiring the storage kind be expressed before the
    resource handle (e.g. ref myResource)?
  • Do we want to permit omitting the storage kind of the resource
    handle? We will have to decide what this means in the presence
    of multiple return intents for enter
  • Do we want to explore the assignment-like syntax, or are we in
    agreement that this might be too confusing?
  • Do we want to explore offering a secondary form of syntax that
    lets context managers be used in with clauses?
  • Do we want to consider inverting the order of the manager
    expression and the resource handle?
    (e.g. myResource in createAndUseMyManager())
  • Do we want to extend the syntax to support nested managers?