On Wed, Mar 5, 2025, at 3:37 AM, Edmond Dantes wrote:
> Good day, Larry.
>
>> First off, as others have said, thank you for a thorough and detailed proposal.
> Thanks!
>
>> * A series of free-standing functions.
>> * That only work if the scheduler is active.
>> * The scheduler being active is a run-once global flag.
>> * So code that uses those functions is only useful based on a global state not present in
>> that function.
>> * And a host of other seemingly low-level objects that have a myriad of methods on them
>> that do, um, stuff.
>> * Oh, and a lot of static methods, too, instead of free-standing functions.
>
> Suppose these shortcomings don’t exist, and we have implemented the
> boldest scenario imaginable. We introduce Structured Concurrency,
> remove low-level elements, and possibly even get rid of Future
. Of
> course, there are no functions like startScheduler
or anything like
> that.
>
> 1. In this case, how should PHP handle Fiber
and all the behavior
> associated with it? Should Fiber
be declared deprecated and removed
> from the language? What should the flow be?
I'm not sure yet. I was quite hesitant about Fibers when they went in because they were so
low-level, but the authors were confident that it was enough for a user-space toolchain to be
iterated on quickly that everyone could use. That clearly didn't pan out as intended (Revolt
exists, but usage of it is still rare), so here we are with a half-finished API.
Thinking aloud, perhaps we could cause new Fiber
to create an automatic async block?
Or we do deprecate it and discourage its use. Something to think through, certainly.
> 2. What should be done with I/O functions? Should they remain
> blocking, with a separate API provided as an extension?
The fact that IO functions become transparently async when appropriate is the best part of the
current RFC. Please keep that. :-)
> 3. Would it be possible to convince the maintainers of XDEBUG and
> other extensions to rewrite their code to support the new model? ( *If
> you're reading this question now, please share your opinion.* )
I cannot speak for Derick.
> 4. If transparent concurrency is introduced for I/O in point 2, what
> should be done with Revolt
+ AMPHP
? This would break their code.
> Should an additional function or option be introduced to switch PHP
> into "legacy mode"?
Also an excellent question, to which I do not yet have an answer. (See previous point about Fibers
being half-complete.) I would want to involve Aaron, Christian, and Ces-Jan before trying to make
any suggestions here.
> Structured concurrency is a great thing. However, I’d like to avoid
> changing the language syntax and make something closer to Go’s
> semantics. I’ll think about it and add this idea to my TODO.
Well, as noted in the article, structured concurrency done right means *not* having unstructured
concurrency. Having Go-style async and then building a structured nursery system on top of it means
you cannot have any of the guarantees of the structured approach, because the other one is still
poking out the side and leaking. We're already stuck with mutable-by-default, global
variables, and other things that prevent us from making helpful assumptions. Please, let's try
to avoid that for async. We don't need more gotos.
>> async $context {
>> // $context is an object of AsyncContext, and can be passed around as such.
>> // It is the *only* way to span anything async, or interact with the async controls.
>> // If a function doesn't take an AsyncContext param, it cannot control async. This is
>> good.
>
> This is a very elegant solution. Theoretically.
>
> However, in practice, if you require explicitly passing the context to
> all functions, it leads to the following consequences:
>
> 1. The semantics of all functions increase by one additional parameter
> (*Signature bloat*).
No, just those functions/objects that necessarily involve running async control commands. Most
wouldn't. They would just silently context switch when they hit an IO operation (which as
noted above is transparency supported, which is what makes this work) and otherwise behave the same.
But if something does actively need to do async stuff, it should have a context to work within.
It's the same discussion as:
A: "Pass/inject a DB connection to a class that needs it, don't just call a global db()
function."
B: "But then I have to pass it to all these places explicitly!"
A: "That's a sign your SQL is too scattered around the code base. Fix that first and your
problem goes away."
Explicit flow control is how you avoid bugs. It's also self-documenting, as it's patently
obvious what code expects to run in an async context and which doesn't care.
> 2. If an asynchronous call needs to be added to a function, and other
> functions depend on it, then the semantics of all dependent functions
> must be changed as well.
This is no different than DI of any other service. I have restructured code to handle temporary
contexts before. (My AttributeUtils and Serde libraries.) The result was... much better code than
I had before. I'm glad I made those refactors.
> In this example, there is another aspect: the fact that async execution
> is explicitly limited to a specific scope. This is essentially the same
> as startScheduler
, and it is one of the options I was considering.
>
> Of course, startScheduler
can be replaced with a construction like
> async(function() { ... })
.
> This means that async execution is only active within the closure, and
> coroutines can only be created inside that closure.
>
> This is one of the semantic solutions that allows removing
> startScheduler
, but at the implementation level, it is exactly the
> same.
>
> What do you think about this?
That looks mostly like the async block syntax I proposed, spelled differently. The main difference
is that the body of the wrapped function would need to explicitly use
any variables
from scope that it wanted, rather than getting them implicitly. Whether that's good or bad is
probably subjective.
But it would allow for a syntax like this for the context, which is quite similar to how database
transactions are often done:
$val = async(function(AsyncContext $ctx) use ($stuff, $fn) {
$result = [];
foreach ($stuff as $item) {
$result[] = $ctx->run($fn);
}
// We block/wait here until all subtasks are complete, then the async() call returns this value.
return $result;
});
And of course in both cases you could use a pre-defined callable instead of inlining one. At this
point I think it's mostly a stylistic difference, function vs block.
>> I'm not convinced that sticking arbitrary key/value pairs into the Context object is
>> wise;
>
> Why not?
>
>> that's global state by another name
>
> Static variables inside a function are also global state. Are you
> against static variables?
Vocally, in fact. :-)
>> But if we must, the above would handle all the inheritance and override stuff quite
>> naturally. Possibly with:
>
> How will a context with open string keys help preserve service data
> that the service doesn't want to expose to anyone? The Key()
solution
> is essentially the same as Symbol
in JS, which is used for the same
> purpose. Of course, we could add a coroutine static $var
construct to
> the language syntax. But it's all the same just syntactic sugar that
> would require more code to support.
I cannot speak to JS Symbols as I haven't used them. I am just vhemently opposed to globals,
no matter how many layers they're wrapped in. :-) Most uses could be replaced by proper DI or
partial application.
>> [$in, $out] = Channel::create($buffer_size);
>
> This semantics require the programmer to remember that two variables
> actually point to the same object. If a function has multiple channels,
> this makes the code quite verbose. Additionally, such channels are
> inconvenient to store in lists because their structure becomes more
> complex.
>
> I would suggest a slightly different solution:
>
> <code php>
> $in = new Channel()->getProducer();
> async myFunction($in->getConsumer());
> <code>
>
> This semantics do not restrict the programmer in usage patterns while
> still allowing interaction with the channel through a well-defined
> contract.
I'd go slightly differently if you wanted to go that route:
$ch = new Channel($buffer_size);
$in = $ch->producer();
$out = $ch->consumer();
// You do most interaction with $in and $out.
I could probably work with that as well.
(Or even just $ch->inPipe and $ch->outPipe, now that we have nice property support.)
But the overall point, I think, is avoiding implicit modal logic. If my code doesn't need to
care if it's in an async world, it doesn't care. If it does, then I need an explicit
async world to work within, rather than relying on one implicitly existing, I hope. And I
shouldn't have to think about "who owns this end of this channel". I just have an in
and out hose I stick stuff into and pull out from, kthxbye.
> Thanks for the great examples, and a special thanks for the article.
> I also like the definition of context.
>
> Ed
--Larry Garfield