Merging a few replies together here, since they overlap. Also reordering a few of Tim's
comments...
On Fri, Feb 7, 2025, at 7:32 AM, Tim Düsterhus wrote:
> Hi
>
> Am 2025-02-07 05:57, schrieb Larry Garfield:
>> It is now back with a better implementation (many thanks to Ilija for
>> his help and guidance in that), and it's nowhere close to freeze, so
>> here we go again:
>>
>> https://wiki.php.net/rfc/pipe-operator-v3
>
> There's some editorial issues:
>
> 1. Status: Draft needs to be updated.
> 2. The RFC needs to be added to the overview page.
> 3. List formatting issues in “Future Scope” and “Patches and Tests”.
>
> Would also help having a closed voting widget in the “Proposed Voting
> Choices” section to be crystal clear on what is being voted on (see
> below the next quote).
I split pipes off from the Composition RFC late last night right before posting; I guess I missed a
few things while doing so. :-/ Most notably, the Compose section is now removed from pipes, as it
is not in scope for this RFC. (As noted, it's going to be more work so has its own RFC.)
Sorry for the confusion. I think it should all be handled now.
> 5. The “References” (as in reference variables) section would do well
> with an example of what doesn't work.
Example block added.
> 9. In the “Why in the engine?” section: The RFC makes a claim about
> performance.
>
> Do you have any numbers?
Not currently. The statements here are based on simply counting the number of function calls
necessary, and PHP function calls are sadly non-cheap. In previous benchmarks of my own libraries
using my Crell/fp library, I did find that the number of function calls involved in some tight pipe
operations was both a performance and debugging concern, but I don't have any hard numbers
laying about at present to share.
If you think that's critical, please advise on how to best get meaningful numbers here.
Regarding the equivalency of pipes:
Tim Düsterhus wrote:
> 4. “That is, the following two code fragments are also exactly
> equivalent:”.
>
> I do not believe this is true (specifically referring to the “exactly”
> word in there), since the second code fragment does not have the short
> closures, which likely results in an observable behavioral difference
> when throwing Exceptions (in the stack trace) and also for debuggers. Or
> is the implementation able to elide the the extra closure? (Of course
> there's also the difference between the temporary variable existing,
> with would be observable for get_defined_vars()
and possibly
> destructors / object lifetimes).
Thomas Hruska wrote:
> The repeated assignment to $temp in your second example is _not_
> actually equal to the earlier example as you claim. The second example
> with all of the $temp variables should, IMO, just be:
>
> $temp = "Hello World";
> $result = array_filter(array_map('strtoupper',
> str_split(htmlentities($temp))), fn($v) { return $v != 'O'; });
Juris Evertovskis wrote:
> 3. Does the implementation actually turn 1 |> f(...) |> g(...)
into
> $π = f(1); g($π)
? Is g(f(1))
not performanter? Or is the engine
> clever enough with the var reuse anyways?
There's some subtlety here on these points. The v2 RFC used the lexer to mutate $a |> $b
|> $c into the same AST as $c($b($a)), which would then compile as though that had been written
in the first place. However, that made addressing references much harder, and there's an
important caveat around order of operations. (See below.) The v3 RFC instead uses a compile
function to take the AST of $a |> $b |> $c and produce opcodes that are effectively equivalent
to $t = $b($a); $t = $c($t); I have not compared to see if they are the precise same opcodes, but
they net effect is the same. So "effectively equivalent" may be a more accurate
statement.
In particular, Tim is correct that, technically, the short lambdas would be used as-is, so
you'd end up with the equivalent of:
$temp = (fn($x) => array_map(strtoupper(...), $x))($temp);
I'm not sure if there's a good way to automatically unwrap the closure there. (If someone
knows of one, please share; I'm fine with including it.) However, the intent is that it would
be largely unnecessary in the future with a revised PFA implementation, which would obviate the need
for the explicit wrapping closure. You would instead write
$a |> array_map(strtoupper(...), ?);
Alternatively, one can use higher order user-space functions already. In trivial cases:
function amap(Closure $fn): Closure {
return fn(array $x) => array_map($fn, $x);
}
$a |> amap(strtoupper(...));
Which I am already using in Crell/fp and several libraries that leverage it, and it's quite
ergonomic.
There's a whole bunch of such simple higher order functions here:
https://github.com/Crell/fp/blob/master/src/array.php
https://github.com/Crell/fp/blob/master/src/string.php
Which leads to the subtle difference between that and the v2 implementation, and why Thomas'
statement is incorrect. If the expression on the right side that produces a Closure has side
effects (output, DB interaction, etc.), then the order in which those side effects happen may change
with the different restructuring. With all pure functions, that won't make a practical
difference, and normally one should be using pure functions, but that's not something PHP can
enforce.
I don't think there would be an appreciable performance difference between the two compiled
versions, either way, but using the temp-var approach makes dealing with references easier, so
it's what we're doing.
Juris Evertovskis wrote:
> 1. Do you think it would be hard to add some shorthand for `|>
> $condition ? $callable : fn($😐) => $😐`?
I'm not sure I follow here. Assuming you're talking about "branch in the next
step", the standard way of doing that is with a higher order user-space function. Something
like:
function cond(bool $cond, Closure $t, Closure $f): Closure {
return $cond ? $t : $f;
}
$a |> cond($config > 10, bigval(...), smallval(...)) |> otherstuff(...);
I think it's premature to try and bake that logic into the language, especially when I
don't know of any other function-composition-having language that does so at the language level
rather than the standard library level. (There are a number of fun operations people build into
pipelines, but they are all generally done in user space.)
--Larry Garfield