> On 25 Aug 2024, at 00:01, Ilija Tovilo <[email protected]> wrote:
>
> Hi Stephen
>
> On Sat, Aug 24, 2024 at 1:54 PM Stephen Reay <[email protected]> wrote:
>>
>> Thanks for clarifying. Out of curiosity, how much optimisation do you imagine would be
>> possible if the lookups were done the same was as classes (ie no fallback, names must be local,
>> qualified or imported with use
)?
>
> I haven't measured this case specifically, but if unqualified calls to
> local functions are indeed rare (which the last analysis seems to
> indicate), then it should make barely any difference. Of course, if
> your code makes lots of use of them, then the story might be
> different. That said, the penalty of an ambiguous internal call is
> much higher than that of a user, local call, given that internal calls
> sometimes have special optimizations or can even be entirely executed
> at compile time. For local calls, it will simply lead to a double
> lookup on first execution.
>
>> I am aware this is a BC break. But if it's kosher to discuss introducing a never
>> ending BC break I don't see why this isn't a valid discussion either. It would give
>> *everyone* that elusive 2-4% performance boost, would resolve any ambiguity about which function a
>> person intended to call (the claimed security issue) and would bring consistency with the way
>> classes/etc are referenced.
>
>> From my analysis, there were 2 967 unqualified calls to local
> functions in the top 1 000 repositories. (Disclaimer: There might be a
> "use function" at the top for some of these, the analysis isn't that
> sophisticated.)
>
> I also ran the script to check for unqualified calls to global
> functions (or at least functions that weren't statically visible in
> that scope in any of the repositories files), and there were ~139 000
> of them. It seems like this is quite a different beast. To summarize:
>
> 1. Flipping lookup order: ~a few dozens of changes
> 2. Global only: ~3 000 changes
> 3. Local only: ~139 000 changes
>
> While much of this can be automated, huge diffs still require
> reviewing time, and can lead to many merge conflicts which also take
> time to resolve. I would definitely prefer to go with 1. or
> potentially 2.
>
> Ilija
>
Hi Ilija,
I understand that a change like (3) is a huge BC break, and as I said earlier, I wasn't
actually suggesting that is the action to take, because I don't think there is sufficient
reason to take *any* action. But given that some people in this thread seem convinced that *a*
change to functionality is apparently required, I do think every potential change, and it's
pros and cons, should be discussed.
As I've said numerous times, and been either outright dismissed or ignored: there has been a
consistent push from a non-trivial number of internals members that userland developers should make
better use of regular functions, rather than using classes as fancy namespaces. There was a recent
RFC vote that implicitly endorsed this opinion.
Right now, the lookup rules make namespaced regular functions a consistent experience for
developers, but the lack of autoload makes it unpopular, and the lack of visibility for such symbols
can be problematic.
With the change you're proposing, there will be *another* hurdle that makes the use of regular
namespaced functions harder/less intuitive, or potentially (with option 1) unpredictable over PHP
versions, due to the constant threat of BC breaks due to new builtin functions - right when we have
not one but two RFCs for function autoloading (arguably the biggest barrier to their increased usage
in userland).
So the whole reason I asked about (3) is because it would
- (a) bring consistency with class/interface/trait symbols;
- (b) inherently bring the much desired 2% performance boost for function calls, because people
would be forced to qualify the names;
- (c) have zero risk of of future WTF BC break when a new global function interrupting local
function lookups;
- (d) have no need for a new "simpler" qualifying syntax (you can't get shorter than
1 character);
- (e) presumably simplify function autoloading, because there's no longer any
"fallback" step to worry about before triggering an autoloader;
- (e) even solve the "security" concerns John raised, because the developer would be
forced to qualify their usage if they wanted to use the builtin function - their intent is always
explicit, never guessed.
Yes, it is a huge BC break in terms of the amount of code that's affected. But it's almost
certainly one of the simplest BC break to "fix" in the history of PHP BC breaks.
How much code was affected when register globals was removed? Or when magic quotes was removed? Or
when short codes were removed?
Surely any change being proposed here would mean a deprecation notice in the next release after 8.4,
and then whatever actual change is proposed, in the next major version after that. So possibly 8.5
and then 9.0, but potentially 9.0 and then 10.0.
If either of (1) or (2) is chosen, and the "acceptability" of such a choice depends on
something less verbose than "namespace\" to qualify a local function, projects literally
can't future proof (or no-deprecation-notice-proof, if you prefer) their code against the
eventual flip until that change is implemented - in a scenario where a deprecation (and new local
qualifier) goes out in 2025 as part of 8.5, and a flip happens in 2026 as part of 9.0, that would
cuts the time projects have to effectively adapt, in half, and it means any code that's updated
for it, can't make use of the new "less verbose" local qualifier if they also need to
support versions prior to it being available.
If it happened to be 9.0 and 10.0 being the deprecation and "change" versions, obviously
people have longer to make the required change - but that argument cuts both ways. If you have 5
years to change every strlen
to \strlen
it's hardly going to cause a
huge and sudden swath of noise in revision history. I would imagine most projects would just adopt a
new code style, and prefixing with \
would occur automatically whenever a file is
otherwise modified by a developer.
There's also an impact on internals development/RFC with either (1) or (2): *any* proposed new
global function in the standard library now has a BC barrier to pass if it *might* conflict with one
defined by anyone in userland, in any namespace. JS is a living embodiment of of this problem: see
String#includes, Array#includes, and Array#flat - and that's with people doing the "wrong
thing" (extending builtin JS prototypes is arguably the same as using the \php
namespace)
Multiple people have lamented the way function fallbacks were originally implemented. If you're
going to insist on making a change, let's at least aim for a change that brings *MORE*
consistency, and fixes a previous mistake, rather than adding a brand new inconsistency and who
knows how many years of unexpected BC breaks for unsuspecting userland developers - who apparently
*already* stuggle to understand the way symbol lookup happens - into the future, and adding yet
*another* reason for people to not use namespaced functions.
Cheers
Stephen