Re: [Concept] Flip relative function lookup order (global, then local)

From: Date: Sat, 24 Aug 2024 09:09:27 +0000
Subject: Re: [Concept] Flip relative function lookup order (global, then local)
References: 1 2 3 4 5 6 7 8 9 10 11  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message


On Sat, Aug 24, 2024, at 11:00, Rob Landers wrote:
> On Fri, Aug 23, 2024, at 23:57, Ilija Tovilo wrote:
>> On Fri, Aug 23, 2024 at 9:41 PM Rowan Tommins [IMSoP]
>> <[email protected]> wrote:
>> >
>> > On 23 August 2024 18:32:41 BST, Ilija Tovilo <[email protected]> wrote:
>> > >IMO, 1. is too drastic. As people have mentioned, there are tools to
>> > >automate disambiguation. But unless we gain some other benefit from
>> > >dropping the lookup entirely, why do it?
>> >
>> > I can think of a few disadvantages of "global first":
>> >
>> > - Fewer code bases will be affected, but working out which ones is harder. The easiest
>> > migration will probably be to make sure all calls to namespaced functions are fully qualified, as
>> > though it was "global only".
>> 
>> To talk about more concrete numbers, I now also analyzed how many
>> relative calls to local functions there are in the top 1000 composer
>> packages.
>> 
>> https://gist.github.com/iluuu1994/9d4bbbcd5f378d221851efa4e82b1f63
>> 
>> There were 4229 calls to local functions that were statically visible.
>> Of those, 1534 came from thecodingmachine/safe, which I'm excluding
>> again for a fair comparison. The remaining 2695 calls were split
>> across 210 files and 27 repositories, which is less than I expected.
>> 
>> The calls that need to be fixed by swapping the lookup order are a
>> subset of these calls, namely only the ones also clashing with some
>> global function. Hence, the process of identifying them doesn't seem
>> fundamentally different. Whether the above are "few enough" to justify
>> the BC break, I don't know.
>> 
>> > - The engine won't be able to optimise calls where the name exists locally but
>> > not globally, because a userland global function could be defined at any time.
>> 
>> When relying on the lookup, the lookup will be slower. But if the
>> hypothesis is that there are few people relying on this in the first
>> place, it shouldn't be an issue. It's also worth noting that many of
>> the optimizations don't apply anyway, because the global function is
>> also unknown and hence a user function, with an unknown signature.
>> 
>> > - Unlike with the current way around, there's unlikely to be a use case for
>> > shadowing a namespaced name with a global one; it will just be a gotcha that trips people up
>> > occasionally.
>> 
>> Indeed. But this is a downside of both these approaches.
>> 
>> > None of these seem like showstoppers to me, but since we can so easily go one step
>> > further to "global only", and avoid them, why wouldn't we?
>> >
>> > Your answer to that seems to be that you think "global only" is a bigger BC
>> > break, but I wonder how much difference it really makes. As in, how many codebases are using
>> > unqualified calls to reference a namespaced function, but *not* shadowing a global name?
>> 
>> I hope this provides some additional insight. Looking at the analysis,
>> I'm not completely opposed to your approach. There are some open
>> questions. For example, how do we handle functions declared and called
>> in the same file?
>> 
>> namespace Foo;
>> function bar() {}
>> bar();
>> 
>> Without a local fallback, it seems odd for this call to fail. An
>> option might be to auto-use Foo\bar when it is declared, although that
>> will require a separate pass over the top functions so that functions
>> don't become order-dependent.
>> 
>> Ilija
>> 
> 
> Hey Ilija,
> 
> I'm actually coming around to global first, then local second. I haven't gotten
> statistically significant results yet though, but preliminary results show that global first gives
> symfony/laravel their speed boost and function autoloading gives things like wordpress their speed
> boost. Everyone wins.
> 
> For function autoloading, it is only called on the local check. So, it looks kinda like this:
> 
>  1. does it exist in global namespace?
>    1. yes: load the function; done.
>    2. no: continue
>  2. does it exist in local namespace?
>    1. yes: load the function; done.
>    2. no: continue
>  3. call the autoloader for local namespace.
>  4. does it exist in local namespace?
>    1. yes: load the function; done.
>    2. no: continue
>  5. does it exist in the global namespace?
>    1. yes: load the function; done.
>    2. no: continue
> 
> It checks the scopes in reverse order after autoloading because it is more likely that the
> autoloader loaded a local scope function than a global one. This adds a small inconsistency (if the
> autoloader were to load both a global and non-global function of the same name), but keeps
> autoloading fast for unqualified function calls. By checking global first, for OOP-centric codebases
> like Symfony and Laravel that call unqualified global functions, they never hit the autoloader. For
> things that do call qualified local-namespace functions, they hit the autoloader and immediately
> start loading them. The worst performance then becomes autoloading global functions that are called
> unqualified. Not only do you have to strip out the current namespace in the autoloader, but you have
> to deal with being the absolute last check in the function table. However, (and I'm still
> trying to figure out how to quantify this), I'm reasonably certain projects do not use global
> functions that often.
> 
> — Rob

Amendment:

Actually, I may skip allowing the second check in the global space for autoloaders. In other words,
if you want to autoload a global function, you need to call it fully qualified. It's not 100%
ideal, but better than pinning and better performance for everyone.

— Rob


Thread (112 messages)

« previous php.internals (#125177) next »