Re: [Concept] Flip relative function lookup order (global, then local)

From: Date: Sat, 24 Aug 2024 09:00:13 +0000
Subject: Re: [Concept] Flip relative function lookup order (global, then local)
References: 1 2 3 4 5 6 7 8 9 10  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
On Fri, Aug 23, 2024, at 23:57, Ilija Tovilo wrote:
> On Fri, Aug 23, 2024 at 9:41 PM Rowan Tommins [IMSoP]
> <[email protected]> wrote:
> >
> > On 23 August 2024 18:32:41 BST, Ilija Tovilo <[email protected]> wrote:
> > >IMO, 1. is too drastic. As people have mentioned, there are tools to
> > >automate disambiguation. But unless we gain some other benefit from
> > >dropping the lookup entirely, why do it?
> >
> > I can think of a few disadvantages of "global first":
> >
> > - Fewer code bases will be affected, but working out which ones is harder. The easiest
> > migration will probably be to make sure all calls to namespaced functions are fully qualified, as
> > though it was "global only".
> 
> To talk about more concrete numbers, I now also analyzed how many
> relative calls to local functions there are in the top 1000 composer
> packages.
> 
> https://gist.github.com/iluuu1994/9d4bbbcd5f378d221851efa4e82b1f63
> 
> There were 4229 calls to local functions that were statically visible.
> Of those, 1534 came from thecodingmachine/safe, which I'm excluding
> again for a fair comparison. The remaining 2695 calls were split
> across 210 files and 27 repositories, which is less than I expected.
> 
> The calls that need to be fixed by swapping the lookup order are a
> subset of these calls, namely only the ones also clashing with some
> global function. Hence, the process of identifying them doesn't seem
> fundamentally different. Whether the above are "few enough" to justify
> the BC break, I don't know.
> 
> > - The engine won't be able to optimise calls where the name exists locally but not
> > globally, because a userland global function could be defined at any time.
> 
> When relying on the lookup, the lookup will be slower. But if the
> hypothesis is that there are few people relying on this in the first
> place, it shouldn't be an issue. It's also worth noting that many of
> the optimizations don't apply anyway, because the global function is
> also unknown and hence a user function, with an unknown signature.
> 
> > - Unlike with the current way around, there's unlikely to be a use case for shadowing
> > a namespaced name with a global one; it will just be a gotcha that trips people up occasionally.
> 
> Indeed. But this is a downside of both these approaches.
> 
> > None of these seem like showstoppers to me, but since we can so easily go one step further
> > to "global only", and avoid them, why wouldn't we?
> >
> > Your answer to that seems to be that you think "global only" is a bigger BC
> > break, but I wonder how much difference it really makes. As in, how many codebases are using
> > unqualified calls to reference a namespaced function, but *not* shadowing a global name?
> 
> I hope this provides some additional insight. Looking at the analysis,
> I'm not completely opposed to your approach. There are some open
> questions. For example, how do we handle functions declared and called
> in the same file?
> 
> namespace Foo;
> function bar() {}
> bar();
> 
> Without a local fallback, it seems odd for this call to fail. An
> option might be to auto-use Foo\bar when it is declared, although that
> will require a separate pass over the top functions so that functions
> don't become order-dependent.
> 
> Ilija
> 

Hey Ilija,

I'm actually coming around to global first, then local second. I haven't gotten
statistically significant results yet though, but preliminary results show that global first gives
symfony/laravel their speed boost and function autoloading gives things like wordpress their speed
boost. Everyone wins.

For function autoloading, it is only called on the local check. So, it looks kinda like this:

 1. does it exist in global namespace?
   1. yes: load the function; done.
   2. no: continue
 2. does it exist in local namespace?
   1. yes: load the function; done.
   2. no: continue
 3. call the autoloader for local namespace.
 4. does it exist in local namespace?
   1. yes: load the function; done.
   2. no: continue
 5. does it exist in the global namespace?
   1. yes: load the function; done.
   2. no: continue

It checks the scopes in reverse order after autoloading because it is more likely that the
autoloader loaded a local scope function than a global one. This adds a small inconsistency (if the
autoloader were to load both a global and non-global function of the same name), but keeps
autoloading fast for unqualified function calls. By checking global first, for OOP-centric codebases
like Symfony and Laravel that call unqualified global functions, they never hit the autoloader. For
things that do call qualified local-namespace functions, they hit the autoloader and immediately
start loading them. The worst performance then becomes autoloading global functions that are called
unqualified. Not only do you have to strip out the current namespace in the autoloader, but you have
to deal with being the absolute last check in the function table. However, (and I'm still
trying to figure out how to quantify this), I'm reasonably certain projects do not use global
functions that often.

— Rob


Thread (112 messages)

« previous php.internals (#125176) next »