Re: Module or Class Visibility, Season 2

From: Date: Wed, 14 May 2025 21:27:32 +0000
Subject: Re: Module or Class Visibility, Season 2
References: 1 2 3 4  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message


On Wed, May 14, 2025, at 16:57, Rowan Tommins [IMSoP] wrote:
> 
> 
> On 14 May 2025 14:24:57 BST, Michael Morris <[email protected]> wrote:
> >Well, it's what Go calls "modules". It's confusing because I was being
> >truthful, not snarky, when I said "Ask 10 programmers for the definition of
> >module and expect 12 answers."  I'm self trained, so I expect to get my
> >terms wrong from time to time. But I know enough to identify problems and
> >needs and I've tried to be clear on that.
> 
> I don't know much about Go, but at a glance it uses a similar model to JavaScript and
> Python where *classes don't have a universal name*, the names are always local. That's not
> a different kind of module, it's a fundamentally different *language design*.

Go has some weird scoping, for sure. Everything is done by convention instead of syntax. In other
words, if you want to export a symbol, you capitalize it; otherwise, it is lower-cased and thus
private to the module. Then each directory is a module, and even in the same project, you cannot
access another lower-cased symbol from another directory -- er, module.

It is strange, and I don't think it translates to PHP. PHP is generally explicit via syntax
over convention.

> 
> If you want to use two different versions of Guzzle in the same application, the first problem
> you need to solve has nothing to do with require, or autoloading, or Phar files. The first problem
> you need to solve is that you now have two classes called \GuzzleHttp\Client, and that breaks a
> bunch of really fundamental assumptions. 

As written, that simply isn't possible in PHP because there is only one class allowed with a
given name. Names of classes are global. I don't think this has to be the case, though.
Different languages take different approaches to this. For example, JavaScript allows each module to
"close over" its dependencies so each module can import its own version of dependencies.
Originally, there wasn't even any deduplication, so you'd have 500 copies of left-pad or
whatever. Then there is Go, which doesn't allow you to have multiple versions of modules. You
get exactly one version, which is similar to how PHP currently works with composer by default.
However, with some massaging, you can "prefix" your imports so you get only your own
version. I believe many WordPress plugins do this, so each plugin can use their own version of
things.

I'm fairly certain we can do a similar thing so that each module gets its own unique
'namespace' in the class table such that two modules can define the same classes. So
ModuleA and ModuleB can have Foo\Bar without conflicting with one another. From the user's
perspective, we can probably hide that technical detail from them but allow aliasing:

use module ModuleA; // import ModuleA's namespace into our current namespace for this file
use module ModuleB as Baz; // import ModuleB's namespace into our current namespace for this
file, but with a prefix

Foo\Bar; // ModuleA\Foo\Bar
Baz\Foo\Bar; // ModuleB\Foo\Bar

I'm just spitballing syntax here, and I'm not suggesting it actually work like this, but I
just want to illustrate that I think there are reasonable ways to allow modules to have conflicting
names.

> For example: 
> - plugin1 uses Guzzle v5, runs "$client1 = new \GuzzleHttp\Client", and returns it to
> the main application
> - The main application passes $client1 to plugin2
> - plugin2 uses Guzzle v4
> plugin2 runs "$client2 = new \GuzzleHttp\Client"
> 
> $client1 and $client2 are instances of different classes, with the same name! How does
> "instanceof" behave? What about "get_class"? What if you serialize and
> unserialize?

I'm of the opinion that the "names" of the module classes be distinct so that humans
(and deserializers) know it is from a module. Something like [ModuleA]\Foo\Bar.

> 
> I think if you changed the language enough that those questions didn't matter, it would be
> a language fork on the scale of Python 2 to 3, or even Perl 5 to Raku (originally called "Perl
> 6"). Every single application and library would have to be rewritten to use the new concept of
> what a class is. And most of them would get absolutely no benefit, because they *want* to reference
> the same version of a class everywhere in the application.

I suspect the hard part will be defining the module in the first place. IE, the
"package.json" or "go.mod" or whatever it gets called. As composer isn't a
part of the PHP project, I don't want to take it for granted, but I also don't want to
rely on it. That means each module may have to define its own "loader" or somehow define
what PHP files encompass the module. As I mentioned earlier, PHP doesn't usually operate by
convention, though the community tends to force it to anyway (PSR-4 autoloading comes to mind
immediately); so we'd need something that is explicit but automatable so the community can
implement conventions.

That's going to be the hard part.

> 
> That's why I think "containers" are the more useful comparison - you need some
> way to put not just plugin1 itself, but all the third-party code it calls, into some kind of
> sandbox, as though it was running in a separate process. If you can control what classes can go into
> and out of that sandbox, then in any piece of code, you don't end up with conflicting meanings
> for the same name - just as a Linux container can't open a network port directly on the host.

Exactly.

> 
> Regards,
> Rowan Tommins
> [IMSoP]
> 

— Rob


Thread (13 messages)

« previous php.internals (#127363) next »