Re: [Initial Feedback] PHP User Modules - An Adaptation of ES6 from JavaScript

From: Date: Fri, 28 Jun 2024 07:07:55 +0000
Subject: Re: [Initial Feedback] PHP User Modules - An Adaptation of ES6 from JavaScript
References: 1 2 3 4  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
This is a very long reply to several emails.

On Thu, Jun 27, 2024 at 5:45 PM Jim Winstead <[email protected]> wrote:

> The angle I am coming at this from is improving the developer experience
> around "packages" or "modules" or whatever you want to call them, and so
> much of this proposal doesn't seem to be about that.
>
>
Ok, first problem - not a proposal really, but a ramble trying to get to a
proposal. Before I made the first post the idea was knocking around in my
head and wouldn't go away, so I just stream of consciousness listed what's
going through my head. That leads to the second point you made.


> I could have made that point in other ways, and I'm sorry that my first
> attempt came off as insulting. It really concerned me when I already saw
> discussion about taking this off-list and going into the weeds on technical
> details when the problem that is being addressed by this proposal is
> extremely unclear to me.
>

 It is unclear even to me. Perhaps I shouldn't have posted out something
this half baked. That said, pruning off large sections of language
functionality is a distraction. For now let's just note that it is a
possibility to improve the language this way afforded by the fact that
import would be new way of bringing scripts in.  Could isn't should.  Also,
at the moment again it's a distraction. Let's focus down on how code is
imported.

First though, a history review, partially to get this straight in my own
head but hopefully of use for those following along. Why? Knowing how we
got we are is important to some degree to chart a way forward.

PHP started as a template engine. By modern standards, and compared to the
likes of twig, it's a very bad template engine, but that doesn't really
matter because it's evolved into a programming language in it's own right
over the last nearly 20 years.

Include, include_once, require, and require_once have been around since the
beginning as the way to splice code files together. The behavior of these
statements calls back to PHP's origin as a template engine as they do
things similar mechanisms like JavaScript's import do not do (and for that
matter, their equivalents in C# and Java). Their scope behavior is very
different from import mechanisms in other languages, as they see the
variables in the scope of the function they were invoked from or the global
scope when called from there.  Their parsing can be aborted early with a
return.  They can return a value, which is quite unusual to be honest. None
of this is bad per se, but it is different and the question arises is it
necessary.

One artifact of their behavior that is bad in my opinion is that they start
from the standpoint of being text or html files.  If the included file has
no PHP tags then the contents get echoed out. If there are no output
buffers running this can cause headers to be set and fun errors to be had.
So they can't be used to create files that can only echo explicitly (that
is, a call to the echo statement or the like).

Fast forward a bit - PHP 5.3, and the introduction of namespaces were
introduced to deal with the overloaded symbol tables. They are a bit a
hotwire as (if I'm not mistaken, it's been a couple years since I read the
discussion on it) they just quietly prepend the namespace string in front
of the name of all new symbols declared in the namespace for use elsewhere.
As a result, PHP namespaces don't do some of the things we see in the
namespaces of other languages (looking at Java and C# here). For example,
privacy modifiers within a namespace aren't a thing.

Very quickly after PHP 5.3 released autoloaders showed up. At some point
support for multiple autoloaders was added.  Several schema were added,
PSR-4 won out, and composer showed up to leverage this. Composer is based
on NPM, even to the point where json is used to configure it, and the
composer.json file is fairly close to npm's package.json file even now.
It's a userland solution, but to my knowledge WordPress is the only widely
used PHP application out there that doesn't use it directly (there is a
Composer Wordpress project).

Before composer, and before namespaces there was PECL.  Composer has
eclipsed it because PECL has the limitation of being server-wide. It never
really caught on in the age of virtual hosting with multiple PHP sites
running on one box. Today we have Docker, but that didn't help PECL make a
comeback because by the time docker deployment of PHP sites became the norm
composer had won out.  Also, composer library publishing is more permissive
than PECL. I'll stop here lest this digress into a Composer v PECL
discussion - suffice to say stabs a bringing code packages into PHP isn't a
new idea, and a survey of what's been done before, what was right about
those attempts and what was wrong needs to be considered before adding yet
another php package system into the mix.

The main influence of composer and autoloaders for preparing packages is
that PHP has become far more Object Oriented than it was before.  Prior to
PHP 5.3 object oriented programming was a great option, but since
autoloaders cannot bring in functions (at least not directly, they can be
cheated in by bundling them in static classes which are all but namespaces)
the whole ecosystem has become heavily object oriented.

That isn't a bad thing.  But it does need to be acknowledged.  Before I go
further I'll now respond to some other points made by others in this
thread.


On Thu, Jun 27, 2024 at 6:01 PM Jordan LeDoux <[email protected]>
wrote:

>
> Ah, yes, THAT'S a fair point. While the idea of optimizing the
> engine/parser for modules has merit as part of a user modules proposal, I
> agree that many of the specifics proposed here feel pretty scatter-shot and
> unclear.
>
> The scoping operator change I simply ignored, as that feels to me like
> just asking "I would like to program in Node" and there's no clear benefit
> to changing the scoping operator outlined, while there is a clear detriment
> to eliminating the concatenation operator entirely.
>
> Mostly I ignored that aspect of it, because I assumed that all the people
> capable of implementing this proposal would just refuse stuff like that
> outright, and that the inclusion of it would guarantee the RFC fails, so no
> point in worrying.
>
> But the broader question you are presenting about the focus and goals of
> the proposal, and how the specifics relate to that, is actually a question
> that I share.
>

I hope the above begins to address that.  Package management I think should
be the main topic, and from here forward I'll leave aside any unnecessary
parser changes that might occur when code is imported as there are
distractions. Those I continue to bring up I'll state why, and those who
are more familiar with how the engine works can speak to whether such
changes truly are useful or unecessary. If I'm wrong, then dropping such
suggestions entirely is the way to go.



On Thu, Jun 27, 2024 at 6:07 PM Rob Landers <[email protected]> wrote:

>
> Internals has made it pretty clear: no more declare or ini entries (unless
> it is absolutely needed).
>

Noted.


>
> I personally don’t like it because it uses arrays, which are opaque, easy
> to typo, and hard to document/check.
>
> Instead, maybe consider a new Reflection API?
>
> (new ReflectionModule)->import('MyModule')->run()
>

That doesn't solve the problem of how the parser figures out where the code
is.  That's got to happen somewhere.  I'll come back to this in a moment.


> Keep in mind that extensions typically expose functions automatically, and
> under the original proposal those functions have to be imported to be used:
> import mysql_query
>
>
> they also do now, unless you either prefix them with \ or rely on the
> fallback resolution system. I’m honestly not sure we need a new syntax for
> this, but maybe just disable the global fallback system in modules?
>
>
I'm not sure that's a good idea, neither was this.


>
> Perhaps PHP imports, unlike their JavaScript or even Java C# counterparts,
> could be placed in try/catch blocks, with the catch resolving what to do if
> the import misses.
>
> Which is something I wrote, yet a day later - yuck. I do not like. But I'm
in brainstorm mode, playing with ideas with everyone.


> I really don't like the extension games seen in node with js, cjs and mjs,
> but there's a precedent for doing it that way.  In their setup if you've
> set modules as the default parse method then cjs can be used to identify
> files that still need to use CommonJS.  And mjs can force the ES6 even in
> default mode.  But it is a bit of a pain and feels like it should be
> avoided.
>
>
> I would argue that it be something seriously considered. Scanning a
> directory in the terminal, in production systems, while diagnosing ongoing
> production issues, it can be very handy to distinguish between the “old
> way” and “new way”, at a glance.
>
>
Fair point.

>
>
>
>
> the only thing I don’t like about this import/export thing is that it
> reminds me of the days when we had to carefully order our require_once
> directives to make sure files were loaded before they were used. So, I
> think it is worth thinking about how loading will work and whether loading
> can be dynamic, hoisted out of function calls (like js), how order matters,
> whether packages can enrich other packages (like doctrine packages) and if
> so, how much they can gain access to internal state, etc. This is very much
> not “a solved problem.”
>
>
> In JavaScript import must be top of the file - you'll get an error if you
> try an import following any other statement unless it's a dynamic import(),
> which is a whole other Promise/Async/Kettle of fish that thankfully PHP
> does not have to take into account as, until you get used to it (and even
> after), async code is a headache.
>
>
> Are you sure? I don’t remember them removing import hoisting, but it’s
> probably more of a typical linting rule because it is hard to reason about.
>

Likely correct - I do use linters heavily.  Hoisting is evil (necessary,
but still evil).


On Thu, Jun 27, 2024 at 6:13 PM Rowan Tommins [IMSoP] <[email protected]>
wrote:

>
> Thank you for sharing. I think it's valuable to explore radical ideas
> sometimes.
>
> I do think PHP badly needs a native concept of "module" or "package" -
> in fact, I'm increasingly convinced it's the inevitable path we'll end
> up on at some point. BUT I think any such concept needs to be built on
> top of what we have right now. That means:
>
> - It should build on or work in harmony with namespaces, not ignore or
> replace them
> - It should be compatible with Composer, but not dependent on it
> - It should be easy to take existing code, and convert it to a
> module/package
> - It should be easy to carry on using that module/package after it's
> been converted
>

On all these points, agreed.


>
> If we can learn from other languages while we do that, I'm all for it;
> but we have to remember that those languages had a completely different
> set of constraints to work with.
>
> For instance, JS has no concept of "namespaces", but does treat function
> names as dynamically scoped alongside variables. So the module system
> needed to give a way of managing how you imported names from one scope
> to another. That's not something PHP needs, because it treats all names
> as global, and namespaces have proved an extremely successful way of
> sharing code without those names colliding.
>
>
Very good point.


>
> Other parts of your e-mail are essentially an unrelated idea, to have
> some new "PHP++" dialect, where a bunch of "bad" things are removed.
>

Let's set that aside then.  Better package management is a big enough
dragon to slay.



On Thu, Jun 27, 2024 at 8:16 PM Mike Schinkel <[email protected]> wrote:

> This is a long reply rather than send a bunch of shorter emails.
>
> > On Jun 27, 2024, at 2:10 PM, Deleu <[email protected]> wrote:
> >
> > Overall, I think PHP has already reached the limit of surviving with
> only PSR-4 and Composer. Single class files were a great solution to get us
> out of the nightmare of require and import on top of PHP files. But
> more than once I have had the desire to declare a couple of interfaces in a
> single file, or a handful of Enums, etc.
>
> This.
>
> I cannot overemphasize how nice it is to work in Go where I can put almost
> any code I want in any file I want without having to think about
> autoloading.
>

Go is cool. I need to use it more. These days JavaScript gets most of my
time, but PHP will always be the language that got me into
programming professionally and for that I'll be eternally grateful.


>
> As I understand the proposal, this would have no BC issues for code not in
> modules. PHP could then set rules for code in modules that would not to be
> directly compatible with code outside modules.
>

That is the goal. Module code should be allowed to be different if the
optimization makes for faster running and easier to understand code (for
the programmer, the IDE, and the parser itself). Changing things for the
sake of changing them, no.


>
> At least to me this does not feel as big as trying to implement unicode.
>

I would hope not, because that turned out to be well night impossible.


>
> >  2. No need for autoloaders with modules; I assume this would be
> obvious, right?
> >
> > Depends largely on whether modules can include and require to get access
> to old code. I also didn't discuss how they behave - do they share their
> variables with includes and requires?
>
> I was presuming that all old code would use autoloaders but modules would
> be free to do it a better way.


> If you need to call code from a namespace from inside a module, sure, the
> autoloader would be needed.
>

This is correct and what I had in mind.


>
> > 6. Modules should be directories, not .php files. Having each file be a
> module makes code org really hard.
> >
> > Yes, but that is how JavaScript currently handles things. It is
> currently necessary when making large packages to have an index.js that
> exports out the public members of the module. This entry point is
> configurable through the package.json of the module.
>
> I am envisioning that there could be a module metadata file that would
> have everything that PHP needs to handle the module.  It could even be
> binary, using protobufs:
>

An interesting idea. I need to research this some.


> node_modules IMO is one of the worse things about the JavaScript
> ecosystem. Who has not seen the meme about node_modules being worse than a
> black hole?
>

Fair enough. Or maybe import maps would be a better way forward.


>
> But ensuring that it is possible to disallow loading needs to be
> contemplated in the design. PHP has to be able to know what is a module and
> what isn't without expensive processes.
>

One possible solution is that if modules do not have <?php ?> tags, ever,
and someone directly tries to load a module through http(s) the file won't
execute. Only files with <?php ?> tags are executable by the web sapi.


>
> > 10. Having exports separate from functions and classes seems like it
> would be problematic.
> >
> > Again, this is how they work in JavaScript. Not saying that's the best
> approach, but even if problematic it's a solved problem.
>
> I have evidently not written enough JavaScript to realize that.
>

JavaScript is an odd prototypical duck.  Everything ultimately is an
object. Tha


>
> > I'm also interested in learning on how other module systems out there do
> work.
>
> I am very familiar with modules (packages) in GoLang and think PHP could
> benefit from considering how they work, too.
>
>
I've only touched the surface on how GoLang does things. Some of it was
confusing to me at first. It's also been awhile so I'd need to refresh my
memory to speak to it.


> > On Jun 27, 2024, at 3:22 PM, Michael Morris <[email protected]> wrote:
> > Composer would need a massive rewrite to be a part of this since it
> currently requires the file once it determines it should do so. If we do a
> system where import causes the parser to act differently then that alone
> means imports can't be dealt with in the same manner as other autoloads.
>
> That is why I am strongly recommending a modern symbol resolution system
> within modules vs. autoloading.
>
>
Ok.


> >> I'm not fond of this either.
> >
> > There will need to be a way to define the entrypoint php.  I think
> index.php is reasonable, and if another entry point is desired it can be
> called out -> "mypackage/myentry.php"
>
> Why is an entry point needed?  If there is a module metadata file as I am
> proposing PHP can get all the information it needs from that file. Maybe
> that is the .phm file?
>
>
Maybe. Again, I need to look over this meta data format. Also, how does it
get created?



> > On Jun 27, 2024, at 4:54 PM, Rob Landers <[email protected]> wrote:
> >> Thanks. The sticking point is what degree of change should be
> occurring. PHP isn't as behind an 8-ball as JavaScript is since the dev can
> choose their PHP version and hence deprecation works most of the time for
> getting rid of old stuff. But not always. Changes that are incompatible
> with what came before need a way to do things the old way during
> transition. Again, see PHP 6 and unicode, which snowballed until it was
> clear that even if PHP 6 had been completed it wouldn't be able to run most
> PHP 5 code.
> >
> > It’s not just up to the dev, but the libraries we use and whether or not
> we can easily upgrade (or remove) them to upgrade the php version.
>
> By "upgrade" then, do you mean convert them into modules, or just be able
> to use them as-is.
>
> As I read it and am envisioning it, there would be no changes needed to be
> able to use them as-is.
>
>
Any system that blocks existing code from being used would be a non-starter
for inclusion.


> > I think it would be a mistake to exclude old code and/or prevent
> templating. Not only are there now decades old code in some orgs, but how
> would you write an email sender that sent templated emails, provide html,
> generate code, etc? There has to be an output from the code to be useful.
>
> Excluding old code or templates from modules would not exclude them from
> working as they currently do outside modules.  As I see it, modules would
> be more about exporting classes and functions, not generating output per se.
>
> So all that decades of old code could continue to exist outside modules,
> as it currently does today.
>
>
Exactly this.



> > I think it’s fine to use js as an inspiration, but it isn’t the only one
> out there. There is some precedent to consider directories as modules (go
> calls them “packages”) and especially in PHP where namespaces (due to PSR-4
> autoloading) typically match directory structures.
>
> Totally agree about inspiration for modules outside JS, but not sure that
> PHP namespaces are the best place to look for inspiration.
>
> Namespaces by their very nature were designed to enable autoloading with a
> one-to-one file to class or interface, and by nature add conceptual scope
> and complexity to a project that would not be required if a modern
> module/package system were added to PHP.
>
> Modules could and IMO should be a rethink that learns the lessons other
> languages have learned over the past decade+.
>
>
Agreed.


> >> Node.js uses package.json and the attendant npm to do this sort of prep
> work.  And it's a critical part of this since modules can be versioned, and
> different modules may need to run different specific versions of other
> modules.
> >
> > Please, please, please do not make a json file a configuration language..
> You can’t comment in them, you can’t handle “if php version <9, load this,
> or if this extension is installed, use this.”
> >
> > Maybe that is desirable, but doing things slightly different based on
> extensions loaded is def a thing.
>
> I don't think commenting is important in this file, or even desired.
>
> As I proposed above, these could be protobuf or phar.  These should be
> build artifacts that can be generated on the fly during development or for
> newbies even during deployment, not hand-managed.
>

Hand management has value in learning the underlying concepts though.


>
> I could see the generation of two files; one in binary form and one that
> is readonly so a developer can double-check what is in the current protobuf
> or phar file.
>
> >> Those are implementation details a little further down the road than
> we're ready for, I think.
> >
> > Personally, if these are going to have any special syntax, we probably
> shouldn’t call them .php files. Maybe .phm?
>
> I was going to suggest that, and then remembered earlier PHP when there
> were multiple file extensions and that was a nightmare.
>
> This does remind me to mention that I think there should be a required
> "module" declaration at the top of each file just like Go requires a
> "package" declaration at the top of each file. That would make it trivial
> for tooling to differentiate, even with grep


Fun idea, if the @ operator is ditched as an error suppression operator it
could be used as the package operator.  (If I manage to talk everyone into
getting rid of one thing, it's @).


> .
>
> > the only thing I don’t like about this import/export thing is that it
> reminds me of the days when we had to carefully order our require_once
> directives to make sure files were loaded before they were used. So, I
> think it is worth thinking about how loading will work and whether loading
> can be dynamic, hoisted out of function calls (like js), how order matters,
> whether packages can enrich other packages (like doctrine packages) and if
> so, how much they can gain access to internal state, etc. This is very much
> not “a solved problem.”
>
> That is why I proposed having a "compiled" module symbol table to
> eliminate most (all?) of those issues.
>

The more you bring it up, the more I am reminded of the import-map
directive added to client-side JavaScript.



>
> > On Jun 27, 2024, at 6:00 PM, Rowan Tommins [IMSoP] <[email protected]>
> wrote:
> > I do think PHP badly needs a native concept of "module" or "package" -
> in fact, I'm increasingly convinced it's the inevitable path we'll end up
> on at some point. BUT I think any such concept needs to be built on top of
> what we have right now. That means:
> >
> > - It should build on or work in harmony with namespaces, not ignore or
> replace them
>
> It may be an unpopular opinion, but I would argue that namespaces were
> optimized for autoloading and the one class/interface per file paradigm,
> not to mention to regrettable choice of using the escape operator to
> seperate namespaces and that fact that PHP throws away a lot of information
> about namespaces at runtime.
>

I remember when the choice to use \ was made.  I've rarely been so angry
about a language design choice before or since.  I've gotten used to it,
but seeing \\ all over the place in strings is still.. yuck.


>
> IMO allowing modules to eventually deprecate namespaces — at least in a
> defacto form of deprecation — would allow modules to be much better than if
> the try to cling to a less desirable past.
>
> > - It should be easy to take existing code, and convert it to a
> module/package
>
> Maybe, but not if that means modules retain baggage that should really be
> jettisoned.
>
> > and namespaces have proved an extremely successful way of sharing code
> without those names colliding.
>
> At the expense of a lot more complexity than necessary, yes.
>
> Managing symbols in a module need not be a hard problem if PHP recognizes
> modules internally rather than trying to munge everything into a global
> namespace like with namespaces.
>

I'm inclined to agree on these points, but I also don't know the engine
internals that wall.  Intuitively it would seem keeping the symbol table
small would make the code go faster.


> > On Jun 27, 2024, at 6:41 PM, Larry Garfield <[email protected]>
> wrote:
> > What problem would packages/modules/whatever be solving that isn't
> already adequately solved?
>
> Not speaking for Michael, obviously, but speaking for what I envision:
>
> 1. Adding a module/package system to PHP with modern module features
>         - including module private, module function, and module properties
> 2. Providing an alternative to auto-loader-optimized namespaces.
>         - better code management and better page load performance
>
>
Couldn't have said it better myself.


> > Do we want:
> >
> > 1. Packages and namespaces are synonymous?  (This is roughly how JVM
> languages work, I believe.)
> > 2. Packages and files are synonymous?  (This is how Python and
> Javascript work.)
> > 3. All packages correspond to a namespace, but not all namespaces are a
> package?
>
> I would argue packages (modules) should be orthogonal to namespaces to
> allow modules to be optimized for what other languages have learned about
> packages/modules over the past decade+.
>
> The fact that namespaces use the escape character as a separator, that PHP
> does not keep track of namespace after parsing is enough reason to move on
> from them, and that they were optimize for one-to-one symbol to file
> autoload are enough reasons IMO to envision a way to move on from them.
>
> > And given the near-universality of PSR-4 file structure, what impact
> would each of those have in practice?
>
> Orthogonal.  Old way vs new way.  But still completely usable, just not as
> modules without conversion.
>
> The fact PSR-4 exists is an artifact of autoloading single-symbol files
> and thus a sunken cost does not mean that PHP should not cling to for
> modules just because they currently exist.
>
>
I have nothing to add to the above.


Thread (128 messages)

« previous php.internals (#123985) next »