Re: Zephir, and other tangents

From: Date: Wed, 11 Sep 2024 19:12:53 +0000
Subject: Re: Zephir, and other tangents
References: 1 2  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
Hi Rowan,

> On Sep 11, 2024, at 2:55 AM, Rowan Tommins [IMSoP] <[email protected]> wrote:
> Perhaps you're unaware that classes in core already can, and do, provide operator
> overloading. GMP is the "poster child" for it, overloading a bunch of mathematical
> operators, but the mechanism it uses to do so is reasonably straightforward and available to any
> extension.

I was making an (evidently) uninformed assuming that it was non-trivial to add operator overloading
at the C level. If it is easy, then my comments were moot.  

That said, writing extensions in C and deploying them is non-trivial —comparing to writing code in
PHP— so there is that. ¯\_(ツ)_/¯

> I've never liked that approach, because it means users can't write polyfills, or even
> stub objects, that have these special behaviours. It feels weird for the language to define
> behaviour that isn't expressible in the language. 

Understood. In _general_ I don't like it either, but I will use as an analogy a prior
discussion regarding __toArray, and I quote[1]:

"For the "convertible to array" case, I think __toArray, or an interface specifying
just that one method, would make more sense than combining it with the existing interfaces. I'm
sceptical of that concept, though, because most objects could be converted to many different arrays
in different circumstances, each of which should be given a different and descriptive name."

I am of course quoting you.   

Similarly, operators could mean different things, e.g. it is possible to have different meaning of
equal, and even different meanings of plus. Or worse be applied in ways that are non-sensical to
anybody but the developer who implements them (that would be the same kind of developer who names
their variables after Game of Thrones characters.)  

That is why I am not a fan of operator overloading, just as you were not a fan of __toArray which to
me is less problematic than overloaded operators because it has such smaller scope and is actually
quote useful for a common set of use-cases regardless of the potential for confusion. But I digress.

> It also risks conflicting with a future language feature that overlaps, as happened with all
> native functions marked as accepting string automatically coercing nulls, but all userland ones
> rejecting it. Deprecating that difference has caused a lot of friction.

That is a little different in that it was a behavior that occurred in both core and userland whereas
only allowing operator overloading in core would mean there would be not userland differences that
could conflict.

Whatever the case, if there are only two options: 1.) no operator overloading, and 2.) userland
operator overloading I would far prefer the former.

> This is the tricky part for me: some of the things people want to do in extensions are
> explicitly the kinds of thing a shared host would not want them to, such as interface to system
> libraries, perform manual memory management, interact with other processes on the host.
> 
> If WASM can provide some kind of sandbox, while still allowing a good portion of the features
> people actually want to write in extensions, I can imagine that being useful. But how exactly that
> would work I have no idea, so can't really comment further.

WebAssembly has a deny-by-default design so could be something to seriously consider for
extensibility in PHP. Implementations start with a full sandbox[2] and only add what they need to
avoid those kinds of concerns. 

Also, all memory manipulations sandboxed, though there are still potential vulnerabilities within
the sandbox so the project that incorporates WASM needs to be careful.  WASM written in C/C++ can
have memory issues just like in regular C/C++, for example.  One option would be to allow only
AssemblyScript source for WASM. Another would be a config option that a web-host could set to only
allow signed modules, but that admittedly would open another can of worms.  But the memory issues
cannot leak out of the module or affect other modules nor the system, if implemented with total
memory constraints.

That said, web hosts can't stop PHP developers from creating infinite loops so the memory
issues with WASM don't feel like too much bigger of a concern given their sandboxed nature. 
I've copied numerous other links for reference: [4][5][6]


>>> The overall trend is to have only what's absolutely necessary in an extension.
>> 
>> Not sure what you mean here.
> 
> I mean, like Phalcon plans to, ship both a binary extension and a PHP library, putting only
> certain essential functionality in the extension. It's how MongoDB ships their PHP bindings,
> for instance - the extension provides low-level protocol support which is not intended for every day
> use; the library is then free to evolve the user-facing parts more freely.

Gotcha.  

I think that actually supports what I was saying; people would gravitate to only doing in an
extension what they cannot do in PHP itself, and over time if PHP itself improves there is reason to
migrate more code to PHP.  

But there can still be reasons to not allow some thing in userland. Some things like __toArray.

-Mike

[1] https://www.mail-archive.com/[email protected]/msg100001.html
[2] https://thenewstack.io/how-webassembly-offers-secure-development-through-sandboxing/
[3] https://radu-matei.com/blog/practical-guide-to-wasm-memory/
[4] https://www.cs.cmu.edu/~csd-phd-blog/2023/provably-safe-sandboxing-wasm/
[5] https://chatgpt.com/share/b890aede-1c82-412a-89a9-deae99da506e
[6] https://www.assemblyscript.org/


Thread (43 messages)

« previous php.internals (#125511) next »