On Sat, Jul 27, 2024, at 15:26, Christoph M. Becker wrote:
> On 15.02.2023 at 06:18, Rowan Tommins wrote:
>
> > On 15 February 2023 02:35:42 GMT, Thomas Hruska <[email protected]> wrote:
> >
> >> On 2/14/2023 2:02 PM, Rowan Tommins wrote:
> >>
> >> I thought about that but didn't know how well it would be received nor, perhaps
> >> more importantly, the direction it should take (i.e. a formal Zend type in the engine, extending the
> >> existing zend_string type, a class, some combination, or something else entirely). All of the more
> >> advanced options I came up with would have required some code changes to the PHP source itself with
> >> a new data type being the most involved and probably the most controversial.
> >
> > My instinct was that it could just be a built-in class, with an internal pointer to a
> > zend_string that's completely invisible to userland. Something like how the SimpleXML and DOM
> > objects just point into a libxml parse result.
> >
> > Then to add to existing functions requires changing an argument type from string to
> > string|Buffer, rather than adding new arguments.
> >
> > No change to the type system needed, internally or externally, just some code to unwrap
> > the pointer. But perhaps I'm being naive and oversimplifying, as I don't have a deep
> > understanding of the engine.
> >
> >> I'm not entirely sure what the next step here should be. Should I go research
> >> the above, or go back and develop/test and then propose something concrete in an OO direction and
> >> gather feedback at that point, or should we hash it out a bit more here on the list to get a more
> >> specific direction to go in?
> >
> > Well, those were just my thoughts; maybe someone else will come along shortly with a very
> > different take.
>
> I'm very late on this discussion, but I think it is an interesting
> topic, and maybe <https://github.com/cmb69/php-stringbuilder>,
> which I
> had written long ago just to check some assumptions, can serve as POC.
> It is certainly possible to have such a string buffer class without
> having to patch the engine; it could even be made available as PECL
> extension (first).
>
> Note that this StringBuilder uses smart_str
s[1] what might be a good
> idea or not. But certainly you could use some other internal handling;
> interoperability with zend_string
s[2] requires to copy the char arrays
> in most cases anyway, since these have a fixed length, and if these
> copies are reduced to a minimum (i.e. the new class has enough
> flexibility to work without casting to and from string), that should be
> bearable.
>
> Not sure if that would work for the "gd imageexportpixels() and
> imageimportpixels()" RFC[3], but it might be worth investigating.
>
> [1]
> <https://www.phpinternalsbook.com/php7/internal_types/strings/smart_str.html>
> [2]
> <https://www.phpinternalsbook.com/php7/internal_types/strings/zend_strings.html>
> [3] <https://wiki.php.net/rfc/gd_image_export_import_pixels>
>
> Cheers,
> Christoph
>
Huh, I am also very late and somewhat poignant, last weekend, I managed to refactor all zend_strings
to contain a char* instead of char[1] and the char* pointed to the memory just after the pointer. It
increased zend_string by a few bytes on a 64bit machine, but would allow for some nice
optimizations, such as zend_strings sharing memory (effectively removing the need for the current
interned strings implementation). I ended up ditching it because it would break literally every
extension that does its own allocations instead of calling zend_string_alloc|init() and it was also
hard to manage when copying strings, which also some core extensions do instead of calling core
zend_string_* functions. Needless to say, "vanilla php" worked fine and all tests passed.
I did submit a small part of my refactoring here: https://github.com/php/php-src/pull/15054 but
even something that simple didn't seem well received. So, I won't continue this approach.
But, fwiw, I wouldn't advise changing zend_strings too much, many extensions appear to do one
of two things: their own allocations and/or their own copying and/or their own freeing.
— Rob