Re: [RFC] Deprecations for PHP 8.4

From: Date: Thu, 27 Jun 2024 07:44:55 +0000
Subject: Re: [RFC] Deprecations for PHP 8.4
References: 1 2 3 4  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
Hi,

On 26.06.24 07:18, Mike Schinkel wrote:
On Jun 25, 2024, at 4:51 PM, Gina P. Banyard <[email protected] <mailto:[email protected]>> wrote: On Tuesday, 25 June 2024 at 19:06, Mike Schinkel <[email protected] <mailto:[email protected]>> wrote:
strtok() ===== strtok() is found 35k times in GitHub:
    https://github.com/search?q=strtok%28+language%3APHP+&type=code
    <https://github.com/search?q=md5%28+language%3APHP+&type=code>
It is a commonly used as a "left part of string up to a character" in addition to its intended use for tokenizing. I would prefer not deprecated because of BC breakage, but IF it is deprecated I would suggest adding a one-for-one replacement function for the  "left part of string up to a character" use-case; maybe str_left("abc.txt",".") returning "abc".
For this exact case of extracting a file name without an extension, you should really just use: |pathinfo($filepath, PATHINFO_FILENAME);| But for something more generic, you can just do: explode($delimiter, $str)[0]; So I really don't see why we would need an "str_left()" function.
Ah, *the dangers of providing a specific example of a broader use-case* is that someone will invariably discredit the specific example instead of focusing on the applicability for the broader use-case. 🤦‍♂️ To wit, here are seven (7) use-cases for which pathinfo() is not a viable alternative:
    https://3v4l.org/RDYFs#v8.3.8 <https://3v4l.org/RDYFs#v8.3.8>
Note those seven use-cases are found in around the first 25 results when searching GitHub for "strtok(".  I could probably find more if I kept looking:
    https://github.com/search?q=strtok%28+language%3APHP+&type=code
    <https://github.com/search?q=strtok%28+language%3APHP+&type=code>
Regarding explode($delimiter, $str)[0] — unless it is to be special-cased during compilation —it is a really inefficient way to find the substring up to the first character, especially for large strings and/or when in a tight loop where the explode is contained in a called function. Here is a benchmark (https://onlinephp.io/c/87341 <https://onlinephp.io/c/87341>) showing that — on average of the runs I performed — for using strtok() to fully process through a 3972 byte file with 359 commas it took right at */90 times/* longer using explode($delimiter, $str)[0] vs. strtok($str,$delimiter). Imagine is the file were 39,720 bytes, or larger, instead.
    Size of file:                3972
    Number of commas:            359
    Time taken for strtok:       0.0034 seconds
    Time taken for explode:      0.3036 seconds
    *Times strtok() faster:     89.1*
Yes the above processes the entire file using explode()[0] each time rather than first using explode(",") once — because of the equivalent of the N+1 problem[1] where the explode() is buried in a function. This illustrates why strtok() is so good for its primary use-case of parsing text files. strtok() is fast and does not use heaps of memory on every token. This leads me to think strtok() */should not/* be deprecated given how inefficient string handling in PHP can otherwise be, at least not without a much more efficient object for string parsing.
I'm with Mike on strtok() and don't understand why it would be on a deprecation list. I see nothing inherently "wrong" or "dangerous" with it: it's one of the "works an intended" and if you know how to use it, it works perfectly they way it is designed. The variations of suggestions in other replies how to handle certain use cases of strtok() already shows there's no clear migration path and depends on the situation, which is the worst. Compare this with suggestion like sha1() or similar, where the deprecation is about "the function, but not the functionality", because SHA1 is available by other means. But there's no clear alternative to strtok(), as it is its own kind. 👎 on deprecating it; if a gotcha with it is not clear (e.g. using it in different scopes, as this was brought up), I see this rather as a "documentation problem". cheers, - Markus

Thread (68 messages)

« previous php.internals (#123931) next »