> On Jun 25, 2024, at 4:51 PM, Gina P. Banyard <[email protected]> wrote:
>
>
> On Tuesday, 25 June 2024 at 19:06, Mike Schinkel <[email protected]> wrote:
>>
>> strtok()
>> =====
>> strtok() is found 35k times in GitHub:
>>
>> https://github.com/search?q=strtok%28+language%3APHP+&type=code
>> <https://github.com/search?q=md5%28+language%3APHP+&type=code>
>>
>> It is a commonly used as a "left part of string up to a character" in addition to
>> its intended use for tokenizing.
>>
>> I would prefer not deprecated because of BC breakage, but IF it is deprecated I would
>> suggest adding a one-for-one replacement function for the "left part of string up to a
>> character" use-case; maybe str_left("abc.txt",".")
returning
>> "abc"
.
>
>
> For this exact case of extracting a file name without an extension, you should really just use:
> pathinfo($filepath, PATHINFO_FILENAME);
> But for something more generic, you can just do:
> explode($delimiter, $str)[0];
>
> So I really don't see why we would need an "str_left()" function.
Ah, the dangers of providing a specific example of a broader use-case is that someone will
invariably discredit the specific example instead of focusing on the applicability for the broader
use-case. 🤦♂️
To wit, here are seven (7) use-cases for which pathinfo()
is not a viable alternative:
https://3v4l.org/RDYFs#v8.3.8 <https://3v4l.org/RDYFs#v8.3.8>
Note those seven use-cases are found in around the first 25 results when searching GitHub for
"strtok(". I could probably find more if I kept looking:
https://github.com/search?q=strtok%28+language%3APHP+&type=code
<https://github.com/search?q=strtok%28+language%3APHP+&type=code>
Regarding explode($delimiter, $str)[0] — unless it is to be special-cased during compilation —it
is a really inefficient way to find the substring up to the first character, especially for large
strings and/or when in a tight loop where the explode is contained in a called function.
Here is a benchmark (https://onlinephp.io/c/87341) showing that — on average of the runs I
performed — for using strtok()
to fully process through a 3972 byte file with 359
commas it took right at 90 times longer using explode($delimiter, $str)[0] vs.
strtok($str,$delimiter). Imagine is the file were 39,720 bytes, or larger, instead.
Size of file: 3972
Number of commas: 359
Time taken for strtok: 0.0034 seconds
Time taken for explode: 0.3036 seconds
Times strtok() faster: 89.1
Yes the above processes the entire file using explode()[0] each time rather than first using
explode(",") once — because of the equivalent of the N+1 problem[1] where the explode()
is buried in a function. This illustrates why strtok() is so good for its primary use-case of
parsing text files. strtok() is fast and does not use heaps of memory on every token.
This leads me to think strtok()
should not be deprecated given how inefficient string
handling in PHP can otherwise be, at least not without a much more efficient object for string
parsing.
-Mike
[1] https://www.baeldung.com/cs/orm-n-plus-one-select-problem
<https://www.baeldung.com/cs/orm-n-plus-one-select-problem>