> On Jul 8, 2024, at 12:03 PM, Alexandru Pătrănescu <[email protected]> wrote:
> Managed to simplify it like this and I find it reasonable enough:
> function strtok2(string $string, ?string $token = null): string|false {
> static $tokenGenerator = null;
> if ($token) {
> $tokenGenerator = (function(string $characters) use ($string): \Generator {
> $pos = 0;
> while (true) {
> $pos += \strspn($string, $characters, $pos);
> $len = \strcspn($string, $characters, $pos);
> if ($len === 0)
> return;
> $token = \substr($string, $pos, $len);
> $characters = yield $token;
> $pos += $len;
> }
> })($token);
> return $tokenGenerator->current() ?? false;
> }
> return $tokenGenerator?->send($string) ?? false;
> }
Hi Alexandru,
Great attempt.
Unfortunately, however, it seems around 4.5 slower than strtok():
https://3v4l.org/7lXlM#v8.3.9 <https://3v4l.org/7lXlM#v8.3.9>
> On Jul 8, 2024, at 2:23 PM, Claude Pache <[email protected]> wrote:
>> Le 6 juil. 2024 à 03:22, Mike Schinkel <[email protected]> a écrit :
>>> On Jul 5, 2024, at 1:11 PM, Claude Pache <[email protected] <mailto:[email protected]>> wrote:
>>> * About strtok(): An exact replacement of strtok()
that is reasonably
>>> performant may be constructed with a sequence of strspn(...) and strcspn(...) calls; here is an
>>> implementation using a generator in order to keep the state: https://3v4l.org/926tC <https://3v4l.org/926tC>
>> Well your modern_strtok() function is not an _exact_ replacement as it requires using a
>> generator and thus forces the restructure of the code that calls strtok().
>
> Yes, of course, I meant: it has the exact same semantics. You cannot have the same API without
> keeping global state somewhere. If you use strtok() for what it was meant for, you must restructure
> your code if you want to eliminate hidden global state.
Hi Claude,
Agreed that semantics would have to change. Somewhat.
The reason I made the comment was when I saw you stated it was an "exact replacement" I
was concern people not paying close attention to the thread may see it and and think: "Oh,
okay, there is an exact, drop-in replacement so I will vote to deprecate" when that same person
might not vote to deprecate if they did not think there was an exact drop-in replacement. But I did
my best to try to soften my words so it did not come off as accusatory and instead just
matter-of-fact. If I failed at that, I apologize.
Anyway, your comments about needing to change the semantics got me thinking that addressing the
concern when remediating code with strtok() could be much closer to a drop in replacement than a
generator, assuming there is a will to actually tackle this. And this it small enough scope that I
might even be able to learn enough C-for-PHP to create a pull request, if the idea were blessed.
Consider this simple code for using strtok()
:
$token = strtok($content, ',');
while ($token !== false) {
$token = strtok (',');
}
Now compare to this potential enhancement:
$handle=strtok($content, ',', STRTOK_INIT);
do {
$token = strtok($handle);
} while ($token !== false);
strtok($handle, STRTOK_RELEASE)
This would be much closer to a drop-in replacement, would allow PHP to keep the fast strtok()
function, AND would address the reason for deprecation.
See any reason this approach would not be viable?
-Mike