-
Notifications
You must be signed in to change notification settings - Fork 7.8k
str_starts_with slower than userland #18474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Possibly related to #18204 |
Interesting that on PHP 8.2 the results for Fedora 42 x86_64 (64-bit). |
Alright, let's have a look...
I don't think so, the implementation only relies on engine API and doesn't call into userland, it should be easy to beat a PHP implementation normally. |
First of all, I can't fully reproduce the results you get.
If I then remove the pcre one, I get this
Possibly caching/throttling related. Second, the benchmarking code is likely flawed. Rewriting the code like this: https://gist.github.com/nielsdos/c5525659ecad22074b41b1acd2bccac1
Suggesting that the overhead of the call is still drowning the execution time of the funtion. |
The original report appears to have xdebug enabled, I get similar results to the original report with this enabled using the updated bench version on MacOS. When testing with xdebug not loaded I do see something interesting with timing.
On Linux (compiled with gcc) I don't get much of a different between the calls, the same as the prev comment.
On MacOS (compiled with clang) str_starts_with drops down the ladder for execution time and it always slower.
I did notice that strncmp is defined with fastcall vs str_starts_with with zend_inline_always, I don't know if clang has different results than GCC for those two. Of course this is only a single call of each function, however the timing spread was the same on multiple calls of the bench script. |
|
Oh I misread the code then. I thought strncmp was defined ZEND_FUNCTION(strncmp) which calls zend_binary_strncmp which relies on memcmp too. I only noticed that this function is defined ZEND_FASTCALL zend_binary_strncmp vs zend_always_inline bool zend_string_starts_with_cstr |
No actually I got it mixed up ;) Which makes the difference even weirder |
If I expand the number of test cases to from 100,000 to 1,000,000 the differences almost disappear, changing the size of the test cases also sees a lot of the differences disappear. strncmp_startswith2 is consistently faster with it's first byte fast fail however.
|
Probably makes sense as you're going to measure the overhead of the engine less and you're starting to more measure the actual comparison.
Sure, but that check is only worth it if you know likely that the first byte is different (which is true for this benchmark, which is a flaw of the data used), whereas Anyway, what these experiments seem to point out is mostly that the benchmark is flawed. |
@nielsdos Ubuntu 24.04 (packages via //ppa.launchpadcontent.net/ondrej/php/ubuntu/) on a "AMD EPYC 7642 48-Core Processor"
MacOS 12.7 (package via homebrew) on an Intel mac.
|
I get similar results as Niels. With the original script:
With Niels' adjustments:
|
@jdarwood007 Which of the benchmarks are you using here? Can you try the one Niels posted? I don't think this issue is actionable. Unfortunately, both compilers and CPUs are unpredictable, and synthetic benchmarks are very good at amplifying this unpredictability. Making changes that help for all machines, all code bases and all input is likely impossible. |
@iluuu1994 I tried the one niels mentioned and still see the slower results. |
I tried both with the packaged 8.4.6 version (from nix) and compiling As mentioned, both |
Also: try without xdebug 🙂 (i.e. extension unloaded) |
I think they did:
|
Description
Firstly, apologies if this is the wrong place.
I was reviewing some code going into a project, and it used
str_starts_with
. Being new to using it myself wondered about its performance compared to other options and suspected that it should outperform any userland functions. Knowing that if it performed better, it would make a use case for a future micro-optimization PR in the project.I shamelessly borrowed some Stack Overflow code and added str_starts_with into the benchmarking script and ran it
https://gist.github.com/jdarwood007/1a949424f8cf85ca1b5f66ce38527eb6
PHP 8.4 results
PHP 8.3 had similar results.
8.2 had improved results across the board on all tests except preg_match, but I only have a limited set of systems to test with. Results were in the same order.
Is this something that could be expected to be looked into, and see if performance improvements can be made? Seeing userland functions outperform the native functions seems to indicate to me that there could be some improvements
PHP Version
PHP 8.4.5 (cli) (built: Mar 13 2025 15:36:20) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.4.5, Copyright (c) Zend Technologies
with Zend OPcache v8.4.5, Copyright (c), by Zend Technologies
with Xdebug v3.4.2, Copyright (c) 2002-2025, by Derick Rethans
Operating System
Ubuntu 24.04, MacOS 12.7
The text was updated successfully, but these errors were encountered: