-
Notifications
You must be signed in to change notification settings - Fork 7.9k
ext/bcmath: Performance improvement bcsqrt()
#18771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
bbfe6d5
to
9cf6b03
Compare
sqrt()
bcsqrt()
bd573f4
to
b60f757
Compare
Sorry, it was still incomplete. |
193a1a5
to
9c4af5c
Compare
9c4af5c
to
70613a0
Compare
70613a0
to
d7d1d80
Compare
The code is ready for review. |
It looks like |
9da96c2
to
aef9c17
Compare
done |
No longer using |
… are no longer used.
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First two commits are fine, but simultaneously refactored and optimized code is too hard to follow. The commits need to be split between refactoring and actual optimization, and a high level picture must be explained in the commit descriptions.
/* Initial checks. */ | ||
if (bc_is_neg(local_num)) { | ||
if (bc_is_neg(*num)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I wonder is whether this was the right move.
One the one hand, keeping *num
in a local variable may save a pointer load, on the other hand it may get spilled in between calls anyway.
Got it. I’ll split the commits starting from the third one. |
Sure |
Test
Approximately 80 million calculations were performed using values of various magnitudes and scales, and all results matched those from the previous implementation.
Benchmark
Performance improvements are particularly noticeable in the following cases:
For large values that do not fall into the above categories, most of the execution time is spent on the iterative process of the Newton-Raphson method, especially on division operations. As a result, while there may be some minor gains from reducing memory allocations and lowering the cost of converting BCD to BC_VECTOR, there is no significant improvement in overall performance.
Small size value (fast path)
Code:
Result:
Middle size value (< 1) (standard path)
Code:
Result:
Middle size value 1 (fast path)
The new logic ignores unnecessary scales in the calculation, so this is a fast path.
Code:
Result:
Middle size value 2 (standard path)
Code:
Result:
Big size value (standard path)
Code:
Result: