Skip to content

Conversation

StephanTLavavej
Copy link
Member

@StephanTLavavej StephanTLavavej commented Apr 25, 2025

I noticed that we had a single remaining usage of separately compiled multiprecision arithmetic machinery. I believe that we can replace this with the modern _Unsigned128 type, which is far better tested and more widely exercised. As @AlexGuteniev observed, _Unsigned128 uses intrinsics, so it should also be faster than multprec.cpp. And as a final bonus, it makes this codepath strongly resemble the 32-bit and 64-bit codepaths above.

👨‍🔬 Proof that 128 bits are sufficient

The function comment says:

STL/stl/inc/random

Lines 474 to 482 in b5df16a

// Choose intermediate type:
// To use type T for the intermediate calculation, we must show
// _Ax * (_Mx - 1) + _Cx <= numeric_limits<T>::max()
// Split _Cx:
// _Cx <= numeric_limits<T>::max()
// && _Ax * (_Mx - 1) <= numeric_limits<T>::max() - _Cx
// Divide by _Ax:
// _Cx <= numeric_limits<T>::max()
// && (_Mx - 1) <= (numeric_limits<T>::max() - _Cx) / _Ax

The first part, _Cx <= (2^128 - 1), is obviously true.

The next part is (_Mx - 1) <= ((2^128 - 1) - _Cx) / _Ax. This is the hardest to achieve for the largest LHS (caused by the largest _Mx) and the smallest RHS (caused by the largest _Ax and the largest _Cx). So ((2^64 - 1) - 1) <= ((2^128 - 1) - (2^64 - 1)) / (2^64 - 1) is the worst case.

C:\Temp>py
Python 3.13.3 (tags/v3.13.3:6280bb5, Apr  8 2025, 14:47:33) [MSC v.1943 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> ((2**64 - 1) - 1)
18446744073709551614
>>> ((2**128 - 1) - (2**64 - 1)) // (2**64 - 1)
18446744073709551616
>>> ((2**64 - 1) - 1) <= ((2**128 - 1) - (2**64 - 1)) // (2**64 - 1)
True

⚙️ Commits

  • _Uint _Cx always satisfies _Cx <= ULLONG_MAX.
  • Use _Unsigned128, preserve multprec.cpp for bincompat.
    • We can drop _INLINE_VAR from _MP_len, giving it internal linkage within multprec.cpp.
    • We don't need extern "C++" on the declarations - that was for modules.
    • We need to add _CRTIMP2_PURE to the definitions. I thought we had harmonized all declarations and definitions, but we missed these.
  • Reduce multprec.cpp dependencies to slightly improve throughput.
    • We can hardcode 64 instead of including <limits> for numeric_limits<unsigned long long>::digits.
    • uint64_t and unsigned long long are synonyms. In fact, this file was already using unsigned long long for the definitions, even though the declarations originally said uint64_t.
    • We moved the declarations out of <random> so we don't need it anymore. All we need is <yvals.h>.

✅ Notes

  • I verified (with static_assert(false)) that this codepath is exercised by our test suites.
  • I verified (with dumpbin /exports) that this doesn't affect the dllexport surface.

We can drop `_INLINE_VAR` from `_MP_len`, giving it internal linkage within multprec.cpp.

We don't need `extern "C++"` on the declarations - that was for modules.

We need to add `_CRTIMP2_PURE` to the definitions. I thought we had harmonized all declarations and definitions, but we missed these.
We can hardcode `64` instead of including `<limits>` for `numeric_limits<unsigned long long>::digits`.

`uint64_t` and `unsigned long long` are synonyms. In fact, this file was already using `unsigned long long` for the definitions, even though the declarations originally said `uint64_t`.

We moved the declarations out of `<random>` so we don't need it anymore. All we need is `<yvals.h>`.
@StephanTLavavej StephanTLavavej added performance Must go faster and removed enhancement Something can be improved labels Apr 25, 2025
@StephanTLavavej StephanTLavavej moved this from Initial Review to Final Review in STL Code Reviews Apr 25, 2025
@StephanTLavavej
Copy link
Member Author

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

Copy link
Member

@zacklj89 zacklj89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appreciate the python in the description 💯

@StephanTLavavej StephanTLavavej moved this from Final Review to Merging in STL Code Reviews Apr 29, 2025
@StephanTLavavej StephanTLavavej merged commit 8e9f4ee into microsoft:main Apr 29, 2025
39 checks passed
@StephanTLavavej StephanTLavavej deleted the mp-arr-matey branch April 29, 2025 12:23
@github-project-automation github-project-automation bot moved this from Merging to Done in STL Code Reviews Apr 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Must go faster

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants