Skip to content

GH-111429: Speed up pathlib.PurePath.[is_]relative_to() #111431

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 12, 2023

Conversation

barneygale
Copy link
Contributor

@barneygale barneygale commented Oct 28, 2023

Avoid unnecessary calls to with_segments(). This makes both is_relative_to() and relative_to() faster when passed a PurePath object, and makes relative_to() faster when passed another kind of path-like object (like a str).

Also, use _from_parsed_parts() in relative_to() to return a pre-parsed path. Operations like str(p.relative_to(q)) are faster as a result.

@Jason-Y-Z
Copy link
Contributor

Thanks for the change! Overall LGTM. A small suggestion would be - would you mind doing a quick profiling of these 2 functions before/after the change please, just so that we can better understand the effect of the change?
Something like timeit might be helpful for that.

@barneygale
Copy link
Contributor Author

barneygale commented Oct 29, 2023

The improvement depends on the type of the argument, number of segments in each path, and in the case of relative_to(), how the result is used. So take with a pinch of salt - the important bit is that some things are faster and nothing is slower:

$ ./python -m timeit \
    -s 'from pathlib import Path; p0 = Path("foo/bar"); p1 = Path("foo")' \
    'str(p0.relative_to(p1))'
10000 loops, best of 5: 20.3 usec per loop  # before
50000 loops, best of 5: 9.13 usec per loop  # after

$ ./python -m timeit \
    -s 'from pathlib import Path; p0 = Path("foo/bar"); p1 = "foo"' \
    'str(p0.relative_to(p1))'
10000 loops, best of 5: 20.4 usec per loop  # before
20000 loops, best of 5: 14.5 usec per loop  # after

$ ./python -m timeit \
    -s 'from pathlib import Path; p0 = Path("foo/bar"); p1 = Path("foo")' \
    'p0.is_relative_to(p1)'
50000 loops, best of 5: 9.01 usec per loop  # before
50000 loops, best of 5: 4.15 usec per loop  # after

$ ./python -m timeit \
    -s 'from pathlib import Path; p0 = Path("foo/bar"); p1 = "foo"' \
    'p0.is_relative_to(p1)'
50000 loops, best of 5: 9.04 usec per loop  # before
50000 loops, best of 5: 8.91 usec per loop  # after

@Jason-Y-Z
Copy link
Contributor

@pitrou Sorry for tagging, but based on contributor history, would you mind giving a quick review?

@barneygale barneygale merged commit d7cef7b into python:main Nov 12, 2023
@barneygale
Copy link
Contributor Author

Thanks for reviewing @Jason-Y-Z!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage topic-pathlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants