Skip to content

Speed up pathlib.Path.iterdir() by using os.scandir() #117727

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
barneygale opened this issue Apr 10, 2024 · 0 comments
Closed

Speed up pathlib.Path.iterdir() by using os.scandir() #117727

barneygale opened this issue Apr 10, 2024 · 0 comments
Labels
performance Performance or resource usage topic-pathlib

Comments

@barneygale
Copy link
Contributor

barneygale commented Apr 10, 2024

We should be able to call os.scandir() from pathlib.Path.iterdir() and construct results based on the os.DirEntry.path string.

Currently we call os.listdir() and _make_child_relpath(), which returns a fully parsed/normalized string; particularly, it sets _str, _drv, _root and _tail_cached.

It's probably not worth the expense of setting _drv, _root and _tail_cached - they're only useful when paths are subsequently deconstructed with PurePath methods, which isn't particularly common. It is worth setting _str, and happily os.DirEntry.path provides a string that's very nearly normalized to pathlib's standards.

Also discussed here: https://discuss.python.org/t/is-there-a-pathlib-equivalent-of-os-scandir/46626/21

Linked PRs

@barneygale barneygale added performance Performance or resource usage topic-pathlib labels Apr 10, 2024
barneygale added a commit to barneygale/cpython that referenced this issue Apr 10, 2024
…ir()`

Replace use of `os.listdir()` with `os.scandir()`. Forgo setting `_drv`,
`_root` and `_tail_cached`, as these usually aren't needed. Use
`os.DirEntry.path` to set `_str`.
barneygale added a commit that referenced this issue Apr 12, 2024
…117728)

Replace use of `os.listdir()` with `os.scandir()`. Forgo setting `_drv`,
`_root` and `_tail_cached`, as these usually aren't needed. Use
`os.DirEntry.path` to set `_str`.
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
…ir()` (python#117728)

Replace use of `os.listdir()` with `os.scandir()`. Forgo setting `_drv`,
`_root` and `_tail_cached`, as these usually aren't needed. Use
`os.DirEntry.path` to set `_str`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage topic-pathlib
Projects
None yet
Development

No branches or pull requests

1 participant