Skip to content

Should UPath.__new__ return pathlib.Path instances for local paths #90

Closed
@ap--

Description

@ap--

Summarizing from #84

When creating a UPath instance universal_pathlib returns a pathlib.Path instance for local filesystems, which can cause problems with type checkers:

from upath import UPath     # UPath is a subclass of pathlib.Path

pth = UPath("/local/path")  # returns i.e. a pathlib.PosixPath instance on non-windows
reveal_type(pth)            # Revealed type is "upath.core.UPath"

Possible solutions:

(1) user explicitly type annotates

No changes. We could just document it very explicitly.

pth: pathlib.Path = UPath(...)

(2) always return a UPath instance

I think the reason for returning a pathlib.Path instance for local filesystems is to guarantee that there are no changes in behavior when using UPath as a replacement.

(2A) create an intermediate subclass

We could still guarantee pathlib behavior by creating an intermediate UPath class

class UPath(pathlib.Path):
    def __new__(cls, *args, **kwargs):
        if ...:                         # is local path
            return cls(*args, **kwargs)
        else:                           # promote to a fsspec backed class
            return FSSpecUPath(*args, **kwargs)
    
    # > we could define additional properties here too `.fs`, etc...

class FSSpecUPath(UPath):
    ...  # provides all the methods current UPath provides

(2B) we always return the fsspec LocalFileSystem backed UPath instance for local paths

This would be a simple change, but I think we should then port the CPython pathlib test suite to universal_pathlib, to guarantee as good as we can that the behavior is identical. I'm worried that symlink support, the windows UNC path stuff, and other weird edge-cases will probably cause quite a few issues.

(3) apply some magic

We could provide UPath as an alias for the UPath class but cast it to type[pathlib.Path] to trick the static typecheckers. And we could fallback to some metaclass magic to make runtime typechecking work, but this would basically (partially?) revert the cleanup done here: #56

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions