Type checking only typed input

stefanv · August 16, 2025, 2:59pm

Hi everyone,

In scikit-image, we have the problem that users sometimes confuse images and image masks: two commons function inputs, both of which are NumPy arrays.

We would like to provide a typing mechanism that would help users identify these problems. However, we do not want to force them to adopt types.

To make this concrete, consider the following example:

import numpy as np
import numpy.typing as npt

from typing import overload, NewType


# Example annotation only — not correct for our purposes
Image = NewType("Image", np.ndarray)
Mask = NewType("Mask", np.ndarray)


@overload
def process(arr: Image) -> None: ...
@overload
def process(arr: np.ndarray) -> None: ...

def process(arr):
    print(arr[0, 0])


x = np.array([[1,2,3]])
process(x)  # should work

y = Image(np.array([[1,2,3]]))
process(y)  # should work

z = Mask(np.array([[1,2,3]]))
process(z)  # should fail, masked provided instead of image

Here, you can see that we want:

For a user who does not type annotate to continue their work as normal, with no type checking failures.
For a user who annotates their types to see when they mix up Images and Masks.

Thus far, we have not been able to find a solution to the problem. One suggestion was to use PEP702 (@deprecated), and I’ve also been looking at PEP746—although I do not think it covers our use-case.

I would like to please ask for your “definitive” guidance on the topic. Thank you in advance!

oscarbenjamin · August 16, 2025, 3:36pm

I think NewType is only really useful if you want to disallow the supertype. If they don’t run a type checker they won’t see any errors though.

How bad would it be to just say “use Image if you want to run a type checker”?

Note that:

y = Image(np.array([[1,2,3]]))
t = y[:]
reveal_type(y) # Image
reveal_type(t) # ndarray[tuple[Any, ...], dtype[Any]]

stefanv · August 16, 2025, 3:38pm

We do not want valid code (using NumPy arrays as input, here) to fail type checking, so unfortunately that is not a solution to our problem.

oscarbenjamin · August 16, 2025, 4:05pm

I realise that but the purpose of NewType is to make it a type checker error to pass the “supertype” even though it would be valid to do so at runtime. If you don’t want that behaviour then you don’t really want NewType.

stefanv · August 16, 2025, 4:13pm

Sorry, that annotation was just in aid of the example. Do you have any suggestions of how to achieve our intended outcome?

oscarbenjamin · August 16, 2025, 4:41pm

Would Image and Mask have different dtypes?

Something like:

Image = np.ndarray[tuple[int, int], np.dtype[np.uint8]]
Mask = np.ndarray[tuple[int, int], np.dtype[np.bool_]]

jorenham · August 16, 2025, 4:44pm

Do image and mask arrays have distinguishable dtypes?

edit: I see that @oscarbenjamin just beat me to it haha

stefanv · August 16, 2025, 9:02pm

That’d be a cute hack, but we have a similar situation arising for Images and LabeledImages, which can both be of type int. So, we really do have to answer the fundamental concern here.

(For context, since I see shape information in the above annotations, we do not know the dimensionality upfront.)

xmw · August 16, 2025, 9:35pm

For your snippet, I think PEP 702 (@typing.deprecated) does work (mypy playground)?

...

@overload
@deprecated("Masks are not supported in `process`")
def process(arr: Mask, /) -> None: ...

...

x = NDArray([[1,2,3]])
process(x)  # OK

y = Image(NDArray([[1,2,3]]))
process(y)  # OK

z = Mask(NDArray([[1,2,3]]))
process(z)  # E: overload def (__main__.Mask) of function __main__.process is deprecated: Masks are not supported in `process`  [deprecated]

The only inconvenient thing here is forcing end users to use the Mask(...) call for an array they know to be a mask. This is less of a problem if scikit-image has public functions annotated to return instances of Masks and the majority of users of scikit-image do actually use these public functions.

However, for this particular example, I think it’s easier just making mask a keyword-only argument (if only for typing purposes) rather than forcing users to use extra calls like Image(…) and Mask(…). It should be fairly obvious what’s intended here, even if the arrays are not named very intuitively:

def process(arr: NDArray, /, *, mask: NDArray = ...) -> None: ...

...

process(arr1, arr2)  # Type-checker error, too many positional arguments
process(arr2, arr1)  # Type-checker error, too many positional arguments
process(arr1, mask=arr2)  # OK

oscarbenjamin · August 16, 2025, 10:15pm

I don’t think there is a solution to the fundamental concern. If the functions should accept ndarray then having a type checker reject subtypes of ndarray goes against the general grain of how type checkers work.

I think NewType does what you basically would want but it only works if you enforce its use and you don’t want to enforce that. Similar considerations applies to all other ways of doing this.

Eneg · August 18, 2025, 10:18pm

This is the same problem as having typed IDs and accepting bare ints at the same time.
One way you could sorta make it work is by having the 2nd overload be a “catch-all” that resolves to Never.

Image = NewType("Image", np.ndarray)
Mask = NewType("Mask", np.ndarray)
type AnyArray = Image | Mask

@overload
def process(arr: Image) -> None: ...
@overload
def process(arr: AnyArray) -> Never: ...
@overload
def process(arr: np.ndarray) -> None: ...

def process(arr: np.ndarray) -> None:
    print(arr[0, 0])

x = np.array([[1,2,3]])
process(x)  # resolves to 3rd overload

y = Image(np.array([[1,2,3]]))
process(y)  # resolves to 1st overload

z = Mask(np.array([[1,2,3]]))
process(z)  # resolves to 2nd overload

The last call results in reachability analysis to report unreachable code afterwards. Not really an error, though.

jorenham · August 19, 2025, 3:11am

Eneg:

Image = NewType("Image", np.ndarray)
Mask = NewType("Mask", np.ndarray)
type AnyArray = Image | Mask

@overload
def process(arr: Image) -> None: ...
@overload
def process(arr: AnyArray) -> Never: ...
@overload
def process(arr: np.ndarray) -> None: ...

def process(arr: np.ndarray) -> None:
    print(arr[0, 0])

The first two overloads overlap but have incompatible return types, i.e. AnyArray is broader than Image, so the second return type should also be broader than the first. But Never is not broader than None, making this an invalid definition.
It’s invalid because if the second overload matches, then so does the first. The input is therefore assignable to AnyArray & Image (intersection because the input is assignable to both). So then it must return (Image -> None) & (AnyArray -> Never), which resolves to None & Never. But that’s not an inhabitable type, which must mean that the overloaded definition is invalid.

ImogenBits · August 19, 2025, 9:11am

In general, the solution here would be to have non-transitive type compatibility. That is, you want ndarray to be compatible with both Mask and Image, but they shouldn’t be compatibel with each other. Since transitivity is a pretty fundamental assumption behind the current type system, I don’t think there’s any combination of workarounds that actually gets you there.

Though I wonder, this feels like it’d be a reasonably common occurrence in libraries that move from untyped to typed code. So it might be worth to think about a solution for this. For example, you could have some kind of “version” or “strictness” decorator that lets type checkers select between different incompatible type annotations. Users can then tell the type checkers which version they would like, starting out with the more untyped version and moving to the stricter ones when they are ready.

E.g. something like this:

@version("strict")
def process(arr: Image) -> None: ...
@version("default")
def process(arr: np.ndarray) -> None: ...

And depending on user settings type checkers would only process one of the variants.

Eneg · August 19, 2025, 12:41pm

This currently works with pyright. I’m aware of the newly proposed overload resolution mechanism, which will likely stop it from working.

Afaik the only other approach would have you create a union for each such function that needs overloading, combining all NewTypes except the one that “works”. That is cumbersome as it results in combinatorial growth of the unions (N functions * M newtypes)

Ideally this would be spelled as AnyArray - Image (or AnyArray & ~Image? I think those are same?)

oscarbenjamin · August 19, 2025, 2:06pm

One way you can make this work is by using TYPE_CHECKING to tell the type checker that Image and Mask are opaque unrelated types:

import numpy as np
import numpy.typing as npt

from typing import overload, NewType, TYPE_CHECKING

if TYPE_CHECKING:
    class Image:
        def __init__(self, arr: npt.NDArray[np.uint8]): ...
    class Mask:
        def __init__(self, arr: npt.NDArray[np.bool]): ...
else:
    Image = lambda x: x
    Mask = lambda x: x

@overload
def process(arr: Image) -> None: ...
@overload
def process(arr: np.ndarray) -> None: ...

def process(arr):
    print(arr[0, 0])


x = np.array([[1,2,3]])
process(x)  # fine

y = Image(np.array([[1,2,3]]))
process(y)  # fine

z = Mask(np.array([[1,2,3]]))
process(z)  # error

I have used this technique in other situations where a type checker is otherwise unable to understand the types because they are conditional on optional dependencies. There are significant downsides like you would need to spell out every single ndarray method that a type checker should understand and it is not possible to tell the type checker that those types can be passed to any function that is annotated as taking only ndarray. I expect that this solution would turn out not to be a good fit for scikit-image.

peterc · August 19, 2025, 3:10pm

This solution seems to work perfectly?
It’s also conceptually equivalent to the solution I came to. (Allow arrays, then disallow Image.)

You need to include Mask(...) calls to get extra type-checker protection, but if you don’t include those Mask(...) calls you still have the baseline type-checker protection. That also seems to be what the OP describes as the desired state.

It does look intuitive to me to use @deprecated like this. -> NotImplemented or -> Error or ->Never or something in that direction would make more sense to me as a reader. But I tried variations of those in the mypy playground and couldn’t get anything to work.

Jelle · August 19, 2025, 3:28pm

@deprecated does probably do what you want here, in terms of producing type checker diagnostics for code that you want to produce diagnostics. It’s conceptually not a great fit though because nothing is actually deprecated here. This might confuse users, and means that type checker configurations would work in unintuitive ways.

There’s a longstanding idea to add something like a @typing_error decorator that would work similar to @deprecated but not carry any implications about deprecation. I included that in an early version of PEP 702 but removed it for simplicity. However, new use cases keep coming up, so now may be a good time for a new PEP proposing this idea.

xmw · August 19, 2025, 5:01pm

The reason why @deprecated(...) is (ab)used for this use case, and why -> Never isn’t (although it semantically make sense than @deprecated(...) here), is because

@deprecated(...) causes an error to be emitted at the call expression site;
-> Never causes an error at the statement after the call site (at least in mypy and pyright), and also requires special options to be turned on (--warn-unreachable and reportUnreachable, respectively).

Emitting an error at the call site is more useful in this case, and IMO most (if not all) other cases. I don’t recall reading anything about why -> Never doesn’t cause error emission at the call site but the following statement instead; I would otherwise opt for -> Never over @deprecated(...) here.

Eneg · August 19, 2025, 5:53pm

Never doesn’t emit errors anywhere. It’s not an error to call a function that never returns, but following statements are logically never going to be reached, thus type checkers emit reachability diagnostics.

xmw · August 19, 2025, 6:10pm

I’m not sure what criteria of “error” you mean here, unless you mean that at runtime, such a function is successfully entered (regardless of whether it raises). If this is the criteria, type-checkers shouldn’t emit diagnostics at the call site for things like mismatched argument types compared with the function’s parameter annotations either, because such functions are also successfully entered.

The only time calling a function actually does result in an error at runtime without entering the function body is a mismatched number of arguments (which raises TypeError without entering the function body).