Skip to content

Conversation

@dcreager
Copy link
Member

@dcreager dcreager commented Dec 1, 2025

This fixes a non-determinism that we were seeing in the constraint set tests in #21715.

In this test, we create the following constraint set, and then try to create a specialization from it:

(T@constrained_by_gradual_list = list[Base])
  ∨
(Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]])

That is, T is either specifically list[Base], or it's any list. Our current heuristics say that, absent other restrictions, we should specialize T to the more specific type (list[Base]).

In the correct test output, we end up creating a BDD that looks like this:

(T@constrained_by_gradual_list = list[Base])
┡━₁ always
└─₀ (Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]])
    ┡━₁ always
    └─₀ never

In the incorrect output, the BDD looks like this:

(Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]])
┡━₁ always
└─₀ never

The difference is the ordering of the two individual constraints. Both constraints appear in the first BDD, but the second BDD only contains T is any list. If we were to force the second BDD to contain both constraints, it would look like this:

(Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]])
┡━₁ always
└─₀ (T@constrained_by_gradual_list = list[Base])
    ┡━₁ always
    └─₀ never

This is the standard shape for an OR of two constraints. However! Those two constraints are not independent of each other! If T is specifically list[Base], then it's definitely also "any list". From that, we can infer the contrapositive: that if T is not any list, then it cannot be list[Base] specifically. When we encounter impossible situations like that, we prune that path in the BDD, and treat it as false. That rewrites the second BDD to the following:

(Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]])
┡━₁ always
└─₀ (T@constrained_by_gradual_list = list[Base])
    ┡━₁ never   <-- IMPOSSIBLE, rewritten to never
    └─₀ never

We then would see that that BDD node is redundant, since both of its outgoing edges point at the never node. Our BDDs are reduced, which means we have to remove that redundant node, resulting in the BDD we saw above:

(Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]])
┡━₁ always
└─₀ never       <-- redundant node removed

The end result is that we were "forgetting" about the T = list[Base] constraint, but only for some BDD variable orderings.

To fix this, I'm leaning in to the fact that our BDDs really do need to "remember" all of the constraints that they were created with. Some combinations might not be possible, but we now have the sequent map, which is quite good at detecting and pruning those.

So now our BDDs are quasi-reduced, which just means that redundant nodes are allowed. (At first I was worried that allowing redundant nodes would be an unsound "fix the glitch". But it turns out they're real! This is the paper that introduces them, though it's very difficult to read. Knuth mentions them in §7.1.4 of TAOCP, and this paper has a nice short summary of them in §2.)

While we're here, I've added a bunch of debug and trace level log messages to the constraint set implementation. I was getting tired of having to add these by hands over and over. To enable them, just set TY_LOG in your environment, e.g.

env TY_LOG=ty_python_semantic::types::constraints::SequentMap=trace ty check ...

[Note, this has an internal label because are still not using specialize_constrained in anything user-facing yet.]

@dcreager dcreager added internal An internal refactor or improvement ty Multi-file analysis & type inference labels Dec 1, 2025
@astral-sh-bot
Copy link

astral-sh-bot bot commented Dec 1, 2025

Diagnostic diff on typing conformance tests

No changes detected when running ty on typing conformance tests ✅

@astral-sh-bot
Copy link

astral-sh-bot bot commented Dec 1, 2025

mypy_primer results

Changes were detected when running on open source projects
beartype (https://github.com/beartype/beartype)
- beartype/claw/_package/clawpkgtrie.py:66:29: warning[unsupported-base] Unsupported class base with type `<class 'dict[str, PackagesTrieBlacklist]'> | <class 'dict[str, Divergent]'>`
- Found 494 diagnostics
+ Found 493 diagnostics

scikit-build-core (https://github.com/scikit-build/scikit-build-core)
- src/scikit_build_core/_logging.py:153:13: warning[unsupported-base] Unsupported class base with type `<class 'Mapping[str, Style]'> | <class 'Mapping[str, Divergent]'>`
- src/scikit_build_core/build/_pathutil.py:25:38: error[invalid-argument-type] Argument to function `__new__` is incorrect: Expected `str | PathLike[str]`, found `DirEntry[Path]`
- src/scikit_build_core/build/_pathutil.py:27:24: error[invalid-argument-type] Argument to function `__new__` is incorrect: Expected `str | PathLike[str]`, found `DirEntry[Path]`
- src/scikit_build_core/build/wheel.py:98:20: error[no-matching-overload] No overload of bound method `__init__` matches arguments
- Found 45 diagnostics
+ Found 41 diagnostics

rotki (https://github.com/rotki/rotki)
- rotkehlchen/accounting/structures/processed_event.py:85:75: warning[unused-ignore-comment] Unused blanket `type: ignore` directive
- rotkehlchen/api/rest.py:1041:73: warning[unused-ignore-comment] Unused blanket `type: ignore` directive
- Found 2052 diagnostics
+ Found 2050 diagnostics

No memory usage changes detected ✅

Base automatically changed from dcreager/fire-display to main December 2, 2025 08:17
@dcreager dcreager force-pushed the dcreager/nondeterminism branch from 017fe62 to 9e5ea4b Compare December 2, 2025 13:52
reveal_type(generic_context(unbounded).specialize_constrained(ConstraintSet.range(bool, T, bool) & ConstraintSet.range(Never, T, str)))

# revealed: ty_extensions.Specialization[T@unbounded = int]
# revealed: ty_extensions.Specialization[T@unbounded = bool]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test (and the other changed one below) were other possible points of nondeterminism, since the result could depend on whether T ≤ bool was kept in the BDD or simplified away as described in the PR body.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand why we solve this to bool. The constraints are T <= int | T <= bool, which seems equivalent to T <= int. Why do we prefer solving to bool when there's the additional (seemingly redundant) constraint present?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because of our current heuristic for handling multiple paths in a BDD. T ≤ int ∨ T ≤ bool ends up looking like this now:

(T ≤ int)
┡━₁ (T ≤ bool)
│   ┡━₁ true
│   └─₀ true
└─₀ (T ≤ bool)
    ┡━₁ impossible
    └─₀ false

There are two paths to the true terminal, one representing T ≤ bool and one representing bool < T ≤ int. T = bool and T = int are the largest types that satisfy each respective path. Our current heuristic says that if there's a type that satisfies all paths, we choose that. (That is, if the intersection of the specializations is non-empty, use it.)

I'm very much open to changing that heuristic, but think that should be follow-on work. I'll mark this as a TODO that we'd rather produce T = int here.

Comment on lines +254 to +256
# Same tests as above, but with the typevar constraints in a different order, to make sure the
# results do not depend on our BDD variable ordering.
def constrained_by_gradual_list_reverse[T: (list[Any], list[Base])]():
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test failed before these fixes, reproducing the test failure in the macOS CI job

Comment on lines -783 to -785
if if_true == if_false {
return if_true;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the clause that used to remove redundant BDD nodes. Removing it changes our BDDs from reduced to quasi-reduced.

Comment on lines +950 to +956
(Node::AlwaysTrue, Node::Interior(interior))
| (Node::Interior(interior), Node::AlwaysTrue) => Node::new(
db,
interior.constraint(db),
Node::AlwaysTrue,
Node::AlwaysTrue,
),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method skips calling Node::new for some inputs, so we have to change the base cases to not throw away redundant nodes. (ditto below)

Comment on lines -3018 to +3172
return String::from("always");
return String::from("never");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a bug in our display rendering that we couldn't trigger before, since never BDDs were always collapsed into the AlwaysNever terminal. Now we might have BDDs where all paths lead to the never terminal, which would hit this clause. (The intuition here is that we're displaying a DNF formula — a union of intersections — and so if the union contains no elements, the overall formula is false/never. And the new snippet just above handles where there's a single-element union of an empty intersection, which represents true/always.)

│ └─₀ (U = bool)
│ ┡━₁ always
│ └─₀ never
┡━₁ (T = bool)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a big blow-up in the size of the BDD, but we still have structural sharing from the salsa interning. So for example, all of the copies of

(U = bool)
┡━₁ always
└─₀ never

are stored once as a single InternalNode. So the rendered size here is not a good proxy for how much memory is used internally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Primer doesn't report any noticeable memory increases, that's good enough for me!

@codspeed-hq
Copy link

codspeed-hq bot commented Dec 2, 2025

CodSpeed Performance Report

Merging #21744 will not alter performance

Comparing dcreager/nondeterminism (948a9fb) with main (cd079bd)

Summary

✅ 22 untouched
⏩ 30 skipped1

Footnotes

  1. 30 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a bit sad that we have to do this. I guess ultimately the reason for it boils down to the fact that constraints can involve gradual types, and we don't staticify those via materialization, and this can lead to these path-dependent redundancy determinations. And we can't reduce constraints to fully static types without breaking the gradual guarantee / causing false positives.

But I guess the positive is that this fix isn't complex at all, fixes the issue, and doesn't seem to cause memory issues!

Thank you.

reveal_type(generic_context(unbounded).specialize_constrained(ConstraintSet.range(bool, T, bool) & ConstraintSet.range(Never, T, str)))

# revealed: ty_extensions.Specialization[T@unbounded = int]
# revealed: ty_extensions.Specialization[T@unbounded = bool]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand why we solve this to bool. The constraints are T <= int | T <= bool, which seems equivalent to T <= int. Why do we prefer solving to bool when there's the additional (seemingly redundant) constraint present?

│ └─₀ (U = bool)
│ ┡━₁ always
│ └─₀ never
┡━₁ (T = bool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Primer doesn't report any noticeable memory increases, that's good enough for me!

* origin/main:
  [ty] Improve `@override`, `@final` and Liskov checks in cases where there are multiple reachable definitions (#21767)
  [ty] Extend `invalid-explicit-override` to also cover properties decorated with `@override` that do not override anything (#21756)
  [ty] Enable LRU collection for parsed module (#21749)
  [ty] Support typevar-specialized dynamic types in generic type aliases (#21730)
  Add token based `parenthesized_ranges` implementation (#21738)
  [ty] Default-specialization of generic type aliases (#21765)
  [ty] Suppress false positives when `dataclasses.dataclass(...)(cls)` is called imperatively (#21729)
  [syntax-error] Default type parameter followed by non-default type parameter (#21657)
  new module for parsing ranged suppressions (#21441)
  [ty] `type[T]` is assignable to an inferable typevar (#21766)
  Fix syntax error false positives for `await` outside functions (#21763)
  [ty] Improve diagnostics for unsupported comparison operations (#21737)
  Move `Token`, `TokenKind` and `Tokens` to `ruff-python-ast` (#21760)
  [ty] Don't confuse multiple occurrences of `typing.Self` when binding bound methods (#21754)
  Use our org-wide Renovate preset (#21759)
  Delete `my-script.py` (#21751)
  [ty] Move `all_members`, and related types/routines, out of `ide_support.rs` (#21695)
@dcreager dcreager merged commit 45842cc into main Dec 3, 2025
41 checks passed
@dcreager dcreager deleted the dcreager/nondeterminism branch December 3, 2025 15:19
dcreager added a commit that referenced this pull request Dec 3, 2025
* origin/main:
  [ty] Reachability constraints: minor documentation fixes (#21774)
  [ty] Fix non-determinism in `ConstraintSet.specialize_constrained` (#21744)
  [ty] Improve `@override`, `@final` and Liskov checks in cases where there are multiple reachable definitions (#21767)
  [ty] Extend `invalid-explicit-override` to also cover properties decorated with `@override` that do not override anything (#21756)
  [ty] Enable LRU collection for parsed module (#21749)
  [ty] Support typevar-specialized dynamic types in generic type aliases (#21730)
  Add token based `parenthesized_ranges` implementation (#21738)
  [ty] Default-specialization of generic type aliases (#21765)
  [ty] Suppress false positives when `dataclasses.dataclass(...)(cls)` is called imperatively (#21729)
  [syntax-error] Default type parameter followed by non-default type parameter (#21657)
dcreager added a commit that referenced this pull request Dec 9, 2025
#21744 fixed some non-determinism in our constraint set implementation
by switching our BDD representation from being "fully reduced" to being
"quasi-reduced". We still deduplicate identical nodes (via salsa
interning), but we removed the logic to prune redundant nodes (one with
identical outgoing true and false edges). This ensures that the BDD
"remembers" all of the individual constraints that it was created with.

However, that comes at the cost of creating larger BDDs, and on #21551
that was causing performance issues. `scikit-learn` was producing a
function signature with dozens of overloads, and we were trying to
create a constraint set that would map a return type typevar to any of
those overload's return types. This created a combinatorial explosion in
the BDD, with by far most of the BDD paths leading to the `never`
terminal.

This change updates the quasi-reduction logic to prune nodes that are
redundant _because both edges lead to the `never` terminal_. In this
case, we don't need to "remember" that constraint, since no assignment
to it can lead to a valid specialization. So we keep the "memory" of our
quasi-reduced structure, while still pruning large unneeded portions of
the BDD structure.

Pulling this out of #21551 for
separate review.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

internal An internal refactor or improvement ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants