Skip to content

gh-70870: Clarify dual usage of 'free variable' #122545

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Oct 8, 2024

Conversation

ncoghlan
Copy link
Contributor

@ncoghlan ncoghlan commented Aug 1, 2024

  • Define closure variable and free variable in glossary
  • Use free (closure) variable to link to the closure variable term when describing
    API elements that use "free" in the narrower "closure variable" sense rather than
    the broader formally defined sense of any reference to names other than those of
    local variables
  • Switch entirely to the closure variable term when describing API elements that
    don't otherwise make any reference to the term "free"

Closes #70870
Closes #98117


📚 Documentation preview 📚: https://cpython-previews--122545.org.readthedocs.build/

@ncoghlan ncoghlan added docs Documentation in the Doc dir triaged The issue has been accepted as valid by a triager. 3.13 bugs and security fixes 3.14 bugs and security fixes needs backport to 3.13 bugs and security fixes labels Aug 1, 2024
@encukou
Copy link
Member

encukou commented Aug 2, 2024

Now, whenever you read free variable, you're left wondering about which of the two meanings is meant. Can we avoid that?

Would it make sense to accept the “pragmatic” meaning of free variable?

[edit: Oh, I understood this wrong: the distinction is in globals/builtins, not in variables used in an inner scope, right?]
..., and in the few cases that exclude variables only accessed in inner nested scopes, say that explicitly?

After all, “used” in executionmodel.html's definition is itself not too specific. Does a nonlocal without “actual” use count? Does a def statement that binds a cell from an outer scope count?

@ncoghlan
Copy link
Contributor Author

ncoghlan commented Aug 3, 2024

Now, whenever you read free variable, you're left wondering about which of the two meanings is meant. Can we avoid that?

Would it make sense to accept the “pragmatic” meaning of free variable?

We can't, as the formal meaning isn't specific to Python.

The heart of the problem is that the use of the closure/free name pairing when cell variables were introduced was formally incorrect, since the named free variables were potentially only a subset of all free variables if a nested function also referenced global or builtin names. However, the docs haven't historically made that clear, they just called them free variables, so folks inferring the meaning of the phrase from the way those parts of the CPython docs used it would understandably conclude that it referred specifically to references to closure variables (e.g. I had to go read the code to determine whether the symtable docs were using "free" in its formal sense or its pragmatic sense, and it turned out to be the latter).

That naming decision was made decades ago though, so the best we can reasonably do is:

  1. acknowledge that the phrase "free variable" is now unavoidably ambiguous at least as far as CPython is concerned, and to some degree for Python in general
  2. indicate that some contexts (such as the formal language spec) always use one of the two meanings so it doesn't need to be spelled out on each use (this mostly applies to cases where "free variable" is being used correctly in the formal sense, since I didn't want to modify the related wording in the execution model docs as part of this PR)
  3. where practical, update the wording of affected documentation to remove the ambiguity (this mostly involved either adding a "closure" qualifier or explicitly mentioning that globals and builtins were excluded when the term is being used in its pragmatic sense)

And yeah, references to variables in outer scopes are included in both the formal and pragmatic uses of the phrase. The difference is just that the pragmatic use omits any references to global and builtin names, since the compiler doesn't actually keep track of those outside the individual lookup opcodes, so the only way to get that list is to scan the full AST or opcode array for the code block.

(While the ambiguity is unfortunate, the situation isn't bad enough to justify making implementation changes solely due to this)

@encukou
Copy link
Member

encukou commented Aug 3, 2024

One more idea that might be off the mark: how about using nonlocal rather than closure variable?
The keyword already can't be used for global/built-in variables ( nonlocal scopes are the local scopes of the enclosing functions, so it's not a synonym for “non-local”). Some variables are nonlocal implicitly.

@ncoghlan
Copy link
Contributor Author

ncoghlan commented Aug 8, 2024

We use nonlocal in the symtable API to specifically mean "marked with the nonlocal keyword", rather than as a general term for all closure variables.

The function object attribute is also called __closure__ rather than __nonlocals__, as is the parameter to exec and eval, so there's no getting away from closure variables being a term worth defining.

@ncoghlan ncoghlan marked this pull request as draft September 13, 2024 09:04
@ncoghlan
Copy link
Contributor Author

(Converted to draft, since I don't know when I'll get back to this and incorporate Petr's feedback)

@JelleZijlstra
Copy link
Member

I read the PR and all makes sense to me. Happy to take another look after you incorporate Petr's feedback.

@ncoghlan ncoghlan marked this pull request as ready for review October 7, 2024 13:08
@ncoghlan ncoghlan requested a review from willingc as a code owner October 7, 2024 13:08
Copy link
Contributor

@willingc willingc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs look good to me @ncoghlan. Thanks for the many clarifications.

@ncoghlan
Copy link
Contributor Author

ncoghlan commented Oct 8, 2024

Thanks for the reviews, folks!

@ncoghlan ncoghlan enabled auto-merge (squash) October 8, 2024 07:34
@ncoghlan ncoghlan merged commit 2739099 into python:main Oct 8, 2024
23 checks passed
@miss-islington-app
Copy link

Thanks @ncoghlan for the PR 🌮🎉.. I'm working now to backport this PR to: 3.13.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Oct 8, 2024
The term "free variable" has unfortunately become genuinely
ambiguous over the years (presumably due to the names of
some relevant code object instance attributes).

While we can't eliminate that ambiguity at this late date, we can
at least alert people to the potential ambiguity by describing
both the formal meaning of the term and the common
alternative use as a direct synonym for "closure variable".

---------

(cherry picked from commit 2739099)

Co-authored-by: Alyssa Coghlan <[email protected]>
Co-authored-by: Carol Willing <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Oct 8, 2024

GH-125088 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Oct 8, 2024
ncoghlan added a commit that referenced this pull request Oct 8, 2024
…125088)

The term "free variable" has unfortunately become genuinely
ambiguous over the years (presumably due to the names of
some relevant code object instance attributes).

While we can't eliminate that ambiguity at this late date, we can
at least alert people to the potential ambiguity by describing
both the formal meaning of the term and the common
alternative use as a direct synonym for "closure variable".

---------

(cherry picked from commit 2739099)

Co-authored-by: Alyssa Coghlan <[email protected]>
Co-authored-by: Carol Willing <[email protected]>
efimov-mikhail pushed a commit to efimov-mikhail/cpython that referenced this pull request Oct 9, 2024
The term "free variable" has unfortunately become genuinely
ambiguous over the years (presumably due to the names of
some relevant code object instance attributes).

While we can't eliminate that ambiguity at this late date, we can
at least alert people to the potential ambiguity by describing
both the formal meaning of the term and the common
alternative use as a direct synonym for "closure variable".

---------

Co-authored-by: Carol Willing <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes 3.14 bugs and security fixes docs Documentation in the Doc dir triaged The issue has been accepted as valid by a triager.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improved documentation for exec() closure parameter [doc] Questionable terminology ('free variables') for describing what locals() does
4 participants