Skip to content

bpo-42272: fix misleading warning filter message/module docs #23172

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 30, 2022

Conversation

kevinoid
Copy link
Contributor

@kevinoid kevinoid commented Nov 5, 2020

"The Warnings Filter" section of the warnings module documentation describes the message and module filters as "a string containing a regular expression". While that is true when they are arguments to the filterwarnings function, it is not true when they appear in -W orPYTHONWARNINGS where they are matched literally (after stripping any starting/ending whitespace). This PR updates the documentation to note when they are matched literally and clarifies that module matches the "fully-qualified module name", rather than "module name" which is ambiguous.

Additionally, the error:::mymodule[.*] example in the "Describing Warning Filters" section of the warnings module documentation does not behave as the comment describes. Since the module portion of the filter string is interpreted literally, it would match a module with a fully-qualified name that is literally mymodule[.*].

Unfortunately, there is not a way to match '"module" and any subpackages of "mymodule"' as documented, since the module part of a filter string is matched literally. Instead, this PR updates the filter and comment to match only "mymodule".

Thanks for considering,
Kevin

https://bugs.python.org/issue42272

@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Dec 16, 2020
"The Warnings Filter" section of the warnings module documentation
describes the message and module filters as "a string containing a
regular expression".  While that is true when they are arguments to the
filterwarnings function, it is not true when they appear in -W or
$PYTHONWARNINGS where they are matched literally (after stripping any
starting/ending whitespace).  Update the documentation to note when they
are matched literally.  Also clarify that module matches the
"fully-qualified module name", rather than "module name" which is
ambiguous.

skip news (since this is a doc fix)

Signed-off-by: Kevin Locke <[email protected]>
The `error:::mymodule[.*]` example in the "Describing Warning Filters"
section of the warnings module documentation does not behave as the
comment describes.  Since the module portion of the filter string is
interpreted literally, it would match a module with a fully-qualified
name that is literally `mymodule[.*]`.

Unfortunately, there is not a way to match '"module" and any subpackages
of "mymodule"' as documented, since the module part of a filter string
is matched literally.  Instead, update the filter and comment to match
only "mymodule".

skip news (since this is a doc fix)

Signed-off-by: Kevin Locke <[email protected]>
@kevinoid kevinoid force-pushed the bpo-42272-warnings-doc branch from 7a87f02 to 1767ef9 Compare December 16, 2020 00:32
@kevinoid
Copy link
Contributor Author

This PR is still relevant. I've rebased on current master and added "skip news" to commit messages. No changes to patch content.

@github-actions github-actions bot removed the stale Stale PR or inactive for long period of time. label Dec 17, 2020
@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Jan 16, 2021
Copy link

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this change to the documentation reflects the current behavior of -W and PYTHONWARNINGS. I'm not sure if this behavior of using a literal string instead of a regular expression is intended -- @kevinoid do you know if this behavior is a regression from a previous version? Did -W ever treat the input message/module as a regular expression as the documentation claims?

@bdice
Copy link

bdice commented Mar 12, 2022

do you know if this behavior is a regression from a previous version? Did -W ever treat the input message/module as a regular expression as the documentation claims?

I checked Python 2.7 and Python 3.5, just to see if this is a regression from some older version of Python. It appears that the behavior of -W and PYTHONWARNINGS has always used literal strings to match and not regular expressions like warnings.filterwarnings. You can tell because of the backslashes used to escape the pattern:

$ python2.7 -W "ignore:::mymodule.*" -c "import warnings; print warnings.filters[0][3].pattern"
mymodule\.\*$
$ PYTHONWARNINGS="ignore:::mymodule.*" python2.7 -c "import warnings; print warnings.filters[0][3].pattern"
mymodule\.\*$
$ python2.7 -c 'import warnings; warnings.filterwarnings("ignore", module="mymodule.*"); print warnings.filters[0][3].pattern'
mymodule.*
$ python3.5 -W "ignore:::mymodule.*" -c "import warnings; print(warnings.filters[0][3].pattern)"
mymodule\.\*$
$ PYTHONWARNINGS="ignore:::mymodule.*" python3.5 -c "import warnings; print(warnings.filters[0][3].pattern)"
mymodule\.\*$
$ python3.5 -c 'import warnings; warnings.filterwarnings("ignore", module="mymodule.*"); print(warnings.filters[0][3].pattern)'
mymodule.*

Copy link

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinoid
Copy link
Contributor Author

Thanks @bdice! I really appreciate your review.

To add to your findings, it appears that when -W was added in 2a862c6 and 47f5fdc the message pattern was also escaped:

message = re.escape(message)

@arhadthedev
Copy link
Member

arhadthedev commented Mar 12, 2022

Did -W ever treat the input message/module as a regular expression as the documentation claims?

No, but there was an attempt on 2018-2021 in bpo-34624 -W option and PYTHONWARNINGS env variable does not accept module regexes. (Note: probably a PEP, Python Enhancement Proposal is needed to boost things up because there was a concern about ReDoS attack)

In that issue Nick Coghlan (an author of Describing Warning Filters) writes:

I think the only reason I didn't mention this discrepancy in my doc updates is because I wasn't aware there was a discrepancy.

@arhadthedev
Copy link
Member

Also, Thomas Gläßle reported in the same BPO issue on December 2021 that:

Ok, it seems at least the incorrect documentation has been fixed in the mean time.

To check if this PR is still relevant, I got familiarized myself with a chain of hoops PYTHONWARNINGS is thrown through on the C side and what processing happens on the way, then compared it to the first occurence of PYTHONWARNINGS in the module documentation. I can verify that:

  • Warning filters are put into PyConfig.warnoptions with no processing, in a unified way from PYTHONWARNINGS, command line, and Py_xxx global configuration variables. This field is seen from Python code as sys.warnoptions.

    So The interpreter saves the arguments for all supplied entries without interpretation in sys.warnoptions is true.

  • Python warnings module calls _processoptions on its initialization that calls _setoption for each sys.warnoptions entry. _setoption, it its turn, does this:

    cpython/Lib/warnings.py

    Lines 223 to 228 in 7517437

    if message or module:
    import re
    if message:
    message = re.escape(message)
    if module:
    module = re.escape(module) + r'\Z'

    So the warnings module parses these when it is first imported is incorrect. The module escapes all regex special characters so they are still matched literally.

So it looks like that module parses these when it is first imported part needs to be fixed as well.

@kevinoid
Copy link
Contributor Author

Thanks for reviewing/investigating @arhadthedev! Good finds.

So it looks like that module parses these when it is first imported part needs to be fixed as well.

Is that something you'd like me to fix in this PR, or is that something for future work? If here, could you clarify what you'd like it to say?

@bluenote10
Copy link

It would be great if #9358 could be given another shot before removing the possibility to filter hierarchically is given up upon entirely. The lack of proper hierarchical filtering makes strict filterwarnings hard to use, and it could lead many projects to the bad solution of not reporting warnings as errors (over-broad warning silencing), because expected warnings in third party modules can neither be fixed nor filtered out appropriately. For use case see this SO question.

@ezio-melotti
Copy link
Member

ezio-melotti commented May 25, 2022

Additionally, the error:::mymodule[.*] example in the "Describing Warning Filters" section of the warnings module documentation does not behave as the comment describes. Since the module portion of the filter string is interpreted literally, it would match a module with a fully-qualified name that is literally mymodule[.*].

I just ran into this, and I suspect it might have been a typo for mymodule[.]* i.e. mymodule followed by 0 or more literal . ([.] escapes the . like \.). mymodule[.]* matches both mymodule and mymodule.subpkg as the comment claims, even though I would have written that as mymodule\.? or possibly mymodule(\.\w+)*.

This was added in GH-4458 -- @ncoghlan, can you confirm whether this was your intention?

Also note that since the code is using re.match, it might be wise to include a $ in the example, otherwise something like imp[.]* will also match e.g importlib. For reference this is the relevant piece of code:

cpython/Lib/warnings.py

Lines 353 to 357 in 1971014

if ((msg is None or msg.match(text)) and
issubclass(category, cat) and
(mod is None or mod.match(module)) and
(ln == 0 or lineno == ln)):
break

FWIW: I found this PR starting from pytest-dev/pytest#8759 and I also wrote pytest-dev/pytest#8759 (comment) with some examples and the result of my experiments.

@ezio-melotti
Copy link
Member

The PR looks good to me (I closed it and reopened it to re-trigger the bots).

@kevinoid, can you add a NEWS entry to say that the docs now acknowledge that -W and PYTHONWARNINGS only accept literal strings and not regular expressions?

@arhadthedev, if you think something else should be changed, we can create a new PR/issue. I think more examples (like the ones I added in pytest-dev/pytest#8759 (comment)) would be helpful.

@ezio-melotti ezio-melotti self-assigned this May 25, 2022
@ezio-melotti ezio-melotti removed the stale Stale PR or inactive for long period of time. label May 25, 2022
@kevinoid
Copy link
Contributor Author

Thanks @ezio-melotti! I just pushed an additional commit with a NEWS entry. Let me know if you'd prefer I rebase the PR and merge the NEWS entry into a previous commit. And, as always, let me know if you notice any other issues.

@ezio-melotti ezio-melotti merged commit 8136606 into python:main May 30, 2022
@ezio-melotti ezio-melotti added needs backport to 3.10 only security fixes needs backport to 3.11 only security fixes labels May 30, 2022
@miss-islington
Copy link
Contributor

Thanks @kevinoid for the PR, and @ezio-melotti for merging it 🌮🎉.. I'm working now to backport this PR to: 3.10.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Thanks @kevinoid for the PR, and @ezio-melotti for merging it 🌮🎉.. I'm working now to backport this PR to: 3.11.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request May 30, 2022
…H-23172)

* bpo-42272: improve message/module warning filter docs

"The Warnings Filter" section of the warnings module documentation
describes the message and module filters as "a string containing a
regular expression".  While that is true when they are arguments to the
filterwarnings function, it is not true when they appear in -W or
$PYTHONWARNINGS where they are matched literally (after stripping any
starting/ending whitespace).  Update the documentation to note when they
are matched literally.  Also clarify that module matches the
"fully-qualified module name", rather than "module name" which is
ambiguous.

skip news (since this is a doc fix)

Signed-off-by: Kevin Locke <[email protected]>

* bpo-42272: remove bad submodule warning filter doc

The `error:::mymodule[.*]` example in the "Describing Warning Filters"
section of the warnings module documentation does not behave as the
comment describes.  Since the module portion of the filter string is
interpreted literally, it would match a module with a fully-qualified
name that is literally `mymodule[.*]`.

Unfortunately, there is not a way to match '"module" and any subpackages
of "mymodule"' as documented, since the module part of a filter string
is matched literally.  Instead, update the filter and comment to match
only "mymodule".

skip news (since this is a doc fix)

Signed-off-by: Kevin Locke <[email protected]>

* bpo-42272: add warning filter doc changes to NEWS

Signed-off-by: Kevin Locke <[email protected]>
(cherry picked from commit 8136606)

Co-authored-by: Kevin Locke <[email protected]>
@bedevere-bot
Copy link

GH-93341 is a backport of this pull request to the 3.10 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.10 only security fixes label May 30, 2022
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request May 30, 2022
…H-23172)

* bpo-42272: improve message/module warning filter docs

"The Warnings Filter" section of the warnings module documentation
describes the message and module filters as "a string containing a
regular expression".  While that is true when they are arguments to the
filterwarnings function, it is not true when they appear in -W or
$PYTHONWARNINGS where they are matched literally (after stripping any
starting/ending whitespace).  Update the documentation to note when they
are matched literally.  Also clarify that module matches the
"fully-qualified module name", rather than "module name" which is
ambiguous.

skip news (since this is a doc fix)

Signed-off-by: Kevin Locke <[email protected]>

* bpo-42272: remove bad submodule warning filter doc

The `error:::mymodule[.*]` example in the "Describing Warning Filters"
section of the warnings module documentation does not behave as the
comment describes.  Since the module portion of the filter string is
interpreted literally, it would match a module with a fully-qualified
name that is literally `mymodule[.*]`.

Unfortunately, there is not a way to match '"module" and any subpackages
of "mymodule"' as documented, since the module part of a filter string
is matched literally.  Instead, update the filter and comment to match
only "mymodule".

skip news (since this is a doc fix)

Signed-off-by: Kevin Locke <[email protected]>

* bpo-42272: add warning filter doc changes to NEWS

Signed-off-by: Kevin Locke <[email protected]>
(cherry picked from commit 8136606)

Co-authored-by: Kevin Locke <[email protected]>
@bedevere-bot bedevere-bot removed the needs backport to 3.11 only security fixes label May 30, 2022
@bedevere-bot
Copy link

GH-93342 is a backport of this pull request to the 3.11 branch.

miss-islington added a commit that referenced this pull request May 30, 2022
* bpo-42272: improve message/module warning filter docs

"The Warnings Filter" section of the warnings module documentation
describes the message and module filters as "a string containing a
regular expression".  While that is true when they are arguments to the
filterwarnings function, it is not true when they appear in -W or
$PYTHONWARNINGS where they are matched literally (after stripping any
starting/ending whitespace).  Update the documentation to note when they
are matched literally.  Also clarify that module matches the
"fully-qualified module name", rather than "module name" which is
ambiguous.

skip news (since this is a doc fix)

Signed-off-by: Kevin Locke <[email protected]>

* bpo-42272: remove bad submodule warning filter doc

The `error:::mymodule[.*]` example in the "Describing Warning Filters"
section of the warnings module documentation does not behave as the
comment describes.  Since the module portion of the filter string is
interpreted literally, it would match a module with a fully-qualified
name that is literally `mymodule[.*]`.

Unfortunately, there is not a way to match '"module" and any subpackages
of "mymodule"' as documented, since the module part of a filter string
is matched literally.  Instead, update the filter and comment to match
only "mymodule".

skip news (since this is a doc fix)

Signed-off-by: Kevin Locke <[email protected]>

* bpo-42272: add warning filter doc changes to NEWS

Signed-off-by: Kevin Locke <[email protected]>
(cherry picked from commit 8136606)

Co-authored-by: Kevin Locke <[email protected]>
miss-islington added a commit that referenced this pull request May 30, 2022
* bpo-42272: improve message/module warning filter docs

"The Warnings Filter" section of the warnings module documentation
describes the message and module filters as "a string containing a
regular expression".  While that is true when they are arguments to the
filterwarnings function, it is not true when they appear in -W or
$PYTHONWARNINGS where they are matched literally (after stripping any
starting/ending whitespace).  Update the documentation to note when they
are matched literally.  Also clarify that module matches the
"fully-qualified module name", rather than "module name" which is
ambiguous.

skip news (since this is a doc fix)

Signed-off-by: Kevin Locke <[email protected]>

* bpo-42272: remove bad submodule warning filter doc

The `error:::mymodule[.*]` example in the "Describing Warning Filters"
section of the warnings module documentation does not behave as the
comment describes.  Since the module portion of the filter string is
interpreted literally, it would match a module with a fully-qualified
name that is literally `mymodule[.*]`.

Unfortunately, there is not a way to match '"module" and any subpackages
of "mymodule"' as documented, since the module part of a filter string
is matched literally.  Instead, update the filter and comment to match
only "mymodule".

skip news (since this is a doc fix)

Signed-off-by: Kevin Locke <[email protected]>

* bpo-42272: add warning filter doc changes to NEWS

Signed-off-by: Kevin Locke <[email protected]>
(cherry picked from commit 8136606)

Co-authored-by: Kevin Locke <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants