The platform module can cause crashes in Windows due to slow WMI calls

Crash report

What happened?

When running on a virtual machine where WMI calls seems to have very variable performance the WMI C module can cause Python to crash.

If you have (or can simulate slow WMI calls) then simple python code should (non-deterministically) reproduce the problem. The reason it's non-deterministic is because It's a thread race with a shared resource on the stack. WMI thread in CPython can end up with a pointer to a now invalid stack frame.

I can cause the problem by repeatedly calling platform.machine() and platform.win32_ver() in a loop of about 100 iterations on a machine with slow WMI calls.

import platform

for i in range(100):
    platform.win32_ver()
    platform.machine()

On the affected machines this will sometimes cause the whole process to die with error that relates to the stack being trashed, such as 0xC0000409 where the stack canary has been overwritten.

From a crash dump (that i cannot share) I debugged this issue by taking the WMI module and running on its own. I notice in the code that there is a timeout that seems to have been created because the WMI calls themselves can be quite slow, especially in the case of permission problems where WMIs own timeout is quite long.

https://github.com/python/cpython/blob/main/PC/_wmimodule.cpp#L282

The problem is that this timeout can cause the function that the platform modules uses to finish before the thread running the WMI code does. This is a bit of a problem because the thread is using a pointer to a struct that is allocated on the stack that is about to go away.

https://github.com/python/cpython/blob/main/PC/_wmimodule.cpp#L241

That struct has handles to a bunch of things that the WMI thread wants to use or clean up, including references to the ends of a pipe for which WriteFile calls are used.

In some situations python hangs, sometimes windows terminates it because it detected a stack overflow, sometimes it works, sometimes the timeout is fine, but it all depends on where the thread doing the WMI work was at the time the calling function terminates.

I can stop this problem by monkey patching the WMI calls in the platform module (it has alternative code paths that work ok). I can also stop it by removing the simulated timeout in the WMI module.

The problem is that lots of tools use the platform module - i first discovered this whilst using poetry when poetry install would just terminate, but it can affect anything that uses the platform module to make WMI calls on a machine with slow WMI.

There is no reasonable workaround on the virtual machines I use because they are managed by an organisation (as is the python install on those machines).

CPython versions tested on:

3.12

Operating systems tested on:

Windows

Output from running 'python -VV' on the command line:

No response

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

The platform module can cause crashes in Windows due to slow WMI calls #125315

Crash report

What happened?

CPython versions tested on:

Operating systems tested on:

Output from running 'python -VV' on the command line:

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

The platform module can cause crashes in Windows due to slow WMI calls #125315

Description

Crash report

What happened?

CPython versions tested on:

Operating systems tested on:

Output from running 'python -VV' on the command line:

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions