Skip to content

release GIL while doing I/O operations in the mmap module #44098

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
luks mannequin opened this issue Oct 8, 2006 · 11 comments
Closed

release GIL while doing I/O operations in the mmap module #44098

luks mannequin opened this issue Oct 8, 2006 · 11 comments
Labels
3.12 only security fixes extension-modules C modules in the Modules dir performance Performance or resource usage

Comments

@luks
Copy link
Mannequin

luks mannequin commented Oct 8, 2006

BPO 1572968
Nosy @pitrou, @briancurtin, @jnwatson, @ZackerySpytz, @isidentical
PRs
  • gh-44098: Release the GIL while calling mmap() in the mmap module on Windows #14114
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2006-10-08.00:26:31.000>
    labels = ['extension-modules', '3.9', 'performance']
    title = 'release GIL while doing I/O operations in the mmap module'
    updated_at = <Date 2019-06-15.15:48:30.691>
    user = 'https://bugs.python.org/luks'

    bugs.python.org fields:

    activity = <Date 2019-06-15.15:48:30.691>
    actor = 'ZackerySpytz'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Extension Modules']
    creation = <Date 2006-10-08.00:26:31.000>
    creator = 'luks'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 1572968
    keywords = ['patch']
    message_count = 8.0
    messages = ['61261', '61262', '84542', '102453', '102465', '102466', '342561', '343989']
    nosy_count = 8.0
    nosy_names = ['pitrou', 'luks', 'brian.curtin', 'neologix', 'adiroiban', 'jnwatson', 'ZackerySpytz', 'BTaskaya']
    pr_nums = ['14114']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue1572968'
    versions = ['Python 3.9']

    @luks
    Copy link
    Mannequin Author

    luks mannequin commented Oct 8, 2006

    There is a few system I/O calls in the mmap module that
    can take some time. Would be be possible to put
    Py_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADS around
    these to release the GIL and allow other Python threads
    to do their work?

    @luks luks mannequin added type-feature A feature request or enhancement stdlib Python modules in the Lib dir labels Oct 8, 2006
    @josiahcarlson
    Copy link
    Mannequin

    josiahcarlson mannequin commented Oct 16, 2006

    Logged In: YES
    user_id=341410

    Can you come up with the listing of code blocks where it
    would make sense to allow threads, or even provide a patch?

    @pitrou
    Copy link
    Member

    pitrou commented Mar 30, 2009

    I don't know mmap myself but patches are welcome.

    @pitrou pitrou added performance Performance or resource usage and removed type-feature A feature request or enhancement labels Mar 30, 2009
    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Apr 6, 2010

    As soon as you're dealing with files (not anonymous mapping), you can get the same type of latency than when using open/read/write...
    While it's probably not worth the trouble to release the GIL for every operation involving mmaped-files, there are a couple places where it should be considered, for example:

    Modules/mmapmodule.c:new_mmap_object
    #ifdef HAVE_FSTAT
    #  ifdef __VMS
    	/* on OpenVMS we must ensure that all bytes are written to the file */
    	if (fd != -1) {
    	        fsync(fd);
    	}
    #  endif
    

    fsync() can take up to a couple seconds, so we should probably allow threads here (this is done only on VMS).
    Another place worth looking at is when using msync(), like in mmap_object_dealloc():

    	if (m_obj->data!=NULL) {
    		msync(m_obj->data, m_obj->size, MS_SYNC);
    		munmap(m_obj->data, m_obj->size);
    	}
    

    or in mmap_flush_method():

    	if (-1 == msync(self->data + offset, size, MS_SYNC)) {
    		PyErr_SetFromErrno(mmap_module_error);
    		return NULL;
    	}
    

    msycn too can take quite a long time to complete.

    I can write a patch if someone's willing to review it.

    @pitrou
    Copy link
    Member

    pitrou commented Apr 6, 2010

    This shouldn't really care about VMS here. As for the rest, feel free to propose a patch :)
    Please note that most non-trivial system calls, such as ftruncate(), mremap(), even mmap() itself, deserve to be enclosed in Py_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADS pairs.

    However, you'll have to be careful that the internal state of the mmap object remains consistent, such that using it from several threads doesn't crash the interpreter.
    (you can even add multi-threaded unit tests if you want to be sure of this)

    @pitrou
    Copy link
    Member

    pitrou commented Apr 6, 2010

    This shouldn't really care about VMS here.

    Correction: /we/ shouldn't really care about VMS here.
    And the reason being, of course, that we have neither developers nor
    known testers under VMS.
    (we don't even know if the current trunk builds there or not)

    @isidentical
    Copy link
    Member

    Any news? If a patch is not ready, i can work on a patch too.

    @jnwatson
    Copy link
    Mannequin

    jnwatson mannequin commented May 30, 2019

    I'll add one more system I/O call that's not GIL-wrapped in the mmap module that can take some time: mmap itself.

    mmap on Linux with MAP_POPULATE (0x8000) as the flags can take quite a bit of time. That's the flag that prefaults the memory range. MAP_POPULATE has been around since Linux 2.5.46.

    I know that MAP_POPULATE isn't explicitly supported in the module, but mmap.mmap does take arbitrary flags, so it isn't exactly unsupported either.

    @ZackerySpytz ZackerySpytz mannequin added extension-modules C modules in the Modules dir 3.9 only security fixes and removed stdlib Python modules in the Lib dir labels Jun 15, 2019
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @gvanrossum
    Copy link
    Member

    This seems stuck, and it’s really old (2006!).

    @isidentical Could I interest you in providing that patch you mentioned?

    @gvanrossum gvanrossum added the pending The issue will be closed if no feedback is provided label Sep 3, 2022
    @iritkatriel iritkatriel added 3.12 only security fixes and removed 3.9 only security fixes labels Sep 6, 2022
    hauntsaninja added a commit to hauntsaninja/cpython that referenced this issue Oct 10, 2022
    This seems pretty straightforward. The issue mentions other calls in
    mmapmodule that we could release the GIL on, but those are in methods
    where we'd need to be careful to ensure that something sensible happens
    if those are called concurrently. In prior art, note that python#12073
    released the GIL for munmap.  In a toy benchmark, I see the speed up
    you'd expect from doing this.
    hauntsaninja added a commit to hauntsaninja/cpython that referenced this issue Oct 10, 2022
    @AlexWaygood AlexWaygood removed the pending The issue will be closed if no feedback is provided label Oct 10, 2022
    @gvanrossum
    Copy link
    Member

    Once the Unix PR lands, should we keep this open hoping someone will do it for Windows, or close it?

    miss-islington pushed a commit that referenced this issue Oct 10, 2022
    This seems pretty straightforward. The issue mentions other calls in mmapmodule that we could release the GIL on, but those are in methods where we'd need to be careful to ensure that something sensible happens if those are called concurrently. In prior art, note that #12073 released the GIL for munmap.  In a toy benchmark, I see the speedup you'd expect from doing this.
    
    Automerge-Triggered-By: GH:gvanrossum
    @hauntsaninja
    Copy link
    Contributor

    The Unix PR has landed and I don't intend to make one for Windows (don't have access to a machine or a use case). This has been open since 2006, so I vote we close it.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.12 only security fixes extension-modules C modules in the Modules dir performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants