-
Notifications
You must be signed in to change notification settings - Fork 533
RF: Futures-based MultiProc #2598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2598 +/- ##
==========================================
- Coverage 67.58% 67.54% -0.04%
==========================================
Files 339 340 +1
Lines 42845 43004 +159
Branches 5295 5320 +25
==========================================
+ Hits 28956 29047 +91
- Misses 13205 13258 +53
- Partials 684 699 +15
Continue to review full report at Codecov.
|
Huh, apparently futures gets bundled with Python 2.7 in conda. Should probably still add it to the dependencies, but this looks to be passing, so far. |
looks good to me. we should check it in a basic system python 2.7 environment and see if that requires any backports or a specific version of py2.7. |
Is there a reason we're using Miniconda in Travis? It supports vanilla Python installations, and I think all dependencies can be installed with |
perhaps we should switch since we can pip install all of nipype's requirements at this point. |
Opened a PR to test independently of these changes. Will merge into this branch if everything goes well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great move.
I would suggest to add this reimplementation as a new plugin, and replace multiproc only when we are completely sure it works as we expect.
daemon = property(_get_daemon, _set_daemon) | ||
|
||
|
||
class NonDaemonPool(pool.Pool): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we still want nondaemon processes, we will need to investigate how to do that with concurrent.futures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will require using undocumented API, if it is possible.
What purpose does it serve, by the way?
try: | ||
self.pool = NipypePool( | ||
processes=self.processors, | ||
maxtasksperchild=maxtasks, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently, there is no way to define a maxtasksperchild with concurrents.futures (https://stackoverflow.com/a/42229574). We should maybe consider loky.
Max tasks per child is important for FMRIPREP, as memory gets cleared up every time a worker is shut down and spun up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if concurrent.futures
doesn't have the features we need, perhaps it makes more sense to try to push them to fix the bug in multiprocessing.Pool
?
@oesteban Just a bump on this. If we want to have a futures-based plugin, we should decide what we're doing here. |
7b92109
to
885e375
Compare
@oesteban, I've made a copy of the old The point of this change is to seamlessly fix the crashing problem for people for whom it can be fixed, so using a new plugin name will not really get us where we want (or allow us to test widely). So instead I'm going to risk breaking for people, and we can tell them they can restore the old behavior by using Does this seem reasonable to everybody? cc @chrisfilo since #2548 was yours. |
If we're okay waiting a little while before merging this, I pinned nipreps/fmriprep#1180 to this branch to make sure that it doesn't screw up our pipelines either. I think the nipype pipelines are probably sufficient test, anyway, so I'm okay with merging before that finishes. |
This is a great addition :). Have you checked whether it has any influence on the overall throughput? (maybe on fmriprep proper, is it faster?) |
I don't think it will significantly improve throughput. I don't believe job submission has generally been a major bottleneck, except with extremely short jobs, and then it's been the 2s sleep. But we can compare the fmriprep test runtimes in that pinned branch with the latest master test runtimes. |
Maybe I spoke too soon? ds054 and ds210 both have a multi-minute speed-up on the workflow-running step (comparing https://circleci.com/workflow-run/035e3177-0d20-4714-ae9f-b19f1f0a1664 with https://circleci.com/workflow-run/cdcfd49e-e523-4021-a358-900955e679ab). Unclear if that will be a consistent improvement, though. Edit: Same with ds005. That's encouraging. |
Indeed. I had the suspicion that this PR would improve performance. My understanding is that the control thread is much more efficient with the futures.concurrent implementation. My only fear is memory, as workers are not automatically reset. And yet we should go ahead since this fixes #2548 in theory. |
Sounds good. Taking the plunge. |
@effigies: in case it is helpful, we were seeing the infinite loop issue after upgrading from 0.13 to 1.0.4. We merged |
@wtriplett That is great to hear! |
This is a first pass at basing MultiProc off of the
concurrent.futures
module.I don't expect this to pass the 2.7 tests, but it will be interesting to see if we can get Python 3 working with about half an hour of effort.
Fixes #2548