Skip to content

RF: Futures-based MultiProc #2598

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 2, 2018
Merged

Conversation

effigies
Copy link
Member

This is a first pass at basing MultiProc off of the concurrent.futures module.

I don't expect this to pass the 2.7 tests, but it will be interesting to see if we can get Python 3 working with about half an hour of effort.

Fixes #2548

@effigies effigies added this to the 1.1.0 milestone May 26, 2018
@codecov-io
Copy link

codecov-io commented May 26, 2018

Codecov Report

Merging #2598 into master will decrease coverage by 0.03%.
The diff coverage is 58.65%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2598      +/-   ##
==========================================
- Coverage   67.58%   67.54%   -0.04%     
==========================================
  Files         339      340       +1     
  Lines       42845    43004     +159     
  Branches     5295     5320      +25     
==========================================
+ Hits        28956    29047      +91     
- Misses      13205    13258      +53     
- Partials      684      699      +15
Flag Coverage Δ
#smoketests 50.52% <26.81%> (-0.21%) ⬇️
#unittests 64.99% <58.65%> (-0.06%) ⬇️
Impacted Files Coverage Δ
nipype/info.py 94.02% <ø> (ø) ⬆️
nipype/pipeline/plugins/multiproc.py 84.81% <100%> (+2.66%) ⬆️
nipype/pipeline/plugins/__init__.py 100% <100%> (ø) ⬆️
nipype/pipeline/plugins/legacymultiproc.py 55.95% <55.95%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fa864aa...f836118. Read the comment docs.

@effigies
Copy link
Member Author

Huh, apparently futures gets bundled with Python 2.7 in conda.

Should probably still add it to the dependencies, but this looks to be passing, so far.

@satra
Copy link
Member

satra commented May 26, 2018

looks good to me. we should check it in a basic system python 2.7 environment and see if that requires any backports or a specific version of py2.7.

@effigies
Copy link
Member Author

Is there a reason we're using Miniconda in Travis? It supports vanilla Python installations, and I think all dependencies can be installed with pip. Nibabel has URLs for wheels for anything that would otherwise need compilation.

@satra
Copy link
Member

satra commented May 27, 2018

perhaps we should switch since we can pip install all of nipype's requirements at this point.

@effigies
Copy link
Member Author

Opened a PR to test independently of these changes. Will merge into this branch if everything goes well.

Copy link
Contributor

@oesteban oesteban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great move.

I would suggest to add this reimplementation as a new plugin, and replace multiproc only when we are completely sure it works as we expect.

daemon = property(_get_daemon, _set_daemon)


class NonDaemonPool(pool.Pool):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still want nondaemon processes, we will need to investigate how to do that with concurrent.futures

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will require using undocumented API, if it is possible.

What purpose does it serve, by the way?

try:
self.pool = NipypePool(
processes=self.processors,
maxtasksperchild=maxtasks,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently, there is no way to define a maxtasksperchild with concurrents.futures (https://stackoverflow.com/a/42229574). We should maybe consider loky.

Max tasks per child is important for FMRIPREP, as memory gets cleared up every time a worker is shut down and spun up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if concurrent.futures doesn't have the features we need, perhaps it makes more sense to try to push them to fix the bug in multiprocessing.Pool?

@effigies
Copy link
Member Author

@oesteban Just a bump on this. If we want to have a futures-based plugin, we should decide what we're doing here.

@effigies effigies force-pushed the rf/multiproc_futures branch from 7b92109 to 885e375 Compare June 19, 2018 02:14
@effigies
Copy link
Member Author

effigies commented Jul 2, 2018

@oesteban, I've made a copy of the old MultiProc as LegacyMultiProc.

The point of this change is to seamlessly fix the crashing problem for people for whom it can be fixed, so using a new plugin name will not really get us where we want (or allow us to test widely). So instead I'm going to risk breaking for people, and we can tell them they can restore the old behavior by using LegacyMultiProc, and in the meantime ask them to help us debug the new MultiProc.

Does this seem reasonable to everybody?

cc @chrisfilo since #2548 was yours.

@effigies
Copy link
Member Author

effigies commented Jul 2, 2018

If we're okay waiting a little while before merging this, I pinned nipreps/fmriprep#1180 to this branch to make sure that it doesn't screw up our pipelines either. I think the nipype pipelines are probably sufficient test, anyway, so I'm okay with merging before that finishes.

@oesteban
Copy link
Contributor

oesteban commented Jul 2, 2018

This is a great addition :). Have you checked whether it has any influence on the overall throughput? (maybe on fmriprep proper, is it faster?)

@effigies
Copy link
Member Author

effigies commented Jul 2, 2018

I don't think it will significantly improve throughput. I don't believe job submission has generally been a major bottleneck, except with extremely short jobs, and then it's been the 2s sleep.

But we can compare the fmriprep test runtimes in that pinned branch with the latest master test runtimes.

@effigies
Copy link
Member Author

effigies commented Jul 2, 2018

Maybe I spoke too soon? ds054 and ds210 both have a multi-minute speed-up on the workflow-running step (comparing https://circleci.com/workflow-run/035e3177-0d20-4714-ae9f-b19f1f0a1664 with https://circleci.com/workflow-run/cdcfd49e-e523-4021-a358-900955e679ab). Unclear if that will be a consistent improvement, though.

Edit: Same with ds005. That's encouraging.

@oesteban
Copy link
Contributor

oesteban commented Jul 2, 2018

That's encouraging.

Indeed. I had the suspicion that this PR would improve performance. My understanding is that the control thread is much more efficient with the futures.concurrent implementation. My only fear is memory, as workers are not automatically reset. And yet we should go ahead since this fixes #2548 in theory.

@effigies
Copy link
Member Author

effigies commented Jul 2, 2018

Sounds good. Taking the plunge.

@effigies effigies merged commit 7fda6ae into nipy:master Jul 2, 2018
@effigies effigies deleted the rf/multiproc_futures branch July 2, 2018 18:15
effigies added a commit that referenced this pull request Jul 2, 2018
#2598 did not incorporate #2611's move to hierarchical loggers.

This is likely to be a breaking change for a lot of downstream users, so we should make a particular note of it in the release notes.
@wtriplett
Copy link
Contributor

@effigies: in case it is helpful, we were seeing the infinite loop issue after upgrading from 0.13 to 1.0.4. We merged effigies:rf/multiproc_futures with the 1.0.4 tag (I guess it was about a week ago) and it has been working great.

@effigies
Copy link
Member Author

effigies commented Jul 5, 2018

@wtriplett That is great to hear!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants