multiprocessing.Pool报pickling error

本文探讨了在使用Python的multiprocessing模块时遇到的PicklingError问题,并提供了几种有效的解决方案,包括使用线程替代进程、利用copy_reg模块进行方法序列化及采用dill或pathos.multiprocessing替代标准库。
Python3.8

Python3.8

Conda
Python

Python 是一种高级、解释型、通用的编程语言,以其简洁易读的语法而闻名,适用于广泛的应用,包括Web开发、数据分析、人工智能和自动化脚本

现象

multiprocessing.Pool传递一个普通方法(不在class中定义的)时, 能正常工作.

from multiprocessing import Pool

p = Pool(3)
def f(x):
     return x*x

p.map(f, [1,2,3])

但在class中定义的方法使用multiprocessing.Pool会报pickling error错误.

报错代码

# coding: utf8
import multiprocessing


class MyTask(object):
    def task(self, x):
        return x*x

    def run(self):
        pool = multiprocessing.Pool(processes=3)

        a = [1, 2, 3]
        pool.map(self.task, a)


if __name__ == '__main__':
    t = MyTask()
    t.run()

会出现如下异常:

cPickle.PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed

原因:

stackoverflow上的解释:
Pool methods all use a queue.Queue to pass tasks to the worker processes. Everything that goes through the queue.Queue must be pickable. So, multiprocessing can only transfer Python objects to worker processes which can be pickled. Functions are only picklable if they are defined at the top-level of a module, bound methods are not picklable.

pool方法都使用了queue.Queue将task传递给工作进程。multiprocessing必须将数据序列化以在进程间传递。方法只有在模块的顶层时才能被序列化,跟类绑定的方法不能被序列化,就会出现上面的异常。

解决方法:

  1. 用线程替换进程
  2. 可以使用copy_reg来规避上面的异常.
  3. dillpathos.multiprocesssing :use pathos.multiprocesssing, instead of multiprocessing. pathos.multiprocessing is a fork of multiprocessing that uses dill. dill can serialize almost anything in python, so you are able to send a lot more around in parallel.

正确代码1

 # coding: utf8
from multiprocessing.pool import ThreadPool as Pool


class MyTask(object):
    def task(self, x):
        return x*x

    def run(self):
        pool = Pool(3)

        a = [1, 2, 3]
        ret = pool.map(self.task, a)
        print ret


if __name__ == '__main__':
    t = MyTask()
    t.run()

正确代码2:

# coding: utf8
import multiprocessing
import types
import copy_reg


def _pickle_method(m):
    if m.im_self is None:
        return getattr, (m.im_class, m.im_func.func_name)
    else:
        return getattr, (m.im_self, m.im_func.func_name)


copy_reg.pickle(types.MethodType, _pickle_method)


class MyTask(object):
    def __init__(self):
        self.__result = []

    def task(self, x):
        return x * x

    def result_collector(self, result):
        self.__result.append(result)

    def run(self):
        pool = multiprocessing.Pool(processes=3)

        a = [1, 2, 3]
        ret = pool.map(self.task, a)
        print ret


if __name__ == '__main__':
    t = MyTask()
    t.run()

您可能感兴趣的与本文相关的镜像

Python3.8

Python3.8

Conda
Python

Python 是一种高级、解释型、通用的编程语言,以其简洁易读的语法而闻名,适用于广泛的应用,包括Web开发、数据分析、人工智能和自动化脚本

评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值