要注意的一点是:传递给进程的参数必须是可序列化
的,常见的数据类型都是可序列化的,自定义的类一般是不可序列化的,(在java中有将自定类写为可序列化的方式,不知道python中有没有,懒得查了)如果需要在进程中使用自定义类的对象,而不想多次初始化,可以将自定义类对象设为全局变量。(下面方式一的 [解决办法] 链接中我曾这样试过是可以的)
方式一:每个进程使用不同的参数 x 和相同的参数 y
def work(x, y):return x + yfrom functools import partial
import multiprocessing as mp
pool = mp.Pool(3)
x = [1,2,3,4,5]
partial_work = partial(work, y=1) # 提取x作为partial函数的输入变量
results = pool.map(partial_work, x)
这种情况也可以把 y 设置为全局变量,参见我的另一篇文章中的 解决方法
方式二:每个进程使用不同的参数x、y
def work(x_y):x = x_y[0]y = x_y[1]return x + yimport multiprocessing as mp
pool = mp.Pool(3)
x = [1,2,3,4,5,6]
y = [1,1,1,1,1,1]
x_y = zip(x, y) # 可以使用多个参数,只要保证每个参数长度相同,都是可迭代的即可
results = pool.map(work, x_y)
方式三:每个进程使用不同的参数 x、y
def work(x, y):return x + yfrom pathos.multiprocessing import ProcessingPoll as Pool
pool = Pool(4)
x = [1,2,3,4,5,6]
y = [1,3,1,2,1,1]
results = pool.map(work, x, y)
bug
RuntimeError: An attempt has been made to start a new process before thecurrent process has finished its bootstrapping phase.This probably means that you are not using fork to start yourchild processes and you have forgotten to use the proper idiomin the main module:if __name__ == '__main__':freeze_support()...The "freeze_support()" line can be omitted if the programis not going to be frozen to produce an executable._check_not_importing_main()
解决方法:
在main函数中调用多进程 if __name__=="__main__":