joblib.Memory и joblib.Parallel дают _pickle.PicklingError: (Не удается рассолить ‹класс '__main__.Foo'›: он не найден как __main__.Foo

Я разрабатываю некоторые вещи для машинного обучения и столкнулся с ошибкой. После некоторых исследований я отследил его до joblib.

Как я могу исправить эту ошибку?

Я использую Windows со следующими настройками, но эта ошибка воспроизводится и в Google Colab.

  • Версия платформы: Windows-10-10.0.18363-SP0
  • Версия Python: 3.9.5 (tags/v3.9.5:0a7dcbd, 3 мая 2021 г., 17:27:52) [MSC v.1928, 64 бит (AMD64)]
  • Версия Joblib: 1.0.1

A minimum working example:

from joblib import Parallel, delayed, Memory
import joblib
import time
import random

class Foo(object):
    def compute(self, x, y):
        time.sleep(1)
        return x + y

def compute_foo(foo: Foo, x, y):
    return foo.compute(x, y)


if __name__ == '__main__':
    memory = Memory(location="./.cache/foo", verbose=0)
    compute_foo_cached = memory.cache(compute_foo)

    foo = Foo()

    Parallel(n_jobs=2)(delayed(compute_foo_cached)(foo, i ** 2, i) for i in [random.randint(0, 10) for i in range(20)])

Error

joblib.externals.loky.process_executor._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\externals\loky\process_executor.py", line 431, in _process_worker
    r = call_item()
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\externals\loky\process_executor.py", line 285, in __call__
    return self.fn(*self.args, **self.kwargs)
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\_parallel_backends.py", line 595, in __call__
    return self.func(*args, **kwargs)
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\parallel.py", line 262, in __call__
    return [func(*args, **kwargs)
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\parallel.py", line 262, in <listcomp>
    return [func(*args, **kwargs)
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\memory.py", line 591, in __call__
    return self._cached_call(args, kwargs)[0]
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\memory.py", line 483, in _cached_call
    func_id, args_id = self._get_output_identifiers(*args, **kwargs)
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\memory.py", line 620, in _get_output_identifiers
    argument_hash = self._get_argument_hash(*args, **kwargs)
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\memory.py", line 614, in _get_argument_hash
    return hashing.hash(filter_args(self.func, self.ignore, args, kwargs),
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\hashing.py", line 266, in hash
    return hasher.hash(obj)
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\hashing.py", line 63, in hash
    self.dump(obj)
  File "C:\Users\xxxx\AppData\Local\Programs\Python\Python39\lib\pickle.py", line 487, in dump
    self.save(obj)
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\hashing.py", line 241, in save
    Hasher.save(self, obj)
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\hashing.py", line 89, in save
    Pickler.save(self, obj)
  File "C:\Users\xxxx\AppData\Local\Programs\Python\Python39\lib\pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "C:\Users\xxxx\AppData\Local\Programs\Python\Python39\lib\pickle.py", line 971, in save_dict
    self._batch_setitems(obj.items())
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\hashing.py", line 140, in _batch_setitems
    Pickler._batch_setitems(self, iter(sorted(items)))
  File "C:\Users\xxxx\AppData\Local\Programs\Python\Python39\lib\pickle.py", line 997, in _batch_setitems
    save(v)
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\hashing.py", line 241, in save
    Hasher.save(self, obj)
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\hashing.py", line 89, in save
    Pickler.save(self, obj)
  File "C:\Users\xxxx\AppData\Local\Programs\Python\Python39\lib\pickle.py", line 603, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Users\xxxx\AppData\Local\Programs\Python\Python39\lib\pickle.py", line 687, in save_reduce
    save(cls)
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\hashing.py", line 241, in save
    Hasher.save(self, obj)
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\hashing.py", line 89, in save
    Pickler.save(self, obj)
  File "C:\Users\xxxx\AppData\Local\Programs\Python\Python39\lib\pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\hashing.py", line 111, in save_global
    Pickler.save_global(self, obj, **kwargs)
  File "C:\Users\xxxx\AppData\Local\Programs\Python\Python39\lib\pickle.py", line 1070, in save_global
    raise PicklingError(
_pickle.PicklingError: ("Can't pickle <class '__main__.Foo'>: it's not found as __main__.Foo", 'PicklingError while hashing {\'foo\': <__main__.Foo object at 0x00000214D1DDE100>, \'x\': 16, \'y\': 4}: PicklingError("Can\'t pickle <class \'__main__.Foo\'>: it\'s not found as __main__.Foo")')
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "G:\path\to\working\dir\_joblib_pickle.py", line 29, in <module>
    Parallel(n_jobs=2)(delayed(compute_foo_cached)(foo, i ** 2, i) for i in [random.randint(0, 10) for i in range(20)])
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\parallel.py", line 1054, in __call__
    self.retrieve()
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\parallel.py", line 933, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "C:\Users\xxxx\AppData\Local\pypoetry\Cache\virtualenvs\project-X-FnEG7BaP-py3.9\lib\site-packages\joblib\_parallel_backends.py", line 542, in wrap_future_result
    return future.result(timeout=timeout)
  File "C:\Users\xxxx\AppData\Local\Programs\Python\Python39\lib\concurrent\futures\_base.py", line 445, in result
    return self.__get_result()
  File "C:\Users\xxxx\AppData\Local\Programs\Python\Python39\lib\concurrent\futures\_base.py", line 390, in __get_result
    raise self._exception
_pickle.PicklingError: ("Can't pickle <class '__main__.Foo'>: it's not found as __main__.Foo", 'PicklingError while hashing {\'foo\': <__main__.Foo object at 0x00000214D1DDE100>, \'x\': 16, \'y\': 4}: PicklingError("Can\'t pickle <class \'__main__.Foo\'>: it\'s not found as __main__.Foo")')

person Pengin    schedule 20.07.2021    source источник
comment
Самым простым обходным решением было бы переместить код вашей библиотеки (классы и функции) в отдельный модуль от того, из которого вы запускаете параллельные задания. Затем выполните from my_stuff import Foo.   -  person Iguananaut    schedule 20.07.2021
comment
Также связано: stackoverflow.com/ вопросы/45106274/   -  person Iguananaut    schedule 20.07.2021
comment
Благодарю вас! Это решение, которое я искал!   -  person Pengin    schedule 29.07.2021