Worker state¶
Contents
If you want to let each worker have its own state you can use the use_worker_state flag:
def task(worker_state, x):
if "local_sum" not in worker_state:
worker_state["local_sum"] = 0
worker_state["local_sum"] += x
with WorkerPool(n_jobs=4, use_worker_state=True) as pool:
results = pool.map(task, range(100))
Important
The worker state is passed on as the third argument, after the worker ID and shared objects (when enabled), to the provided function.
Instead of passing the flag to the mpire.WorkerPool constructor you can also make use of
mpire.WorkerPool.set_use_worker_state():
with WorkerPool(n_jobs=4) as pool:
pool.set_use_worker_state()
pool.map(task, range(100))
Combining worker state with worker_init and worker_exit¶
The worker state can be combined with the worker_init and worker_exit parameters of each map function,
leading to some really useful capabilities:
import numpy as np
import pickle
def load_big_model(worker_state):
# Load a model which takes up a lot of memory
with open('./a_really_big_model.p3', 'rb') as f:
worker_state['model'] = pickle.load(f)
def model_predict(worker_state, x):
# Predict
return worker_state['model'].predict(x)
with WorkerPool(n_jobs=4, use_worker_state=True) as pool:
# Let the model predict
data = np.array([[...]])
results = pool.map(model_predict, data, worker_init=load_big_model)
More information about the worker_init and worker_exit parameters can be found at Worker init and exit.
Combining worker state with keep_alive¶
By default, workers are restarted each time a map function is executed. As described in Keep alive this can
be circumvented by using keep_alive=True. This also ensures worker state is kept across consecutive map calls:
with WorkerPool(n_jobs=4, use_worker_state=True, keep_alive=True) as pool:
# Let the model predict
data = np.array([[...]])
results = pool.map(model_predict, data, worker_init=load_big_model)
# Predict some more
more_data = np.array([[...]])
more_results = pool.map(model_predict, more_data)
In this example we don’t need to supply the worker_init function to the second map call, as the workers will be
reused. When worker_lifespan is set, though, this rule doesn’t apply.