Keep alive

Contents

By default, workers are restarted on each map call. This is done to clean up resources as quickly as possible when the work is done.

Workers can be kept alive in between consecutive map calls using the keep_alive flag. This is useful when your workers have a long startup time and you need to call one of the map functions multiple times.

def foo(x):
    pass

with WorkerPool(n_jobs=4, keep_alive=True) as pool:
    pool.map(task, range(100))
    pool.map(task, range(100))  # Workers are reused here

Instead of passing the flag to the mpire.WorkerPool constructor you can also make use of mpire.WorkerPool.set_keep_alive():

with WorkerPool(n_jobs=4) as pool:
    pool.map(task, range(100))
    pool.map(task, range(100))  # Workers are restarted
    pool.set_keep_alive()
    pool.map(task, range(100))  # Workers are reused here

Caveats

Keeping workers alive works even when the function to be called or any other parameter passed on to the map function changes.

However, when you’re changing either the worker_init and/or worker_exit function while keep_alive is enabled, you need to be aware this can have undesired side-effects. worker_init functions are only executed when a worker is started and worker_exit functions when a worker is terminated. When keep_alive is enabled, workers aren’t restarted in between consecutive map calls, so those functions are not called.

def init_func_1(): pass
def exit_func_1(): pass

def init_func_2(): pass
def init_func_2(): pass

with WorkerPool(n_jobs=4, keep_alive=True) as pool:
    pool.map(task, range(100), worker_init=init_func_1, worker_exit=exit_func_1)
    pool.map(task, range(100), worker_init=init_func_2, worker_exit=exit_func_2)

In the above example init_func_1 is called for each worker when the workers are started. After the first map call exit_func_1 is not called because workers are kept alive. During the second map call init_func_2 isn’t called as well, because the workers are still alive. When exiting the context manager the workers are shut down and exit_func_2 is called.

It gets even trickier when you also enable worker_lifespan. In this scenario during the first map call a worker could’ve reached its maximum lifespan and is forced to restart, while others haven’t. The exit function of the worker to be restarted is called (i.e., exit_func_1). When calling map for the second time and the exit function is changed, the other workers will execute the new exit function when they need to be restarted (i.e., exit_func_2).