Process start method

The multiprocessing package allows you to start processes using a few different methods: 'fork', 'spawn' or 'forkserver'. Threading is also available by using 'threading'. For detailed information on the multiprocessing contexts, please refer to the multiprocessing documentation and caveats section. In short:

fork

Copies the parent process such that the child process is effectively identical. This includes copying everything currently in memory. This is sometimes useful, but other times useless or even a serious bottleneck. fork enables the use of copy-on-write shared objects (see Shared objects).

spawn

Starts a fresh python interpreter where only those resources necessary are inherited.

forkserver

First starts a server process (using 'spawn'). Whenever a new process is needed the parent process requests the server to fork a new process.

threading

Starts child threads. Suffers from the Global Interpreter Lock (GIL), but works fine for I/O intensive tasks.

For an overview of start method availability and defaults, please refer to the following table:

Start method

Available on Unix

Available on Windows

fork

Yes (default)

No

spawn

Yes

Yes (default)

forkserver

Yes

No

threading

Yes

Yes

Spawn and forkserver

When using spawn or forkserver as start method, be aware that global variables (constants are fine) might have a different value than you might expect. You also have to import packages within the called function:

import os

def failing_job(folder, filename):
    return os.path.join(folder, filename)

# This will fail because 'os' is not copied to the child processes
with WorkerPool(n_jobs=2, start_method='spawn') as pool:
    pool.map(failing_job, [('folder', '0.p3'), ('folder', '1.p3')])
def working_job(folder, filename):
    import os
    return os.path.join(folder, filename)

# This will work
with WorkerPool(n_jobs=2, start_method='spawn') as pool:
    pool.map(working_job, [('folder', '0.p3'), ('folder', '1.p3')])

A lot of effort has been put into making the progress bar, dashboard, and nested pools (with multiple progress bars) work well with spawn and forkserver. So, everything should work fine.