When I first started tackling heavy computational tasks in Python—like processing large datasets or running complex simulations—I quickly ran into a wall. My scripts were using only one of my computer’s CPU cores, leaving the rest sitting idle. The reason for this is Python’s Global Interpreter Lock (GIL), which prevents multiple threads from executing Python bytecode at the same time. While multithreading is great for I/O-bound tasks where the program is waiting for a network or disk, it doesn’t help with CPU-bound work.
Table of Contents
- 1.1 Understanding the Difference: Processes vs. Threads
- 1.2 Getting Started: How to Create and Run a Process
- 1.2.1 The if __name__ == "__main__": Guard
- 1.3 Sharing Data Between Processes (Inter-Process Communication)
- 1.3.1 Queues: The Most Flexible Method
- 1.3.2 Managers: For Sharing Complex Objects
- 1.4 Process Pools: The Best Way to Distribute Work
- 1.4.1 The Modern Approach: ProcessPoolExecutor
- 1.5 A Note on Start Methods: spawn vs. fork
- 1.6 Conclusion
- 1.7 More Topics
The solution is Python’s multiprocessing
module. It allows you to bypass the GIL by creating separate processes, each with its own Python interpreter and memory space. This lets you achieve true parallelism and fully utilize all the cores on your machine. In this guide, I’ll walk you through how to use multiprocessing, share data between processes, and efficiently distribute work.
Understanding the Difference: Processes vs. Threads
Before diving in, it’s crucial to understand how processes differ from threads. I think of it this way: a process is like a separate house with its own address space and resources, while threads are like multiple people living in the same house, sharing everything.
- Memory Space: Processes have their own isolated memory. A variable in one process is not accessible to another. Threads, however, share the same memory, which makes data sharing easier but also riskier due to potential race conditions.
- Overhead: Creating a new process is more resource-intensive (heavyweight) than creating a thread (lightweight).
- Isolation: The memory separation in processes makes them more robust. If one process crashes, it generally won’t affect the others. A misbehaving thread can bring down the entire application.
For CPU-intensive work, the overhead of processes is a small price to pay for the ability to run on multiple cores in parallel.
Getting Started: How to Create and Run a Process
The multiprocessing
module provides a Process
class that feels very similar to the threading
module’s Thread
class.
The if __name__ == "__main__":
Guard
This is the most important and often misunderstood part of using multiprocessing. You must put your process-spawning code inside this block. Because child processes inherit or re-import the parent script, this guard prevents them from entering an infinite loop of creating more processes.
Here’s how to create, start, and wait for a process to finish:
Python
import multiprocessing
import time
def worker(name):
"""A simple function for our process to run."""
print(f"Worker '{name}' starting...")
time.sleep(2)
print(f"Worker '{name}' finished.")
if __name__ == "__main__":
# Create a Process object
process = multiprocessing.Process(target=worker, args=("Process-1",))
print("Main: Starting the process.")
# Start the process
process.start()
print("Main: Waiting for the process to finish.")
# Wait for the process to complete
process.join()
print("Main: All done.")
In this example, process.start()
kicks off the new process, and the main script continues. The
process.join()
call makes the main script pause and wait until the worker
process has completed its execution.
Sharing Data Between Processes (Inter-Process Communication)
Since processes don’t share memory, you can’t just use global variables to share data. Instead, you have to use explicit Inter-Process Communication (IPC) mechanisms. My advice is to favor message passing over shared memory whenever possible, as it’s generally safer and easier to reason about.
Queues: The Most Flexible Method
I find that multiprocessing.Queue
is the most versatile tool for IPC. It allows multiple processes to safely exchange objects. One or more “producer” processes can put items into the queue, and “consumer” processes can get them out.
Python
from multiprocessing import Process, Queue
def worker(number, output_queue):
"""Calculates a result and puts it in the queue."""
result = number * number
output_queue.put(result)
if __name__ == "__main__":
queue = Queue()
processes = []
for i in range(5):
# Create and start a process for each number
p = Process(target=worker, args=(i, queue))
processes.append(p)
p.start()
# Retrieve the results from the queue
for _ in range(5):
result = queue.get()
print(f"Got result: {result}")
# Clean up processes
for p in processes:
p.join()
Managers: For Sharing Complex Objects
If you need to share more complex Python objects like dictionaries or lists, a Manager
is the most convenient option. It runs a server process that holds the shared objects, and other processes communicate with it through proxies. It feels like you’re modifying a normal dictionary, but it’s all happening safely across processes.
Python
from multiprocessing import Manager, Process
def update_dict(shared_dict, key, value):
shared_dict[key] = value
if __name__ == "__main__":
with Manager() as manager:
# Create a shared dictionary using the manager
shared_dict = manager.dict()
processes = [
Process(target=update_dict, args=(shared_dict, "proc1", 10)),
Process(target=update_dict, args=(shared_dict, "proc2", 20)),
]
for p in processes:
p.start()
for p in processes:
p.join()
print("Final shared dict:", dict(shared_dict)) # prints {'proc1': 10, 'proc2': 20}
While managers are very easy to use, they are slower than other IPC methods because of the overhead of the manager process and serialization.
Process Pools: The Best Way to Distribute Work
Manually creating and managing a process for every single task is inefficient. For tasks like processing items in a large list, a process pool is the way to go. A pool maintains a fixed number of worker processes and distributes tasks to them automatically.
The Modern Approach: ProcessPoolExecutor
My preferred method is using concurrent.futures.ProcessPoolExecutor
. Its API is clean, it uses a context manager for easy cleanup, and it’s consistent with ThreadPoolExecutor
, so you can switch between threads and processes easily. This is especially powerful for tasks in data science and machine learning where you might be doing heavy computations.
Python
from concurrent.futures import ProcessPoolExecutor
def square(x):
return x * x
if __name__ == "__main__":
data = [1, 2, 3, 4, 5, 6, 7, 8]
with ProcessPoolExecutor(max_workers=4) as executor:
# map() applies the function to each item in the data list
results = executor.map(square, data)
print(list(results)) # Output: [1, 4, 9, 16, 25, 36, 49, 64]
The executor.map()
function is great because it applies the function to all items and returns the results in the same order as the input. The with
statement ensures the pool is properly shut down when you’re done.
A Note on Start Methods: spawn
vs. fork
Python can start new processes in different ways, and the default depends on your operating system.
spawn
: Starts a fresh, new Python interpreter process. It’s slower but safer and works on all platforms (Windows, macOS, Linux). This is the default on Windows and modern macOS.fork
: Creates an exact copy of the parent process. It’s much faster but is only available on Unix-like systems (Linux) and can be unsafe if your program uses threads.
For cross-platform compatibility and to avoid tricky bugs, I generally recommend sticking with or explicitly setting the start method to spawn
. You can do this by adding
multiprocessing.set_start_method('spawn')
at the beginning of your if __name__ == "__main__":
block.
Conclusion
Multiprocessing is Python’s answer to true parallelism for CPU-bound tasks. By sidestepping the GIL, it lets you harness the full power of modern multi-core processors. While it introduces the complexity of inter-process communication, tools like Queue
and ProcessPoolExecutor
make it manageable and efficient.
Remember the golden rule: use multiprocessing for CPU-bound work and multithreading or asyncio for I/O-bound work. Choosing the right tool for the job is the key to writing high-performance concurrent applications in Python.
More Topics
- Python’s Itertools Module – How to Loop More Efficiently
- Python Multithreading – How to Handle Concurrent Tasks
- Python Asyncio – How to Write Concurrent Code
- Python Data Serialization – How to Store and Transmit Your Data
- Python Context Managers – How to Handle Resources Like a Pro
- Python Project Guide: NumPy & SciPy: For Science!
- Python Project Guide: Times, Dates & Numbers