Python Multiprocessing - How to Use Multiple CPU Cores

When I first started tackling heavy computational tasks in Python—like processing large datasets or running complex simulations—I quickly ran into a wall. My scripts were using only one of my computer’s CPU cores, leaving the rest sitting idle. The reason for this is Python’s Global Interpreter Lock (GIL), which prevents multiple threads from executing Python bytecode at the same time. While multithreading is great for I/O-bound tasks where the program is waiting for a network or disk, it doesn’t help with CPU-bound work.

Table of Contents

1.1 Understanding the Difference: Processes vs. Threads
1.2 Getting Started: How to Create and Run a Process
1.2.1 The if __name__ == "__main__": Guard
1.3 Sharing Data Between Processes (Inter-Process Communication)
1.3.1 Queues: The Most Flexible Method
1.3.2 Managers: For Sharing Complex Objects
1.4 Process Pools: The Best Way to Distribute Work
1.4.1 The Modern Approach: ProcessPoolExecutor
1.5 A Note on Start Methods: spawn vs. fork
1.6 Conclusion
1.7 More Topics

The solution is Python’s multiprocessing module. It allows you to bypass the GIL by creating separate processes, each with its own Python interpreter and memory space. This lets you achieve true parallelism and fully utilize all the cores on your machine. In this guide, I’ll walk you through how to use multiprocessing, share data between processes, and efficiently distribute work.

Understanding the Difference: Processes vs. Threads

Before diving in, it’s crucial to understand how processes differ from threads. I think of it this way: a process is like a separate house with its own address space and resources, while threads are like multiple people living in the same house, sharing everything.

Memory Space: Processes have their own isolated memory. A variable in one process is not accessible to another. Threads, however, share the same memory, which makes data sharing easier but also riskier due to potential race conditions.
Overhead: Creating a new process is more resource-intensive (heavyweight) than creating a thread (lightweight).
Isolation: The memory separation in processes makes them more robust. If one process crashes, it generally won’t affect the others. A misbehaving thread can bring down the entire application.

For CPU-intensive work, the overhead of processes is a small price to pay for the ability to run on multiple cores in parallel.

Getting Started: How to Create and Run a Process

The multiprocessing module provides a Process class that feels very similar to the threading module’s Thread class.

The `if name == "main":` Guard

This is the most important and often misunderstood part of using multiprocessing. You must put your process-spawning code inside this block. Because child processes inherit or re-import the parent script, this guard prevents them from entering an infinite loop of creating more processes.

Here’s how to create, start, and wait for a process to finish:

Python

import multiprocessing
import time

def worker(name):
    """A simple function for our process to run."""
    print(f"Worker '{name}' starting...")
    time.sleep(2)
    print(f"Worker '{name}' finished.")

if __name__ == "__main__":
    # Create a Process object
    process = multiprocessing.Process(target=worker, args=("Process-1",))
    
    print("Main: Starting the process.")
    # Start the process
    process.start()
    
    print("Main: Waiting for the process to finish.")
    # Wait for the process to complete
    process.join()
    
    print("Main: All done.")

In this example, process.start() kicks off the new process, and the main script continues. The

process.join() call makes the main script pause and wait until the worker process has completed its execution.

Since processes don’t share memory, you can’t just use global variables to share data. Instead, you have to use explicit Inter-Process Communication (IPC) mechanisms. My advice is to favor message passing over shared memory whenever possible, as it’s generally safer and easier to reason about.

Queues: The Most Flexible Method

I find that multiprocessing.Queue is the most versatile tool for IPC. It allows multiple processes to safely exchange objects. One or more “producer” processes can put items into the queue, and “consumer” processes can get them out.

Python

from multiprocessing import Process, Queue

def worker(number, output_queue):
    """Calculates a result and puts it in the queue."""
    result = number * number
    output_queue.put(result)

if __name__ == "__main__":
    queue = Queue()
    processes = []
    
    for i in range(5):
        # Create and start a process for each number
        p = Process(target=worker, args=(i, queue))
        processes.append(p)
        p.start()

    # Retrieve the results from the queue
    for _ in range(5):
        result = queue.get()
        print(f"Got result: {result}")

    # Clean up processes
    for p in processes:
        p.join()

If you need to share more complex Python objects like dictionaries or lists, a Manager is the most convenient option. It runs a server process that holds the shared objects, and other processes communicate with it through proxies. It feels like you’re modifying a normal dictionary, but it’s all happening safely across processes.

Python

from multiprocessing import Manager, Process

def update_dict(shared_dict, key, value):
    shared_dict[key] = value

if __name__ == "__main__":
    with Manager() as manager:
        # Create a shared dictionary using the manager
        shared_dict = manager.dict()
        
        processes = [
            Process(target=update_dict, args=(shared_dict, "proc1", 10)),
            Process(target=update_dict, args=(shared_dict, "proc2", 20)),
        ]

        for p in processes:
            p.start()
        for p in processes:
            p.join()
            
        print("Final shared dict:", dict(shared_dict)) # prints {'proc1': 10, 'proc2': 20}

While managers are very easy to use, they are slower than other IPC methods because of the overhead of the manager process and serialization.

Process Pools: The Best Way to Distribute Work

Manually creating and managing a process for every single task is inefficient. For tasks like processing items in a large list, a process pool is the way to go. A pool maintains a fixed number of worker processes and distributes tasks to them automatically.

The Modern Approach: `ProcessPoolExecutor`

My preferred method is using concurrent.futures.ProcessPoolExecutor. Its API is clean, it uses a context manager for easy cleanup, and it’s consistent with ThreadPoolExecutor, so you can switch between threads and processes easily. This is especially powerful for tasks in data science and machine learning where you might be doing heavy computations.

Python

from concurrent.futures import ProcessPoolExecutor

def square(x):
    return x * x

if __name__ == "__main__":
    data = [1, 2, 3, 4, 5, 6, 7, 8]
    
    with ProcessPoolExecutor(max_workers=4) as executor:
        # map() applies the function to each item in the data list
        results = executor.map(square, data)
    
    print(list(results)) # Output: [1, 4, 9, 16, 25, 36, 49, 64]

The executor.map() function is great because it applies the function to all items and returns the results in the same order as the input. The with statement ensures the pool is properly shut down when you’re done.

A Note on Start Methods: `spawn` vs. `fork`

Python can start new processes in different ways, and the default depends on your operating system.

spawn: Starts a fresh, new Python interpreter process. It’s slower but safer and works on all platforms (Windows, macOS, Linux). This is the default on Windows and modern macOS.
fork: Creates an exact copy of the parent process. It’s much faster but is only available on Unix-like systems (Linux) and can be unsafe if your program uses threads.

For cross-platform compatibility and to avoid tricky bugs, I generally recommend sticking with or explicitly setting the start method to spawn. You can do this by adding

multiprocessing.set_start_method('spawn') at the beginning of your if __name__ == "__main__": block.

Conclusion

Multiprocessing is Python’s answer to true parallelism for CPU-bound tasks. By sidestepping the GIL, it lets you harness the full power of modern multi-core processors. While it introduces the complexity of inter-process communication, tools like Queue and ProcessPoolExecutor make it manageable and efficient.

Remember the golden rule: use multiprocessing for CPU-bound work and multithreading or asyncio for I/O-bound work. Choosing the right tool for the job is the key to writing high-performance concurrent applications in Python.

Your Daily Dose of News, Insights, Gaming Guides, and Global Exploration.

Python Multiprocessing – How to Use Multiple CPU Cores

Understanding the Difference: Processes vs. Threads

Getting Started: How to Create and Run a Process

The `if name == "main":` Guard

Queues: The Most Flexible Method

Process Pools: The Best Way to Distribute Work

The Modern Approach: `ProcessPoolExecutor`

A Note on Start Methods: `spawn` vs. `fork`

Conclusion

More Topics

Yaman Şener

Leave a Reply Cancel reply

Understanding the Difference: Processes vs. Threads

Getting Started: How to Create and Run a Process

The if __name__ == "__main__": Guard

Sharing Data Between Processes (Inter-Process Communication)

Queues: The Most Flexible Method

Managers: For Sharing Complex Objects

Process Pools: The Best Way to Distribute Work

The Modern Approach: ProcessPoolExecutor

A Note on Start Methods: spawn vs. fork

Conclusion

More Topics

Yaman Şener

Leave a Reply Cancel reply

The `if name == "main":` Guard

The Modern Approach: `ProcessPoolExecutor`

A Note on Start Methods: `spawn` vs. `fork`