Multi thread python for loop
First, in Python, if your code is CPU-bound, multithreading won't help, because only one thread can hold the Global Interpreter Lock, and therefore run Python code, at a time. So, you need to use processes, not threads.
This is not true if your operation "takes forever to return" because it's IO-bound—that is, waiting on the network or disk copies or the like. I'll come back to that later.
Next, the way to process 5 or 10 or 100 items at once is to create
a pool of 5 or 10 or 100 workers, and put the items into a queue that the workers service. Fortunately, the stdlib
The former is more powerful and flexible for traditional programming; the latter is simpler if you need to compose future-waiting;
for trivial cases, it really doesn't matter which you choose. (In this case, the most obvious implementation with each takes 3 lines with
If you're using 2.6-2.7 or 3.0-3.1,
Finally, it's usually a lot simpler to parallelize things if you can turn the entire loop iteration into a function call (something you
could, e.g., pass to
Putting it all together:
If you have lots of relatively small jobs, the overhead of multiprocessing might swamp the gains. The way to solve that is to batch up the work into larger jobs. For example (using
Finally, what if your code is IO bound? Then threads are just as good as processes, and with less overhead (and fewer limitations, but those limitations usually won't affect you in cases like this). Sometimes that "less overhead" is enough to mean you don't need batching with threads, but you do with processes, which is a nice win.
So, how do you use threads instead of processes? Just change
If you're not sure whether your code is CPU-bound or IO-bound, just try it both ways.
Yes. In fact, there are two different ways to do it.
First, you can share the same (thread or process) executor and use it from multiple places with no problem. The whole point of tasks and futures is that they're self-contained; you don't care where they run, just that you queue them up and eventually get the answer back.
Alternatively, you can have two executors in the same program with no problem. This has a performance cost—if you're using both executors at the same time, you'll end up trying to run (for example) 16 busy threads on 8 cores, which means there's going to be some context switching. But sometimes it's worth doing because, say, the two executors are rarely busy at the same time, and it makes your code a lot simpler. Or maybe one executor is running very large tasks that can take a while to complete, and the other is running very small tasks that need to complete as quickly as possible, because responsiveness is more important than throughput for part of your program.
If you don't know which is appropriate for your program, usually it's the first.
How often do we have to run a compute-heavy operation on a list of objects? or read a list of files from a storage space like S3?
What is common between the two problems stated above? Long wait times!
What is different? One is a CPU-bound process and another is an I/O (input-output) bound process.
Yeah, I know! That is heavy. Let’s explore this in pieces.
What is a CPU?
CPU is the Central Processing Unit. That’s it!
Just kidding, let me explain a bit more. A CPU’s productivity is measured based on cycles (but it can depend on a lot of other factors too, including the architecture) - the time required for the execution of one simple processor operation. So, for example, if you check out the task manager on your system, there are multiple processes working on the same CPU, for example, Google Chrome, Slack, VSCode, etc. and they are making it happen by sharing the amount of time each of their cycles run. One software might be running for 10 cycles while another runs for the next 5 cycles and the third for the next 8 cycles and it repeats. All of this might seem like a lot of waiting around and doing nothing for the non-running software, but guess what? It is not at all visible to the human eye.
What is multi-processing?
Utilization of multiple processors to run a task. It is generally a perfect use case for CPU-bound operations. In our case, we would create processes and give them each an object from the list that we want to process. This would lead to parallel processing and a faster response, but the same amount of CPU time. Simply put, if we had 2 items on the list and it was taking 20 seconds earlier, thanks to multiprocessing it will take 10 seconds (approx) but the same amount of CPU cycles which gets divided among the two processes.
Caveat: Processors do not share memory, so it is harder to share a global variable among the two processes, but there are solutions to this problem which are beyond the scope of this article.
What is multi-threading?
Multi-threading in the context of python is more similar to distributing work to different workers within the same processor but they cannot all work at the same time. So, what is the advantage? If it is an I/O bound functionality that we are trying to multithread while the first worker waits to get their file from storage the second worker can start its work and give back control when the first worker is done waiting.
Let’s test out some of these concepts!
We are going to use the Python Module — concurrent.futures
It has ThreadPoolExcecutor and ProcessPoolExecutor classes which have the same interface, and are subclasses of the Executor class.
The code samples were run on my local system having the following configuration.
System config: The highlighted text shows the system has 6 cores that can handle two threads each, hence 12 logical processors. Simply it means that each core (working unit) can run two threads but it does not happen simultaneously. The threads are just scheduled to share the cycles of the core.
Given below, are two code samples having the same compute intensive function consisting of multiple transformations applied on an image. Yup, I couldn’t think of any other use case so decided to do a bunch of stuff to an image for no apparent reason. Anyway, the code sample with normal execution runs in 456.551 sec while with multiprocessing it completes in 236.995 sec. Here, ProcessPoolExecutor is used for multiprocessing with 61 parallel workers. There are a number of other parameters that can be used to customize the process, documentation can be found here.The compute intensive function is run on a list of images with normal executionThe compute intensive function is run on a list of images using multiprocessing
Multithreading is implemented in a similar manner, but is needed for a different reason altogether - I/O bound process, as mentioned initially. Given below is an example of a multithreading implementation in which we are trying to read images from a folder on the local system. The total time taken for the task with multithreading is 0.3212 sec while without multithreading it is 0.3732 sec which is a significant difference for a very small list. Here, we are also using partial functions, which is useful when we want to pass in a function with multiple arguments prefilled.
This was just a very brief overview of what is happening underneath the hood and also how easily we can make our code run faster. While there is a lot more to this topic, I hope this helps you to get started. Here is a very good article covering the ins and outs of ThreadPoolExecutor.
Hi there, thanks a lot for taking the time to read the article. I am a Machine Learning Engineer currently exploring the MLOps world but also other industries. I use medium to jot down my thoughts about topics that piqued my interest recently. You can connect with me on LinkedIn and I am always up for a quick chat :)
Can Python run multiple threads?
To recap, threading in Python allows multiple threads to be created within a single process, but due to GIL, none of them will ever run at the exact same time. Threading is still a very good option when it comes to running multiple I/O bound tasks concurrently.
How do you write multiple threads in Python?
To use multithreading, we need to import the threading module in Python Program..
print("Hello, how old are you ", n).
t1 = threading. Thread( target = print_hello, args =(18, )).
Can Python threads run on multiple cores?
Python is NOT a single-threaded language. Python processes typically use a single thread because of the GIL. Despite the GIL, libraries that perform computationally heavy tasks like numpy, scipy and pytorch utilise C-based implementations under the hood, allowing the use of multiple cores.
How do you run a for loop concurrently in Python?
Parallel for Loop in Python.
Use the multiprocessing Module to Parallelize the for Loop in Python..
Use the joblib Module to Parallelize the for Loop in Python..
Use the asyncio Module to Parallelize the for Loop in Python..