Multithreaded Programming In Python

What is Threading?

A thread of execution represents the most diminutive sequence of programmed instructions, capable of autonomous management by a scheduler, often integrated within the operating system. In the previous chapter, you can study in detail about... Threading in Python

What is Multithreading ?

Multithreading constitutes a fundamental concept within software programming, widely embraced by various high-level programming languages. Multithreaded programs bear resemblance to their single-threaded counterparts, differing primarily in their capacity to accommodate multiple concurrent threads of execution. These threads share process resources while retaining the ability to operate autonomously.

This architecture enables a single process to concurrently manage diverse "functions," optimizing hardware utilization, especially across multiple cores or processors. For instance, within a multithreaded operating system, simultaneous tasks like logging file modifications, data indexing, and window management can be efficiently executed.

Basic Multithreading Example

import threading import time def print_numbers(): for i in range(5): print("Number:", i) time.sleep(1) # Introduce a small delay def print_letters(): for letter in 'abcde': print("Letter:", letter) time.sleep(1) if __name__ == "__main__": t1 = threading.Thread(target=print_numbers) t2 = threading.Thread(target=print_letters) t1.start() t2.start() t1.join() t2.join() print("Both threads have finished")

In this example, two threads are created using the threading.Thread class. Each thread runs a separate function, print_numbers() and print_letters(), and a delay is introduced using time.sleep(1) to visualize concurrency.

Thread Synchronization with Locks

Threads often share resources like variables or data. To avoid conflicts, locks are employed. Here's a simple counter example:

import threading counter = 0 counter_lock = threading.Lock() def increment_counter(): global counter for _ in range(100000): with counter_lock: counter += 1 if __name__ == "__main__": t1 = threading.Thread(target=increment_counter) t2 = threading.Thread(target=increment_counter) t1.start() t2.start() t1.join() t2.join() print("Final counter value:", counter)

The counter_lock ensures that only one thread can increment the counter at a time, avoiding race conditions.

ThreadPoolExecutor for Concurrent Execution

Python's concurrent.futures module simplifies thread management. The ThreadPoolExecutor class allows executing functions concurrently.

import concurrent.futures def process_data(data): return data * 2 if __name__ == "__main__": data_list = [1, 2, 3, 4, 5] with concurrent.futures.ThreadPoolExecutor(max_workers=2) as executor: results = executor.map(process_data, data_list) for result in results: print("Processed data:", result)

Here, the ThreadPoolExecutor manages concurrent execution of the process_data() function over the provided data list.

Downloads images from the web | Multithreading in Python

Here is an example of a multithreaded program that downloads images from the web. The program uses three threads to download the images in parallel.

import threading import requests import time def download_image(url): image = requests.get(url) with open(url.split("/")[-1], "wb") as f: f.write(image.content) def main(): threads = [] urls = ["https://upload.wikimedia.org/wikipedia/commons/thumb/a/a4/Ubuntu_logo_2010.svg/1200px-Ubuntu_logo_2010.svg.png", "https://upload.wikimedia.org/wikipedia/commons/thumb/f/f8/Python_logo_and_wordmark.svg/1200px-Python_logo_and_wordmark.svg.png", "https://upload.wikimedia.org/wikipedia/commons/thumb/0/0a/TensorFlow_logo.svg/1200px-TensorFlow_logo.svg.png"] for url in urls: thread = threading.Thread(target=download_image, args=[url]) threads.append(thread) thread.start() for thread in threads: thread.join() if __name__ == "__main__": main()

This program will download the three images in parallel. The download_image() function is executed in a separate thread for each image. The main() function starts the three threads and then waits for them to finish.

This program will be faster than a single-threaded program that downloads the images one at a time. The amount of speedup will depend on the number of cores on your CPU.

Conclusion

Multithreading in Python is a powerful technique to achieve parallelism and improve performance, particularly for tasks involving I/O-bound operations. However, it's essential to manage shared resources carefully to avoid issues like race conditions.