I"m trying to copy files in python , it'll take long time in sequential approach so I wanted to do it in multiple threads. Below is my code to copy files in sequential
for file in files:
shutil.copy(file,destination_path)
this took around 1.2 sec to complete I tried to implement multithreading as below
import multiprocessing
copy_instance(src,dest):
shutil.copy(src,dest)
for file in files:
p1 = multiprocessing.process(target=copy_instance,args=(file,destination_path)
p1.start()
this took 168.8 sec to complete
Doing it in multiprocessing taking way more time then sequential approach , what am I doing wrong? how can I correctly implement multithreading to speed up my copying process? any help or suggestion on this will be very helpful, thanks
CodePudding user response:
Spawning new processes is slow, particularly in windows. Instead of multiprocessing you should use multithreading, which in CPython can be good for I/O (and is bad for CPU bound processes):
import shutil
import itertools
from concurrent.futures import ThreadPoolExecutor
def copy_instance(args):
shutil.copy(*args)
# ... snipped definition of files and destination
with ThreadPoolExecutor() as pool:
pool.map(copy_instance, zip(files, itertools.repeat(destination)))
Nevertheless, as noted in the comments, depending on you storage you probably won't see any difference.
