fast copying of large files in python

Dave Angel d at davea.name
Wed Nov 2 17:26:16 EDT 2011


On 11/02/2011 03:31 PM, Catherine Moroney wrote:
> Hello,
>
> I'm working on an application that as part of its processing, needs to 
> copy 50 Meg binary files from one NFS mounted disk to another.
>
> The simple-minded approach of shutil.copyfile is very slow, and I'm 
> guessing that this is due to the default 16K buffer size.  Using
> shtil.copyfileobj is faster when I set a larger buffer size, but it's
> still slow compared to a "os.system("cp %s %s" % (f1,f2))" call.
> Is this because of the overhead entailed in having to open the files
> in copyfileobj?
>
> I'm not worried about portability, so should I just do the
> os.system call as described above and be done with it?  Is that the
> fastest method in this case?  Are there any "pure python" ways of
> getting the same speed as os.system?
>
> Thanks,
>
> Catherine

I suspect the fastest way would be to use scp, and not try to use your 
local mount for either file.  There are some options on scp that tell it 
to just connect the (up to) two machines to each other and transfer, 
where it won't even come to your local machine.

I do that sort of thing when I'm connecting over slow internet (and vpn) 
to two machines that are on the same local subnet.

-- 

DaveA




More information about the Python-list mailing list