[Tutor] Efficient way to join files

Jeff Shannon jeff at ccvcorp.com
Wed Oct 1 13:25:10 EDT 2003


Bob Gailer wrote:
> At 07:39 PM 9/30/2003, Héctor Villafuerte D. wrote:
> 
>> Hi all,
>> I need to join multiple files.
>> Is there a more efficient way to do it than using
>> fileinput.input(file) and looping through those files?
>> Thanks in advance,
> 
> 
> On Windows? os.popen("copy sourcefile1 + sourcefile2 + ... 
> destinationfile").read()

I think that it would be a bit of a stretch to call this "more 
efficient", at least if one thinks at all about the implementation of 
this code.  (Maybe this just depends on one's definition of 
"efficiency"...)

You are proposing, here, to get the OS to read all the files and 
append them to another (new) file, and then read in the new file. 
That's, at a minimum, an extra hard drive read and hard drive write 
for each of those files.  Using fileinput, or otherwise looping 
through a list of files, doesn't do that.  This solution may take 
fewer lines of code, but LoC count isn't necessarily related to real 
efficiency in any meaningful way.

I would also say that it's more clear to state, in code, "take this 
list of files and process them one at a time, with this procedure..." 
than to state "Get someone else to collect all of these files and copy 
them into a new one, and then take that new file and do this..."

It is simplest to write a procedure that pretends to be dealing with 
only a single file, to be sure.  However, I think that the correct 
abstraction is to have a function that is passed a single file, and 
that deals with multiple files by getting them passed to it in 
sequence, rather than to write something that handles multiple files 
by squeezing them all together into an undifferentiated mass.

Jeff Shannon
Technician/Programmer
Credit International




More information about the Tutor mailing list