building a large file

Alex Martelli aleaxit at yahoo.com
Sat Apr 14 06:37:51 EDT 2001


"Thomas Duterme" <thomas at madeforchina.com> wrote in message
news:mailman.987235759.19334.python-list at python.org...
> Hi everyone,
>
> So I need to build a very large file.  Essentially, I need
> to go through a directory and recursively append the
> contents of each file from that directory to a file.
>
> Here's what I'm doing right now:
>
> for x in os.listdir('.'):
> os.system('cat '+x+' >> mylargefile)
>
> Is there any smarter way to do this in python?

Your approach is very simple, which is a plus, but
it can have problems if "mylargefile" is also in the
current directory, and performance is uncertain.

An alternative might be:

bufsize = 1024*1024
fileob = open(mylargefile, 'ab')
for x in os.listdir('.'):
    if x != mylargefile:
        x = open(x, 'rb')
        while 1:
            data = x.read(bufsize)
            if not data: break
            fileob.write(data)
        x.close()

The amount of 'sophistication' (and complexity)
can be tweaked, of course.  If each of the files
you're reading will fit comfortably in memory, for
example, there is no need to read each in 1-MB
slices -- and if you're willing to rely on automatic
closure on object destruction, this simplifies to:

fileob = open(mylargefile, 'ab')
for x in os.listdir('.'):
    if x != mylargefile:
        fileob.write(open(x, 'rb').read())

On the other hand, the "x != mylargefile" may
be redundant (if you know mylargefile is not in
the current directory) or not careful enough (if
the form of the mylargefile string may indicate
a file in the current directory but also include a
path, e.g. "./foo.dat").


Alex






More information about the Python-list mailing list