slow joinings of strings

Fredrik Lundh fredrik at effbot.org
Tue Jan 30 07:58:00 EST 2001


Karol Bryd wrote:
> I want to read a file (0.6MB, 10000 lines) into memory, and want to do it as
> fast as possible, this code does it, but is terribly slow
>
> fp = open(file, 'r')
> s = ''
> while 1:
>         line = fp.readline()
>         if line == '': break
>         s = s + line
>
> (executing time 25 sec)
>
> At first I thought that this is caused by readline() and lack of buffering
> but after removing "s = s + line" executing time decreased to 0.7 seconds!
> The question is how to join two strings in a more efficient way?

if you know your files won't be larger than a couple
of megabytes, you can use readlines instead:

    for line in fp.readlines():
        ...

(in 2.1, "for line in fp.xreadlines()" is nearly as efficient,
and won't run out of memory no matter how large your
file is)

to speed up the string concatentation, use "join":

    import string
    L = []
    for line in fp.readlines():
        ...
        L.append(line)
    s = string.join(L, "")

(there's a FAQ entry with more performance tips and
tricks. www.python.org => FAQ)

Cheers /F





More information about the Python-list mailing list