urllib slow on FreeBSD 4.7? sockets too
Mike Brown
mike at skew.org
Sat Nov 23 05:49:03 EST 2002
> I'm not sure it's a good experiment to eliminate too many things at once.
> I.e., how do you know how much you gained by going to os.read and how much
> you gained by buffering via bytea.append instead of having urllib do it
internally?
If I use a string or buffer object with "+=" instead of an array,
performance drops significantly:
import urllib, time, os
starttime = time.time()
u = urllib.urlopen('http://localhost/4MBfile')
fn = u.fp.fileno()
bytes = 1
allbytes = ''
while bytes:
bytes = os.read(fn, 16 * 1024)
allbytes += bytes
u.close()
endtime = time.time()
elapsed = endtime - starttime
length = len(allbytes)
print "bytes: %.1fK; time: %0.3fs (%0d KB/s)" % (length / 1024.0, elapsed,
length / 1024.0 / elapsed )
bytes: 4241.5K; time: 5.809s (730 KB/s)
If I use cStringIO, it's much better, but still about 2 to 5 MB/s slower
than Jarkko's byte array:
import urllib, time, os, cStringIO
starttime = time.time()
u = urllib.urlopen('http://localhost/4MBfile')
fn = u.fp.fileno()
bytes = 1
allbytes = cStringIO.StringIO()
while bytes:
bytes = os.read(fn, 16 * 1024)
allbytesf.write(bytes)
u.close()
allbytes = allbytesf.getvalue()
allbytesf.close()
endtime = time.time()
elapsed = endtime - starttime
length = len(allbytes)
print "bytes: %.1fK; time: %0.3fs (%0d KB/s)" % (length / 1024.0, elapsed,
length / 1024.0 / elapsed )
bytes: 4241.5K; time: 0.419s (10127 KB/s)
So the byte array.append() approach with ''.join() afterward (which is
surprisingly fast) seems to be the clear winner. On to the sockets...
More information about the Python-list
mailing list