bz2 & cpu usage
Brad Tilley
bradtilley at gmail.com
Wed Oct 20 11:45:53 EDT 2004
Kirk Job-Sluder wrote:
> Sorry for the late post, the original scrolled off the server.
>
> > I'd like to keep at least 50% of the cpu free while doing bz2 file
> > compression. Currently, bz2 compression takes between 80 & 100 percent
> > of the cpu and the Windows GUI becomes almost useless. How can I lower
> > the strain on the cpu and still do compression? I'm willing for the
> > compression process to take longer.
> >
> > Thanks,
> >
> > Brad
> >
> > def compress_file(filename):
> > path = r"C:\repository_backup"
> > print path
> > for root, dirs, files in os.walk(path):
> > for f in files:
> > if f == filename:
> > print "Compressing", f
> > x = file(os.path.join(root, f), 'rb')
> > os.chdir(path)
> > y = bz2.BZ2File(f + ".bz2", 'w')
> > while True:
> > data = x.read(1024000)
> > time.sleep(0.1)
> > if not data:
> > break
> > y.write(data)
> > time.sleep(0.1)
> > y.close()
> > x.close()
> > else:
> > return
>
> One of the issues you may be running into is memory. Under windows,
> using up 90% of the CPU shouldn't affect GUI performance (much) but
> swapping does. According to the bzip2 man page, the maximum block size
> is 900KB so you might be running into problems reading your file 1024KB
> at a time. Use the system monitor control panel to check for excessive
> swapping. Bzip2 uses 8x<blocksize> memory. So with the default setting
> of a 900KB block size, you are looking at 7.2M + some bookeeping memory.
>
> Another issue is that you might be better off downloading bzip2 for
> windows and letting the gnu bzip2 implementation handle file input and
> output. Using a shell command here might be more efficient in spite of
> spawning a new process.
>
> A third issue is that bzip2 achieves high compression efficiency at the
> expense of CPU time and memory. It might be worth considering whether
> gzip might occupy the sweet spot compromise between minimal archive size
> and minimal cpu usage.
>
> Fourth, how many of those files are uncompressible? I've noticed that
> bzip2 tries really hard to eek out some form of savings from
> uncompressible files. A filename filter for files that should not be
> compressed (png, jpg, gif, sx*) might be worth doing here.
Thanks for the tips. I installed 512MB of ECC Ram and the problem went away.
More information about the Python-list
mailing list