looking for speed-up ideas

Ram Bhamidipaty ramb at sonic.net
Tue Feb 4 01:21:49 EST 2003


John La Rooy <nospampls.jlr at doctor.com> writes:
> On 4 Feb 2003 01:22:32 GMT
> William Park <opengeometry at yahoo.ca> wrote:
> 
> > 
> > Behold:
> >     egrep '^F' dumpfile | sort -t '/' -n -k 2,2 | tail -200
> > 
> > How fast does it run?
> 
> The 'sort' is going to kill performance here. Probably ok on 300k lines
> but 2.2M is probably pushing it.


And that 2.2M is with the "compressed" file names. If I just created a
file with the full path names then the file would be a lot larger. By
putting a directory identifier on each F line I would not have to
create an enourmous file.

The other part is that what I want is to create one script that can
output data in a variety of formats. Like the directories that contain
files that are using up most of the disk space.

Ok I tried the egrep pipe. For 300k lines it ran in about 8.5 seconds.

Hmm. Thats nice -- and maybe the solution to this problem would be
to use some kind of python wrapper around those tools.

BUT - it would be nice and cool if python could approach that kind of
speed. Any ideas?

-Ram




More information about the Python-list mailing list