looking for speed-up ideas
William Park
opengeometry at yahoo.ca
Mon Feb 3 22:32:08 EST 2003
Andrew Dalke <adalke at mindspring.com> wrote:
> William Park wrote:
>> Behold:
>> egrep '^F' dumpfile | sort -t '/' -n -k 2,2 | tail -200
>>
>> How fast does it run?
>
> That was my first thought too. The problem is that it doesn't
> keep track of the directory names, which is needed to display
> the full path name, which I believe he dumps in
>
> for t in all_file_list:
> print t[2], t[1], get_dir_name(t[3])
>
> It's too bad he didn't include example output.
In that case, generate another file with full pathnames.
T /remote 0
S/name/0/1
S/joe/1/2
S/bob/1/3
F/3150900/big_file.tar.gz
S/testing/3/4
F/414/.envrc
F/276/BUILD_FLAGS
F/36505/make.incl
F/3861/build_envrc
Let's see, using '@' for pathname separator...
awk 'BEGIN {dir[0] = "remote" ; OFS = "/" ; FS = "/"}
$1 ~ /^S$/ { $2 = dir[$4] = pwd = dir[$3] "@" $2 }
$1 ~ /^F$/ { $3 = pwd "@" $3 ; print}
' dumpfile | sort -t '/' -n -k 2,2 | tail -200
Python translation is left as homework for the OP.
--
William Park, Open Geometry Consulting, <opengeometry at yahoo.ca>
Linux solution for data management and processing.
More information about the Python-list
mailing list