How to remove subset from a file efficiently?

bonono at gmail.com bonono at gmail.com
Sat Jan 14 02:17:05 EST 2006


fynali wrote:
> $ cat cleanup_ray.py
>     #!/usr/bin/python
>     import itertools
>
>     b = set(file('/home/sajid/python/wip/stc/2/CBR0000333'))
>
> file('PSP-CBR.dat,ray','w').writelines(itertools.ifilterfalse(b.__contains__,file('/home/sajid/python/wip/stc/2/PSP0000333')))
>
>     --
>     $ time ./cleanup_ray.py
>
>     real    0m5.451s
>     user    0m4.496s
>     sys     0m0.428s
>
> (-: Damn!  That saves a bit more time!  Bravo!
>
> Thanks to you Raymond.
Have you tried the explicit loop variant with psyco ? My experience is
that psyco is pretty good at optimizing for loop which usually results
in faster code than even built-in map/filter variant.

Though it would just be 1 or 2 sec difference(given what you already
have) so may not be important but could be fun.




More information about the Python-list mailing list