How to remove subset from a file efficiently?

bonono at gmail.com bonono at gmail.com
Sat Jan 14 04:01:52 EST 2006


fynali wrote:
> $ cat cleanup_use_psyco_and_list_compr.py
>     #!/usr/bin/python
>
>     import psyco
>     psyco.full()
>
>     postpaid_file = open('/home/sajid/python/wip/stc/2/PSP0000333')
>     outfile = open('/home/sajid/python/wip/stc/2/PSP-CBR.dat.psyco',
> 'w')
>
>     barred = {}
>
>     for number in open('/home/sajid/python/wip/stc/2/CBR0000333'):
>         barred[number] = None # just add it as a key
>
>     outfile.writelines([number for number in postpaid_file if number
> not in barred])
>
>     postpaid_file.close(); outfile.close()
>
>     --
>     $ time ./cleanup_use_psyco_and_list_compr.py
>
>     real    0m39.638s
>     user    0m5.532s
>     sys     0m0.868s
>
> This was run on my machine (w/ Python 2.4.1), can't install psyco on
> the actual server at the moment.
>
> I guess using generators & newer Python is indeed faster|better.
>
> --
> fynali
um, strange, so psyco is slower than not using it ?

you may try to expand the list comprehension to :

for number in postpaid_file:
  if number not in barred: outfile.writelines(number)




More information about the Python-list mailing list