Efficient grep using Python?

Thu Dec 16 09:28:21 EST 2004

Christos TZOTZIOY Georgiou wrote:
> On Wed, 15 Dec 2004 16:10:08 +0000, rumours say that P at draigBrady.com
> might have written:
> 
> 
>>>Essentially, want to do efficient grep, i..e from A remove those lines which
>>>are also present in file B.
>>
>>You could implement elegantly using the new sets feature
>>For reference here is the unix way to do it:
>>
>>sort a b b | uniq -u
> 
> 
> No, like I just wrote in another post, he wants
> 
> $ grep -vf B A
> 
> I think that
> 
> $ sort A B B | uniq -u
> 
> can be abbreviated to
> 
> $ sort -u A B B
> 
> which is the union rather than the intersection of the files

wrong. Notice the -u option to uniq.
http://marc.free.net.ph/message/20021101.043138.1bc24964.html

> wastes some time by considering B twice

I challenge you to a benchmark :-)

> and finally destroys original line
> order (should it be important).

true

-- 
Pádraig Brady - http://www.pixelbeat.org
--