Match beginning of two strings

Scott David Daniels Scott.Daniels at Acm.Org
Sun Aug 3 00:18:32 EDT 2003


Ravi wrote:
> Hi,
> 
> I have about 200GB of data that I need to go through and extract the 
> common first part of a line. Something like this.
> 
>  >>>a = "abcdefghijklmnopqrstuvwxyz"
>  >>>b = "abcdefghijklmnopBHLHT"
>  >>>c = extract(a,b)
>  >>>print c
> "abcdefghijklmnop"
> 
> Here I want to extract the common string "abcdefghijklmnop". Basically I 
> need a fast way to do that for any two given strings. For my situation, 
> the common string will always be at the beginning of both strings. I can 
> use regular expressions to do this, but from what I understand there is 
> a lot of overhead. New data is being generated at the rate of about 1GB 
> per hour, so this needs to be reasonably fast while leaving CPU time for 
> other processes.
> 
> Thanks
> Ravi
> 
While you can be forgiven for not have guessed, os.path is the place to
look:
	import os.path
	a = "abcdefghijklmnopqrstuvwxyz"
	b = "abcdefghijklmnopBHLHT"
	print os.path.commonprefix([a,b])

-Scott David Daniels
Scott.Daniels at Acm.Org





More information about the Python-list mailing list