[TriZPUG] [triangle-zpug] how to match strings in python

Chris Calloway cbc at unc.edu
Fri Apr 3 00:55:12 CEST 2009


On 4/2/2009 4:19 PM, Chris Rossi wrote:
 > brad.crittenden at gmail.com> wrote:
 >> Not enough testing!
 >> Dang, my pronouncement was premature.  I think Jay's would behave
 > similarly, too.

Oh, crap. I assumed Jay's worked for all cases, too. Doh!

OK, so I think I have it. Back in the middle ages of TriZPUG, Adam Hupp 
taught me this hack from ActiveState:

http://starship.python.net/pipermail/triangle-zpug/2005-June/000660.html

(This is why we still need those archives :)

And I'm going to cheat and use that un-Pythonic hack now along with the 
shortened form of the classic and/or short circuit hack:

 >>> ''.join([(x==y) and x or ''
...          for (x,y) in zip('foobard','foobazd')
...          if '' not in locals()['_[1]']])
'fooba'
 >>>

or if sys.version > '2.4':

 >>> ''.join([x if (x==y) else ''
...          for (x,y) in zip('foobard','foobazd')
...          if '' not in locals()['_[1]']])
'fooba'
 >>>

I folded the one statement across three lines so you could read it in an 
email. But it is a genuine one liner.

Do I win the prize now? :)

We just became PerlMongers, ya know.

I think I was also wrong about O(nm) for the lower limit on time, which 
the wiki article claims is the limit.

I stuck this optimization near the end of my previous example:

         if len(shorter) <= len(result):
             break

No point in looking for a another result longer than the search argument.

It doesn't make my solution better than O(nm) on time because the find 
method is O(n/2) on average. I think my solution is more like 
O((n**2)m/2). I think my solution using string methods would be pretty 
abysmally slow for practical use cases where n is large like, say, 
looking for plagiarism. But it does make my solution twice as efficient 
on average as without the optimization.

But maybe a similar hook could be put into the algorithm on the wiki. 
And then the efficiency would be O(nm/2) on average.

-- 
Sincerely,

Chris Calloway
http://www.secoora.org
office: 332 Chapman Hall   phone: (919) 599-3530
mail: Campus Box #3300, UNC-CH, Chapel Hill, NC 27599



More information about the TriZPUG mailing list