How to compare text?

Andrew Dalke dalke at dalkescientific.com
Mon Dec 3 12:53:25 EST 2001


A wrote:
  =====
For example I have the first paragraph

I want to be very good at Python programming. Better than in Perl.

THe second paragraph might look loke this:

She works all day long to master Perl.

All that I need is to find out if any of word from the second is in the
first paragraph. For the example above I should find out word

Perl.

  =====

You should also find that 'to' is in common.  Here's a good
hint at solving your question.

text1 = "I want to be very good at Python programming. " + \
        "Better than in Perl."
text2 = "She works all day long to master Perl."

d = {}
for word in text1.split():
  d[word] = 1
for word in text2.split():
  if d.has_key(word):
    print "The word", word, "is in both"

Things to worry about:
  - is capitalization important?  "May" is a month or someone's
      name, but unless it's at the beginning of the sentence it
      does not mean the same as "may".  At the very least you
      should lowercase everything.
  - you'll need to remove punctuation, unless you don't want
      "Perl." to match "Perl"
  - but you will need to keep apostrophes, like "don't"

                    Andrew
                    dalke at dalkescientific.com






More information about the Python-list mailing list