[Tutor] Re: How to compare text?
Guenter Kruschina
G.Kruschina@gmx.de
Mon, 3 Dec 2001 21:27:02 +0100
--Message-Boundary-26703
Content-type: text/plain; charset=ISO-8859-1
Content-transfer-encoding: Quoted-printable
Content-description: Mail message body
From: "A" <printers@sendme.cz>
To: tutor@python.org, activepython@listserv.ActiveState.com,
python-list@python.org
Subject: How to compare text?
Send reply to: printers@sendme.cz
Priority: normal
Date sent: Mon, 3 Dec 2001 11:06:04 +0100
>
> Hello,
> How can I compare of one parragraph of text with another
> paragraph?Each paragraph can have about 100 words.
> For example I have the first paragraph
>
> I want to be very good at Python programming. Better than in Perl.
>
> THe second paragraph might look loke this:
>
> She works all day long to master Perl.
>
> All that I need is to find out if any of word from the second is in the
> first paragraph. For the example above I should find out word
>
> Perl
>
>
> What is the best and quickest way?
> Thank you for help.
> Ladislav
>
>
> _______________________________________________
> ActivePython mailing list
> ActivePython@listserv.ActiveState.com
> http://listserv.ActiveState.com/mailman/listinfo/activepython
>
Hallo Ladislav, I have written a small progam, which will work as you expe=
ct. I
hope so. I think this is a fast way to compare two paragraphs.
wbg
G=FCnter
--Message-Boundary-26703
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Content-description: Text from file 'diff.py'
def CreateDict(par):
#Remove some chars
for char in ('.',';',','):
par = par.replace(char,"")
words = par.split(' ')
dPar = {}
for word in words:
dPar[word] = 1
return dPar
def Diff(par1,par2):
dPar1 = CreateDict(par1)
dPar2 = CreateDict(par2)
lCommon = []
for word in dPar2.keys():
if dPar1.has_key(word):
lCommon.append(word)
return lCommon
def main():
lCommonWords = Diff("I want to be very good at Python programming. Better than in Perl.",
"She works all day long to master Perl.")
print "Common Words: ", lCommonWords
main()
--Message-Boundary-26703--