python tool: finding duplicate code
Roman Suzi
rnd at onego.ru
Thu May 30 14:23:58 EDT 2002
On Thu, 30 May 2002, Michal Wallace wrote:
>On Wed, 29 May 2002, Tim Peters wrote:
>
>> > (hmm... Come to think of it, someone could probably find
>> > *some* duplicate logic by running source files through the
>> > tokenizer first. I wonder if that would work...)
>>
>> Brenda Baker has done some interesting work on this
>> problem (not with Python in mind, but million-line C
>> systems):
>>
>> http://cm.bell-labs.com/who/bsb/
>>
>> Her "On Finding Duplication and Near-Duplication in Large
>> Software Systems" is a good entry into the literature.
>>
>> I have a self-serving reason for mentioning this: if
>> somebody whips up a fast suffix tree for Python, I could
>> put it to good use in ameliorating difflib.py's worst-case
>> time sinks <wink>.
>
>Hey Tim,
>
>Thanks for the link! I found a javascript version of a
>suffix tree algorithm online. I ported it to python and it
There could be some scientific works found on google
by keyword "suffix tree algorithm" ...
Sincerely yours, Roman Suzi
--
\_ Russia \_ Karelia \_ Petrozavodsk \_ rnd at onego.ru \_
\_ Thursday, May 30, 2002 \_ Powered by Linux RedHat 7.2 \_
\_ "How do I love thee? My accumulator overflows." \_
More information about the Python-list
mailing list