Jarow-Winkler algorithm: Measuring similarity between strings

bearophileHUGS at lycos.com bearophileHUGS at lycos.com
Sat Dec 20 07:55:42 EST 2008


John Machin:
> This paper by Heikki Hyyrö is well worth
> reading, and refers to a whole lot of previous work, including
> Ukkonen's:
> http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.6.2242

This is the site of the author:
http://www.cs.uta.fi/~helmu/pubs/pubs.html
There you can find updates too:
http://www.cs.uta.fi/~helmu/pubs/cpm02.pdf

But such researchers have to offer C/Pascal/Java code too of their
papers. Implementing things from scratch every time is a waste of
time.

Probably using SSE instructions like ANDPS, ANDNPS, ORPS, XORPS allows
to use bitwise operations on 128 bit, allowing to search for longer
strings in the same time.
And the GPU of graphic cards offers other possibilities:
http://www.cbcb.umd.edu/software/cmatch/Cmatch.pdf

Bye,
bearophile



More information about the Python-list mailing list