[Mailman-Developers] Almost OT: Re: Opening up a few can o' worms here...

Les Niles les@2pi.org
Tue, 30 Jul 2002 12:12:47 -0700


On Tue, 30 Jul 2002 07:27:27 -0700 Chuq Von Rospach <chuqui@plaidworks.com> wrote:
>On 7/30/02 3:41 AM, "Ka-Ping Yee" <ping@zesty.ca> wrote:
>> I think they'd hardly be able to get any.  Have you really thought about
>> how hard this would be?  Why would they bother to invest the enormous
>> development effort to make this work for the one or two addresses they
>> *might* get, along with a large number of misread addresses?
>
>Yes, I have. Because I've seen how the spammers have moved up the technology
>curve when it suited their purposes.
>
>You're depending on not being "the low hanging fruit", so to speak. That's
>the philosophy behind "the club" for preventing car thefts. That philosophy
>works only as long as your data isn't valuable enough to be worth the extra
>effort. Once it does, you suddenly have a protection system that isn't
>working, but you've created a false sense of security because you think it
>works. That's worse than having no system, then, because you've stopped
>being worried about it.
>
>> In the image case, there is no secret.  Nobody knows how to program a
>> computer to read as well as person can
>
>Have you seen what the off the shelf OCR systems like OmniPage do these
>days? 

What's more, Gary Kopec and others at Xerox PARC developed OCR
algorithms that, in many situations, can read much better than a
person can.  There are practical issues that prevent using these
algorithms in the typical shrink-wrap OCR applications, but I think
they'd work pretty well for converting email address images.  There
are probably one or two dozen people who could implement this in a
few months, and lots who could do so after reading the papers that
have been published.  (There is an issue of patent infringement
that might discourage selling such software, but it would be really
hard to know that a big harvester was using the algorithms
internally.)  IOW, Chuq I think you're right on target: Once it
becomes valuable enough to get around image-encoding of email
addresses, then it will be done.  

Anyone for audio-encoded email addresses?  When it comes to speech
recognition, computers are definitely much worse than people.

  -les