A little amusing Python program
Andrew Dalke
dalke at dalkescientific.com
Fri Oct 5 19:24:48 EDT 2001
Jeff Sandys
>> Another program shown at an AI conference was a
>> document classifier. To determine which folder to add
>> the document to, it simply compare the size of the
>> tarred folders before and after adding the document.
Tom Good:
>I don't get that last part. How does comparing the size of the
>folders before and after do anything useful? Wouldn't all of the
>folders increase by the size of the file?
I'm assuming they were also compressed. Depending on the
compression scheme, if there is a lot of text (words, phrases)
that are shared between the documents then closely related
text should compress better then more distantly related text.
But that makes a lot of assumptions on the compression, like
that it doesn't reset itself after a given amount of data.
Andrew
dalke at dalkescientific.com
More information about the Python-list
mailing list