compressing short strings?
Thomas Troeger
thomas.troeger.ext at siemens.com
Tue May 20 05:50:22 EDT 2008
Paul Rubin wrote:
> I have a lot of short English strings I'd like to compress in order to
> reduce the size of a database. That is, I'd like a compression
> function that takes a string like (for example) "George Washington"
[...]
>
> Thanks.
I think your idea is good, maybe you'd want to build an LZ78 encoder in
Python (LZ78 is pretty easy), feed it with a long English text and then
pickle the resulting object. You could then unpickle it on program start
and encode your short strings with it. I bet there's a working
implementation around that already that does it ... but if you can't
find any, LZ78 is implemented in 1 or 2 hours. There was a rather good
explanation of the algorithm in German, unfortunately it's vanished from
the net recently (I have a backup if you're interested).
Cheers,
Thomas.
More information about the Python-list
mailing list