python implementation of a new integer encoding algorithm.

Chris Angelico rosuav at gmail.com
Tue Feb 17 09:58:15 EST 2015


On Wed, Feb 18, 2015 at 1:50 AM, Dave Angel <davea at davea.name> wrote:
> But the first thing I'd expect to see would be a target estimate of the
> anticipated distribution of number values/magnitudes.  For example, if a
> typical integer is 1500 bits, plus/minus 200 bits, I'd probably try encoding
> in base 65535, and have a terminator character of 0xffff.

Sure. Not sure how you'd cope with an interior FFFF in the stream
without drastically losing efficiency, though.

> An interesting point of history.  At the time of Huffman's paper, it was
> provable that it was the best possible lossless compression.  But there were
> some unwritten assumptions, such as that each character would be encoded
> with a whole number of bits.  Changing that assumption makes arithmetic
> coding possible.  Next assumption was that probabilities of each character
> didn't depend on context.  Changing that assumption enables LZ.  And so on.

Didn't LZ predate arithmetic coding? But yes, removing those
assumptions has enabled some pretty amazing alternatives. Just
describing arithmetic coding to them is enough to blow most people's
minds. (Especially my sister. She hates infinity, mainly - I think -
because it doesn't fit inside her brain. I troll her occasionally with
Hilbert's Hotel or anything involving the notion of different sizes of
infinity.)

ChrisA



More information about the Python-list mailing list