python implementation of a new integer encoding algorithm.

Chris Angelico rosuav at gmail.com
Wed Feb 18 09:16:00 EST 2015


On Thu, Feb 19, 2015 at 12:54 AM, Dave Angel <davea at davea.name> wrote:
>>> I've tried to read through the original algorithm description, but I'm
>>> not entirely sure: How many payload bits per transmitted byte does it
>>> actually achieve?
>>
>>
>> I don't think that payload bits per byte makes sense in this concept.
>>
>
> Correct.  Presumably one means average payload bits per byte.
>
> First one would have to define what the "standard" unencoded variable length
> integer format was.  Then one could call that size the payload size.  Then,
> in order to compute an average, one would have to specify an expected, or
> target distribution of values.  One then compares and averages the payload
> size for each typical value with the encoded size.

Average, or separately for different ranges. Alternatively, identify
which ranges of numbers can be encoded with how many bytes. For
instance, if you have a varlen length (in bytes) followed by that many
packed bytes, you would have:

2: up to 1<<8-1
3: up to 1<<16-1
4: up to 1<<24-1
...
128: up to 1<<1016-1
130: up to 1<<1024-1
131: up to 1<<1032-1
etc

Using varlen directly gives:

1: up to 1<<7-1
2: up to 1<<14-1
3: up to 1<<21-1
etc

So if your number is around about 1<<20, you know it's going to take 3
bytes to encode as varlen, or 4 to encode with the meta-length. Easy
comparisons.

I'm not sure how this algorithm compares, because it's extremely clever.

ChrisA



More information about the Python-list mailing list