python implementation of a new integer encoding algorithm.

Ian Kelly ian.g.kelly at gmail.com
Thu Feb 19 13:32:17 EST 2015


On Thu, Feb 19, 2015 at 11:24 AM, Dave Angel <davea at davea.name> wrote:
> Here's a couple of ranges of output, showing that the 7bit scheme does
> better for values between 384 and 16379.
>
> 382 2 80fe --- 2 7e82
> 383 2 80ff --- 2 7f82
> 384 3 810000 --- 2 0083
> 384  jan grew 3 810000
> 385 3 810001 --- 2 0183
> 386 3 810002 --- 2 0283
> 387 3 810003 --- 2 0383
> 388 3 810004 --- 2 0483
> 389 3 810005 --- 2 0583
>
> 16380 3 813e7c --- 2 7cff
> 16380  jan grew 3 813e7c
> 16380 7bit grew 2 7cff
> 16381 3 813e7d --- 2 7dff
> 16382 3 813e7e --- 2 7eff
> 16383 3 813e7f --- 2 7fff
> 16384 3 813e80 --- 3 000081
> 16384 7bit grew 3 000081
> 16385 3 813e81 --- 3 010081
> 16386 3 813e82 --- 3 020081
> 16387 3 813e83 --- 3 030081
> 16388 3 813e84 --- 3 040081
> 16389 3 813e85 --- 3 050081
>
> In all my experimenting, I haven't found any values where the 7bit scheme
> does worse.  It seems likely that for extremely large integers, it will, but
> if those are to be the intended distribution, the 7bit scheme could be
> replaced by something else, like just encoding a length at the beginning,
> and using raw bytes after that.

It looks like you're counting whole bytes, not bits. That would be
important since the "difficult" encoding uses fractional bytes.



More information about the Python-list mailing list