python implementation of a new integer encoding algorithm.

Chris Angelico rosuav at gmail.com
Thu Feb 19 13:34:58 EST 2015


On Fri, Feb 20, 2015 at 5:24 AM, Dave Angel <davea at davea.name> wrote:
> In all my experimenting, I haven't found any values where the 7bit scheme
> does worse.  It seems likely that for extremely large integers, it will, but
> if those are to be the intended distribution, the 7bit scheme could be
> replaced by something else, like just encoding a length at the beginning,
> and using raw bytes after that.

Encoding a length (as varlen) and then using eight bits to the byte
thereafter is worse for small numbers, breaks even around 2**56, and
then is better. So unless your numbers are mainly going to be above
2**56, it's better to just use varlen for the entire number. On the
other hand, if you have to stream this without over-reading (imagine
streaming from a TCP/IP socket; you want to block until you have the
whole number, but not block after that), it may be more efficient to
take the length, and then do a blocking read for the main data,
instead of a large number of single-byte reads. But on the gripping
hand, you can probably just do those one-byte reads and rely on (or
implement) lower-level buffering.

Ask not the python-list for advice, because they will say both "yes"
and "no" and "maybe"... because they will say all three of "yes",
"no", "maybe", and "you don't need to do that"... erm, AMONG our
responses will be such diverse elements as...

ChrisA



More information about the Python-list mailing list