python implementation of a new integer encoding algorithm.

janhein.vanderburg at gmail.com janhein.vanderburg at gmail.com
Wed Feb 18 03:55:26 EST 2015


On Tuesday, February 17, 2015 at 2:17:02 PM UTC+1, Chris Angelico wrote:
> This is a fine forum to ask in. However, you may find that the advice
> you get isn't quite what you were asking for. In my case, the advice
> I'm offering is: Don't do this.
Thanks Chris; let me explain why I want this.

As you rightly point out, human readable encoding enables debugging without the need for specialized tools that assemble/disassemble character streams.
I also agree that saving space in a partition of a character stream like a communication packet or persistent memory sector won't do much good if it does not save additional partitions.

The human centered encoding versus machine oriented encoding discussion dates from some decades ago now, and I am personally still not convinced that human readable text is all that sensible in most applications.
The main problem I see is that we are trying to make computers behave like human beings, and that introduces interpretation risks and resource requirements that should be avoided.
I'll be happy to discuss this subject again, but let's do that in a different thread.

The difference between two encoded stream lengths within a single partition may not be relevant, but the difference between needing 1 or 2 partitions is quite significant.
Especially in applications that depend on very limited bandwidth communication channels.
Since all messages in my applications are primarily composed of  integer values, finding the algorithm that fundamentally minimizes the number of bits to encode such a value will increase my chances to save partitions needed to encode the message, even without adding compression techniques at the message level.
Note that I don't assume any properties of the integer values to be encoded.

> Take the
> easy option; you can always make things more complicated later.
That makes sense alright.
No offense, but I still believe that human readable text encoding complicates things right now and shouldn't be tried until "my way" has proven unpractical in its first application.
Consider it to be a theoretical challenge: how do I find the general encoding of an arbitrary integer value that minimizes the number of bits needed and given that algorithm, find the python code that minimizes the processor load inflicted by the codec implementation.




More information about the Python-list mailing list