[issue17618] base85 encoding

Wed Apr 17 17:28:33 CEST 2013

Serhiy Storchaka added the comment:

After searching a lot of other implementations of this encoding I conclude that there are at least three different variants.

1. The original btoa/atob encoding. 4 zeros are packaged as 'z', last incomplete 4 bytes are padded by zeros, an output is wrapped into several lines and decoder ignores '\n'. There are many implementations of this algorithm in different languages.

2. Adobe version. This is an extended version of (1). The last incomplete 4 bytes produces less then 5 output characters, an output is enclosed in <~ and ~>. Decoder ignores all ascii whitespaces, not only '\n'. There are many implementations of this algorithm in different languages.

3. Git and Mercurial version. This is a very simplified version of (1) with an alternative character set. Zeros are not packed, an output is not broken into several lines and decoder doesn't ignores any whitespaces. I don't know is whether this variant used besides Git and Mercurial.

Some implementations combine (1) and (2) (optionally enclose an output in <~ and ~>, optionally wrap an output into several lines, optionally pad last 4 incomplete bytes).

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue17618>
_______________________________________