Working with bytes.
Anton Vredegoor
anton at vredegoor.doge.nl
Thu Apr 8 14:29:26 EDT 2004
Jason Harper <JasonHarper at pobox.com> wrote:
>Anton Vredegoor wrote:
>> I wonder whether it would be possible to use more than six bits per
>> byte but less than seven? There seem to be some character codes left
>> and these could be used too?
>
>Look up Base85 coding (a standard part of PostScript) for an example of
>how this can be done - 4 bytes encoded per 5 characters of printable ASCII.
Thanks to you and Piet for mentioning this. I found some other
interesting application of Base85 encoding. It's used for a scheme to
encode ipv6 addresses (which use 128 bits). Since a md5 digest is 16
bytes (== 128 bits) there's a possibility to use this scheme. See
http://www.faqs.org/rfcs/rfc1924.html
for the details.
Anton
from string import digits, ascii_letters
_rfc1924_chars = digits+ascii_letters+'!#$%&()*+-;<=>?@^_`{|}~'
_rfc1924_table = dict([(c,i) for i,c in enumerate(_rfc1924_chars)])
_rfc1924_bases = [85L**i for i in range(20)]
def bytes_to_rfc1924(sixteen):
res = []
i = 0L
for byte in sixteen:
i <<= 8
i |= ord(byte)
for j in range(20):
i,k = divmod(i,85)
res.append(_rfc1924_chars[k])
return "".join(res)
def rfc1924_to_bytes(twenty):
res = []
i = 0L
for b,byte in zip(_rfc1924_bases,twenty):
i += b*_rfc1924_table[byte]
for j in range(16):
k = i & 255
res.append(chr(k))
i >>= 8
res.reverse()
return "".join(res)
def test():
import md5
#md5.digest returns 16 bytes == 128 bits, an ipv6 address
#also uses 128 bits (I don't know which format so I'm using md5
#as a dummy placeholder to get 16 bytes of 'random' data)
bytes = md5.new('9034572345asdf').digest()
r = bytes_to_rfc1924(bytes)
print r
check = rfc1924_to_bytes(r)
assert bytes == check
if __name__=='__main__':
test()
output:
k#llNFNo4sYFxKn*J<lB
More information about the Python-list
mailing list