Working with bytes.

Paul Prescod paul at prescod.net
Sat Apr 3 17:51:48 EST 2004


Adam T. Gautier wrote:

> I have been unable to solve a problem.  I am working with MD5 signatures 
> trying to put these in a database.  The MD5 signatures are not generated 
> using the python md5 module but an external application that is 
> producing the valid 16 byte signature formats.  Anyway, these 16 byte 
> signatures are not nescarrally valid strings.  How do I manipulate the 
> bytes?  I need to concatenate the bytes with a SQL statement which is a 
> string.  This works fine for most of the md5 signatures but some blow up 
> with a TypeError.  Because there is a NULL byte or something else.  So I 
> guess my ultimate question is how do I get a prepared SQL statement to 
> accept a series of bytes?  How do I convert the bytes to a valid string 
> like:
> 
> 'x%L9d\340\316\262\363\037\311\345<\262\357\215'
> 
> that can be concatenated?

Python strings are just a list of bytes. They will happily contain 
binary data.

 >>> a = open("/bin/ls").read()
 >>> type(a)
<type 'str'>

Concatenating byte strings with Unicode strings has been known to cause 
problems:

 >>> a = open("/bin/ls").read()
 >>> type(a)
<type 'str'>
 >>> type(j)
<type 'unicode'>
 >>> x = a + j
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xfe in position 0: 
ordinal not in range(128)

It is also possible that your SQL library just does not like the byte 
strings: but that wouldn't really be a Python problem. If you want to 
encode as pure ASCII then  you need to choose an encoding. e.g. base64:

a = a.encode("base64")

Of course sometime later you'll have to decode that.

  Paul Prescod






More information about the Python-list mailing list