md5.hexdigest() converting unicode string to ascii

Krzysztof Stachlewski stach at fr.pl
Fri Apr 16 08:00:08 EDT 2004


"uebertester" <mgibson at tripwire.com> wrote in message
news:77b925de.0404151450.5c1f720a at posting.google.com...
> I'm trying to get a MD5 hash of a registry key value on Windows 2000
> Server.  The registry value is a string which is stored as a unicode
> string.  I can get the registry key value and pass it to md5 via
> update(), however, hexdigest() returns a hash value for the ascii
> equivalent of the unicode string.  Not the hash of the unicode string
> itself.  I've verified this by creating a text file containing an
> ascii form of the string and one containing a unicode form of the
> string.  The hash returned by md5.hexdigest() matches that when I run
> md5sum on the ascii file.

The md5() function is defined on character strings
not on unicode strings. An unicode string is a sequence
of integers. Such sequence may be converted to a character
string, but there are many different methods of doing that.
In Python you convert an unicode string to a character
string by using encoding of your choice.
For instance:
u"abcd".encode("utf-16")
utf-16 is just an example - you have to decide which
encoding to choose.

Stach




More information about the Python-list mailing list