md5.hexdigest() converting unicode string to ascii
Fredrik Lundh
fredrik at pythonware.com
Sat Apr 17 02:04:06 EDT 2004
"uebertester" wrote:
> None of the suggestions seem to address the issue. sValue =
> _winreg.QueryValueEx(y,"") returns a tuple containing the following
> (u'http://', 1). The string u'http://' is added to the md5 object via
> the update() and then hashed via hexdigest(). How do I keep the
> unicode string from being converted to ascii with the md5 functions?
krzysztof already explained this:
- MD5 is calculated on bytes, not characters.
- Unicode strings contain characters, not bytes.
- if you pass in a Unicode string where Python expects a byte string,
Python converts the Unicode string to an 8-bit string using the default
rules (which simply creates 8-bit bytes with the same values as the
corresponding Unicode characters, as long as the Unicode string only
contains characters for which ord(ch) < 128).
- if you're not happy with that rule, you have to convert the Unicode
string to a byte string yourself, using the "encode" method.
m.update(u.encode(encoding))
- if you don't know what encoding you're supposed to use, you have
to guess. if it doesn't matter, as long as you remember what you used,
I'd suggest "utf-8" or perhaps "utf-16-le".
> Or can I?
given how things work, the "how do I keep the string from being
converted" doesn't really make sense.
</F>
More information about the Python-list
mailing list