md5.hexdigest() converting unicode string to ascii

Fredrik Lundh fredrik at pythonware.com
Sat Apr 17 02:04:06 EDT 2004


"uebertester" wrote:

> None of the suggestions seem to address the issue.  sValue =
> _winreg.QueryValueEx(y,"") returns a tuple containing the following
> (u'http://', 1).  The string u'http://' is added to the md5 object via
> the update() and then hashed via hexdigest().  How do I keep the
> unicode string from being converted to ascii with the md5 functions?

krzysztof already explained this:

- MD5 is calculated on bytes, not characters.
- Unicode strings contain characters, not bytes.
- if you pass in a Unicode string where Python expects a byte string,
  Python converts the Unicode string to an 8-bit string using the default
  rules (which simply creates 8-bit bytes with the same values as the
  corresponding Unicode characters, as long as the Unicode string only
  contains characters for which ord(ch) < 128).
- if you're not happy with that rule, you have to convert the Unicode
  string to a byte string yourself, using the "encode" method.

        m.update(u.encode(encoding))

- if you don't know what encoding you're supposed to use, you have
  to guess.  if it doesn't matter, as long as you remember what you used,
  I'd suggest "utf-8" or perhaps "utf-16-le".

> Or can I?

given how things work, the "how do I keep the string from being
converted" doesn't really make sense.

</F>







More information about the Python-list mailing list