MD4 and DES for unicode strings (samba passwd file manipulation)

Martin v. Loewis martin at v.loewis.de
Wed Sep 4 03:17:58 EDT 2002


Arcady Genkin <agenkin at cdf.toronto.edu> writes:

> Perhaps the problem lies with Unicode encoding?  Samba's man page (see
> the link below) says that the string is obtained by taking MD4 hash on
> the password, represented as a string of 16-bit little-endian unicode
> characters.

There is a contradiction in this statement: You can't really MD4-hash
"Unicode", as Unicode characters are values in range(0,2**21) - to
MD4-hash, you need a byte string. However, the statement means that
they use the UTF-16 encoding in its little-endian variant, which, in
Python, is implemented in the utf-16le codec:

>>> m = MD4.new()
>>> m.update( u'111222333'.encode("utf-16le") )
>>> print m.hexdigest().upper()
B66B458E27BD714F99A05EE3C479FBF1

m.update expects a byte string. If you pass a Unicode string, that
gets converted with the system default encoding, which is "ascii", and
in this case not appropriate.

HTH,
Martin



More information about the Python-list mailing list