hashing strings to integers for sqlite3 keys

Peter Otten __peter__ at web.de
Thu May 22 11:34:02 EDT 2014


Adam Funk wrote:

> On 2014-05-22, Chris Angelico wrote:
> 
>> On Thu, May 22, 2014 at 11:54 PM, Adam Funk <a24061 at ducksburg.com> wrote:
> 
>>> That ties in with a related question I've been wondering about lately
>>> (using MD5s & SHAs for other things) --- getting a hash value (which
>>> is internally numeric, rather than string, right?) out as a hex string
>>> & then converting that to an int looks inefficient to me --- is there
>>> any better way to get an int?  (I haven't seen any other way in the
>>> API.)
>>
>> I don't know that there is, at least not with hashlib. You might be
>> able to use digest() followed by the struct module, but it's no less
>> convoluted. It's the same in several other languages' hashing
>> functions; the result is a string, not an integer.
> 
> Well, J*v* returns a byte array, so I used to do this:
> 
>     digester = MessageDigest.getInstance("MD5");
>     ...
>     digester.reset();
>     byte[] digest = digester.digest(bytes);
>     return new BigInteger(+1, digest);

In Python 3 there's int.from_bytes()

>>> h = hashlib.sha1(b"Hello world")
>>> int.from_bytes(h.digest(), "little")
538059071683667711846616050503420899184350089339

> I dunno why language designers don't make it easy to get a single big
> number directly out of these things.
 
You hardly ever need to manipulate the numerical value of the digest. And on 
its way into the database it will be re-serialized anyway.
 
> I just had a look at the struct module's fearsome documentation &
> think it would present a good shoot(self, foot) opportunity.






More information about the Python-list mailing list