Does hashlib support a file mode?

Phlip phlip2005 at gmail.com
Wed Jul 6 01:54:50 EDT 2011


Pythonistas:

Consider this hashing code:

  import hashlib
  file = open(path)
  m = hashlib.md5()
  m.update(file.read())
  digest = m.hexdigest()
  file.close()

If the file were huge, the file.read() would allocate a big string and
thrash memory. (Yes, in 2011 that's still a problem, because these
files could be movies and whatnot.)

So if I do the stream trick - read one byte, update one byte, in a
loop, then I'm essentially dragging that movie thru 8 bits of a 64 bit
CPU. So that's the same problem; it would still be slow.

So now I try this:

  sum = os.popen('sha256sum %r' % path).read()

Those of you who like to lie awake at night thinking of new ways to
flame abusers of 'eval()' may have a good vent, there.

Does hashlib have a file-ready mode, to hide the streaming inside some
clever DMA operations?

Prematurely optimizingly y'rs

--
  Phlip
  http://bit.ly/ZeekLand



More information about the Python-list mailing list