[issue17436] hashlib: add a method to hash the content of a file

STINNER Victor report at bugs.python.org
Fri Apr 1 09:23:23 EDT 2016


STINNER Victor added the comment:

> I added a new method to the hash object named fromfile().

Usually, fromxxx() is used to create a new object. In your case, it's more to update an existing hash object. So I would prefer the name "readfile".

IMHO you need two methods:

* hashobj.readfile(filename: str)
* hashobj.readfileobj(file) where file is an object with a read() method which returns bytes strings

The implementation of the two methods can be very different. In readfile(), you know that it's a regular file which exists on the file system. So you can directly uses _Py_fstat() to get st_blksize and then loop on _Py_read().

For readfileobj(), the file object doesn't need to exist on disk, fileno() can raises an exception or not exist at all.

I suggest to look at copyfile() and copyfileobj() functions of the shutil module. For example, copyfileobj() has an optional parameter for the buffer size. You should probably uses that to avoid complex heuristic to guess the optimal buffer size.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue17436>
_______________________________________


More information about the Python-bugs-list mailing list