Does hashlib support a file mode?

Chris Torek nospam at torek.net
Wed Jul 6 13:54:07 EDT 2011


>> - Do the usual dance for default arguments:
>>     def file_to_hash(path, m=None):
>>         if m is None:
>>             m = hashlib.md5()

[instead of

    def file_to_hash(path, m = hashlib.md5()):

]

In article <b317226a-8008-4177-aaa6-3fdc30125eea at e20g2000prf.googlegroups.com>
Phlip  <phlip2005 at gmail.com> wrote:
>Not sure why if that's what the defaulter does?

For the same reason that:

    def spam(somelist, so_far = []):
        for i in somelist:
            if has_eggs(i):
                so_far.append(i)
        return munch(so_far)

is probably wrong.  Most beginners appear to expect this to take
a list of "things that pass my has_eggs test", add more things
to that list, and return whatever munch(adjusted_list) returns ...
which it does.  But then they *also* expect:

    result1_on_clean_list = spam(list1)
    result2_on_clean_list = spam(list2)
    result3_on_partly_filled_list = spam(list3, prefilled3)

to run with a "clean" so_far list for *each* of the first two
calls ... but it does not; the first call starts with a clean
list, and the second one starts with "so_far" containing all
the results accumulated from list1.

(The third call, of course, starts with the prefilled3 list and
adjusts that list.)

>I did indeed get an MD5-style string of what casually appeared
>to be the right length, so that implies the defaulter is not to
>blame...

In this case, if you do:

    print('big1:', file_to_hash('big1'))
    print('big2:', file_to_hash('big2'))

you will get two md5sum values for your two files, but the
md5sum value for big2 will not be the equivalent of "md5sum big2"
but rather that of "cat big1 big2 | md5sum".  The reason is
that you are re-using the md5-sum-so-far on the second call
(for file 'big2'), so you have the accumulated sum from file
'big1', which you then update via the contents of 'big2'.
-- 
In-Real-Life: Chris Torek, Wind River Systems
Intel require I note that my opinions are not those of WRS or Intel
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W)  +1 801 277 2603
email: gmail (figure it out)      http://web.torek.net/torek/index.html



More information about the Python-list mailing list