[Tutor] regarding checksum
Peter Otten
__peter__ at web.de
Wed Oct 26 04:34:41 EDT 2016
Clayton Kirkwood wrote:
> Small problem:
> Import zlib
> For file in files:
> checksum = zlib.adler32(file)
>
> traceback
> checksum = zlib.adler32(file)
> TypeError: a bytes-like object is required, not 'str'
>
> Obvious question, how do I make a bytes-like object. I've read through the
> documentation and didn't find a way to do this.
A checksum is calculated for a sequence of bytes (numbers in the range
0...255), but there are many ways to translate a string into such a byte
sequence. As an example let's convert "mañana" first using utf-8,
>>> list("mañana".encode("utf-8"))
[109, 97, 195, 177, 97, 110, 97]
then latin1:
>>> list("mañana".encode("latin-1"))
[109, 97, 241, 97, 110, 97]
So which sequence should the checksum algorithm choose?
Instead of picking one at random it insists on getting bytes and requires
the user to decide:
>>> zlib.adler32("mañana".encode("utf-8"))
238748531
>>> zlib.adler32("mañana".encode("latin1"))
178062064
However, your for loop
> For file in files:
> checksum = zlib.adler32(file)
suggests that you are interested in the checksum of the files' contents. To
get the bytes in the file you have to read the file in binary mode:
>>> files = "one", "two"
>>> for file in files:
... with open(file, "rb") as f:
... print(zlib.adler32(f.read()))
...
238748531
178062064
More information about the Tutor
mailing list