Problem reading/writing files

John Machin sjmachin at lexicon.net
Thu Aug 3 23:59:48 EDT 2006


smeenehan at hmc.edu wrote:
> This is a bit of a peculiar problem. First off, this relates to Python
> Challenge #12, so if you are attempting those and have yet to finish
> #12, as there are potential spoilers here.
>
> I have five different image files shuffled up in one big binary file.
> In order to view them I have to "unshuffle" the data, which means
> moving bytes around. Currently my approach is to read the data from the
> original, unshuffle as necessary, and then write to 5 different files
> (2 .jpgs, 2 .pngs and 1 .gif).
>
> The problem is with the read() method. If I read a byte valued as 0x00
> (in hexadecimal), the read method returns a character with the value
> 0x20.

I doubt it. What platform? What version of Python? Have you opened the
file in binary mode  i.e. open('thefile', 'rb') ?? Show us the relevant
parts of your code, plus what  caused you to conclude that read()
changed data on the fly in an undocumented fashion.

> When printed as strings, these two values look the same (null and
> space, respectively),

Use the repr() function when you want to see what's *really* in an
object:

#>>> hasnul = 'a\x00b'
#>>> hasspace = 'a\x20b'
#>>> print hasnul, hasspace
a b a b
#>>> print repr(hasnul), repr(hasspace)
'a\x00b' 'a b'
#>>>


> but obviously this screws with the data and makes
> the resulting image file unreadable. I can add a simple if statement to
> correct this, which seems to make the .jpgs readable, but the .pngs
> still have errors and the .gif is corrupted, which makes me wonder if
> the read method is not doing this to other bytes as well.
>
> Now, the *really* peculiar thing is that I made a simple little file
> and used my hex editor to manually change the first byte to 0x00. When
> I read that byte with the read() method, it returned the correct value,
> which boggles me.
>
> Anyone have any idea what could be going on? Alternatively, is there a
> better way to shift about bytes in a non-text file without using the
> read() method (since returning the byte as a string seems to be what's
> causing the issue)?

"seems to be" != "is" :-)

There is no simple better way. We need to establish what you are
actually doing to cause this problem to seem to happen. Kindly answer
the questions above ;-)

Cheers,
John




More information about the Python-list mailing list