problem with read() write()

Alf P. Steinbach alfps at start.no
Sun Nov 1 04:44:45 EST 2009


* Gertjan Klein:
> Alf P. Steinbach wrote:
> 
>> So with 'w+' the only way to get garbage is if 'read' reads beyond the end of 
>> file, or 'open' doesn't conform to the documentation.
> 
> It does read beyond the end of file. This is perhaps the way the
> underlying C library works, but it looks like an "unexpected feature"
> (read: bug) to me.
> 
> I reproduced (with Python 2.5.2 on WinXP) the code the OP wrote after
> creating an empty (0-byte) test file; after the write() the read()
> returns random garbage. I can't imagine why anyone would want that
> behaviour. The file grew to be 4099 bytes after f.close(). I wrote
> 'hello' to it, so the length of garbage added was 4094 bytes, which I
> find a strange number also.

Could you post (copy and paste) the code, and description of results?


> I would have expected the read to return nothing. Can anyone explain or
> even defend this behaviour?

I'm just a Python newbie, but in C and C++ such things are usually down to 
"undefined behavior", that is, the program doing something that is implicitly or 
explicitly defined as undefined behavior by the language standard.

With UB the effect may then be something or nothing or anything or just what you 
expected; appearance of the infamous nasal demons is one possibility...

Quoting n869, which is the January 18th 1999 draft of the C99 standard:

   §7.19.5.3/6
   When a file is opened with update mode (’+’ as the second or third
   character in the above list of mode argument values), both input and
   output may be performed on the associated stream. However, output shall
   not be directly followed by input without an intervening call to the
   fflush function or to a file positioning function (fseek, fsetpos, or
   rewind), and input shall not be directly followed by output without an
   intervening call to a file positioning function, unless the input
   operation encounters end-of-file. Opening( or creating) a text file with
   update mode may instead open (or create) a binary stream in some
   implementations.

"Shall not" means UB. This applies to C "FILE*" handling.

AFAICS nothing except efficiency prevents the Python wrapper, if "FILE*" is what 
it uses, from automatically inserting an appropriate fflush or fseek.

And for a language used by so many newbies (this is positive!) I agree that it 
should ideally get rid of that UB (assuming that's what the problem is), or, if 
it doesn't already, mention that in the Python documentation.


Cheers,

- Alf



More information about the Python-list mailing list