Bug 3.11.x behavioral, open file buffers not flushed til file closed.

Sun Mar 5 20:40:07 EST 2023

On 3/5/23 09:35, aapost wrote:
> I have run in to this a few times and finally reproduced it. Whether it 
> is as expected I am not sure since it is slightly on the user, but I can 
> think of scenarios where this would be undesirable behavior.. This 
> occurs on 3.11.1 and 3.11.2 using debian 12 testing, in case the 
> reasoning lingers somewhere else.
> 
> If a file is still open, even if all the operations on the file have 
> ceased for a time, the tail of the written operation data does not get 
> flushed to the file until close is issued and the file closes cleanly.
> 
> 2 methods to recreate - 1st run from interpreter directly:
> 
> f = open("abc", "w")
> for i in range(50000):
>    f.write(str(i) + "\n")
> 
> you can cat the file and see it stops at 49626 until you issue an f.close()
> 
> a script to recreate:
> 
> f = open("abc", "w")
> for i in range(50000):
>    f.write(str(i) + "\n")
> while(1):
>    pass
> 
> cat out the file and same thing, stops at 49626. a ctrl-c exit closes 
> the files cleanly, but if the file exits uncleanly, i.e. a kill command 
> or something else catastrophic. the remaining buffer is lost.
> 
> Of course one SHOULD manage the closing of their files and this is 
> partially on the user, but if by design something is hanging on to a 
> file while it is waiting for something, then a crash occurs, they lose a 
> portion of what was assumed already complete...

 >Cameron
 >Eryk

Yeah, I later noticed open() has the buffering option in the docs, and 
the warning on a subsequent page:

Warning
Calling f.write() without using the with keyword or calling f.close() 
might result in the arguments of f.write() not being completely written 
to the disk, even if the program exits successfully.

I will have to set the buffer arg to 1. I just hadn't thought about 
buffering in quite a while since python just handles most of the things 
lower level languages don't. I guess my (of course incorrect) 
assumptions would have leaned toward some sort of auto handling of the 
flush, or a non-buffer default (not saying it should).

And I understand why it is the way it is from a developer standpoint, 
it's sort of a mental thing in the moment, I was in a sysadmin way of 
thinking, switching around from doing things in bash with multiple 
terminals, forgetting the fundamentals of what the python interpreter is 
vs a sequence of terminal commands.

That being said, while "with" is great for many use cases, I think its 
overuse causes concepts like flush and the underlying "whys" to atrophy 
(especially since it is obviously a concept that is still important). It 
also doesn't work well when doing quick and dirty work in the 
interpreter to build a file on the fly with a sequence of commands you 
haven't completely thought through yet, in addition to the not wanting 
to close yet, the subsequent indention requirement is annoying. f = 
open("fn", "w", 1) will be the go to for that type of work since now I 
know. Again, just nitpicking, lol.