Creating a file with $SIZE

rbossy at jouy.inra.fr rbossy at jouy.inra.fr
Sat Mar 15 14:39:37 EDT 2008


Quoting Bryan Olson <fakeaddress at nowhere.org>:

> Robert Bossy wrote:
> > Bryan Olson wrote:
> >> Robert Bossy wrote:
> >>>> Robert Bossy wrote:
> >>>>> Indeed! Maybe the best choice for chunksize would be the file's buffer
> >>>>> size...
> >>
> >> That bit strikes me as silly.
> >>
> > The size of the chunk must be as little as possible in order to minimize
> > memory consumption. However below the buffer-size, you'll end up filling
> > the buffer anyway before actually writing on disk.
>
> First, which buffer? The file library's buffer is of trivial size,
> a few KB, and if we wanted to save even that we'd use os.open and
> have no such buffer at all. The OS may set up a file-specific
> buffer, but again those are small, and we could fill our file much
> faster with larger writes.
>
> Kernel buffers/pages are dynamically assigned on modern operating
> systems. There is no particular buffer size for the file if you mean
> the amount of kernel memory holding the written data. Some OS's
> do not buffer writes to disk files; the write doesn't return until
> the data goes to disk (though they may cache it for future reads).
>
> To fill the file fast, there's a large range of reasonable sizes
> for writing, but user-space buffer size - typically around 4K - is
> too small. 1 GB is often disastrously large, forcing paging to and
> from disk to access the memory. In this thread, Matt Nordhoff used
> 10MB; fine size today, and probably for several years to come.
>
> If the OP is writing to a remote disk file to test network
> throughput, there's another size limit to consider. Network file-
> system protocols do not steam very large writes; the client has to
> break a large write into several smaller writes. NFS version 2 had
> a limit of 8 KB; version 3 removed the limit by allowing the server
> to tell the client the largest size it supports. (Version 4 is now
> out, in hundreds of pages of RFC that I hope to avoid reading.)

Wow. That's a lot knowledge in a single post. Thanks for the information, Bryan.

Cheers,
RB



More information about the Python-list mailing list