random writing access to a file in Python

Paul Rubin http
Sat Aug 26 18:54:40 EDT 2006


Claudio Grondi <claudio.grondi at freenet.de> writes:
> > Try the standard Unix/Linux sort utility.  Use the --buffer-size=SIZE
> > to tell it how much memory to use.
> I am on Windows and it seems, that Windows XP SP2 'sort' can work with
> the file, but not without a temporary file and space for the resulting
> file,  so triple of space of the file to sort must be provided.

Oh, sorry, I didn't see the earlier parts of the thread.  Anyway,
depending on the application, it's probably not worth the hassle of
coding something yourself, instead of just throwing more disk space at
the Unix utility.  But especially if the fields are fixed size, you
could just mmap the file and then do quicksort on disk.  Simplest
would be to just let the OS paging system take care of caching stuff
if you wanted to get fancy, you could sort in memory once the sorting
regions got below a certain size.

A huge amount of stuff has been written (e.g. about half of Knuth vol
3) about how to sort.  Remember too, that traditionally large-scale
sorting was done on serial media like tape drives, so random access
isn't that vital.



More information about the Python-list mailing list