shuffle the lines of a large file

Heiko Wundram modelnine at ceosg.de
Fri Mar 11 00:59:33 EST 2005


On Tuesday 08 March 2005 15:55, Simon Brunning wrote:
> Ah, but that's the clever bit; it *doesn't* store the whole list -
> only the selected lines.

But that means that it'll only read several lines from the file, never do a 
shuffle of the whole file content... When you'd want to shuffle the file 
content, you'd have to set lines=1 and throw away repeating lines in 
subsequent runs, or you'd have to set lines higher, and deal with the 
resulting lines too in some way (throw away repeating ones... :-). Doesn't 
matter how, you'd have to store which lines you've already read 
(selected_lines). And in any case you'd need a line cache of 10^9 entries for 
this amount of data...

That's just what I wanted to say...

-- 
--- Heiko.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20050311/4ce2b40a/attachment.sig>


More information about the Python-list mailing list