Fast forward-backward (write-read)

Virgil Stokes vs at it.uu.se
Wed Oct 24 03:19:36 EDT 2012


On 24-Oct-2012 00:53, Steven D'Aprano wrote:
> On Tue, 23 Oct 2012 17:50:55 -0400, David Hutto wrote:
>
>> On Tue, Oct 23, 2012 at 10:31 AM, Virgil Stokes <vs at it.uu.se> wrote:
>>> I am working with some rather large data files (>100GB)
> [...]
>>> Finally, to my question --- What is a fast way to write these variables
>>> to an external file and then read them in backwards?
>> Don't forget to use timeit for an average OS utilization.
> Given that the data files are larger than 100 gigabytes, the time
> required to process each file is likely to be in hours, not microseconds.
> That being the case, timeit is the wrong tool for the job, it is
> optimized for timings tiny code snippets. You could use it, of course,
> but the added inconvenience doesn't gain you any added accuracy.
>
> Here's a neat context manager that makes timing long-running code simple:
>
>
> http://code.activestate.com/recipes/577896
Thanks for this link
>
>
>
>> I'd suggest two list comprehensions for now, until I've reviewed it some
>> more:
> I would be very surprised if the poster will be able to fit 100 gigabytes
> of data into even a single list comprehension, let alone two.
You are correct and I have been looking at working with blocks that are sized to 
the RAM available for processing.
>
> This is a classic example of why the old external processing algorithms
> of the 1960s and 70s will never be obsolete. No matter how much memory
> you have, there will always be times when you want to process more data
> than you can fit into memory.
>
>
>
Thanks for your insights :-)



More information about the Python-list mailing list