Creating huge data in very less time.

Steven D'Aprano steven at REMOVE.THIS.cybersource.com.au
Tue Mar 31 04:15:13 EDT 2009


On Mon, 30 Mar 2009 22:44:41 -0700, venutaurus539 at gmail.com wrote:

> Hello all,
>             I've a requirement where I need to create around 1000
> files under a given folder with each file size of around 1GB. The
> constraints here are each file should have random data and no two files
> should be unique even if I run the same script multiple times. 

I don't understand what you mean. "No two files should be unique" means 
literally that only *one* file is unique, the others are copies of each 
other.

Do you mean that no two files should be the same?


> Moreover
> the filenames should also be unique every time I run the script. One
> possibility is that we can use Unix time format for the file   names
> with some extensions. 

That's easy. Start a counter at 0, and every time you create a new file, 
name the file by that counter, then increase the counter by one.


> Can this be done within few minutes of time. Is it
> possble only using threads or can be done in any other way. This has to
> be done in Windows.

Is it possible? Sure. In a couple of minutes? I doubt it. 1000 files of 
1GB each means you are writing 1TB of data to a HDD. The fastest HDDs can 
reach about 125 MB per second under ideal circumstances, so that will 
take at least 8 seconds per 1GB file or 8000 seconds in total. If you try 
to write them all in parallel, you'll probably just make the HDD waste 
time seeking backwards and forwards from one place to another.



-- 
Steven




More information about the Python-list mailing list