Creating huge data in very less time.

Dave Angel davea at ieee.org
Tue Mar 31 10:15:52 EDT 2009


I wrote a tiny DOS program called resize that simply did a seek out to a 
(user specified) point, and wrote zero bytes. One (documented) side 
effect of DOS was that writing zero bytes would truncate the file at 
that point.  But it also worked to extend the file to that point without 
writing any actual data.  The net effect was that it adjusted the FAT 
table, and none of the data.  It was used frequently for file recovery, 
unformatting, etc.   And it was very fast.

Unfortunately, although the program still ran under NT (which includes 
Win 2000, XP, ...), the security system insists on zeroing all the 
intervening sectors, which takes much time, obviously.

Still, if the data is not important (make the first sector unique, and 
the rest zeroes), this would probably be the fastest way to get all 
those files created.  Just write the file name in the first sector 
(since we'[ll separately make sure the filename is unique), and then 
seek out to a billion, and write one more byte.  I won't assume that 
writing zero bytes would work for Unix.

andrea wrote:
> On 31 Mar, 12:14, "venutaurus... at gmail.com" <venutaurus... at gmail.com>
> wrote:
>   
>> That time is reasonable. The randomness should be in such a way that
>> MD5 checksum of no two files should be the same.The main reason for
>> having such a huge data is for doing stress testing of our product.
>>     
>
>
> In randomness is not necessary (as I understood) you can just create
> one single file and then modify one bit of it iteratively for 1000
> times.
> It's enough to make the checksum change.
>
> Is there a way to create a file to big withouth actually writing
> anything in python (just give me the garbage that is already on the
> disk)?
>
>   



More information about the Python-list mailing list