creating/modifying sparse files on linux

Terry Reedy tjreedy at udel.edu
Wed Aug 17 17:07:58 EDT 2005


<draghuram at gmail.com> wrote in message 
news:1124304819.090356.200280 at f14g2000cwb.googlegroups.com...
> Is there any special support for sparse file handling in python?

Since I have not heard of such in several years, I suspect not.  CPython, 
normally compiled, uses the standard C stdio lib.  If your system+C has a 
sparseIO lib, you would probably  have to compile specially to use it.

> options.size = 6442450944
> options.ranges = ["4096,1024","30000,314572800"]

options.ranges = [(4096,1024),(30000,314572800)] # makes below nicer

> fd = open("testfile", "w")
> fd.seek(options.size-1)
> fd.write("a")
> for drange in options.ranges:
>    off = int(drange.split(",")[0])
>    len = int(drange.split(",")[1])

off,len = map(int, drange.split(",")) # or
off,len = [int(s) for s in drange.split(",")] # or for tuples as suggested 
above
off,len = drange

>    print "off =", off, " len =", len
>    fd.seek(off)
>    for x in range(len):

If I read the above right, the 2nd len is 300,000,000+ making the space 
needed for the range list a few gigabytes.  I suspect this is where you 
started thrashing ;-).  Instead:

  for x in xrange(len): # this is what xrange is for ;-)

>    fd.write("a")

Without indent, this is syntax error, so if your code ran at all, this 
cannot be an exact copy.  Even with xrange fix, 300,000,000 writes will be 
slow.  I would expect that an real application should create or accumulate 
chunks larger than single chars.

> fd.close()
>
> This piece of code takes very long time and in fact I had to kill it as
> the linux system started doing lot of swapping. Am I doing something
> wrong here?

See above

> Is there a better way to create/modify sparse files?

Unless you can access builting facilities, create your own mapping index.

Terry J. Reedy






More information about the Python-list mailing list