Manipulate Large Binary Files

Steve Holden steve at holdenweb.com
Wed Apr 2 12:09:03 EDT 2008


Derek Tracy wrote:
> I am trying to write a script that reads in a large binary file (over 
> 2Gb) saves the header file (169088 bytes) into one file then take the 
> rest of the data and dump it into anther file.  I generated code that 
> works wonderfully for files under 2Gb in size but the majority of the 
> files I am dealing with are over the 2Gb limit
> 
> INPUT = open(infile, 'rb')
> header = FH.read(169088)
> 
> ary = array.array('H', INPUT.read())
> 
Replace this line with a loop that reads 10MB or 100MB chunks at a time. 
There is little reason to read the whole file in to write it out again.

regards
  Steve

> INPUT.close()
> 
> OUTF1 = open(outfile1, 'wb')
> OUTF1.write(header)
> 
> OUTF2 = open(outfile2, 'wb')
> ary.tofile(OUTF2)
> 
> 
> When I try to use the above on files over 2Gb I get:
>      OverflowError: requested number of bytes is more than a Python 
> string can hold
> 
> Does anybody have an idea as to how I can get by this hurdle?
> 
> I am working in an environment that does not allow me to freely download 
> modules to use.  Python version 2.5.1
> 
> 
> 
> R/S --
> ---------------------------------
> Derek Tracy
> tracyde at gmail.com <mailto:tracyde at gmail.com>
> ---------------------------------
> 


-- 
Steve Holden        +1 571 484 6266   +1 800 494 3119
Holden Web LLC              http://www.holdenweb.com/




More information about the Python-list mailing list