Manipulate Large Binary Files

Derek Tracy tracyde at gmail.com
Wed Apr 2 14:09:45 EDT 2008


On Wed, Apr 2, 2008 at 10:59 AM, Derek Tracy <tracyde at gmail.com> wrote:
> I am trying to write a script that reads in a large binary file (over 2Gb) saves the header file (169088 bytes) into one file then take the rest of the data and dump it into anther file.  I generated code that works wonderfully for files under 2Gb in size but the majority of the files I am dealing with are over the 2Gb limit
>
> INPUT = open(infile, 'rb')
> header = FH.read(169088)
>
> ary = array.array('H', INPUT.read())
>
> INPUT.close()
>
> OUTF1 = open(outfile1, 'wb')
> OUTF1.write(header)
>
> OUTF2 = open(outfile2, 'wb')
> ary.tofile(OUTF2)
>
>
> When I try to use the above on files over 2Gb I get:
>      OverflowError: requested number of bytes is more than a Python string can hold
>
> Does anybody have an idea as to how I can get by this hurdle?
>
> I am working in an environment that does not allow me to freely download modules to use.  Python version 2.5.1
>
>
>
> R/S --
> ---------------------------------
> Derek Tracy
> tracyde at gmail.com
> ---------------------------------
>

I know have 2 solutions, one using
partial
and the other using array

Both are clocking in at the same time (1m 5sec for 2.6Gb), are there
any ways I can optimize either solution?  Would turning off the
read/write buff increase speed?

-- 
---------------------------------
Derek Tracy
tracyde at gmail.com
http://twistedcode.blogspot.com
---------------------------------



More information about the Python-list mailing list