[New-bugs-announce] [issue1142] code sample showing errors reading large files with py 2.5

christen report at bugs.python.org
Mon Sep 10 17:52:42 CEST 2007


New submission from christen:

Error in reading >4Go files under windows

try this:

import sys
print(sys.version_info)
import time
print (time.strftime('%Y-%m-%d %H:%M:%S'))
liste=[]
start = time.time()
fichout=open('test.txt','w')
for i in xrange(85014961):
    if i%5000000==0 and i>0:
        print (i,time.time()-start)
    fichout.write(str(i)+' '*59+'\n')
fichout.close()
print ('total lines written ',i)
print (i,time.time()-start)
print ('*'*50)
fichin=open('test.txt')
start3 = time.time()
for i,li in enumerate(fichin):
    if i%5000000==0 and i>0:
        print (i,time.time()-start3)
fichin.close()
print ('total lines read ',i)
print(time.time()-start)

it generates a >4Go file,not all lines are read !!
example:
('total lines written ', 85014960)
('total lines read ', 85014950)
10 lines are missing

if you replace by
fichout.write(str(i)+' '*59+'\n')

file is now under 4Go, is properly read
Used both a 32 and 64 Windows XP machines

seems to work with Linux and BSD (did not tried this example but had no
pb with my home made big files)
Pb : many examples of >4Go files for the human genome and other
biological applications. Almost sure that people are doing mistakes,
because it took me a while before discovering that...
Note : does not happen with py 3k :-)

----------
components: Windows
messages: 55785
nosy: Richard.Christen at unice.fr
severity: urgent
status: open
title: code sample showing errors reading large files with py 2.5
type: behavior
versions: Python 2.5

__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue1142>
__________________________________


More information about the New-bugs-announce mailing list