[Numpy-discussion] Efficient way to load a 1Gb file?

Torgil Svensson torgil.svensson at gmail.com
Sun Aug 14 11:31:24 EDT 2011


Try the fromiter function, that will allow you to pass an iterator
which can read the file line by line and not preload the whole file.

file_iterator = iter(open('filename.txt')
line_parser = lambda x: map(float,x.split('\t'))
a=np.fromiter(itertools.imap(line_parser,file_iterator),dtype=float)

You have also the option to iterate the file twice and pass the
"count" argument.

//Torgil

On Wed, Aug 10, 2011 at 7:22 PM, Russell E. Owen <rowen at uw.edu> wrote:
> A coworker is trying to load a 1Gb text data file into a numpy array
> using numpy.loadtxt, but he says it is using up all of his machine's 6Gb
> of RAM. Is there a more efficient way to read such text data files?
>
> -- Russell
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



More information about the NumPy-Discussion mailing list