Memory error while using pandas dataframe

Jason Swails jason.swails at gmail.com
Wed Jun 10 12:54:50 EDT 2015


On Mon, Jun 8, 2015 at 3:32 AM, naren <narencr7 at gmail.com> wrote:

> Memory Error while working with pandas dataframe.
>
> Description of Environment Windows 7 python 3.4.2 32-bit version pandas
> 0.16.0
>
> We are running into the error described below. Any help provided will be
> sincerely appreciated.
>
> We are able to read a 300MB Csv file into a dataframe using the read_csv
> function. While working with the dataframe we ran into memory error. We
> used the pd.Concat function to concatenate two dataframes. So we decided to
> use chunksize for lazy reading. Chunking returns an object of type
> TextFileReader.
>
>
> http://pandas.pydata.org/pandas-docs/stable/io.html#iterating-through-files-chunk-by-chunk
>
> We are able to iterate over this object once as a debugging measure. The
> iterator gets exhausted after iterating once. So we are not able to convert
> the TextFileReader object back into a dataframe, using the pd.concat
> function.
>
​It looks like you already figured out what your problem is.  The
TextFileReader is exhausted (i.e., at EOF), so you end up getting None from
it.​


​What is your question?  You want to be able to iterate through
TextFileReader again?

If so, try rewinding the file object that you passed to pd.concat.  If you
saved a reference to the file object, just call "seek(0)" on that object.
If you didn't, access it as the "f" attribute on the TextFileReader object
and call "seek(0)" on that instead.

That might work.  Otherwise, you should be more specific with your question
and provide a full segment of code that is as small as possible to
reproduce the error you're seeing.

HTH,
Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20150610/f2374ac5/attachment.html>


More information about the Python-list mailing list