MemoryError and Pickle

Chris Kaynor ckaynor at zindagigames.com
Mon Nov 21 19:49:19 EST 2016


On Mon, Nov 21, 2016 at 3:43 PM, John Gordon <gordon at panix.com> wrote:
> In <o0vvtm$1rpo$1 at gioia.aioe.org> Fillmore <fillmore_remove at hotmail.com> writes:
>
>
>> Question for experts: is there a way to refactor this so that data may
>> be filled/written/released as the scripts go and avoid the problem?
>> code below.
>
> That depends on how the data will be read.  Here is one way to do it:
>
>     fileObject = open(filename, "w")
>     for line in sys.stdin:
>         parts = line.strip().split("\t")
>         fileObject.write("ta: %s\n" % parts[0])
>         fileObject.write("wa: %s\n" % parts[1])
>         fileObject.write("ua: %s\n" % parts[2])
>     fileObject.close()
>
> But this doesn't use pickle format, so your reader program would have to
> be modified to read this format.  And you'll run into the same problem if
> the reader expects to keep all the data in memory.

If you want to keep using pickle, you should be able to pickle each
item of the list to the file one at a time. As long as the file is
kept open (or seeked to the end), you should be able to dump without
overwriting the old data, and read starting at the end of the previous
pickle stream.

I haven't tested it, so there may be issues (if it fails, you can try
using dumps and writing to the file by hand):

Writing:
with open(filename, 'wb') as fileObject:
    for line in sys.stdin:
        pickle.dump(line, fileObject)

Reading:
with open(filename, 'wb') as fileObject:
    while not fileObject.eof: # Not sure of the correct syntax, but
gives the idea
        line = pickle.load(fileObject)
        # do something with line


It should also be noted that if you do not need to support multiple
Python versions, you may want to specify a protocol to pickle.dump to
use a better version of the format. -1 will use the latest (best if
you only care about one version of Python.); 4 is currently the latest
version (added in 3.4), which may be useful if you need
forward-compatibility but not backwards-compatibility. 2 is the latest
version available in Python 2 (added in Python 2.3) See
https://docs.python.org/3.6/library/pickle.html#data-stream-format for
more information.



More information about the Python-list mailing list