xlrd and cPickle.dump

John Machin sjmachin at lexicon.net
Tue Apr 1 08:27:50 EDT 2008


patrick.waldo at gmail.com wrote:
> Hi all,
> 
> Sorry for the repeat I needed to reform my question and had some
> problems...silly me.

Indeed. Is omitting the traceback part of the "reformation"?

> 
> The xlrd documentation says:
> "Pickleable.  Default is true. In Python 2.4 or earlier, setting to
> false will cause use of array.array objects which save some memory but
> can't be pickled. In Python 2.5, array.arrays are used
> unconditionally. Note: if you have large files that you need to read
> multiple times, it can be much faster to cPickle.dump() the xlrd.Book
> object once, and use cPickle.load() multiple times."
> 
> I'm using Python 2.4 and I have an extremely large excel file that I
> need to work with.

How many megabytes is "extremely large"? How many seconds does it take 
to open it with xlrd.open_workbook?

>  The documentation leads me to believe that cPickle
> will be a more efficient option, but I am having trouble pickling the
> excel file.  So far, I have this:
> 
> import cPickle,xlrd
> import pyExcelerator
> from pyExcelerator import *

You only need one of the above imports at the best of times, and for 
what you are attempting to do, you don't need pyExcelerator at all.

> 
> data_path = """C:\test.xls"""

It is extremely unlikely that you have a file whose basename begins with 
a TAB ('\t') character. Please post the code that you actually ran.

> pickle_path = """C:\pickle.xls"""
> 
> book = xlrd.open_workbook(data_path)
> Data_sheet = book.sheet_by_index(0)
> 
> wb=pyExcelerator.Workbook()
> proc = wb.add_sheet("proc")
> 
> #Neither of these work
> #1) pyExcelerator try
> #cPickle.dump(book,wb.save(pickle_path))
> #2) Normal pickle try
> #pickle_file = open(pickle_path, 'w')
> #cPickle.dump(book, pickle_file)
> #file.close()
> 

and the last bit of the pre-freormation traceback was
"""
   File "C:\Python24\lib\copy_reg.py", line 69, in _reduce_ex
     raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle file objects
"""

I can reproduce that behaviour with Python 2.2, also with 2.1 (different 
error message, same meaning). However it works OK with Python 2.3.5, 
2.4.3, and 2.5.2. Precisely which version of Python 2.4 are you using? 
Are you in the habit of copying library files like copy_reg.py from one 
version to another?

The second argument of cPickle.dump is an open file.
wb.save(pickle_path) will write an empty/default spreadsheet file to the 
given path (this is utterly pointless) and then return None. So once you 
get over the first problem, you will have the second: None is not an 
open file. The whole pyExcelerator carry-on is quite irrelevant to your 
problem.

Please post the minimal pyExcelerator-free script that demonstrates your 
problem. Ensure that it includes the following line:
     import sys; print sys.version; print xlrd.__VERSION__
Also post the output and the traceback (in full).

Cheers,
John



More information about the Python-list mailing list