xlrd and cPickle.dump

patrick.waldo at gmail.com patrick.waldo at gmail.com
Wed Apr 2 10:23:23 EDT 2008


Still no luck:

Traceback (most recent call last):
  File "C:\Python24\Lib\site-packages\pythonwin\pywin\framework
\scriptutils.py", line 310, in RunScript
    exec codeObject in __main__.__dict__
  File "C:\text analysis\pickle_test2.py", line 13, in ?
    cPickle.dump(Data_sheet, pickle_file, -1)
PicklingError: Can't pickle <type 'module'>: attribute lookup
__builtin__.module failed

My code remains the same, except I added 'wb' and the -1 following
your suggestions:

import cPickle,xlrd, sys

print sys.version
print xlrd.__VERSION__

data_path = """C:\\test\\test.xls"""
pickle_path = """C:\\test\\pickle.pickle"""

book = xlrd.open_workbook(data_path)
Data_sheet = book.sheet_by_index(0)

pickle_file = open(pickle_path, 'wb')
cPickle.dump(Data_sheet, pickle_file, -1)
pickle_file.close()

To begin with (I forgot to mention this before) I get this error:
WARNING *** OLE2 inconsistency: SSCS size is 0 but SSAT size is non-
zero

I'm not sure what this means.

> What do you describe as "simple manipulations"? Please describe your
> computer, including how much memory it has.

I have a 1.8Ghz HP dv6000 with 2Gb of ram, which should be speedy
enough for my programming projects.  However, when I try to print out
the rows in the excel file, my computer gets very slow and choppy,
which makes experimenting slow and frustrating.  Maybe cPickle won't
solve this problem at all!  For this first part, I am trying to make
ID numbers for the different permutation of categories, topics, and
sub_topics.  So I will have [book,non-fiction,biography],[book,non-
fiction,history-general],[book,fiction,literature], etc..
so I want the combination of
[book,non-fiction,biography] = 1
[book,non-fiction,history-general] = 2
[book,fiction,literature] = 3
etc...

My code does this, except sort returns None, which is strange.  I just
want an alphabetical sort of the first option, which sort should do
automatically.  When I do a test like
>>>nest_list = [['bbc', 'cds'], ['jim', 'ex'],['abc', 'sd']]
>>>nest_list.sort()
[['abc', 'sd'], ['bbc', 'cds'], ['jim', 'ex']]
It works fine, but not for my rows.

Here's the code (unpickled/unsorted):
import xlrd, pyExcelerator

path_file = "C:\\text_analysis\\test.xls"
book = xlrd.open_workbook(path_file)
ProcFT_QC = book.sheet_by_index(0)
log_path = "C:\\text_analysis\\ID_Log.log"
logfile = open(log_path,'wb')

set_rows = []
rows = []
db = {}
n=0
while n<ProcFT_QC.nrows:
    rows.append(ProcFT_QC.row_values(n, 6,9))
    n+=1
print rows.sort() #Outputs None
ID = 1
for row in rows:
    if row not in set_rows:
        set_rows.append(row)
        db[ID] = row
        entry = str(ID) + '|' + str(row).strip('u[]') + '\r\n'
        logfile.write(entry)
        ID+=1
logfile.close()

> Also, any good reason for sticking with Python 2.4?

Trying to learn Zope/Plone too, so I'm sticking with Python 2.4.


Thanks again



More information about the Python-list mailing list