newbie - merging xls files using xldt and xlwt

Albert-jan Roskam fomcl at yahoo.com
Wed Oct 15 11:42:44 EDT 2008


Hi John,

Thanks! Using a higher xlrd version did the trick! Regarding your other remarks:
-yep, input files with multiple sheets don't work yet. I kinda repressed that ;-) Spss outputs only single-sheet xls files, but I agree it's nicer if the programs works in other cases too.
-and no, I don't intend to use data fields. Wouldn't it be easier to convert those to string values if I ever came across them?

Thanks again!
Albert-Jan


--- On Wed, 10/15/08, John Machin <sjmachin at lexicon.net> wrote:

> From: John Machin <sjmachin at lexicon.net>
> Subject: Re: newbie - merging xls files using xldt and xlwt
> To: python-list at python.org
> Date: Wednesday, October 15, 2008, 3:14 PM
> On Oct 15, 9:16 pm, Albert-jan Roskam
> <fo... at yahoo.com> wrote:
> > Hi,
> >
> > I wrote the program below to merge all xls files in a
> given directory into one multisheet xls file. It uses xlwt
> and xlrd. The xls files I use for input are generated by
> Spss. When I open and re-save the files in Excel, the
> program works, but when I use the xls files as they were
> created by Spss, the program returns an error message (see
> below). Some details: Python 2.4, xlwt version 0.7.0, xlrd
> version 0.5.2, Win NT.
> >
> > Any ideas?
> 
> The version of xlrd that you are using is an antique. Go to
> http://pypi.python.org/pypi/xlrd and get the latest
> version. If that
> doesn't work, send me a copy of one of the files that
> is displaying
> this problem.
> >
> > Thanks in advance!
> >
> > Albert-Jan
> >
> > """
> > Merge all xls files in a given directory into one
> multisheet xls file.
> > The sheets get the orginal file name, without the
> extension.
> > File names should not exceed 31 characters, as this is
> the maximum
> > sheet name length
> > """
> >
> > import xlrd, xlwt
> > import glob, os.path
> >
> > def merge_xls (in_dir,
> out_file="d:/merged_output.xls"):
> >
> >     xls_files   = glob.glob(in_dir +
> "*.xls")
> >     sheet_names = []
> >     merged_book = xlwt.Workbook()
> >
> >     [sheet_names.append(os.path.basename(v)[:-4])
> for k, v in enumerate(xls_files)]
> 
> Wah! Try this:
> sheet_names = [os.path.basename(v)[:-4]) for v in
> xls_files]
> 
> 
> >     for k, xls_file in enumerate(xls_files):
> >         if len (sheet_names[k]) <= 31:
> >             book = xlrd.open_workbook(xls_file)
> >             ws =
> merged_book.add_sheet(sheet_names[k])
> >             for sheetx in range(book.nsheets):
> >                 sheet =
> book.sheet_by_index(sheetx)
> >                 for rx in range(sheet.nrows):
> >                     for cx in
> range(sheet.ncols):
> >                          ws.write(rx, cx,
> sheet.cell_value(rx, cx))
> 
> I presume that you're not too worried about any date
> data.
> 
> If an input file has more than 1 sheet, you are creating
> only one
> sheet in the output file, and overwriting cells.
> 
> >         else:
> >[snip]
> >
> > merge_xls(in_dir="d:/temp/")
> >
> > *** WARNING: OLE2 inconsistency: SSCS size is 0 but
> SSAT size is non-zero ***
> > put_cell 0 1
> >
> > Traceback (most recent call last):
> [snip]
> >     self._cell_types[rowx][colx] = ctype
> > IndexError: list assignment index out of range
> 
> Yeah, symptom of an xls file with a DIMENSIONS records that
> lies ...
> xlrd has like Excel become more resilient over time :-)
> 
> BTW, consider joining the python-excel group at
> http://groups.google.com/group/python-excel
> 
> Cheers,
> John
> --
> http://mail.python.org/mailman/listinfo/python-list


      



More information about the Python-list mailing list