Decimating Excel files

John Machin sjmachin at lexicon.net
Mon Feb 5 20:31:30 EST 2007


On Feb 6, 10:46 am, "Gabriel Genellina" <gagsl... at yahoo.com.ar> wrote:
> En Sat, 03 Feb 2007 18:52:10 -0300, mensana... at aol.com
> <mensana... at aol.com> escribió:
>
> > On Feb 3, 1:43?pm, gonzlobo <gonzl... at gmail.com> wrote:
> >> We have a data acquisition program that saves its output to Excel's
> >> .xls format. Unfortunately, the programmer was too stupid to write
> >> files the average user can read.
>
> >> I'd like some advice on how to go about:
> >> 1. Reading a large Excel file and chop it into many Excel files (with
> >> only 65535 lines per file)
>
> > An Excel sheet only has 65535 lines. Or do yo mean it has
> > multiple sheets?
>
> As I understand the problem, the OP has a program that generates the .xls
> files, but it's so dumb that writes files too large for Excel to read.
> I'd try the "xlrd" package - it is capable of reading Excel files on any
> platform.

Thanks for the plug, Gabriel. However xlrd is not the panacea for all
evils and all idiocies :-)

Excel "file sizes" are limited by the number of rows and columns that
a particular version's file format will support in each worksheet.
There may be a limit on the maximum number of worksheets in a file,
but I've never heard of this as a problem.

Before Excel 97 aka 8.0, the limits were 16384 rows x 256 columns.

Excel 97 (8.0) up to Excel 2003 (11.0) allow 65536 rows by 256
columns.

Excel 2007 aka 12.0 (just released) raises the limits to 2**20 rows x
16384 columns.

Let's presume the OP is talking about files written in the format
("BIFF8")  that is expected by Excel 97-2003.

In cell data records, there are 16 bits for an unsigned row number. If
the file writer writes zillions of rows, with row numbers modulo
65536, then not even xlrd can help the OP -- not out of the box;
rescue would be possible with a tweaked version *if* the rows were
written in sequential order.

There are also 16 bits for an unsigned column number. It is possible
to write 65536 columns, but I'd guess that Excel would refuse to open
the file, or go bananas. In any case the OP's problem seems to be with
too many rows.

In any case #2, what the OP said was [my emphasis added]:
"Unfortunately, the programmer was too stupid to write files *the
average user* can read." -- i.e. it's not  Excel being grumpy, it's
the average user.

It would help a great deal if the OP would say what the problem really
is ...

Cheers,
John




More information about the Python-list mailing list