What is the best data structure for a very simple spreadsheet?

vsoler vicente.soler at gmail.com
Sun Jan 3 07:38:28 EST 2010


On Jan 3, 1:28 pm, Steven D'Aprano <st... at REMOVE-THIS-
cybersource.com.au> wrote:
> On Sun, 03 Jan 2010 03:27:46 -0800, vsoler wrote:
> > My application would contain a limited set of "cells" represented by the
> > instances of a Cell class:
>
> > class Cell:
> > ...
>
> > A1=Cell(7)
> > A2=Cell(2*A1)
> > A3=Cell(3*A1+A2)
> > A4=Cell(A3*4)
>
> > Of course, A1 = 7, A2 = 14, A3 = 35 and A4 = 140
>
> > Now, I somehow want to be able to show a dependency tree
>
> > 1 level dependency trees
> >   A1: None
> >   A2: A1
> >   A3: A1, A2
> >   A4: A3
>
> > All levels dependency trees
>
> >   A1: None
> >   A2: A1
> >   A3: A1, A2
> >   A4: A3, A2, A1
>
> > Leaf + values dependency trees:
>
> >   A1: 7
> >   A2: A1=7, 2
> >   A3: 3, A1=7, 2
> >   A4: 3, A1=7, 2, 4
>
> > What I'd like to know is:
>
> > 1) what are, in your opinion, the basic elements of the Cell class?
>
> def Cell(object):
>     def __init__(self, payload):
>         self.payload = payload
>     def __str__(self):
>         return str(self.payload)
>     def __float__(self):
>         return float(self.payload)
>     def dependency(self):
>         try:
>             return self.payload.dependency()
>         except AttributeError:
>             return ['None']
>
> Cells can contain one of three things: a number, a string, or a formula.
> The first two can be supported by providing a built-in Python object
> (float or str) as payload. You can support formulae two ways that I can
> think of:
>
> (1) Provide a formula as a string, with a leading '='. Then, whenever you
> want to evaluate such a cell, you fetch the string from the cell, parse
> it, generate an arithmetic expression, and calculate it.
>
> (2) Instead of parsing the formula on every single spreadsheet refresh,
> use a couple of helper classes:
>
> class Indirect(object):
>     def __init__(self, ref, sheet=None):
>         if sheet is None:
>             self.sheet = default_sheet()
>         else:
>             self.sheet = sheet
>         self.ref = ref
>     def __str__(self):
>         return str(self.sheet[self.ref])
>     def float(self):
>         return float(self.sheet[self.ref])
>     def dependency(self):
>         return [self.ref]
>
> class Formula(object):
>     def __init__(self, x, y, op):
>         self.x = x
>         self.y = y
>         self.op = op
>     def eval(self):
>         return self.op(float(x), float(y))
>     def dependency(self):
>         return self.x.dependency(level) + self.y.dependency(level)
>
> Then do something like this:
>
> sheet = {}
> sheet['A1'] = Cell(7)
> sheet['A2'] = Cell(Formula(2, Indirect('A1'), operator.mul))
> sheet['A3'] = Cell(
>   Formula(
>     Formula(3, Indirect('A1'), operator.mul),
>     Indirect('A2'),
>     operator.add
>     ))
> sheet['A4'] = Cell(Formula(Indirect('A3'), 4, operator.mul))
>
> Then you only need to parse each human-readable formula like '=2*A1' once.
>
> > 2) Do I need a parser to evaluate the formulas like “3*A1+A2”?
>
> Yes.
>
> > Can you recommend one library that already contains one?
>
> Try PyParsing.
>
> > 3) Do I need a tree
> > data structure to represent my data? would the tree be an attribute of
> > the class instance?
>
> I suspect a dict will be faster.
>
> To get the dependencies of each cell:
>
> for key, value in sheet.items():
>     print key, value.dependency()
>
> Keep in mind I haven't tested ANY of this -- it is entirely stream of
> consciousness. I've never actually done this, so I have no idea whether
> it is a good approach or not, but it seems to me that it might be.
>
> --
> Steven

WOW!!!
After lunch I am going to read your post thoroughly, but I can already
see that you've put into it a lot of time and expertise.

Thank you



More information about the Python-list mailing list