Optimizing memory usage w/ HypterText package

Wed Dec 5 15:28:22 EST 2007

On Dec 5, 10:17 pm, Chris <cwi... at gmail.com> wrote:
> On Dec 5, 9:35 pm, je.s.t... at hehxduhmp.org wrote:
>
>
>
> > I've been using the HyperText module for a while now
> > (http://dustman.net/andy/python/HyperText/), and I really like it.
>
> > I've run into a situation where I have code to construct a table
> > and while it is normally perfect, there are times where the table
> > can get quite huge (e.g. 1000 columns, 100000 rows .... yes, the
> > question of "how on earth would someone render this table?" comes
> > up, but that's not the point here :) ), and the code I have generating
> > this starts choking and dying from excessive RAM usage.
>
> > I'm curious if people see a better way of going about this task and/or
> > believe that an alternative method of HTML generation here would be
> > better.
>
> > A (possibly somewhat pseudocode, as I'm doing this by hand) small example
> > of what I'm doing ...
>
> > inputs = [A, List, Of, Values, To, Go, Into, A, Table]
> > numcolumns = howManyColumnsIWant
>
> > out = ht.TABLE()
> > column = 0
> > for input in inputs:
> >     if (column == 0):
> >         tr = ht.TR()
> >     tr.append(ht.TD(input))
> >     column += 1
> >     if (column == numcolumns):
> >         out.append(tr)
> >         column = 0
>
> > As I said, this works fine for normal cases, but I've run into some situations
> > where I need this to scale not just into the hundreds of thousands but also
> > well into the millions - and that's just not happening.  Is there a better
> > way to do this (which involves direct HTML generation in Python), or am
> > I SOL here?
>
> for (i, input) in enumerate(inputs):
>   """Your Code
>   """
>   if not i % 1000:
>     # Flush your data.
>
> It's logical that you will run out of space as the code just appends
> data constantly instead of ever writing it out.  How you flush the
> data out is up to you or if it's as simple as you have there you could
> do something like.
>
> file_out.write('<TABLE>\n')
> for x in xrange(0, len(inputs)//numcolumns):
>   file_out.write('<TR>\n<TD>%s</TD>\n</TR>' % '</TD>
> \n<TD>'.join(inputs[(x*numcolumns):((x+1)*numcolumns)]) )
>   if not x % 500: file_out.flush()
> file_out.write('<TR>\n<TD>%s</TD>\n</TR>\n</TABLE>' % '</TD>
> \n<TD>'.join(inputs[x*numcolumns:]) )
> file_out.close()

Sorry, change the second last line from:

> file_out.write('<TR>\n<TD>%s</TD>\n</TR>\n</TABLE>' % '</TD>\n<TD>'.join(inputs[x*numcolumns:]) )

to:

if len(inputs) % numcolumns:
  file_out.write('<TR>\n<TD>%s</TD>\n</TR>\n</TABLE>' % '</TD>
\n<TD>'.join(inputs[x*numcolumns:]) )
else:
  file_out.write('\n</TABLE>')