[Pandas-dev] Help with contributing

William Ayd william.ayd at icloud.com
Fri Oct 23 12:07:50 EDT 2020


Thanks for the interest Robert. The best advice I can give is to break up the problem into very small pieces and try to approach improvements to the code base from there. You’ve written a function for your own purpose which is awesome, but it is unlikely that pandas would just adopt that function on its own. So instead I would suggest to take a step back and focus on the problem that “to_html() is too slow”.

If that’s the problem you have nailed down I would then suggest trying to dig a little deeper by:
	1. Searching the existing issue tracker on GitHub for similar issues
	2. Profiling what exactly makes it slow

If #1 works out any clarifications you can add to the existing issue (and of course PRs to solve) would be helpful. If you don’t see any existing issue but can provide a timing profile of performance, I would advise opening a dedicated issue with that information.

Hope that helps.

- Will

> On Oct 23, 2020, at 4:53 AM, Robert Butler <robertb at sccwrp.org> wrote:
> 
> Hi Pandas dev
>  
> I am developing a small web application using flask and pandas
> Part of it involved making an api call using fetch to the server, and the server had to return some data formatted as html
>  
> The code that executes during the fetch request does a lot, and when I used the dataframe.to_html() method it took a while
>  
> I spent some time making a function that converts a dataframe to html and I was able to get something that runs a lot faster, and the difference is pretty noticeable when the dataframe is a lot larger
>  
> When I looked at the source code, it seemed like it would be difficult to incorporate the function as a contribution, so I was wondering if I could get some help?
>  
> I have roughly 2 years of experience with programming in python, and did not major in CS, so I consider myself to be a beginner, which Is why I struggle with something basic like making a contribution like this
>  
> I was wondering if I could get some help making this contribution?
>  
> This is the function, obviously you can tell I sort of wrote it in such a way that it would be convenient for my particular application
>  
> def htmltable(df, id = None, cssclass = None, enumeraterows = True):
>     '''
>         df is a pandas dataframe,
>         id is a css id you want to give to the table,
>         cssclass is a css class for the table,
>         enumeraterows actually only distinguishes even/odd rows with css classes
>     '''
>  
>     html = """
>     <table{}{}>
>         <colgroup>
>             {}
>         </colgroup>
>         <thead>
>             {}
>         </thead>
>         <tbody>
>             {}
>         </tbody>
>     </table>   
>     """.format(
>         # add in the id
>         f" id = {id}" if id else "",
>  
>         # add the class
>         f" class = {cssclass}" if cssclass else "",
>        
>         # colgroups
>         ''.join(['<col span="1" class="{}">'.format(colname) for colname in df.columns]),
>        
>         # column headers
>         ''.join(
>             [  
>                 # sticks on the outsides of the row after doing the join
>                 '<tr><th scope="col">{}</th></tr>'.format(
>                     '</th><th scope="col">'.join(df.columns)
>                 )
>             ]
>         ),
>         # cells of table body
>         ''.join(
>             [
>                 # sticks on the outsides of the row after doing the join
>                 # adds even and odd css classes to each row as well
>                 '<tr{} id="rownumber-{}">{}</tr>'.format(
>                     ' class="row-even"' if i % 2 == 0 else ' class="row-odd"' if enumeraterows else "",
>                     i,
>                     x
>                 ) for i,x in
>                 # Zips columns together, then joins them with closing table cell tag and opening table cell tag between
>                 enumerate([''.join(
>                     list(
>                         map(
>                             lambda cell:
>                             '<td contenteditable="true" class="colname-{}">{}</td>'.format(cell['column_name'], cell['column_value']),
>                             row
>                         )
>                     )
>                 )
>                 for row in
>                     zip(*
>                         [
>                             df[col].apply(lambda x: {'column_name':col, 'column_value': x}) for col in df.columns
>                         ]
>                     )
>                 ])
>             ]   
>         )
>     )
>     return html
>  
>  
>  
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org <mailto:Pandas-dev at python.org>
> https://mail.python.org/mailman/listinfo/pandas-dev <https://mail.python.org/mailman/listinfo/pandas-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/pandas-dev/attachments/20201023/28f1dc93/attachment-0001.html>


More information about the Pandas-dev mailing list