[TriPython] Places to look at performance tuning

Ken MacKenzie ken at mack-z.com
Wed Jun 7 17:09:51 EDT 2017


OK so here is what we got so far.

Code:
I moved from the ORM model to using the expression language.  Also skipped
the DictBundle but my marshaling statement is still along the lines of

return json.dumps([dict(r) for r in records] default alchemyencode)

Something like that.  No discernible performance improvement, odd.

cProfile shows the biggest time spent in the fetchone process iterating
through I am guessing in the above statement.

So watching CentOS box we barely tick above 12% cpu for this

Then we get to Windows.

So I click a refresh on the resource.

Immediately CPU goes up to 40%, network traffic matches, and they stay at
that level for the main duration of the run.
But what about memory.  Well the VM in question has a ll of 2gb of ram and
frankly it is at 95% baseline.

I think we are getting to the problem here.

So the sql server side is working for all but the last 2 seconds of the
request return, and the whole time memory is basically maxed.  So that
might be the issue...


On Tue, Jun 6, 2017 at 10:31 PM, David Handy <david at handysoftware.com>
wrote:

>    By the way, your I/O waiting could be on database sockets as well as on
>    communication with the browser.
>
>
>
>    On Tuesday, June 6, 2017 10:28pm, "David Handy" <
> david at handysoftware.com>
>    said:
>
>    Hi Ken -
>
>
>
>    Your previous email said your web server is running on CentOS and your
>    database is MS SQL Server Express (running on a Windows server,
>    presumably.)
>
>
>
>    What does the CPU usage look like on your web server and database
> server,
>    respectively, during a test? (It would probably help to have a test
> where
>    you do several consecutive requests, to make CPU usage more obvious.)
>
>
>
>    David H
>
>
>
>    On Tuesday, June 6, 2017 5:31pm, "Ken MacKenzie" <ken at mack-z.com> said:
>
>    > _______________________________________________
>    > TriZPUG mailing list
>    > TriZPUG at python.org
>    > https://mail.python.org/mailman/listinfo/trizpug
>    > http://tripython.org is the Triangle Python Users Group
>    > Wait brain fart, that actually goes back to the IO bound issue.
>    >
>    > Ok so let me think about this. If it is IO bound why does it still
> show
>    > near 20 seconds for TTFB?
>    >
>    > steps, rough outline
>    >
>    > take request
>    > based on route send request to function 1
>    > function 1 adds some gravy then redirects to a core function
>    > core function processes the query string in the request and...
>    > does the ORM filter and groub_by setup
>    > handles setting up the DictBundle
>    > for now also deals with date/time and decimal conversion on fields
>    > gets the data
>    > after all the data is back marshals it for the return (json by default
>    > but there is also a csv option)
>    > returns to function 1
>    > function 1 places returned marshaled data in resp.body and creates the
>    > resp.header
>    >
>    > When I was thinking IO bound I was thinking that it was the browser
>    > actually getting the response data over http.
>    >
>    > Perhaps my IO bound issues are in the creation of the objects or the
>    > marshaling the response
>    >
>    > to explain function 1 and core. core is a core transaction module,
>    > functions 1-3 are types of transactions expense, revenue, budget hence
>    that
>    > division in the logic.
>    >
>    >
>    > On Tue, Jun 6, 2017 at 5:22 PM, Ken MacKenzie <ken at mack-z.com> wrote:
>    >
>    > >
>    > > (method 'poll' of 'select.poll objects) seems to be the place where
> I
>    am
>    > > hitting the performance bottleneck.
>    > >
>    > > To me I could think of that as either the ORM or the previous
> indexing
>    > > suggestion or both.
>    > >
>    > >
>    > >
>    > Wait brain fart, that actually goes back to the IO bound issue.
>    > Ok so let me think about this.** If it is IO bound why does it still
>    show
>    > near 20 seconds for TTFB?
>    > steps, rough outline
>    > take request
>    > based on route send request to function 1
>    > function 1 adds some gravy then redirects to a core function
>    > core function processes the query string in the request and...
>    > ** ** does the ORM filter and groub_by setup
>    > ** ** handles setting up the DictBundle
>    > ** ** for now also deals with date/time and decimal conversion on
> fields
>    > ** ** gets the data
>    > ** ** after all the data is back marshals it for the return (json by
>    > default but there is also a csv option)
>    > ** ** returns to function 1
>    > function 1 places returned marshaled data in resp.body and creates the
>    > resp.header
>    > When I was thinking IO bound I was thinking that it was the browser
>    > actually getting the response data over http.
>    > Perhaps my IO bound issues are in the creation of the objects or the
>    > marshaling the response
>    > to explain function **1 and core. **core is a core transaction module,
>    > functions 1-3 are types of transactions expense, revenue, budget hence
>    > that division in the logic.
>    > On Tue, Jun 6, 2017 at 5:22 PM, Ken MacKenzie <[1]ken at mack-z.com>
> wrote:
>    >
>    > (method 'poll' of 'select.poll objects) seems to be the place where I
> am
>    > hitting the performance bottleneck.
>    > To me I could think of that as either the ORM or the previous indexing
>    > suggestion or both.
>    >
>    > References
>    >
>    > Visible links
>    > 1. mailto:ken at mack-z.com
>    >
>
-------------- next part --------------
   OK so here is what we got so far.
   Code:
   I moved from the ORM model to using the expression language.** Also
   skipped the DictBundle but my marshaling statement is still along the
   lines of**
   return json.dumps([dict(r) for r in records] default alchemyencode)
   Something like that.** No discernible performance improvement, odd.
   cProfile shows the biggest time spent in the fetchone process iterating
   through I am guessing in the above statement.
   So watching CentOS box we barely tick above 12% cpu for this
   Then we get to Windows.
   So I click a refresh on the resource.
   Immediately CPU goes up to 40%, network traffic matches, and they stay at
   that level for the main duration of the run.
   But what about memory.** Well the VM in question has a ll of 2gb of ram
   and frankly it is at 95% baseline.
   I think we are getting to the problem here.
   So the sql server side is working for all but the last 2 seconds of the
   request return, and the whole time memory is basically maxed.** So that
   might be the issue...
   On Tue, Jun 6, 2017 at 10:31 PM, David Handy <[1]david at handysoftware.com>
   wrote:

     ** **By the way, your I/O waiting could be on database sockets as well
     as on
     ** **communication with the browser.

     ** **On Tuesday, June 6, 2017 10:28pm, "David Handy"
     <[2]david at handysoftware.com>
     ** **said:

     ** **Hi Ken -

     ** **Your previous email said your web server is running on CentOS and
     your
     ** **database is MS SQL Server Express (running on a Windows server,
     ** **presumably.)

     ** **What does the CPU usage look like on your web server and database
     server,
     ** **respectively, during a test? (It would probably help to have a test
     where
     ** **you do several consecutive requests, to make CPU usage more
     obvious.)

     ** **David H

     ** **On Tuesday, June 6, 2017 5:31pm, "Ken MacKenzie"
     <[3]ken at mack-z.com> said:

     ** **> _______________________________________________
     ** **> TriZPUG mailing list
     ** **> [4]TriZPUG at python.org
     ** **> [5]https://mail.python.org/mailman/listinfo/trizpug
     ** **> [6]http://tripython.org is the Triangle Python Users Group
     ** **> Wait brain fart, that actually goes back to the IO bound issue.
     ** **>
     ** **> Ok so let me think about this. If it is IO bound why does it
     still show
     ** **> near 20 seconds for TTFB?
     ** **>
     ** **> steps, rough outline
     ** **>
     ** **> take request
     ** **> based on route send request to function 1
     ** **> function 1 adds some gravy then redirects to a core function
     ** **> core function processes the query string in the request and...
     ** **> does the ORM filter and groub_by setup
     ** **> handles setting up the DictBundle
     ** **> for now also deals with date/time and decimal conversion on
     fields
     ** **> gets the data
     ** **> after all the data is back marshals it for the return (json by
     default
     ** **> but there is also a csv option)
     ** **> returns to function 1
     ** **> function 1 places returned marshaled data in resp.body and
     creates the
     ** **> resp.header
     ** **>
     ** **> When I was thinking IO bound I was thinking that it was the
     browser
     ** **> actually getting the response data over http.
     ** **>
     ** **> Perhaps my IO bound issues are in the creation of the objects or
     the
     ** **> marshaling the response
     ** **>
     ** **> to explain function 1 and core. core is a core transaction
     module,
     ** **> functions 1-3 are types of transactions expense, revenue, budget
     hence
     ** **that
     ** **> division in the logic.
     ** **>
     ** **>
     ** **> On Tue, Jun 6, 2017 at 5:22 PM, Ken MacKenzie <[7]ken at mack-z.com>
     wrote:
     ** **>
     ** **> >
     ** **> > (method 'poll' of 'select.poll objects) seems to be the place
     where I
     ** **am
     ** **> > hitting the performance bottleneck.
     ** **> >
     ** **> > To me I could think of that as either the ORM or the previous
     indexing
     ** **> > suggestion or both.
     ** **> >
     ** **> >
     ** **> >
     ** **> Wait brain fart, that actually goes back to the IO bound issue.
     ** **> Ok so let me think about this.** If it is IO bound why does it
     still
     ** **show
     ** **> near 20 seconds for TTFB?
     ** **> steps, rough outline
     ** **> take request
     ** **> based on route send request to function 1
     ** **> function 1 adds some gravy then redirects to a core function
     ** **> core function processes the query string in the request and...
     ** **> ** ** does the ORM filter and groub_by setup
     ** **> ** ** handles setting up the DictBundle
     ** **> ** ** for now also deals with date/time and decimal conversion on
     fields
     ** **> ** ** gets the data
     ** **> ** ** after all the data is back marshals it for the return (json
     by
     ** **> default but there is also a csv option)
     ** **> ** ** returns to function 1
     ** **> function 1 places returned marshaled data in resp.body and
     creates the
     ** **> resp.header
     ** **> When I was thinking IO bound I was thinking that it was the
     browser
     ** **> actually getting the response data over http.
     ** **> Perhaps my IO bound issues are in the creation of the objects or
     the
     ** **> marshaling the response
     ** **> to explain function **1 and core. **core is a core transaction
     module,
     ** **> functions 1-3 are types of transactions expense, revenue, budget
     hence
     ** **> that division in the logic.
     ** **> On Tue, Jun 6, 2017 at 5:22 PM, Ken MacKenzie
     <[1][8]ken at mack-z.com> wrote:
     ** **>
     ** **> (method 'poll' of 'select.poll objects) seems to be the place
     where I am
     ** **> hitting the performance bottleneck.
     ** **> To me I could think of that as either the ORM or the previous
     indexing
     ** **> suggestion or both.
     ** **>
     ** **> References
     ** **>
     ** **> Visible links
     ** **> 1. mailto:[9]ken at mack-z.com
     ** **>

References

   Visible links
   1. mailto:david at handysoftware.com
   2. mailto:david at handysoftware.com
   3. mailto:ken at mack-z.com
   4. mailto:TriZPUG at python.org
   5. https://mail.python.org/mailman/listinfo/trizpug
   6. http://tripython.org/
   7. mailto:ken at mack-z.com
   8. mailto:ken at mack-z.com
   9. mailto:ken at mack-z.com


More information about the TriZPUG mailing list