[TriPython] Places to look at performance tuning

James Whisnant jwhisnant at gmail.com
Tue Jun 6 16:44:04 EDT 2017


I recommend profiling your code to see exactly where it is slow. It may or
may not be what you expect. https://pypi.python.org/pypi/profilehooks is an
easy to use decorate your functions to profile them. It does sound like
some of your database queries are slow. This may help you pinpoint where
and why. Are you using the SQLAlchemy ORM or Core? The ORM can have a large
overhead (in some cases) as compared to the SQLAlchemy Core.


On Tue, Jun 6, 2017 at 4:28 PM, Ken MacKenzie <ken at mack-z.com> wrote:

>    At present the SQL table has about 8 fields that make up the primary
> key,
>    that PK creates an index and is the only index at present.
>    I don't think sql selection is my bottle neck. **here is why.
>    basically my route is like this:
>    /type/fiscal_year/fiscal_period/entity
>    entity is the only one that is "optional"
>    When an entity is specified I get a return in a second or two.**
> Granted a
>    much smaller record set.
>    When entity is omitted I get about 50x the result count and that is
> where
>    I get to a 20 second return time.
>    Those results at first made me think the problem was io bound on the
>    router till I saw that the first return byte was so behind.
>    Am I misunderstanding your suggestion that additional indexes would
>    improve the speed at which the DB returns the data, because I am
>    interpreting your suggestion as one to improve the selection speed.
>    On Tue, Jun 6, 2017 at 4:18 PM, George Gergues
>    <[1]george.gergues at gmail.com> wrote:
>
>      ** **For SQL table **add at least one index. it will improve table
>      scans.
>      ** **On Jun 6, 2017 11:44, "Ken MacKenzie" <[1][2]ken at mack-z.com>
> wrote:
>
>      ** ** **** **So I am in the demo and test phase of an early ReSTful
> API
>      for
>      ** ** **reporting.
>      ** ** **** **Currently a wider scale report set request hits the
>      following
>      ** ** **marks:
>      ** ** **** **TTFB: ~20s
>      ** ** **** **Record Count: ~92k
>      ** ** **** **Download Size: 15.8MB
>      ** ** **** **Details:
>      ** ** **** **Web Server: NGINX
>      ** ** **** **Python App Server: Gunicorn
>      ** ** **** **Web Framework: Falcon
>      ** ** **** **Python version: 3.5 (in a venv)
>      ** ** **** **DB: MS SQL Server Express using SQL Alchemy + pyodbc
>      ** ** **** **Webserver OS: CentOS 7
>      ** ** **** **Gunicorn is setup with 4 workers, on a private port,
> nginx
>      does a
>      ** ** **proxy
>      ** ** **** **pass to the port
>      ** ** **** **DB Details, the table in question has a total of about 8
>      million
>      ** ** **rows.**
>      ** ** **** **Sample query execution within SQL Server Mgmt Studio is
> ~7s
>      ** ** **** **So my question is which of the following would be a
> better
>      target
>      ** ** **to
>      ** ** **** **improve performance, or do I need to as my performance
>      should be
>      ** ** **** **considered good enough.** I mean the server in this case
> is
>      a
>      ** ** **surplus dual
>      ** ** **** **core desktop right now.
>      ** ** **** **add gzip compression to nginx for proxys
>      ** ** **** **switch gunicorn to use a unix socket instead of a tcp
> port
>      ** ** **** **consider leaner SQL and JSON marshaling requests instead
> of
>      ORM's
>      ** ** **and
>      ** ** **** **dictionary bundles.
>      ** ** **** **Appreciate and advice or suggestions.** Thank you.
>
>      ** ** **_______________________________________________
>      ** ** **TriZPUG mailing list
>      ** ** **[2][3]TriZPUG at python.org
>      ** ** **[3][4]https://mail.python.org/mailman/listinfo/trizpug
>      ** ** **[4][5]http://tripython.org is the Triangle Python Users Group
>
> References
>
>    Visible links
>    1. mailto:george.gergues at gmail.com
>    2. mailto:ken at mack-z.com
>    3. mailto:TriZPUG at python.org
>    4. https://mail.python.org/mailman/listinfo/trizpug
>    5. http://tripython.org/
>
> _______________________________________________
> TriZPUG mailing list
> TriZPUG at python.org
> https://mail.python.org/mailman/listinfo/trizpug
> http://tripython.org is the Triangle Python Users Group
>
>
-------------- next part --------------
   I recommend profiling your code to see exactly where it is slow. It may or
   may not be what you expect. [1]https://pypi.python.org/pypi/profilehooks
   is an easy to use decorate your functions to profile them. It does sound
   like some of your database queries are slow. This may help you pinpoint
   where and why. Are you using the SQLAlchemy ORM or Core? The ORM can have
   a large overhead (in some cases) as compared to the SQLAlchemy Core.

   On Tue, Jun 6, 2017 at 4:28 PM, Ken MacKenzie <[2]ken at mack-z.com> wrote:

     ** **At present the SQL table has about 8 fields that make up the
     primary key,
     ** **that PK creates an index and is the only index at present.
     ** **I don't think sql selection is my bottle neck. **here is why.
     ** **basically my route is like this:
     ** **/type/fiscal_year/fiscal_period/entity
     ** **entity is the only one that is "optional"
     ** **When an entity is specified I get a return in a second or two.**
     Granted a
     ** **much smaller record set.
     ** **When entity is omitted I get about 50x the result count and that is
     where
     ** **I get to a 20 second return time.
     ** **Those results at first made me think the problem was io bound on
     the
     ** **router till I saw that the first return byte was so behind.
     ** **Am I misunderstanding your suggestion that additional indexes would
     ** **improve the speed at which the DB returns the data, because I am
     ** **interpreting your suggestion as one to improve the selection speed.
     ** **On Tue, Jun 6, 2017 at 4:18 PM, George Gergues
     ** **<[1][3]george.gergues at gmail.com> wrote:

     ** ** **** **For SQL table **add at least one index. it will improve
     table
     ** ** **scans.
     ** ** **** **On Jun 6, 2017 11:44, "Ken MacKenzie"
     <[1][2][4]ken at mack-z.com> wrote:

     ** ** **** ** **** **So I am in the demo and test phase of an early
     ReSTful API
     ** ** **for
     ** ** **** ** **reporting.
     ** ** **** ** **** **Currently a wider scale report set request hits the
     ** ** **following
     ** ** **** ** **marks:
     ** ** **** ** **** **TTFB: ~20s
     ** ** **** ** **** **Record Count: ~92k
     ** ** **** ** **** **Download Size: 15.8MB
     ** ** **** ** **** **Details:
     ** ** **** ** **** **Web Server: NGINX
     ** ** **** ** **** **Python App Server: Gunicorn
     ** ** **** ** **** **Web Framework: Falcon
     ** ** **** ** **** **Python version: 3.5 (in a venv)
     ** ** **** ** **** **DB: MS SQL Server Express using SQL Alchemy +
     pyodbc
     ** ** **** ** **** **Webserver OS: CentOS 7
     ** ** **** ** **** **Gunicorn is setup with 4 workers, on a private
     port, nginx
     ** ** **does a
     ** ** **** ** **proxy
     ** ** **** ** **** **pass to the port
     ** ** **** ** **** **DB Details, the table in question has a total of
     about 8
     ** ** **million
     ** ** **** ** **rows.**
     ** ** **** ** **** **Sample query execution within SQL Server Mgmt
     Studio is ~7s
     ** ** **** ** **** **So my question is which of the following would be a
     better
     ** ** **target
     ** ** **** ** **to
     ** ** **** ** **** **improve performance, or do I need to as my
     performance
     ** ** **should be
     ** ** **** ** **** **considered good enough.** I mean the server in this
     case is
     ** ** **a
     ** ** **** ** **surplus dual
     ** ** **** ** **** **core desktop right now.
     ** ** **** ** **** **add gzip compression to nginx for proxys
     ** ** **** ** **** **switch gunicorn to use a unix socket instead of a
     tcp port
     ** ** **** ** **** **consider leaner SQL and JSON marshaling requests
     instead of
     ** ** **ORM's
     ** ** **** ** **and
     ** ** **** ** **** **dictionary bundles.
     ** ** **** ** **** **Appreciate and advice or suggestions.** Thank you.

     ** ** **** ** **_______________________________________________
     ** ** **** ** **TriZPUG mailing list
     ** ** **** ** **[2][3][5]TriZPUG at python.org
     ** ** **** **
     **[3][4][6]https://mail.python.org/mailman/listinfo/trizpug
     ** ** **** ** **[4][5][7]http://tripython.org is the Triangle Python
     Users Group

     References

     ** **Visible links
     ** **1. mailto:[8]george.gergues at gmail.com
     ** **2. mailto:[9]ken at mack-z.com
     ** **3. mailto:[10]TriZPUG at python.org
     ** **4. [11]https://mail.python.org/mailman/listinfo/trizpug
     ** **5. [12]http://tripython.org/

     _______________________________________________
     TriZPUG mailing list
     [13]TriZPUG at python.org
     [14]https://mail.python.org/mailman/listinfo/trizpug
     [15]http://tripython.org is the Triangle Python Users Group

References

   Visible links
   1. https://pypi.python.org/pypi/profilehooks
   2. mailto:ken at mack-z.com
   3. mailto:george.gergues at gmail.com
   4. mailto:ken at mack-z.com
   5. mailto:TriZPUG at python.org
   6. https://mail.python.org/mailman/listinfo/trizpug
   7. http://tripython.org/
   8. mailto:george.gergues at gmail.com
   9. mailto:ken at mack-z.com
  10. mailto:TriZPUG at python.org
  11. https://mail.python.org/mailman/listinfo/trizpug
  12. http://tripython.org/
  13. mailto:TriZPUG at python.org
  14. https://mail.python.org/mailman/listinfo/trizpug
  15. http://tripython.org/


More information about the TriZPUG mailing list