[TriPython] Prediction Model. Data Visualization.

Art artem.nesterenko at gmail.com
Wed Oct 11 13:19:25 EDT 2017


Francois,

Thank you for your response! I like the idea of building a heatmap or using
yellow brick rather than going with the bar graph.
Will see if I can make it happen...

I'd like to say a big thank you to everyone for your suggestions and links!
Now I have plenty of materials to work with.

It was a good idea to email this group:)

Best,
Art.

Art Nestsiarenka
email: artem.nesterenko at gmail.com
Cell: (919) 455-5055



On Wed, Oct 11, 2017 at 12:18 PM, Francois Dion <francois.dion at gmail.com>
wrote:

>    Art (and list members interested in visualization),
>
>    As Dave mentioned, donut charts work best for progress to goal. ie. a
>    percentage. Like a dashboard guage. Or something where the 50% mark is
>    important, say a win/loss indicator of the Carolina Hurricanes against
>    visitor. Similarly, the ancestor of donut chart, the pie chart is best
>    suited for parts of a whole when you have 2 or 3 elements at most.
> Beyond
>    that, it is almost impossible to figure out the percentages and relative
>    importance. Bar charts do much better when there are more than 2 or 3
>    values.
>
>    A confusion matrix, in the simplest binary case, bins 4 possible outcome
>    of a classifier. True positive (you are part of the class and I said
> so),
>    false positive (you are not part of the class but I said you were), true
>    negative (you are not part of the class and I said so) and false
> negative
>    (you are part of the class but I said you were not). The expectation of
>    representation of a confusion matrix, is unsurprisingly, as a matrix.
> The
>    standard way to represent this is in a table format, a matrix (of actual
>    against predicted), hence the name. This has been the case since at
> least
>    the 1950s (without doing an exhaustive search, just from memory). For
>    example, I just pulled Mike James' "Classification Algorithms" from
> 1985,
>    page 83, and there it is. He also sums each row and column.
>
>    But, sure, the plain text table is a bit drab if you are looking for
>    maximum impact. So, that's where I was suggesting a heatmap. Or you can
>    use the python package yellow brick.
>
>    Here's an example using seaborn's heatmap (and making sure I label the
>    axis, else it is useless). I used cmap="Greens":
>
>    [1]https://datasciencefrancois.tumblr.com/post/166291770900/
> confusion-matrix-with-a-single-color-sequential
>
>    I've had no problem using this with technical and non technical
> audiences.
>    Shown CMs like the above (and a variety of other graphical and
>    semigraphical displays) to business folks who then proceeded to green
>    light further phases of fairly large data science projects. Once they've
>    seen one and got it you never have to explain it again. Without the
>    heatmap colors, it was super challenging to have people "get it".
>    Also, you might be interested in this list of books on visualization
> (from
>    my "ex-libris" series on linkedin):
>
>    [2]https://www.linkedin.com/pulse/ex-libris-data-scientist-part-v-
> visualization-francois-dion/
>
>    In particular, Stephen Few's "Show Me the Numbers : Designing Tables and
>    Graphs to Enlighten" should definitely be on everyone's reading list,
>    along with Cairo's "The Functional Art", will get you started, if you
>    can't commit to reading 1 viz book per week for the next 2 years :)
>
>    Thanks,
>    Francois
>    On Wed, Oct 11, 2017 at 8:52 AM, Art <[3]artem.nesterenko at gmail.com>
>    wrote:
>
>      ** **Donut graph:
>      ** **[1][4]https://imgur.com/a/C7r8x
>      ** **You should be able to see it now.
>      ** **Art Nestsiarenka
>
>    --
>    [5]about.me/francois.dion - [6]www.pyptug.org - [7]
> www.3DFutureTech.info -
>    [8]@f_dion
>
> References
>
>    Visible links
>    1. https://datasciencefrancois.tumblr.com/post/166291770900/
> confusion-matrix-with-a-single-color-sequential
>    2. https://www.linkedin.com/pulse/ex-libris-data-scientist-part-v-
> visualization-francois-dion/
>    3. mailto:artem.nesterenko at gmail.com
>    4. https://imgur.com/a/C7r8x
>    5. http://about.me/francois.dion
>    6. http://www.pyptug.org/
>    7. http://www.3dfuturetech.info/
>    8. http://twitter.com/f_dion
>
> _______________________________________________
> TriZPUG mailing list
> TriZPUG at python.org
> https://mail.python.org/mailman/listinfo/trizpug
> http://tripython.org is the Triangle Python Users Group
>
>
-------------- next part --------------
   Francois,
   Thank you for your response! I like the idea of building a heatmap or
   using yellow brick rather than going with the bar graph.
   Will see if I can make it happen...
   I'd like to say a big thank you to everyone for your suggestions and
   links! Now I have plenty of materials to work with.****
   It was a good idea to email this group:)
   Best,
   Art.
   Art Nestsiarenka
   email: [1]artem.nesterenko at gmail.com
   Cell: (919) 455-5055

   On Wed, Oct 11, 2017 at 12:18 PM, Francois Dion
   <[2]francois.dion at gmail.com> wrote:

     ** **Art (and list members interested in visualization),

     ** **As Dave mentioned, donut charts work best for progress to goal. ie.
     a
     ** **percentage. Like a dashboard guage. Or something where the 50% mark
     is
     ** **important, say a win/loss indicator of the Carolina Hurricanes
     against
     ** **visitor. Similarly, the ancestor of donut chart, the pie chart is
     best
     ** **suited for parts of a whole when you have 2 or 3 elements at most.
     Beyond
     ** **that, it is almost impossible to figure out the percentages and
     relative
     ** **importance. Bar charts do much better when there are more than 2 or
     3
     ** **values.

     ** **A confusion matrix, in the simplest binary case, bins 4 possible
     outcome
     ** **of a classifier. True positive (you are part of the class and I
     said so),
     ** **false positive (you are not part of the class but I said you were),
     true
     ** **negative (you are not part of the class and I said so) and false
     negative
     ** **(you are part of the class but I said you were not). The
     expectation of
     ** **representation of a confusion matrix, is unsurprisingly, as a
     matrix. The
     ** **standard way to represent this is in a table format, a matrix (of
     actual
     ** **against predicted), hence the name. This has been the case since at
     least
     ** **the 1950s (without doing an exhaustive search, just from memory).
     For
     ** **example, I just pulled Mike James' "Classification Algorithms" from
     1985,
     ** **page 83, and there it is. He also sums each row and column.

     ** **But, sure, the plain text table is a bit drab if you are looking
     for
     ** **maximum impact. So, that's where I was suggesting a heatmap. Or you
     can
     ** **use the python package yellow brick.

     ** **Here's an example using seaborn's heatmap (and making sure I label
     the
     ** **axis, else it is useless). I used cmap="Greens":

     **
     **[1][3]https://datasciencefrancois.tumblr.com/post/166291770900/confusion-matrix-with-a-single-color-sequential

     ** **I've had no problem using this with technical and non technical
     audiences.
     ** **Shown CMs like the above (and a variety of other graphical and
     ** **semigraphical displays) to business folks who then proceeded to
     green
     ** **light further phases of fairly large data science projects. Once
     they've
     ** **seen one and got it you never have to explain it again. Without the
     ** **heatmap colors, it was super challenging to have people "get it".
     ** **Also, you might be interested in this list of books on
     visualization (from
     ** **my "ex-libris" series on linkedin):

     **
     **[2][4]https://www.linkedin.com/pulse/ex-libris-data-scientist-part-v-visualization-francois-dion/

     ** **In particular, Stephen Few's "Show Me the Numbers : Designing
     Tables and
     ** **Graphs to Enlighten" should definitely be on everyone's reading
     list,
     ** **along with Cairo's "The Functional Art", will get you started, if
     you
     ** **can't commit to reading 1 viz book per week for the next 2 years :)

     ** **Thanks,
     ** **Francois
     ** **On Wed, Oct 11, 2017 at 8:52 AM, Art
     <[3][5]artem.nesterenko at gmail.com>
     ** **wrote:

     ** ** **** **Donut graph:
     ** ** **** **[1][4][6]https://imgur.com/a/C7r8x
     ** ** **** **You should be able to see it now.
     ** ** **** **Art Nestsiarenka

     ** **--
     ** **[5][7]about.me/francois.dion - [6][8]www.pyptug.org -
     [7][9]www.3DFutureTech.info -
     ** **[8]@f_dion

     References

     ** **Visible links
     ** **1.
     [10]https://datasciencefrancois.tumblr.com/post/166291770900/confusion-matrix-with-a-single-color-sequential
     ** **2.
     [11]https://www.linkedin.com/pulse/ex-libris-data-scientist-part-v-visualization-francois-dion/
     ** **3. mailto:[12]artem.nesterenko at gmail.com
     ** **4. [13]https://imgur.com/a/C7r8x
     ** **5. [14]http://about.me/francois.dion
     ** **6. [15]http://www.pyptug.org/
     ** **7. [16]http://www.3dfuturetech.info/
     ** **8. [17]http://twitter.com/f_dion

     _______________________________________________
     TriZPUG mailing list
     [18]TriZPUG at python.org
     [19]https://mail.python.org/mailman/listinfo/trizpug
     [20]http://tripython.org is the Triangle Python Users Group

References

   Visible links
   1. mailto:artem.nesterenko at gmail.com
   2. mailto:francois.dion at gmail.com
   3. https://datasciencefrancois.tumblr.com/post/166291770900/confusion-matrix-with-a-single-color-sequential
   4. https://www.linkedin.com/pulse/ex-libris-data-scientist-part-v-visualization-francois-dion/
   5. mailto:artem.nesterenko at gmail.com
   6. https://imgur.com/a/C7r8x
   7. http://about.me/francois.dion
   8. http://www.pyptug.org/
   9. http://www.3dfuturetech.info/
  10. https://datasciencefrancois.tumblr.com/post/166291770900/confusion-matrix-with-a-single-color-sequential
  11. https://www.linkedin.com/pulse/ex-libris-data-scientist-part-v-visualization-francois-dion/
  12. mailto:artem.nesterenko at gmail.com
  13. https://imgur.com/a/C7r8x
  14. http://about.me/francois.dion
  15. http://www.pyptug.org/
  16. http://www.3dfuturetech.info/
  17. http://twitter.com/f_dion
  18. mailto:TriZPUG at python.org
  19. https://mail.python.org/mailman/listinfo/trizpug
  20. http://tripython.org/


More information about the TriZPUG mailing list