[Chicago] Suggestions on improving histogram in matplotlib?

Lewit, Douglas d-lewit at neiu.edu
Sat Mar 21 20:50:29 CET 2015


Very helpful!  Thank you!  Not sure why you couldn't reproduce my results.
I tried Python versions 2 and 3, and both worked.  Oh well.

Thanks for mentioning the "align" keyword.  I must try that out and see how
that works.

Numpy is great, thanks!  And yes, I could probably use Numpy to simplify a
lot of things.  However, in the process of simplifying things I'm denying
myself a programming education.  Sometimes it's helpful to do things the
hard way or the long way.  In terms of production it may not make much
sense, but in terms of education it may make all the sense in the world.
For example, we're learning about Stacks right now in my Java class.  We're
creating our own Stack class based on our Node class.  Of course this is
really the long way of doing it because Java has its own builtin Stack
class!  Why reinvent the wheel?  But I understand the professor's point of
view.  He wants us to become more conscious of what is happening under the
hood, and that won't happen (or will happen very slowly) if we keep on
taking shortcuts to solve problems.

Sometimes I'm interested in just getting results, but sometimes I'm more
interested in how the computer figured out those results.  Does that make
any sense?

On Sat, Mar 21, 2015 at 11:06 AM, Ryan Nelson <rnelsonchem at gmail.com> wrote:

> Douglas,
>
> First of all, to answer you specific comments:
> * It is probably best not to use newline characters to move around the
> text like that. You can adjust the axes (plot area) to give the title more
> room. I couldn't reproduce your problem, though, so I can't comment
> directly on what you expected to see.
> * When you use an integer number to the hist bin keyword argument, you are
> not necessarily going to get bins that start exactly at integer values. You
> can however pass in a list (or array) of specific values that you want to
> use. (See below)
>
> Anyway, I rewrote your script in a different style. See attached. First of
> all, Numpy is your friend. It is installed with Matplotlib, so you already
> have it available. The dice roll calculation with Numpy is much faster and
> the code is simpler. Also, for various reasons, you should create your
> histogram as a separate function call, and probably as a first step. I've
> never passed an argument to the show function, but there might be some
> times when it is necessary. I also passed an align keyword argument to the
> histogram function to align the bins on top of the ticks.
>
> Hope that helps. The folks at the Matplotlib user email list are very
> helpful.
>
> Ryan
>
> Numpy is your friend here.
>
>>
>>
>> Message: 1
>> Date: Fri, 20 Mar 2015 13:33:24 -0500
>> From: "Lewit, Douglas" <d-lewit at neiu.edu>
>> To: The Chicago Python Users Group <chicago at python.org>
>> Subject: [Chicago] Suggestions on improving histogram in matplotlib?
>> Message-ID:
>>         <
>> CAPdZZGwuY7XPhwckv14Pj3FwLXsTpMyDjR9ocXGkENz4M+CVLw at mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> Hey there,
>>
>> I'm attaching my script (done in Python 3 but it should work almost the
>> same in Python 2) and also a .png file that shows the resulting histogram.
>> It looks really good!  However, a couple of comments or criticisms.  First
>> off, if you look at my command *pyplot.title(.....)* you'll see that the
>> title begins with "\n" (the newline character) but it doesn't cause an
>> extra blank line above the title in my plot.  Any suggestions?  Or is that
>> just a minor bug in matplotlib?
>>
>> Also, please note that the rectangles of the histogram are not squarely
>> placed over the corresponding tickmarks, except when you get closer to the
>> center of the histogram, around 11, 12, 13, 14, 15, 16, 17, 18, etc.  But
>> further out the rectangles appear toward the side of the corresponding
>> tickmark.  Again, I'm not sure if that's a true "bug" or maybe I'm just
>> not
>> interpreting the histogram properly or maybe that's how all histograms
>> should be drawn!  I really don't know, but if you have any suggestions
>> please let me know.
>>
>> In case you're wondering, this is the problem of rolling 4 dice and
>> recording their sums.  So the smallest sum should be 4 (1 + 1 + 1 + 1) and
>> the largest sum should be 24 (6 + 6 + 6 + 6).  The sum of 14 is the most
>> probable of all the sums, with a probability of close to 12%.  The other
>> sums are less probable, and the least probable sums of all are 4 and 24.
>> The data look normally distributed with a mean of about 14, but I'm not
>> sure about the variance and standard deviation.  I didn't bother to
>> compute
>> those, but they are probably not that difficult to compute from the raw
>> data generated by Python.
>>
>> The script won't work unless you have matplotlib as part of your Python
>> installation!  Matplotlib is pretty amazing, and produces graphics that
>> are
>> on par with Maple, Mathematica and Matlab.  (Matplotlib's plots look very
>> "Matlabby" to me.  I'm not sure if that's a coincidence or if that was an
>> intentional part of the design.  What do you think? )
>>
>> Okay, thanks!  I appreciate any constructive suggestions on improving the
>> histogram so that it has professional textbook quality.
>>
>> Best,
>>
>> Douglas Lewit
>>
>
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> https://mail.python.org/mailman/listinfo/chicago
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/chicago/attachments/20150321/d636979e/attachment.html>


More information about the Chicago mailing list