[Tutor] Statistics with python

Oscar Benjamin oscar.j.benjamin at gmail.com
Sun Oct 14 16:18:05 EDT 2018


I'm replying back to the tutor list. Can you reply there rather than
directly to me please?

Also I've moved your response below mine as that is the preferred
style on this list. My answer is below.

On Sun, 14 Oct 2018 at 16:05, Mariam Haji <mariamhaji01 at gmail.com> wrote:
>
> On Sat, Oct 13, 2018 at 10:24 PM Oscar Benjamin <oscar.j.benjamin at gmail.com> wrote:
>>
>> On Sat, 13 Oct 2018 at 11:23, Mariam Haji <mariamhaji01 at gmail.com> wrote:
>> >
>> > Hi guys,
>>
>> Hi Mariam
>>
>> > the question is as:
>> > If a sample of 50 patients is taken from a dataset what is the probability
>> > that we will get a patient above the age of 56?
>>
>> I can think of several ways of interpreting this:
>>
>> (a): You have a dataset consisting of 50 patients. You want to know
>> the probability that a patient chosen from that sample will be above
>> the age of 56.
>>
>> (b): You have a dataset consisting of 50 patients. You consider it to
>> be representative of a larger population of people. You would like to
>> use your dataset to estimate the probability that a patient chosen
>> from the larger population will be above the age of 56.
>>
>> (c): You have a larger dataset consisting of more than 50 patients.
>> You want to know that probability that a sample of 50 patients chosen
>> from the larger dataset will contain at least (or exactly?) one person
>> above the age of 56.
>>
>> (d): You have a larger dataset, but you will only analyse a sample of
>> 50 patients from it. You want to use statistics on that sample to
>> estimate the probability that a patient chosen from the larger dataset
>> will be above the age of 56.
>>
>> I can list more interpretations but I think it would be better to wait
>> for you to clarify.
>
> My dataset consists of 300+ patients and I want to analyze analyse a sample of 50 patients from it.
> Yto know the probability that a patient chosen from the larger dataset
> will be above the age of 56.

Is this a homework problem or an actual problem?

If I had 300+ patients I would think that the best way to work out the
probability that a patient chosen from those 300+ was over the age of
56 would be to count how many are over the age of 56. Likewise if I
wanted to estimate how many would be over the age of 56 using a
smaller sample of 50 patients then I would also just count how many
are over the age of 56 in that smaller sample.

I'm going to guess that this is a homework problem and that you have
been asked to assume that the ages are normally distributed (which
they would not be in reality).

Your calculation for the standard deviation given in your earlier
email doesn't make any sense. You should calculate this using a
function that calculates the standard deviation. There is one in the
numpy module:

>>> import numpy
>>> ages = [35, 45, 55, 70]
>>> numpy.mean(ages)
51.25
>>> numpy.std(ages)
12.93010054098575

--
Oscar


More information about the Tutor mailing list