[Python-checkins] Apply edits from Allen Downey's review of the linear_regression docs. (GH-26176)

rhettinger webhook-mailer at python.org
Sun May 16 22:21:23 EDT 2021


https://github.com/python/cpython/commit/b3f65e819f552561294a66e350a9f5a3131f7df2
commit: b3f65e819f552561294a66e350a9f5a3131f7df2
branch: main
author: Raymond Hettinger <rhettinger at users.noreply.github.com>
committer: rhettinger <rhettinger at users.noreply.github.com>
date: 2021-05-16T19:21:14-07:00
summary:

Apply edits from Allen Downey's review of the linear_regression docs. (GH-26176)

files:
M Doc/library/statistics.rst
M Lib/statistics.py

diff --git a/Doc/library/statistics.rst b/Doc/library/statistics.rst
index 117d2b63cbea1..a65c9840b8113 100644
--- a/Doc/library/statistics.rst
+++ b/Doc/library/statistics.rst
@@ -631,25 +631,25 @@ However, for reading convenience, most of the examples show sorted sequences.
    Return the intercept and slope of `simple linear regression
    <https://en.wikipedia.org/wiki/Simple_linear_regression>`_
    parameters estimated using ordinary least squares. Simple linear
-   regression describes relationship between *regressor* and
-   *dependent variable* in terms of linear function:
+   regression describes the relationship between *regressor* and
+   *dependent variable* in terms of this linear function:
 
       *dependent_variable = intercept + slope \* regressor + noise*
 
    where ``intercept`` and ``slope`` are the regression parameters that are
-   estimated, and noise term is an unobserved random variable, for the
+   estimated, and noise represents the
    variability of the data that was not explained by the linear regression
-   (it is equal to the difference between prediction and the actual values
+   (it is equal to the difference between predicted and actual values
    of dependent variable).
 
    Both inputs must be of the same length (no less than two), and regressor
-   needs not to be constant, otherwise :exc:`StatisticsError` is raised.
+   needs not to be constant; otherwise :exc:`StatisticsError` is raised.
 
-   For example, if we took the data on the data on `release dates of the Monty
+   For example, we can use the `release dates of the Monty
    Python films <https://en.wikipedia.org/wiki/Monty_Python#Films>`_, and used
-   it to predict the cumulative number of Monty Python films produced, we could
-   predict what would be the number of films they could have made till year
-   2019, assuming that they kept the pace.
+   it to predict the cumulative number of Monty Python films
+   that would have been produced by 2019
+   assuming that they kept the pace.
 
    .. doctest::
 
@@ -659,14 +659,6 @@ However, for reading convenience, most of the examples show sorted sequences.
       >>> round(intercept + slope * 2019)
       16
 
-   We could also use it to "predict" how many Monty Python films existed when
-   Brian Cohen was born.
-
-   .. doctest::
-
-      >>> round(intercept + slope * 1)
-      -610
-
    .. versionadded:: 3.10
 
 
diff --git a/Lib/statistics.py b/Lib/statistics.py
index 507a5b2d79dce..5d38f855020f4 100644
--- a/Lib/statistics.py
+++ b/Lib/statistics.py
@@ -930,15 +930,15 @@ def linear_regression(regressor, dependent_variable, /):
     Return the intercept and slope of simple linear regression
     parameters estimated using ordinary least squares. Simple linear
     regression describes relationship between *regressor* and
-    *dependent variable* in terms of linear function::
+    *dependent variable* in terms of linear function:
 
         dependent_variable = intercept + slope * regressor + noise
 
-    where ``intercept`` and ``slope`` are the regression parameters that are
-    estimated, and noise term is an unobserved random variable, for the
-    variability of the data that was not explained by the linear regression
-    (it is equal to the difference between prediction and the actual values
-    of dependent variable).
+    where *intercept* and *slope* are the regression parameters that are
+    estimated, and noise represents the variability of the data that was
+    not explained by the linear regression (it is equal to the
+    difference between predicted and actual values of dependent
+    variable).
 
     The parameters are returned as a named tuple.
 



More information about the Python-checkins mailing list