From stanczakdominik at gmail.com  Thu Oct  1 00:04:23 2020
From: stanczakdominik at gmail.com (=?UTF-8?Q?Dominik_Sta=C5=84czak?=)
Date: Thu, 1 Oct 2020 06:04:23 +0200
Subject: [SciPy-Dev] Issues Inserting Graphical Overlay Using Matplotlib
 Patches
In-Reply-To: <VE1PR09MB30538CEA8EC6EB82A91726E28C330@VE1PR09MB3053.eurprd09.prod.outlook.com>
References: <VE1PR09MB30538CEA8EC6EB82A91726E28C330@VE1PR09MB3053.eurprd09.prod.outlook.com>
Message-ID: <CALqoO-5yVwpeni00yZSX1t1VPiduHA+Bn2u98uZesKTbKPM8Xg@mail.gmail.com>

Hey,

Subplots already includes a call to figure, which creates a new active
graph as you put it. This is the empty one that pops out. I'd do this:

fig,graph = plt.subplots(figsize=(6, 6))
# just a single subplots call at the beginning as you want it all on a
single image
plt.xlabel('V1')
plt.ylabel('V2')
plt.title('Visualization of raw data');
plt.scatter(data.iloc[:, 0], data.iloc[:, 1])
plt.scatter(mean[0],mean[1])
# remove plt.figure(figsize=(6, 6)) here
graph.add_patch(ellipse)
plt.show()

I expect that to work, or at least to head in the right direction :)

Cheers,
Dominik


On Wed, Sep 30, 2020, 19:55 Stephen Malcolm <stephen_malcolm at hotmail.com>
wrote:

> Hello All,
>
> I'm having some trouble adding a graphical overlay i.e. an ellipse onto my
> plot.
> I wish to do this, as I need to explain/ portray the mean, standard
> deviation and outliers. And hence evaluate the suitability of the dataset.
>
>
> Could you please let me know what code I'm missing/ or need to add, in
> order to insert this ellipse?
>
>
> I have no trouble plotting the data points and the mean using this code,
> however, the ellipse (width and height/ standard deviation) doesn't appear.
>
> I have no errors, instead, I'm getting a separate graph (without data
> points or ellipse) below the plotted one.
>
> Please find my code below:
>
> #pandas used to read dataset and return the data
> #numpy and matplotlib to represent and visualize the data
> #sklearn to implement kmeans algorithm
>
> import pandas as pd
> import numpy as np
> import matplotlib.pyplot as plt
> from sklearn.cluster import KMeans
>
> #import the data
> data = pd.read_csv('banknotes.csv')
>
> #extract values
> x=data['V1']
> y=data['V2']
>
> #print range to determine normalization
> print ("X_max : ",x.max())
> print ("X_min : ",x.min())
> print ("Y_max : ",y.max())
> print ("Y_min : ",y.min())
>
>
> #normalize values
> mean_x=x.mean()
> mean_y=y.mean()
> max_x=x.max()
> max_y=y.max()
> min_x=x.min()
> min_y=y.min()
>
> for i in range(0,x.size):
>     x[i] = (x[i] - mean_x) / (max_x - min_x)
>
> for i in range(0,y.size):
>     y[i] = (y[i] - mean_y) / (max_y - min_y)
>
> #statistical analyis using mean and standard deviation
>
> import matplotlib.patches as patches
>
> mean = np.mean(data, 0)
> std_dev = np.std(data, 0)
>
> ellipse = patches.Ellipse([mean[0], mean [1]], std_dev[0]*2, std_dev[1]*2,
> alpha=0.25)
>
> plt.xlabel('V1')
> plt.ylabel('V2')
> plt.title('Visualization of raw data');
> plt.scatter(data.iloc[:, 0], data.iloc[:, 1])
> plt.scatter(mean[0],mean[1])
> plt.figure(figsize=(6, 6))
>
> fig,graph = plt.subplots()
>
> graph.add_patch(ellipse)
>
> Kind Regards,
> Stephen
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201001/8d6bc51a/attachment-0001.html>

From stephen_malcolm at hotmail.com  Thu Oct  1 05:49:20 2020
From: stephen_malcolm at hotmail.com (Stephen Malcolm)
Date: Thu, 1 Oct 2020 09:49:20 +0000
Subject: [SciPy-Dev] Issues Inserting Graphical Overlay Using Matplotlib
 Patches
In-Reply-To: <CALqoO-5yVwpeni00yZSX1t1VPiduHA+Bn2u98uZesKTbKPM8Xg@mail.gmail.com>
References: <VE1PR09MB30538CEA8EC6EB82A91726E28C330@VE1PR09MB3053.eurprd09.prod.outlook.com>,
 <CALqoO-5yVwpeni00yZSX1t1VPiduHA+Bn2u98uZesKTbKPM8Xg@mail.gmail.com>
Message-ID: <VE1PR09MB3053CC05DA0950BB04443F888C300@VE1PR09MB3053.eurprd09.prod.outlook.com>

Hi Dominik,

Thanks for the feedback. Ok, so I ran your code.
However, I'm getting an error which is highlighted below the code.  Hoping you can shed some light? Appreciate your help on this.

import matplotlib.patches as patches

fig,graph = plt.subplots(figsize=(6, 6))
# just a single subplots call at the beginning as you want it all on a single image
plt.xlabel('V1')
plt.ylabel('V2')
plt.title('Visualization of raw data');
plt.scatter(data.iloc[:, 0], data.iloc[:, 1])
plt.scatter(mean[0],mean[1])
# remove plt.figure(figsize=(6, 6)) here
graph.add_patch(ellipse)
plt.show()

**
Error is:

NameError                                 Traceback (most recent call last)
<ipython-input-4-bd49ad8e0a12> in <module>()
      3 import matplotlib.patches as patches
      4
----> 5 fig,graph = plt.subplots(figsize = (6, 6))
      6 # just a single subplots call at the beginning as you want it all on a single image
      7 plt.xlabel('V1')

NameError: name 'plt' is not defined


________________________________
From: SciPy-Dev <scipy-dev-bounces+stephen_malcolm=hotmail.com at python.org> on behalf of Dominik Sta?czak <stanczakdominik at gmail.com>
Sent: 01 October 2020 04:04
To: SciPy Developers List <scipy-dev at python.org>
Subject: Re: [SciPy-Dev] Issues Inserting Graphical Overlay Using Matplotlib Patches

Hey,

Subplots already includes a call to figure, which creates a new active graph as you put it. This is the empty one that pops out. I'd do this:

fig,graph = plt.subplots(figsize=(6, 6))
# just a single subplots call at the beginning as you want it all on a single image
plt.xlabel('V1')
plt.ylabel('V2')
plt.title('Visualization of raw data');
plt.scatter(data.iloc[:, 0], data.iloc[:, 1])
plt.scatter(mean[0],mean[1])
# remove plt.figure(figsize=(6, 6)) here
graph.add_patch(ellipse)
plt.show()

I expect that to work, or at least to head in the right direction :)

Cheers,
Dominik


On Wed, Sep 30, 2020, 19:55 Stephen Malcolm <stephen_malcolm at hotmail.com<mailto:stephen_malcolm at hotmail.com>> wrote:
Hello All,


I'm having some trouble adding a graphical overlay i.e. an ellipse onto my plot.
I wish to do this, as I need to explain/ portray the mean, standard deviation and outliers. And hence evaluate the suitability of the dataset.


Could you please let me know what code I'm missing/ or need to add, in order to insert this ellipse?


I have no trouble plotting the data points and the mean using this code, however, the ellipse (width and height/ standard deviation) doesn't appear.

I have no errors, instead, I'm getting a separate graph (without data points or ellipse) below the plotted one.

Please find my code below:

#pandas used to read dataset and return the data
#numpy and matplotlib to represent and visualize the data
#sklearn to implement kmeans algorithm

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

#import the data
data = pd.read_csv('banknotes.csv')

#extract values
x=data['V1']
y=data['V2']

#print range to determine normalization
print ("X_max : ",x.max())
print ("X_min : ",x.min())
print ("Y_max : ",y.max())
print ("Y_min : ",y.min())


#normalize values
mean_x=x.mean()
mean_y=y.mean()
max_x=x.max()
max_y=y.max()
min_x=x.min()
min_y=y.min()

for i in range(0,x.size):
    x[i] = (x[i] - mean_x) / (max_x - min_x)

for i in range(0,y.size):
    y[i] = (y[i] - mean_y) / (max_y - min_y)

#statistical analyis using mean and standard deviation

import matplotlib.patches as patches

mean = np.mean(data, 0)
std_dev = np.std(data, 0)

ellipse = patches.Ellipse([mean[0], mean [1]], std_dev[0]*2, std_dev[1]*2, alpha=0.25)

plt.xlabel('V1')
plt.ylabel('V2')
plt.title('Visualization of raw data');
plt.scatter(data.iloc[:, 0], data.iloc[:, 1])
plt.scatter(mean[0],mean[1])
plt.figure(figsize=(6, 6))

fig,graph = plt.subplots()

graph.add_patch(ellipse)

Kind Regards,
Stephen

_______________________________________________
SciPy-Dev mailing list
SciPy-Dev at python.org<mailto:SciPy-Dev at python.org>
https://mail.python.org/mailman/listinfo/scipy-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201001/04c98baf/attachment.html>

From andrea.gavana at gmail.com  Thu Oct  1 06:42:45 2020
From: andrea.gavana at gmail.com (Andrea Gavana)
Date: Thu, 1 Oct 2020 12:42:45 +0200
Subject: [SciPy-Dev] Issues Inserting Graphical Overlay Using Matplotlib
 Patches
In-Reply-To: <VE1PR09MB3053CC05DA0950BB04443F888C300@VE1PR09MB3053.eurprd09.prod.outlook.com>
References: <VE1PR09MB30538CEA8EC6EB82A91726E28C330@VE1PR09MB3053.eurprd09.prod.outlook.com>
 <CALqoO-5yVwpeni00yZSX1t1VPiduHA+Bn2u98uZesKTbKPM8Xg@mail.gmail.com>
 <VE1PR09MB3053CC05DA0950BB04443F888C300@VE1PR09MB3053.eurprd09.prod.outlook.com>
Message-ID: <CAEf70bx54xoizhJ6ECz42kWhYZH+9zc4JgmEMC5J+fuOvpc=OA@mail.gmail.com>

import matplotlib.pyplot as plt?

On Thu, 1 Oct 2020 at 11.49, Stephen Malcolm <stephen_malcolm at hotmail.com>
wrote:

>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Hi Dominik,
>
>
>
>
>
>
>
>
>
>
>
> Thanks for the feedback. Ok, so I ran your code.
>
>
>
>
> However, I'm getting an error which is highlighted below the code.  Hoping
> you can shed some light? Appreciate your help on this.
>
>
>
>
>
>
>
>
>
>
>
>
>
> import matplotlib.patches as patches
>
>
>
>
>
>
> fig,graph = plt.subplots(figsize=(6, 6))
>
>
> # just a single subplots call at the beginning as you want it all on a
> single image
>
>
> plt.xlabel('V1')
>
>
> plt.ylabel('V2')
>
>
> plt.title('Visualization of raw data');
>
>
> plt.scatter(data.iloc[:, 0], data.iloc[:, 1])
>
>
> plt.scatter(mean[0],mean[1])
>
>
> # remove plt.figure(figsize=(6, 6)) here
>
>
> graph.add_patch(ellipse)
>
>
> plt.show()
>
>
>
>
>
> **
>
>
>
>
> Error is:
>
> NameError                                 Traceback (most recent call last)
> <ipython-input-4-bd49ad8e0a12> in <module>()
>       3 import matplotlib.patches as patches
>       4
> ----> 5 fig,graph = plt.subplots(figsize = (6, 6))
>       6 # just a single subplots call at the beginning as you want it all on a single image
>       7 plt.xlabel('V1')
>
> NameError: name 'plt' is not defined
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> ------------------------------
>
>
> *From:* SciPy-Dev <scipy-dev-bounces+stephen_malcolm=
> hotmail.com at python.org> on behalf of Dominik Sta?czak <
> stanczakdominik at gmail.com>
>
>
> *Sent:* 01 October 2020 04:04
>
>
> *To:* SciPy Developers List <scipy-dev at python.org>
>
>
> *Subject:* Re: [SciPy-Dev] Issues Inserting Graphical Overlay Using
> Matplotlib Patches
>
>
>
>
>
>
>
>
> Hey,
>
>
>
>
>
>
> Subplots already includes a call to figure, which creates a new active
> graph as you put it. This is the empty one that pops out. I'd do this:
>
>
>
>
>
>
>
> fig,graph = plt.subplots(figsize=(6, 6))
>
>
> # just a single subplots call at the beginning as you want it all on a
> single image
>
>
> plt.xlabel('V1')
>
>
> plt.ylabel('V2')
>
>
> plt.title('Visualization of raw data');
>
>
> plt.scatter(data.iloc[:, 0], data.iloc[:, 1])
>
>
> plt.scatter(mean[0],mean[1])
>
>
> # remove plt.figure(figsize=(6, 6)) here
>
>
> graph.add_patch(ellipse)
>
>
>
>
> plt.show()
>
>
>
>
>
>
>
> I expect that to work, or at least to head in the right direction :)
>
>
>
>
>
>
>
> Cheers,
>
>
> Dominik
>
>
>
>
>
>
>
>
>
>
> On Wed, Sep 30, 2020, 19:55 Stephen Malcolm <stephen_malcolm at hotmail.com>
> wrote:
>
>
>
>
>
>
>
>
>
>
> Hello All,
>
>
>
>
>
>
>
>
>
>
>
>
>
> I'm having some trouble adding a graphical overlay i.e. an ellipse onto my
> plot.
>
>
> I wish to do this, as I need to explain/ portray the mean, standard
> deviation and outliers. And hence evaluate the suitability of the dataset.
>
>
>
>
>
>
>
>
>
>
>
> Could you please let me know what code I'm missing/ or need to add, in
> order to insert this ellipse?
>
>
>
>
>
>
>
>
>
>
>
> I have no trouble plotting the data points and the mean using this code,
> however, the ellipse (width and height/ standard deviation) doesn't appear.
>
>
>
>
> I have no errors, instead, I'm getting a separate graph (without data
> points or ellipse) below the plotted one.
>
>
>
>
> Please find my code below:
>
>
>
>
>
>
>
>
>
> #pandas used to read dataset and return the data
>
> #numpy and matplotlib to represent and visualize the data
>
>
> #sklearn to implement kmeans algorithm
>
>
>
>
>
>
>
> import pandas as pd
>
>
> import numpy as np
>
>
> import matplotlib.pyplot as plt
>
>
> from sklearn.cluster import KMeans
>
>
>
>
>
>
>
> #import the data
>
>
> data = pd.read_csv('banknotes.csv')
>
>
>
>
>
>
>
> #extract values
>
>
> x=data['V1']
>
>
> y=data['V2']
>
>
>
>
>
>
>
> #print range to determine normalization
>
>
> print ("X_max : ",x.max())
>
>
> print ("X_min : ",x.min())
>
>
> print ("Y_max : ",y.max())
>
>
> print ("Y_min : ",y.min())
>
>
>
>
>
>
>
>
>
>
> #normalize values
>
> mean_x=x.mean()
>
>
> mean_y=y.mean()
>
>
> max_x=x.max()
>
>
> max_y=y.max()
>
>
> min_x=x.min()
>
>
> min_y=y.min()
>
>
>
>
>
>
>
> for i in range(0,x.size):
>
>
>     x[i] = (x[i] - mean_x) / (max_x - min_x)
>
>
>
>
>
> for i in range(0,y.size):
>
>
>     y[i] = (y[i] - mean_y) / (max_y - min_y)
>
>
>
>
>
>
>
>
>
> #statistical analyis using mean and standard deviation
>
>
>
>
>
>
> import matplotlib.patches as patches
>
>
>
>
>
>
>
> mean = np.mean(data, 0)
>
>
> std_dev = np.std(data, 0)
>
>
>
>
>
>
>
> ellipse = patches.Ellipse([mean[0], mean [1]], std_dev[0]*2, std_dev[1]*2,
> alpha=0.25)
>
>
>
>
>
>
>
> plt.xlabel('V1')
>
>
> plt.ylabel('V2')
>
>
> plt.title('Visualization of raw data');
>
>
> plt.scatter(data.iloc[:, 0], data.iloc[:, 1])
>
>
> plt.scatter(mean[0],mean[1])
>
>
> plt.figure(figsize=(6, 6))
>
>
>
>
>
>
>
> fig,graph = plt.subplots()
>
>
>
>
>
>
>
>
>
> graph.add_patch(ellipse)
>
>
>
>
>
>
>
>
>
>
>
>
>
> Kind Regards,
>
>
>
>
> Stephen
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
>
>
> SciPy-Dev mailing list
>
>
> SciPy-Dev at python.org
>
>
> https://mail.python.org/mailman/listinfo/scipy-dev
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
>
> SciPy-Dev mailing list
>
> SciPy-Dev at python.org
>
> https://mail.python.org/mailman/listinfo/scipy-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201001/931e1481/attachment-0001.html>

From stephen_malcolm at hotmail.com  Thu Oct  1 06:58:49 2020
From: stephen_malcolm at hotmail.com (Stephen Malcolm)
Date: Thu, 1 Oct 2020 10:58:49 +0000
Subject: [SciPy-Dev] Issues Inserting Graphical Overlay Using Matplotlib
 Patches
In-Reply-To: <CAEf70bx54xoizhJ6ECz42kWhYZH+9zc4JgmEMC5J+fuOvpc=OA@mail.gmail.com>
References: <VE1PR09MB30538CEA8EC6EB82A91726E28C330@VE1PR09MB3053.eurprd09.prod.outlook.com>
 <CALqoO-5yVwpeni00yZSX1t1VPiduHA+Bn2u98uZesKTbKPM8Xg@mail.gmail.com>
 <VE1PR09MB3053CC05DA0950BB04443F888C300@VE1PR09MB3053.eurprd09.prod.outlook.com>,
 <CAEf70bx54xoizhJ6ECz42kWhYZH+9zc4JgmEMC5J+fuOvpc=OA@mail.gmail.com>
Message-ID: <VE1PR09MB3053991B1FD89E3DD1A292298C300@VE1PR09MB3053.eurprd09.prod.outlook.com>

I tried your line of code Andrea. I get the plot, but no ellipse. I think I need the matiplotlib.patches, otherwise it?s not possible to get the overlay/ ellipse

Sent from my iPhone

On 2020. Oct 1., at 12:43, Andrea Gavana <andrea.gavana at gmail.com> wrote:

?
import matplotlib.pyplot as plt?

On Thu, 1 Oct 2020 at 11.49, Stephen Malcolm <stephen_malcolm at hotmail.com<mailto:stephen_malcolm at hotmail.com>> wrote:


Hi Dominik,


Thanks for the feedback. Ok, so I ran your code.


However, I'm getting an error which is highlighted below the code.  Hoping you can shed some light? Appreciate your help on this.


import matplotlib.patches as patches


fig,graph = plt.subplots(figsize=(6, 6))


# just a single subplots call at the beginning as you want it all on a single image


plt.xlabel('V1')


plt.ylabel('V2')


plt.title('Visualization of raw data');


plt.scatter(data.iloc[:, 0], data.iloc[:, 1])


plt.scatter(mean[0],mean[1])


# remove plt.figure(figsize=(6, 6)) here


graph.add_patch(ellipse)


plt.show()


**


Error is:


NameError                                 Traceback (most recent call last)
<ipython-input-4-bd49ad8e0a12> in <module>()
      3 import matplotlib.patches as patches
      4
----> 5 fig,graph = plt.subplots(figsize = (6, 6))
      6 # just a single subplots call at the beginning as you want it all on a single image
      7 plt.xlabel('V1')

NameError: name 'plt' is not defined


________________________________


From: SciPy-Dev <scipy-dev-bounces+stephen_malcolm=hotmail.com at python.org<mailto:hotmail.com at python.org>> on behalf of Dominik Sta?czak <stanczakdominik at gmail.com<mailto:stanczakdominik at gmail.com>>


Sent: 01 October 2020 04:04


To: SciPy Developers List <scipy-dev at python.org<mailto:scipy-dev at python.org>>


Subject: Re: [SciPy-Dev] Issues Inserting Graphical Overlay Using Matplotlib Patches


Hey,


Subplots already includes a call to figure, which creates a new active graph as you put it. This is the empty one that pops out. I'd do this:


fig,graph = plt.subplots(figsize=(6, 6))


# just a single subplots call at the beginning as you want it all on a single image


plt.xlabel('V1')


plt.ylabel('V2')


plt.title('Visualization of raw data');


plt.scatter(data.iloc[:, 0], data.iloc[:, 1])


plt.scatter(mean[0],mean[1])


# remove plt.figure(figsize=(6, 6)) here


graph.add_patch(ellipse)


plt.show()


I expect that to work, or at least to head in the right direction :)


Cheers,


Dominik


On Wed, Sep 30, 2020, 19:55 Stephen Malcolm <stephen_malcolm at hotmail.com<mailto:stephen_malcolm at hotmail.com>> wrote:


Hello All,


I'm having some trouble adding a graphical overlay i.e. an ellipse onto my plot.


I wish to do this, as I need to explain/ portray the mean, standard deviation and outliers. And hence evaluate the suitability of the dataset.


Could you please let me know what code I'm missing/ or need to add, in order to insert this ellipse?


I have no trouble plotting the data points and the mean using this code, however, the ellipse (width and height/ standard deviation) doesn't appear.


I have no errors, instead, I'm getting a separate graph (without data points or ellipse) below the plotted one.


Please find my code below:


#pandas used to read dataset and return the data

#numpy and matplotlib to represent and visualize the data


#sklearn to implement kmeans algorithm


import pandas as pd


import numpy as np


import matplotlib.pyplot as plt


from sklearn.cluster import KMeans


#import the data


data = pd.read_csv('banknotes.csv')


#extract values


x=data['V1']


y=data['V2']


#print range to determine normalization


print ("X_max : ",x.max())


print ("X_min : ",x.min())


print ("Y_max : ",y.max())


print ("Y_min : ",y.min())


#normalize values

mean_x=x.mean()


mean_y=y.mean()


max_x=x.max()


max_y=y.max()


min_x=x.min()


min_y=y.min()


for i in range(0,x.size):


    x[i] = (x[i] - mean_x) / (max_x - min_x)


for i in range(0,y.size):


    y[i] = (y[i] - mean_y) / (max_y - min_y)


#statistical analyis using mean and standard deviation


import matplotlib.patches as patches


mean = np.mean(data, 0)


std_dev = np.std(data, 0)


ellipse = patches.Ellipse([mean[0], mean [1]], std_dev[0]*2, std_dev[1]*2, alpha=0.25)


plt.xlabel('V1')


plt.ylabel('V2')


plt.title('Visualization of raw data');


plt.scatter(data.iloc[:, 0], data.iloc[:, 1])


plt.scatter(mean[0],mean[1])


plt.figure(figsize=(6, 6))


fig,graph = plt.subplots()


graph.add_patch(ellipse)


Kind Regards,


Stephen


_______________________________________________


SciPy-Dev mailing list


SciPy-Dev at python.org<mailto:SciPy-Dev at python.org>


https://mail.python.org/mailman/listinfo/scipy-dev


_______________________________________________

SciPy-Dev mailing list

SciPy-Dev at python.org<mailto:SciPy-Dev at python.org>

https://mail.python.org/mailman/listinfo/scipy-dev

_______________________________________________
SciPy-Dev mailing list
SciPy-Dev at python.org
https://mail.python.org/mailman/listinfo/scipy-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201001/ff6de76f/attachment-0001.html>

From stephen_malcolm at hotmail.com  Thu Oct  1 09:19:07 2020
From: stephen_malcolm at hotmail.com (Stephen Malcolm)
Date: Thu, 1 Oct 2020 13:19:07 +0000
Subject: [SciPy-Dev] Issues Inserting Graphical Overlay Using Matplotlib
 Patches
In-Reply-To: <CAEf70bx54xoizhJ6ECz42kWhYZH+9zc4JgmEMC5J+fuOvpc=OA@mail.gmail.com>
References: <VE1PR09MB30538CEA8EC6EB82A91726E28C330@VE1PR09MB3053.eurprd09.prod.outlook.com>
 <CALqoO-5yVwpeni00yZSX1t1VPiduHA+Bn2u98uZesKTbKPM8Xg@mail.gmail.com>
 <VE1PR09MB3053CC05DA0950BB04443F888C300@VE1PR09MB3053.eurprd09.prod.outlook.com>,
 <CAEf70bx54xoizhJ6ECz42kWhYZH+9zc4JgmEMC5J+fuOvpc=OA@mail.gmail.com>
Message-ID: <VE1PR09MB30536788B8B73C23135B16F18C300@VE1PR09MB3053.eurprd09.prod.outlook.com>

Hello All,

For your information, I managed to rectify the code to incorporate my ellipse onto one plot.  I used some of Dominick's suggestions, but I also moved the 'fig,graph = plt.subplots(figsize=(6,6))' line just below the ellipse syntax i.e. 'ellipse = patches.Ellipse([mean[0], mean [1]], std_dev[0]*2, std_dev[1]*2, alpha=0.25)'. This had made all the difference.

Thanks for all your input.


#statistical analysis using mean and standard deviation

import matplotlib.patches as patches

mean = np.mean(data, 0)
std_dev = np.std(data, 0)

ellipse = patches.Ellipse([mean[0], mean [1]], std_dev[0]*2, std_dev[1]*2, alpha=0.25)

fig,graph = plt.subplots(figsize=(6,6))
graph.scatter(data.iloc[:, 0], data.iloc[:, 1])
graph.scatter(mean[0],mean[1])

plt.xlabel('V1')
plt.ylabel('V2')
plt.title('Visualization of raw data');

graph.add_patch(ellipse)
________________________________
From: SciPy-Dev <scipy-dev-bounces+stephen_malcolm=hotmail.com at python.org> on behalf of Andrea Gavana <andrea.gavana at gmail.com>
Sent: 01 October 2020 10:42
To: SciPy Developers List <scipy-dev at python.org>
Subject: Re: [SciPy-Dev] Issues Inserting Graphical Overlay Using Matplotlib Patches

import matplotlib.pyplot as plt?

On Thu, 1 Oct 2020 at 11.49, Stephen Malcolm <stephen_malcolm at hotmail.com<mailto:stephen_malcolm at hotmail.com>> wrote:


Hi Dominik,


Thanks for the feedback. Ok, so I ran your code.


However, I'm getting an error which is highlighted below the code.  Hoping you can shed some light? Appreciate your help on this.


import matplotlib.patches as patches


fig,graph = plt.subplots(figsize=(6, 6))


# just a single subplots call at the beginning as you want it all on a single image


plt.xlabel('V1')


plt.ylabel('V2')


plt.title('Visualization of raw data');


plt.scatter(data.iloc[:, 0], data.iloc[:, 1])


plt.scatter(mean[0],mean[1])


# remove plt.figure(figsize=(6, 6)) here


graph.add_patch(ellipse)


plt.show()


**


Error is:


NameError                                 Traceback (most recent call last)
<ipython-input-4-bd49ad8e0a12> in <module>()
      3 import matplotlib.patches as patches
      4
----> 5 fig,graph = plt.subplots(figsize = (6, 6))
      6 # just a single subplots call at the beginning as you want it all on a single image
      7 plt.xlabel('V1')

NameError: name 'plt' is not defined


________________________________


From: SciPy-Dev <scipy-dev-bounces+stephen_malcolm=hotmail.com at python.org<mailto:hotmail.com at python.org>> on behalf of Dominik Sta?czak <stanczakdominik at gmail.com<mailto:stanczakdominik at gmail.com>>


Sent: 01 October 2020 04:04


To: SciPy Developers List <scipy-dev at python.org<mailto:scipy-dev at python.org>>


Subject: Re: [SciPy-Dev] Issues Inserting Graphical Overlay Using Matplotlib Patches


Hey,


Subplots already includes a call to figure, which creates a new active graph as you put it. This is the empty one that pops out. I'd do this:


fig,graph = plt.subplots(figsize=(6, 6))


# just a single subplots call at the beginning as you want it all on a single image


plt.xlabel('V1')


plt.ylabel('V2')


plt.title('Visualization of raw data');


plt.scatter(data.iloc[:, 0], data.iloc[:, 1])


plt.scatter(mean[0],mean[1])


# remove plt.figure(figsize=(6, 6)) here


graph.add_patch(ellipse)


plt.show()


I expect that to work, or at least to head in the right direction :)


Cheers,


Dominik


On Wed, Sep 30, 2020, 19:55 Stephen Malcolm <stephen_malcolm at hotmail.com<mailto:stephen_malcolm at hotmail.com>> wrote:


Hello All,


I'm having some trouble adding a graphical overlay i.e. an ellipse onto my plot.


I wish to do this, as I need to explain/ portray the mean, standard deviation and outliers. And hence evaluate the suitability of the dataset.


Could you please let me know what code I'm missing/ or need to add, in order to insert this ellipse?


I have no trouble plotting the data points and the mean using this code, however, the ellipse (width and height/ standard deviation) doesn't appear.


I have no errors, instead, I'm getting a separate graph (without data points or ellipse) below the plotted one.


Please find my code below:


#pandas used to read dataset and return the data

#numpy and matplotlib to represent and visualize the data


#sklearn to implement kmeans algorithm


import pandas as pd


import numpy as np


import matplotlib.pyplot as plt


from sklearn.cluster import KMeans


#import the data


data = pd.read_csv('banknotes.csv')


#extract values


x=data['V1']


y=data['V2']


#print range to determine normalization


print ("X_max : ",x.max())


print ("X_min : ",x.min())


print ("Y_max : ",y.max())


print ("Y_min : ",y.min())


#normalize values

mean_x=x.mean()


mean_y=y.mean()


max_x=x.max()


max_y=y.max()


min_x=x.min()


min_y=y.min()


for i in range(0,x.size):


    x[i] = (x[i] - mean_x) / (max_x - min_x)


for i in range(0,y.size):


    y[i] = (y[i] - mean_y) / (max_y - min_y)


#statistical analyis using mean and standard deviation


import matplotlib.patches as patches


mean = np.mean(data, 0)


std_dev = np.std(data, 0)


ellipse = patches.Ellipse([mean[0], mean [1]], std_dev[0]*2, std_dev[1]*2, alpha=0.25)


plt.xlabel('V1')


plt.ylabel('V2')


plt.title('Visualization of raw data');


plt.scatter(data.iloc[:, 0], data.iloc[:, 1])


plt.scatter(mean[0],mean[1])


plt.figure(figsize=(6, 6))


fig,graph = plt.subplots()


graph.add_patch(ellipse)


Kind Regards,


Stephen


_______________________________________________


SciPy-Dev mailing list


SciPy-Dev at python.org<mailto:SciPy-Dev at python.org>


https://mail.python.org/mailman/listinfo/scipy-dev


_______________________________________________

SciPy-Dev mailing list

SciPy-Dev at python.org<mailto:SciPy-Dev at python.org>

https://mail.python.org/mailman/listinfo/scipy-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201001/d0cf40f5/attachment-0001.html>

From thomas.c.hodson at gmail.com  Sat Oct  3 09:01:40 2020
From: thomas.c.hodson at gmail.com (Thomas Hodson)
Date: Sat, 3 Oct 2020 15:01:40 +0200
Subject: [SciPy-Dev] PR Idea: Allowing multiple axis arguments (axis = tuple
 of ints) in the stats package?
Message-ID: <CA+_HMo=Wva6ML8XCvKxDXPbNPXj7a=6uA1Bs7-c_FrMmw-X3JQ@mail.gmail.com>

Hello!

I use scipy.stats.sem a lot and I would love if it was able to take
multiple axis arguments as many numpy functions can. Looking at the source
code sem and other functions in scipy.stats are already implemented mostly
in terms of numpy functions so it seems like it would only require changing
some parts of the logic. Taking stats.sem as an example I think it would
require changing
n = a.shape[axis]
to something like
n = product(a.shape[axis]) For masked arrays, a.count(axis) is used which
already works with multiple axes.
Before I start work on a PR I wanted to ask if there is some reason that
this change would be considered a bad idea. Otherwise, if I write a decent
PR with tests, benchmarks and documentation that updates all the functions
in scipy.stats that might reasonably take multiple axis arguments, is it
likely to be accepted? Thanks, Tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201003/9dc20984/attachment.html>

From stephen_malcolm at hotmail.com  Sun Oct  4 16:31:13 2020
From: stephen_malcolm at hotmail.com (Stephen Malcolm)
Date: Sun, 4 Oct 2020 20:31:13 +0000
Subject: [SciPy-Dev] Re-running Kmeans on a dataset in Python
Message-ID: <VE1PR09MB3053C00669CD5EB6A433EA2A8C0F0@VE1PR09MB3053.eurprd09.prod.outlook.com>

Hello,

I've written some code to run Kmeans on a data set (please see below).
And I've plotted the results, with my two clusters/ centroids.

However, I've to re-run Kmeans several times and pull up different plots (showing the different centroid positions).
Can someone point me in the right direction how to write this extra code to perform this task?

Then I've to conclude if Kmeans is stable. I believe this is the lowest sum of squared errors?
Thanking you in advance.

pandas used to read dataset and return the data
#numpy and matplotlib to represent and visualize the data
#sklearn to implement kmeans algorithm

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler

#import the data
data = pd.read_csv('file.csv')

#extract values
x=data['V1']
y=data['V2']

V1_V2 = np.column_stack ((V1, V2))

km_res = KMeans (n_clusters= 2).fit(V1_V2)
y_kmeans = km_res.predict(V1_V2)

plt.scatter(V1, V2, c=y_kmeans, cmap='viridis',  s = 50, alpha = 0.5)
plt.xlabel('V1')
plt.ylabel('V2')
plt.title('Visualization of raw data');

clusters = km_res.cluster_centers_
plt.scatter(clusters[:,0], clusters[:,1], c='blue', s=150)


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201004/3aa875ce/attachment.html>

From powen.kao.tw at gmail.com  Sun Oct  4 17:25:40 2020
From: powen.kao.tw at gmail.com (Po-Wen Kao)
Date: Sun, 4 Oct 2020 23:25:40 +0200
Subject: [SciPy-Dev] Enhance stats.shapiro() to take n-dimension input and
 nan_policy
Message-ID: <CANEjZP2JOyO2Rn4qkRNSxnppdD-0tNEBK6W-Np07EHfJ=LBNxA@mail.gmail.com>

Hi scipy-dev,
Recently I am working on data analysis combining Pandas and scipy.
I realize some API in stats module takes only 1d array and can't omit
np.nan value.
So I create a PR (https://github.com/scipy/scipy/pull/12916) on the
function that I use the most. If the community finds this feasible, I would
be happy to help also upgrade other APIs in the future. :)
In case you find this PR interesting, feel free to be the reviewer of the
change. Thanks

Best Regards,
Po-Wen Kao
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201004/73fe70a8/attachment.html>

From grlee77 at gmail.com  Mon Oct  5 01:02:52 2020
From: grlee77 at gmail.com (Gregory Lee)
Date: Mon, 5 Oct 2020 01:02:52 -0400
Subject: [SciPy-Dev] Discontinued Rackspace Open Compute Discount
In-Reply-To: <CAFvE7K4FbOU0vMdejQHo7Qr9BR6FKRX17tCQXe7aRG6Hc=DQTQ@mail.gmail.com>
References: <470871448.2376.1581182952898@wamui-napoleon.atl.sa.earthlink.net>
 <CAFvE7K43Uz2deJy81r_jxXT-8=qYYcpcjssU3TFrB4AUSuuvnw@mail.gmail.com>
 <CAFvE7K4vCb72dE2=-3Wy1bfKCCax7g8JxEP8qWSGi-exm_mn=A@mail.gmail.com>
 <CA+WonSRKO+XWtffy3YC80wpCZS7Z6cnbSPp=gy4NmuF6ucDn8Q@mail.gmail.com>
 <CAH6Pt5ooyWb+jwqM2OjVVnf8noox0PO7DxY74x_47h9nJowq8g@mail.gmail.com>
 <CA+WonSRZ1prkf6MbSoHYBFkOR_5J9Fv4jcW34G456MYJKterxw@mail.gmail.com>
 <CAFvE7K58Hhc68MUdF90DF+w1ACDw2MCxESmDfCDs3JvdRiB41Q@mail.gmail.com>
 <CAFvE7K5+M5COuuiSHHajNiVJL_omAX71VcUFfk88A+A-KmdikQ@mail.gmail.com>
 <CAFvE7K7bmoPZcRQrEqt11mE=X0FgqAUYmhQbj0gvcM5QL=i4GQ@mail.gmail.com>
 <CAE1aY-kjnNtsUych1kDMSt2hT5qk7K6e4gOuQnUY-LSy0otbjg@mail.gmail.com>
 <CAJR3sXebe8bz+C1YuW5i7eS1bq3DjN9Bj3AByakmuM1FZcLS0w@mail.gmail.com>
 <CAFvE7K4FbOU0vMdejQHo7Qr9BR6FKRX17tCQXe7aRG6Hc=DQTQ@mail.gmail.com>
Message-ID: <CAJR3sXdiNp2NrNUBfHJ4Y47vN0N+8vbXg=ev7D2y9kYLyGw4dQ@mail.gmail.com>

On Sat, Feb 15, 2020 at 5:11 AM Olivier Grisel <olivier.grisel at ensta.org>
wrote:

> Tyler, Tom: I added you as co-owners of the scipy-wheels-nightly
> organization.
> Gregory, Tyler, Tom: I added you as co-owners of the
> multibuild-wheels-staging organization.
>
>
Hi Olivier,

I was previously added as a co-owner for multibuild-wheels-staging. Can you
add me to scipy-wheels-nightly as well (username: grlee77)? (this is for
the scikit-image team).


> --
> Olivier
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201005/8228497e/attachment.html>

From laurin.steidle at uni-hamburg.de  Mon Oct  5 06:26:04 2020
From: laurin.steidle at uni-hamburg.de (Laurin Steidle)
Date: Mon, 5 Oct 2020 12:26:04 +0200
Subject: [SciPy-Dev] Proposal for a new ODE Solver
Message-ID: <6560341b-66a9-2467-d9da-8e97c5a198f0@uni-hamburg.de>

Hi all,

I've proposed the implementation of the modified-Patanka-Runge-Kutta 
(MPRK) ODE solver.
(original message sent at Thu, 27 Aug 2020)

I've continued with the development but now I am stuck at a point where 
decision needs to be made.
I've presented more detailed in the github issue 
<https://github.com/scipy/scipy/issues/12766> but to summarize it quickly:

MPRK requires the form of the ODE to be

|dy / dt = p(t, y) + d(t, y) where (p/d) is (positive/negative) |

instead of the scipy default:

||

|dy / dt = f(t, y) |||

This opens up three options on how to proceed from here (that I can 
think of):

 1. Allow the new format to be used in solve_ivp. (creates two disjoint
    ODE solver sets in solve_ivp)
 2. Create a new "solve_ivp"-like function. (creates a naming conflict
    as both methods solve ivp's)
 3. Abandon the new solver

I personally don't like either of these options very much but I am 
inclined to the first.
Reasons for that are outlined in the github issue.

Any thoughts on this?

Best,
Laurin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201005/110c4358/attachment-0001.html>

From ralf.gommers at gmail.com  Tue Oct  6 16:44:20 2020
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Tue, 6 Oct 2020 21:44:20 +0100
Subject: [SciPy-Dev] PR Idea: Allowing multiple axis arguments (axis =
 tuple of ints) in the stats package?
In-Reply-To: <CA+_HMo=Wva6ML8XCvKxDXPbNPXj7a=6uA1Bs7-c_FrMmw-X3JQ@mail.gmail.com>
References: <CA+_HMo=Wva6ML8XCvKxDXPbNPXj7a=6uA1Bs7-c_FrMmw-X3JQ@mail.gmail.com>
Message-ID: <CABL7CQjcMKEUC-QM-cV74G1ZrUqe_MwUWfUNNsp-+SOPi4zm=Q@mail.gmail.com>

On Sat, Oct 3, 2020 at 2:02 PM Thomas Hodson <thomas.c.hodson at gmail.com>
wrote:

> Hello!
>
> I use scipy.stats.sem a lot and I would love if it was able to take
> multiple axis arguments as many numpy functions can. Looking at the source
> code sem and other functions in scipy.stats are already implemented mostly
> in terms of numpy functions so it seems like it would only require changing
> some parts of the logic. Taking stats.sem as an example I think it would
> require changing
> n = a.shape[axis]
> to something like
> n = product(a.shape[axis]) For masked arrays, a.count(axis) is used which
> already works with multiple axes.
> Before I start work on a PR I wanted to ask if there is some reason that
> this change would be considered a bad idea. Otherwise, if I write a decent
> PR with tests, benchmarks and documentation that updates all the functions
> in scipy.stats that might reasonably take multiple axis arguments, is it
> likely to be accepted?
>

Good question, thanks for asking. At first sight it does seem appealing,
but also a bit worrying that we may end up with something less consistent.
Right now in scipy.stats there's O(100) instances of the axis keyword, and
they all work the same (int or None), with the exception of gstd and iqr,
which take multiple integers. NumPy on the other hand is far messier,
whether or not tuple of ints is accepted is less predictable, and it's also
more common to use "axes" when tuple of ints is accepted.

So I'd think it's fine if you indeed do it for all functions in
scipy.stats, and if there's no unexpected complications in the
implementation (e.g. makes functions with nan_policy or shape prediction
much harder to get right).

Cheers,
Ralf


Thanks, Tom
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201006/48419328/attachment.html>

From rejones7 at msn.com  Mon Oct 12 17:35:26 2020
From: rejones7 at msn.com (rondall jones)
Date: Mon, 12 Oct 2020 21:35:26 +0000
Subject: [SciPy-Dev] Resubmission of autosol.py
Message-ID: <CY4PR0701MB3619ED301A1CECE47017CAAA98070@CY4PR0701MB3619.namprd07.prod.outlook.com>

I am resubmitting an enhanced version of my autosol.py module. I have uploaded the code to scipy\linalg, and have uploaded a silent full-coverage test to scipy\linalg\tests.
WHAT IS IN THIS MODULE
Autosol.py is a package of automatically regularized solvers for dense linear algebraic systems. The term ?regularize? (see Wikipedia) means to add extra information to a set of equations (or other application) in order to move a poor result more toward what is needed. In our case this extra information is mainly three things: 1) requesting the solution to exclude abnormal behavior caused by unavoidable noise in the data; 2) adding to the system of regular least-squares equations some specific equations that are known to be exactly true, such as ?the solution sums to 1.0?; 3) giving the solver specific bounds on the answer by way of inequality constraints, such as requesting a non-negative solution.
arls() is the main automatically regularized least squares solver for Ax =b. The equations can be easy to solve, mildly or strongly ill-conditioned, or singular. ?arls? stands for ?Automatically Regularized Least Squares.? I have changed the names of the routines in autosol from the original long names to more scipy-like short names. I have left the module name as it was.
arlseq() builds on arls() to add Equality Constraints. There can be any number of equality constraints. Conflicts or other issues between these equations are resolved. Each equation is either solved exactly or rejected as incompatible with the other equations. This is normal behavior for such solvers.
arlsgt() builds on arlseq() to add Inequality Constraints. These often have the form of ?the sum of the solution elements must be 100?, or ?the solution must start at 1.0?, etc. There can be any number of inequality constraints. Inequality constraints are promoted to equality constraints when they are violated. This is, again, normal behavior for such solvers. See Lawson & Hansons? ?Solving Least Squares Problems? book.
arlsnn() builds on arlsgt() to provide a simple non-negativity constraint, like scipy?s nnls().
WHAT IS SIGNIFICANT ABOUT THESE ROUTINES?
These solvers are different than anything currently in scipy. (Skip this paragraph if you are not interested in the mathematical detais!) Briefly, if Ax = b, and A = USVt (an SVD) then
USVt * x = b can be written S(Vt*x)= Ut*b. The ?Discrete Picard Condition? that is the basic insight that autosol uses says that the right-hand side, Ut*b, must decrease faster than the singular values in S. In ill-conditioned problems that condition fails, often dramatically, and of course it fails by definition if A is singular. In these cases a careful analysis of the vector g = (S+ * Ut*b) can produce what might be call a ?usable rank? (which is smaller than the numerical rank) which then allows us to produce an estimate of the error in b, which then directly leads to an appropriate value for the lambda regularization parameter used in Tikhonov regularization. Thus, we can regularize Ax=b with no user input hints, and with no failure-prone iterative calculations. This method is very robust. It can handle multiple difficulties such as linear dependencies between rows, small singular values, zero singular values, high error levels in b, etc., etc., with no danger whatsoever that a numeric overflow or other error will occur. If the SVD does not crash (and such crashes are very rare indeed) then autosol?s algorithms will complete normally. There are no error modes to report. And there is no need for any special user input to guide the process or limit iteration counts, etc.
The other solvers ? arlseq(), arlsgt(), and arlsnn() build in a somewhat classic fashion on arls(), using traditional methods, though those methods have been generalized or enhanced in places, such as how problematic Equality Constraints are rejected.
HOW DOES THIS PACKAGE HELP SCIPY?
In contrast to autosol, all current solvers in scipy work ?blindly?. That is, they work without knowing whether the Ax=b problem they are solving is actually ill-conditioned or not. For example, lstsq() only diagnoses a problem when A is fully singular. But even a moderately ill-conditioned system will cause lstsq() to deliver a bad, high-norm, often oscillatory solution without notifying the user. This happens because the singular values only have to be small compared to the error level in b ? not near zero -- to produce a pathological result.
On the other hand, lsmr(), which is the current primary routine for handling ill-conditioned systems (and often produces beautiful results) is also ?blind? in that it never knows for sure whether the problem to which it is applying a regularization algorithm is actually ill-conditioned. The result is that it will often provide an inappropriately smoothed answer to a perfectly easy problem. Also, lsmr() is prone to producing erratic results with minor changes to a problem. I have prepared a brief illustration of these problems here: http://www.rejones7.net/Autosol/DemoArlsVsLsmr/index.htm
Additionally, the non-negative solver provided in autosol has advantages over the classis nnls() in scipy. The classic nnls() algorithm (from Lawson & Hanson?s classic Solving Least Squares Problems) works very hard at producing the smallest possible residual, and often ends with a failure to converge. It also causes problems with zeroing an excessive number of columns. For example, in a recent consultation regarding a biochemical problem, nnls() on average deleted over 50% of the columns in the problem to produce a non-negative result. But arlsnn() on average deleted of only 17% to achieve non-negativity, thus remaining more relevant to the model. And the slightly higher residual obtained by arlsnn() is usually trivial. Also, arlsnn() cannot fail to converge.
WHAT ISSUES ARE THERE?
One question involves computational cost. I have timed a number of problems being solved with lsmr() and autosol?s arls(). The timing is almost identical. On the other hand, lstsq() is greatly faster than lsmr() and arls(), so there is a cost for regularization, whether with lsmr() or arls(). The advantage of autosol?s algorithm is that it knows what difficulties it is dealing with: it handles well-conditioned problems with full accuracy, like lstsq(), and handles ill-conditioned problems as well as lsmr(), without ever foolishly applying a regularization technique to a well-conditioned problem.
Of course, no algorithm or software package is without flaws. Like lsmr() -- but less often than lsmr() -- autosol?s algorithms will fail to see ill-conditioning where some exists, or ?see? ill-conditioning that is not really present. We have tuned the detection method in the primary solver, arls(), for a decade, in a C++ environment, and have enhanced its reliability significantly during redevelopment of these routines for Python. It should be significantly more reliable then lsmr(), according to the behavior I have seen in testing.
Some history: I developed the first version of the key algorithms in this package as part of a Ph.D. dissertation in the 1980?s. Due to a career change I did not pursue further development of the methods far a long time. But I eventually improved the algorithms and moved the code from Fortran to C++ and made the resulting codes available on my web site for a decade or so. During this time the web site drew interesting customers from around the world, whose problem proved the usefulness of the heuristics in the arls() algorithm. Recently I decided to migrate the code to Python -- which has been fascinating -- and resulted in some further refinements. See http://www.rejones7.net/ for both the C++ material and this Python material and related information and examples.
I am hoping that the scipy community will accept this package ? with any necessary changes that are required by scipy ? and let users learn about it as an alternative to the more traditional solvers currently in scipy. It is worth noting that the current solvers, except for lsmr(), date from as far back as the 1960?s. Maybe it is time to let in some newer technology.
COMMENT
I have thoroughly tested these routines. But I certainly hope this group will try interesting problems with it and push its limits. Please forward me any problems with unexpected results.
I look forward to your feedback.
Ron Jones
rejones7 at msn.com<mailto:Rejones7 at msn.com>
http://www.rejones7.net/


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201012/1375a0af/attachment-0001.html>

From stefanv at berkeley.edu  Mon Oct 12 20:22:06 2020
From: stefanv at berkeley.edu (Stefan van der Walt)
Date: Mon, 12 Oct 2020 17:22:06 -0700
Subject: [SciPy-Dev] Resubmission of autosol.py
In-Reply-To: <CY4PR0701MB3619ED301A1CECE47017CAAA98070@CY4PR0701MB3619.namprd07.prod.outlook.com>
References: <CY4PR0701MB3619ED301A1CECE47017CAAA98070@CY4PR0701MB3619.namprd07.prod.outlook.com>
Message-ID: <536f6cca-82cc-47e3-b431-4851e281880d@www.fastmail.com>

Hi Ron,

Thank you for your email and contribution; I am not very familiar with the algorithms you implemented, but I hope that we will be able to find someone to review who is.

Is this the work that is being discussed at https://github.com/scipy/scipy/pull/12755 ?

Before the code can be included in SciPy, it will need a bit of polishing to fit in with the rest of the package (each test goes in its own function, modes are input as legible strings instead of ints, ensure all docstrings render correctly, etc.).  I see Ilhan has already provided some feedback on the pull request. 

Thank you again for contributing this work; I think having algorithms that are aware of failure cases when solving linear equations would be valuable.

Best regards,
St?fan


On Mon, Oct 12, 2020, at 14:35, rondall jones wrote:
> I am resubmitting an enhanced version of my autosol.py module. I have uploaded the code to scipy\linalg, and have uploaded a silent full-coverage test to scipy\linalg\tests.    

> WHAT IS IN THIS MODULE 

> Autosol.py is a package of automatically regularized solvers for dense linear algebraic systems. The term ?regularize? (see Wikipedia) means to add extra information to a set of equations (or other application) in order to move a poor result more toward what is needed. In our case this extra information is mainly three things: 1) requesting the solution to exclude abnormal behavior caused by unavoidable noise in the data; 2) adding to the system of regular least-squares equations some specific equations that are known to be exactly true, such as ?the solution sums to 1.0?; 3) giving the solver specific bounds on the answer by way of inequality constraints, such as requesting a non-negative solution. 

> arls() is the main automatically regularized least squares solver for Ax =b. The equations can be easy to solve, mildly or strongly ill-conditioned, or singular. ?arls? stands for ?Automatically Regularized Least Squares.? I have changed the names of the routines in autosol from the original long names to more scipy-like short names. I have left the module name as it was. 

> arlseq() builds on arls() to add Equality Constraints. There can be any number of equality constraints. Conflicts or other issues between these equations are resolved. Each equation is either solved exactly or rejected as incompatible with the other equations. This is normal behavior for such solvers. 

> arlsgt() builds on arlseq() to add Inequality Constraints. These often have the form of ?the sum of the solution elements must be 100?, or ?the solution must start at 1.0?, etc. There can be any number of inequality constraints. Inequality constraints are promoted to equality constraints when they are violated. This is, again, normal behavior for such solvers. See Lawson & Hansons? ?Solving Least Squares Problems? book. 

> arlsnn() builds on arlsgt() to provide a simple non-negativity constraint, like scipy?s nnls(). 

> WHAT IS SIGNIFICANT ABOUT THESE ROUTINES? 

> These solvers are different than anything currently in scipy. (Skip this paragraph if you are not interested in the mathematical detais!) Briefly, if Ax = b, and A = USVt (an SVD) then  

> USVt * x = b can be written S(Vt*x)= Ut*b. The ?Discrete Picard Condition? that is the basic insight that autosol uses says that the right-hand side, Ut*b, must decrease faster than the singular values in S. In ill-conditioned problems that condition fails, often dramatically, and of course it fails by definition if A is singular. In these cases a careful analysis of the vector g = (S+ * Ut*b) can produce what might be call a ?usable rank? (which is smaller than the numerical rank) which then allows us to produce an estimate of the error in b, which then directly leads to an appropriate value for the lambda regularization parameter used in Tikhonov regularization. Thus, we can regularize Ax=b with no user input hints, and with no failure-prone iterative calculations. This method is very robust. It can handle multiple difficulties such as linear dependencies between rows, small singular values, zero singular values, high error levels in b, etc., etc., with no danger whatsoever that a numeric overflow or other error will occur. If the SVD does not crash (and such crashes are very rare indeed) then autosol?s algorithms will complete normally. There are no error modes to report. And there is no need for any special user input to guide the process or limit iteration counts, etc. 

> The other solvers ? arlseq(), arlsgt(), and arlsnn() build in a somewhat classic fashion on arls(), using traditional methods, though those methods have been generalized or enhanced in places, such as how problematic Equality Constraints are rejected. 

> HOW DOES THIS PACKAGE HELP SCIPY? 

> In contrast to autosol, all current solvers in scipy work ?blindly?. That is, they work without knowing whether the Ax=b problem they are solving is actually ill-conditioned or not. For example, lstsq() only diagnoses a problem when A is fully singular. But even a moderately ill-conditioned system will cause lstsq() to deliver a bad, high-norm, often oscillatory solution without notifying the user. This happens because the singular values only have to be small compared to the error level in b ? not near zero -- to produce a pathological result.  

> On the other hand, lsmr(), which is the current primary routine for handling ill-conditioned systems (and often produces beautiful results) is also ?blind? in that it never knows for sure whether the problem to which it is applying a regularization algorithm is actually ill-conditioned. The result is that it will often provide an inappropriately smoothed answer to a perfectly easy problem. Also, lsmr() is prone to producing erratic results with minor changes to a problem. I have prepared a brief illustration of these problems here: http://www.rejones7.net/Autosol/DemoArlsVsLsmr/index.htm 

> Additionally, the non-negative solver provided in autosol has advantages over the classis nnls() in scipy. The classic nnls() algorithm (from Lawson & Hanson?s classic *Solving Least Squares Problems)* works very hard at producing the smallest possible residual, and often ends with a failure to converge. It also causes problems with zeroing an excessive number of columns. For example, in a recent consultation regarding a biochemical problem, nnls() on average deleted over 50% of the columns in the problem to produce a non-negative result. But arlsnn() on average deleted of only 17% to achieve non-negativity, thus remaining more relevant to the model. And the slightly higher residual obtained by arlsnn() is usually trivial. Also, arlsnn() cannot fail to converge. 

> WHAT ISSUES ARE THERE? 

> One question involves computational cost. I have timed a number of problems being solved with lsmr() and autosol?s arls(). The timing is almost identical. On the other hand, lstsq() is greatly faster than lsmr() and arls(), so there is a cost for regularization, whether with lsmr() or arls(). The advantage of autosol?s algorithm is that it knows what difficulties it is dealing with: it handles well-conditioned problems with full accuracy, like lstsq(), and handles ill-conditioned problems as well as lsmr(), without ever foolishly applying a regularization technique to a well-conditioned problem. 

> Of course, no algorithm or software package is without flaws. Like lsmr() -- but less often than lsmr() -- autosol?s algorithms will fail to see ill-conditioning where some exists, or ?see? ill-conditioning that is not really present. We have tuned the detection method in the primary solver, arls(), for a decade, in a C++ environment, and have enhanced its reliability significantly during redevelopment of these routines for Python. It should be significantly more reliable then lsmr(), according to the behavior I have seen in testing. 

> Some history: I developed the first version of the key algorithms in this package as part of a Ph.D. dissertation in the 1980?s. Due to a career change I did not pursue further development of the methods far a long time. But I eventually improved the algorithms and moved the code from Fortran to C++ and made the resulting codes available on my web site for a decade or so. During this time the web site drew interesting customers from around the world, whose problem proved the usefulness of the heuristics in the arls() algorithm. Recently I decided to migrate the code to Python -- which has been fascinating -- and resulted in some further refinements. See http://www.rejones7.net/ for both the C++ material and this Python material and related information and examples.  

> I am hoping that the scipy community will accept this package ? with any necessary changes that are required by scipy ? and let users learn about it as an alternative to the more traditional solvers currently in scipy. It is worth noting that the current solvers, except for lsmr(), date from as far back as the 1960?s. Maybe it is time to let in some newer technology. 

> COMMENT 

> I have thoroughly tested these routines. But I certainly hope this group will try interesting problems with it and push its limits. Please forward me any problems with unexpected results. 

> I look forward to your feedback. 

> Ron Jones 

> rejones7 at msn.com <mailto:Rejones7 at msn.com> 

> http://www.rejones7.net/ 

>  

> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201012/efdc2d96/attachment-0001.html>

From rejones7 at msn.com  Mon Oct 12 21:08:05 2020
From: rejones7 at msn.com (rondall jones)
Date: Tue, 13 Oct 2020 01:08:05 +0000
Subject: [SciPy-Dev] Resubmission of autosol.py
In-Reply-To: <536f6cca-82cc-47e3-b431-4851e281880d@www.fastmail.com>
References: <CY4PR0701MB3619ED301A1CECE47017CAAA98070@CY4PR0701MB3619.namprd07.prod.outlook.com>,
 <536f6cca-82cc-47e3-b431-4851e281880d@www.fastmail.com>
Message-ID: <CY4PR0701MB3619851301480307FDCDA13F98040@CY4PR0701MB3619.namprd07.prod.outlook.com>

The 12755 was an earlier submission. I have expanded the package a lot since then. You can ignore that one and look at the version I uploaded today. Thanks for your interest. I think any knowledgible reviewers will find it very interesting.
Ron

Sent from my iPhone

On Oct 12, 2020, at 6:23 PM, Stefan van der Walt <stefanv at berkeley.edu> wrote:

?
Hi Ron,

Thank you for your email and contribution; I am not very familiar with the algorithms you implemented, but I hope that we will be able to find someone to review who is.

Is this the work that is being discussed at https://github.com/scipy/scipy/pull/12755 ?

Before the code can be included in SciPy, it will need a bit of polishing to fit in with the rest of the package (each test goes in its own function, modes are input as legible strings instead of ints, ensure all docstrings render correctly, etc.).  I see Ilhan has already provided some feedback on the pull request.

Thank you again for contributing this work; I think having algorithms that are aware of failure cases when solving linear equations would be valuable.

Best regards,
St?fan


On Mon, Oct 12, 2020, at 14:35, rondall jones wrote:

I am resubmitting an enhanced version of my autosol.py module. I have uploaded the code to scipy\linalg, and have uploaded a silent full-coverage test to scipy\linalg\tests.

WHAT IS IN THIS MODULE

Autosol.py is a package of automatically regularized solvers for dense linear algebraic systems. The term ?regularize? (see Wikipedia) means to add extra information to a set of equations (or other application) in order to move a poor result more toward what is needed. In our case this extra information is mainly three things: 1) requesting the solution to exclude abnormal behavior caused by unavoidable noise in the data; 2) adding to the system of regular least-squares equations some specific equations that are known to be exactly true, such as ?the solution sums to 1.0?; 3) giving the solver specific bounds on the answer by way of inequality constraints, such as requesting a non-negative solution.

arls() is the main automatically regularized least squares solver for Ax =b. The equations can be easy to solve, mildly or strongly ill-conditioned, or singular. ?arls? stands for ?Automatically Regularized Least Squares.? I have changed the names of the routines in autosol from the original long names to more scipy-like short names. I have left the module name as it was.

arlseq() builds on arls() to add Equality Constraints. There can be any number of equality constraints. Conflicts or other issues between these equations are resolved. Each equation is either solved exactly or rejected as incompatible with the other equations. This is normal behavior for such solvers.

arlsgt() builds on arlseq() to add Inequality Constraints. These often have the form of ?the sum of the solution elements must be 100?, or ?the solution must start at 1.0?, etc. There can be any number of inequality constraints. Inequality constraints are promoted to equality constraints when they are violated. This is, again, normal behavior for such solvers. See Lawson & Hansons? ?Solving Least Squares Problems? book.

arlsnn() builds on arlsgt() to provide a simple non-negativity constraint, like scipy?s nnls().

WHAT IS SIGNIFICANT ABOUT THESE ROUTINES?

These solvers are different than anything currently in scipy. (Skip this paragraph if you are not interested in the mathematical detais!) Briefly, if Ax = b, and A = USVt (an SVD) then

USVt * x = b can be written S(Vt*x)= Ut*b. The ?Discrete Picard Condition? that is the basic insight that autosol uses says that the right-hand side, Ut*b, must decrease faster than the singular values in S. In ill-conditioned problems that condition fails, often dramatically, and of course it fails by definition if A is singular. In these cases a careful analysis of the vector g = (S+ * Ut*b) can produce what might be call a ?usable rank? (which is smaller than the numerical rank) which then allows us to produce an estimate of the error in b, which then directly leads to an appropriate value for the lambda regularization parameter used in Tikhonov regularization. Thus, we can regularize Ax=b with no user input hints, and with no failure-prone iterative calculations. This method is very robust. It can handle multiple difficulties such as linear dependencies between rows, small singular values, zero singular values, high error levels in b, etc., etc., with no danger whatsoever that a numeric overflow or other error will occur. If the SVD does not crash (and such crashes are very rare indeed) then autosol?s algorithms will complete normally. There are no error modes to report. And there is no need for any special user input to guide the process or limit iteration counts, etc.

The other solvers ? arlseq(), arlsgt(), and arlsnn() build in a somewhat classic fashion on arls(), using traditional methods, though those methods have been generalized or enhanced in places, such as how problematic Equality Constraints are rejected.

HOW DOES THIS PACKAGE HELP SCIPY?

In contrast to autosol, all current solvers in scipy work ?blindly?. That is, they work without knowing whether the Ax=b problem they are solving is actually ill-conditioned or not. For example, lstsq() only diagnoses a problem when A is fully singular. But even a moderately ill-conditioned system will cause lstsq() to deliver a bad, high-norm, often oscillatory solution without notifying the user. This happens because the singular values only have to be small compared to the error level in b ? not near zero -- to produce a pathological result.

On the other hand, lsmr(), which is the current primary routine for handling ill-conditioned systems (and often produces beautiful results) is also ?blind? in that it never knows for sure whether the problem to which it is applying a regularization algorithm is actually ill-conditioned. The result is that it will often provide an inappropriately smoothed answer to a perfectly easy problem. Also, lsmr() is prone to producing erratic results with minor changes to a problem. I have prepared a brief illustration of these problems here: http://www.rejones7.net/Autosol/DemoArlsVsLsmr/index.htm

Additionally, the non-negative solver provided in autosol has advantages over the classis nnls() in scipy. The classic nnls() algorithm (from Lawson & Hanson?s classic Solving Least Squares Problems) works very hard at producing the smallest possible residual, and often ends with a failure to converge. It also causes problems with zeroing an excessive number of columns. For example, in a recent consultation regarding a biochemical problem, nnls() on average deleted over 50% of the columns in the problem to produce a non-negative result. But arlsnn() on average deleted of only 17% to achieve non-negativity, thus remaining more relevant to the model. And the slightly higher residual obtained by arlsnn() is usually trivial. Also, arlsnn() cannot fail to converge.

WHAT ISSUES ARE THERE?

One question involves computational cost. I have timed a number of problems being solved with lsmr() and autosol?s arls(). The timing is almost identical. On the other hand, lstsq() is greatly faster than lsmr() and arls(), so there is a cost for regularization, whether with lsmr() or arls(). The advantage of autosol?s algorithm is that it knows what difficulties it is dealing with: it handles well-conditioned problems with full accuracy, like lstsq(), and handles ill-conditioned problems as well as lsmr(), without ever foolishly applying a regularization technique to a well-conditioned problem.

Of course, no algorithm or software package is without flaws. Like lsmr() -- but less often than lsmr() -- autosol?s algorithms will fail to see ill-conditioning where some exists, or ?see? ill-conditioning that is not really present. We have tuned the detection method in the primary solver, arls(), for a decade, in a C++ environment, and have enhanced its reliability significantly during redevelopment of these routines for Python. It should be significantly more reliable then lsmr(), according to the behavior I have seen in testing.

Some history: I developed the first version of the key algorithms in this package as part of a Ph.D. dissertation in the 1980?s. Due to a career change I did not pursue further development of the methods far a long time. But I eventually improved the algorithms and moved the code from Fortran to C++ and made the resulting codes available on my web site for a decade or so. During this time the web site drew interesting customers from around the world, whose problem proved the usefulness of the heuristics in the arls() algorithm. Recently I decided to migrate the code to Python -- which has been fascinating -- and resulted in some further refinements. See http://www.rejones7.net/ for both the C++ material and this Python material and related information and examples.

I am hoping that the scipy community will accept this package ? with any necessary changes that are required by scipy ? and let users learn about it as an alternative to the more traditional solvers currently in scipy. It is worth noting that the current solvers, except for lsmr(), date from as far back as the 1960?s. Maybe it is time to let in some newer technology.

COMMENT

I have thoroughly tested these routines. But I certainly hope this group will try interesting problems with it and push its limits. Please forward me any problems with unexpected results.

I look forward to your feedback.

Ron Jones

rejones7 at msn.com<mailto:Rejones7 at msn.com>

http://www.rejones7.net/


_______________________________________________
SciPy-Dev mailing list
SciPy-Dev at python.org<mailto:SciPy-Dev at python.org>
https://mail.python.org/mailman/listinfo/scipy-dev


_______________________________________________
SciPy-Dev mailing list
SciPy-Dev at python.org
https://mail.python.org/mailman/listinfo/scipy-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201013/59a3dc33/attachment-0001.html>

From rejones7 at msn.com  Mon Oct 12 21:45:47 2020
From: rejones7 at msn.com (rondall jones)
Date: Tue, 13 Oct 2020 01:45:47 +0000
Subject: [SciPy-Dev] Resubmission of autosol.py
In-Reply-To: <536f6cca-82cc-47e3-b431-4851e281880d@www.fastmail.com>
References: <CY4PR0701MB3619ED301A1CECE47017CAAA98070@CY4PR0701MB3619.namprd07.prod.outlook.com>,
 <536f6cca-82cc-47e3-b431-4851e281880d@www.fastmail.com>
Message-ID: <CY4PR0701MB361945AC4F59071EFD34298798040@CY4PR0701MB3619.namprd07.prod.outlook.com>

Today's new submittal is in "rejones7/scipy".
The test program is silent now, as required. I did not know about "one test per function rule".
Ron Jones


________________________________
From: SciPy-Dev <scipy-dev-bounces+rejones7=msn.com at python.org> on behalf of Stefan van der Walt <stefanv at berkeley.edu>
Sent: Monday, October 12, 2020 6:22 PM
To: scipy-dev at python.org <scipy-dev at python.org>
Subject: Re: [SciPy-Dev] Resubmission of autosol.py

Hi Ron,

Thank you for your email and contribution; I am not very familiar with the algorithms you implemented, but I hope that we will be able to find someone to review who is.

Is this the work that is being discussed at https://github.com/scipy/scipy/pull/12755 ?

Before the code can be included in SciPy, it will need a bit of polishing to fit in with the rest of the package (each test goes in its own function, modes are input as legible strings instead of ints, ensure all docstrings render correctly, etc.).  I see Ilhan has already provided some feedback on the pull request.

Thank you again for contributing this work; I think having algorithms that are aware of failure cases when solving linear equations would be valuable.

Best regards,
St?fan


On Mon, Oct 12, 2020, at 14:35, rondall jones wrote:

I am resubmitting an enhanced version of my autosol.py module. I have uploaded the code to scipy\linalg, and have uploaded a silent full-coverage test to scipy\linalg\tests.

WHAT IS IN THIS MODULE

Autosol.py is a package of automatically regularized solvers for dense linear algebraic systems. The term ?regularize? (see Wikipedia) means to add extra information to a set of equations (or other application) in order to move a poor result more toward what is needed. In our case this extra information is mainly three things: 1) requesting the solution to exclude abnormal behavior caused by unavoidable noise in the data; 2) adding to the system of regular least-squares equations some specific equations that are known to be exactly true, such as ?the solution sums to 1.0?; 3) giving the solver specific bounds on the answer by way of inequality constraints, such as requesting a non-negative solution.

arls() is the main automatically regularized least squares solver for Ax =b. The equations can be easy to solve, mildly or strongly ill-conditioned, or singular. ?arls? stands for ?Automatically Regularized Least Squares.? I have changed the names of the routines in autosol from the original long names to more scipy-like short names. I have left the module name as it was.

arlseq() builds on arls() to add Equality Constraints. There can be any number of equality constraints. Conflicts or other issues between these equations are resolved. Each equation is either solved exactly or rejected as incompatible with the other equations. This is normal behavior for such solvers.

arlsgt() builds on arlseq() to add Inequality Constraints. These often have the form of ?the sum of the solution elements must be 100?, or ?the solution must start at 1.0?, etc. There can be any number of inequality constraints. Inequality constraints are promoted to equality constraints when they are violated. This is, again, normal behavior for such solvers. See Lawson & Hansons? ?Solving Least Squares Problems? book.

arlsnn() builds on arlsgt() to provide a simple non-negativity constraint, like scipy?s nnls().

WHAT IS SIGNIFICANT ABOUT THESE ROUTINES?

These solvers are different than anything currently in scipy. (Skip this paragraph if you are not interested in the mathematical detais!) Briefly, if Ax = b, and A = USVt (an SVD) then

USVt * x = b can be written S(Vt*x)= Ut*b. The ?Discrete Picard Condition? that is the basic insight that autosol uses says that the right-hand side, Ut*b, must decrease faster than the singular values in S. In ill-conditioned problems that condition fails, often dramatically, and of course it fails by definition if A is singular. In these cases a careful analysis of the vector g = (S+ * Ut*b) can produce what might be call a ?usable rank? (which is smaller than the numerical rank) which then allows us to produce an estimate of the error in b, which then directly leads to an appropriate value for the lambda regularization parameter used in Tikhonov regularization. Thus, we can regularize Ax=b with no user input hints, and with no failure-prone iterative calculations. This method is very robust. It can handle multiple difficulties such as linear dependencies between rows, small singular values, zero singular values, high error levels in b, etc., etc., with no danger whatsoever that a numeric overflow or other error will occur. If the SVD does not crash (and such crashes are very rare indeed) then autosol?s algorithms will complete normally. There are no error modes to report. And there is no need for any special user input to guide the process or limit iteration counts, etc.

The other solvers ? arlseq(), arlsgt(), and arlsnn() build in a somewhat classic fashion on arls(), using traditional methods, though those methods have been generalized or enhanced in places, such as how problematic Equality Constraints are rejected.

HOW DOES THIS PACKAGE HELP SCIPY?

In contrast to autosol, all current solvers in scipy work ?blindly?. That is, they work without knowing whether the Ax=b problem they are solving is actually ill-conditioned or not. For example, lstsq() only diagnoses a problem when A is fully singular. But even a moderately ill-conditioned system will cause lstsq() to deliver a bad, high-norm, often oscillatory solution without notifying the user. This happens because the singular values only have to be small compared to the error level in b ? not near zero -- to produce a pathological result.

On the other hand, lsmr(), which is the current primary routine for handling ill-conditioned systems (and often produces beautiful results) is also ?blind? in that it never knows for sure whether the problem to which it is applying a regularization algorithm is actually ill-conditioned. The result is that it will often provide an inappropriately smoothed answer to a perfectly easy problem. Also, lsmr() is prone to producing erratic results with minor changes to a problem. I have prepared a brief illustration of these problems here: http://www.rejones7.net/Autosol/DemoArlsVsLsmr/index.htm

Additionally, the non-negative solver provided in autosol has advantages over the classis nnls() in scipy. The classic nnls() algorithm (from Lawson & Hanson?s classic Solving Least Squares Problems) works very hard at producing the smallest possible residual, and often ends with a failure to converge. It also causes problems with zeroing an excessive number of columns. For example, in a recent consultation regarding a biochemical problem, nnls() on average deleted over 50% of the columns in the problem to produce a non-negative result. But arlsnn() on average deleted of only 17% to achieve non-negativity, thus remaining more relevant to the model. And the slightly higher residual obtained by arlsnn() is usually trivial. Also, arlsnn() cannot fail to converge.

WHAT ISSUES ARE THERE?

One question involves computational cost. I have timed a number of problems being solved with lsmr() and autosol?s arls(). The timing is almost identical. On the other hand, lstsq() is greatly faster than lsmr() and arls(), so there is a cost for regularization, whether with lsmr() or arls(). The advantage of autosol?s algorithm is that it knows what difficulties it is dealing with: it handles well-conditioned problems with full accuracy, like lstsq(), and handles ill-conditioned problems as well as lsmr(), without ever foolishly applying a regularization technique to a well-conditioned problem.

Of course, no algorithm or software package is without flaws. Like lsmr() -- but less often than lsmr() -- autosol?s algorithms will fail to see ill-conditioning where some exists, or ?see? ill-conditioning that is not really present. We have tuned the detection method in the primary solver, arls(), for a decade, in a C++ environment, and have enhanced its reliability significantly during redevelopment of these routines for Python. It should be significantly more reliable then lsmr(), according to the behavior I have seen in testing.

Some history: I developed the first version of the key algorithms in this package as part of a Ph.D. dissertation in the 1980?s. Due to a career change I did not pursue further development of the methods far a long time. But I eventually improved the algorithms and moved the code from Fortran to C++ and made the resulting codes available on my web site for a decade or so. During this time the web site drew interesting customers from around the world, whose problem proved the usefulness of the heuristics in the arls() algorithm. Recently I decided to migrate the code to Python -- which has been fascinating -- and resulted in some further refinements. See http://www.rejones7.net/ for both the C++ material and this Python material and related information and examples.

I am hoping that the scipy community will accept this package ? with any necessary changes that are required by scipy ? and let users learn about it as an alternative to the more traditional solvers currently in scipy. It is worth noting that the current solvers, except for lsmr(), date from as far back as the 1960?s. Maybe it is time to let in some newer technology.

COMMENT

I have thoroughly tested these routines. But I certainly hope this group will try interesting problems with it and push its limits. Please forward me any problems with unexpected results.

I look forward to your feedback.

Ron Jones

rejones7 at msn.com<mailto:Rejones7 at msn.com>

http://www.rejones7.net/


_______________________________________________
SciPy-Dev mailing list
SciPy-Dev at python.org<mailto:SciPy-Dev at python.org>
https://mail.python.org/mailman/listinfo/scipy-dev


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201013/d5f6b0a8/attachment-0001.html>

From ralf.gommers at gmail.com  Tue Oct 13 06:24:04 2020
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Tue, 13 Oct 2020 11:24:04 +0100
Subject: [SciPy-Dev] Resubmission of autosol.py
In-Reply-To: <CY4PR0701MB361945AC4F59071EFD34298798040@CY4PR0701MB3619.namprd07.prod.outlook.com>
References: <CY4PR0701MB3619ED301A1CECE47017CAAA98070@CY4PR0701MB3619.namprd07.prod.outlook.com>
 <536f6cca-82cc-47e3-b431-4851e281880d@www.fastmail.com>
 <CY4PR0701MB361945AC4F59071EFD34298798040@CY4PR0701MB3619.namprd07.prod.outlook.com>
Message-ID: <CABL7CQhsRgCdb5TXQyRy2Aoe5HQYfQmoxhuRAs8Q7yVx=7Ua7Q@mail.gmail.com>

On Tue, Oct 13, 2020 at 2:46 AM rondall jones <rejones7 at msn.com> wrote:

> Today's new submittal is in "rejones7/scipy".
> The test program is silent now, as required. I did not know about "one
> test per function rule".
>

Can you just update the pull request you have open with the new code?
Reviewing code that's not in a PR is way too cumbersome.

Best,
Ralf


Ron Jones
>
>
>
> ------------------------------
> *From:* SciPy-Dev <scipy-dev-bounces+rejones7=msn.com at python.org> on
> behalf of Stefan van der Walt <stefanv at berkeley.edu>
> *Sent:* Monday, October 12, 2020 6:22 PM
> *To:* scipy-dev at python.org <scipy-dev at python.org>
> *Subject:* Re: [SciPy-Dev] Resubmission of autosol.py
>
> Hi Ron,
>
> Thank you for your email and contribution; I am not very familiar with the
> algorithms you implemented, but I hope that we will be able to find someone
> to review who is.
>
> Is this the work that is being discussed at
> https://github.com/scipy/scipy/pull/12755 ?
>
> Before the code can be included in SciPy, it will need a bit of polishing
> to fit in with the rest of the package (each test goes in its own function,
> modes are input as legible strings instead of ints, ensure all docstrings
> render correctly, etc.).  I see Ilhan has already provided some feedback on
> the pull request.
>
> Thank you again for contributing this work; I think having algorithms that
> are aware of failure cases when solving linear equations would be valuable.
>
> Best regards,
> St?fan
>
>
> On Mon, Oct 12, 2020, at 14:35, rondall jones wrote:
>
> I am resubmitting an enhanced version of my autosol.py module. I have
> uploaded the code to scipy\linalg, and have uploaded a silent
> full-coverage test to scipy\linalg\tests.
>
> WHAT IS IN THIS MODULE
>
> Autosol.py is a package of automatically regularized solvers for dense
> linear algebraic systems. The term ?regularize? (see Wikipedia) means to
> add extra information to a set of equations (or other application) in order
> to move a poor result more toward what is needed. In our case this extra
> information is mainly three things: 1) requesting the solution to exclude
> abnormal behavior caused by unavoidable noise in the data; 2) adding to the
> system of regular least-squares equations some specific equations that are
> known to be exactly true, such as ?the solution sums to 1.0?; 3) giving the
> solver specific bounds on the answer by way of inequality constraints, such
> as requesting a non-negative solution.
>
> arls() is the main automatically regularized least squares solver for Ax
> =b. The equations can be easy to solve, mildly or strongly ill-conditioned,
> or singular. ?arls? stands for ?Automatically Regularized Least Squares.? I
> have changed the names of the routines in autosol from the original long
> names to more scipy-like short names. I have left the module name as it
> was.
>
> arlseq() builds on arls() to add Equality Constraints. There can be any
> number of equality constraints. Conflicts or other issues between these
> equations are resolved. Each equation is either solved exactly or rejected
> as incompatible with the other equations. This is normal behavior for such
> solvers.
>
> arlsgt() builds on arlseq() to add Inequality Constraints. These often
> have the form of ?the sum of the solution elements must be 100?, or ?the
> solution must start at 1.0?, etc. There can be any number of inequality
> constraints. Inequality constraints are promoted to equality constraints
> when they are violated. This is, again, normal behavior for such solvers.
> See Lawson & Hansons? ?Solving Least Squares Problems? book.
>
> arlsnn() builds on arlsgt() to provide a simple non-negativity constraint,
> like scipy?s nnls().
>
> WHAT IS SIGNIFICANT ABOUT THESE ROUTINES?
>
> These solvers are different than anything currently in scipy. (Skip this
> paragraph if you are not interested in the mathematical detais!) Briefly,
> if Ax = b, and A = USVt (an SVD) then
>
> USVt * x = b can be written S(Vt*x)= Ut*b. The ?Discrete Picard Condition?
> that is the basic insight that autosol uses says that the right-hand side,
> Ut*b, must decrease faster than the singular values in S. In
> ill-conditioned problems that condition fails, often dramatically, and of
> course it fails by definition if A is singular. In these cases a careful
> analysis of the vector g = (S+ * Ut*b) can produce what might be call a
> ?usable rank? (which is smaller than the numerical rank) which then allows
> us to produce an estimate of the error in b, which then directly leads to
> an appropriate value for the lambda regularization parameter used in
> Tikhonov regularization. Thus, we can regularize Ax=b with no user input
> hints, and with no failure-prone iterative calculations. This method is
> very robust. It can handle multiple difficulties such as linear
> dependencies between rows, small singular values, zero singular values,
> high error levels in b, etc., etc., with no danger whatsoever that a
> numeric overflow or other error will occur. If the SVD does not crash (and
> such crashes are very rare indeed) then autosol?s algorithms will complete
> normally. There are no error modes to report. And there is no need for any
> special user input to guide the process or limit iteration counts, etc.
>
> The other solvers ? arlseq(), arlsgt(), and arlsnn() build in a somewhat
> classic fashion on arls(), using traditional methods, though those methods
> have been generalized or enhanced in places, such as how problematic
> Equality Constraints are rejected.
>
> HOW DOES THIS PACKAGE HELP SCIPY?
>
> In contrast to autosol, all current solvers in scipy work ?blindly?. That
> is, they work without knowing whether the Ax=b problem they are solving is
> actually ill-conditioned or not. For example, lstsq() only diagnoses a
> problem when A is fully singular. But even a moderately ill-conditioned
> system will cause lstsq() to deliver a bad, high-norm, often oscillatory
> solution without notifying the user. This happens because the singular
> values only have to be small compared to the error level in b ? not near
> zero -- to produce a pathological result.
>
> On the other hand, lsmr(), which is the current primary routine for
> handling ill-conditioned systems (and often produces beautiful results) is
> also ?blind? in that it never knows for sure whether the problem to which
> it is applying a regularization algorithm is actually ill-conditioned. The
> result is that it will often provide an inappropriately smoothed answer to
> a perfectly easy problem. Also, lsmr() is prone to producing erratic
> results with minor changes to a problem. I have prepared a brief
> illustration of these problems here:
> http://www.rejones7.net/Autosol/DemoArlsVsLsmr/index.htm
>
> Additionally, the non-negative solver provided in autosol has advantages
> over the classis nnls() in scipy. The classic nnls() algorithm (from Lawson
> & Hanson?s classic *Solving Least Squares Problems)* works very hard at
> producing the smallest possible residual, and often ends with a failure to
> converge. It also causes problems with zeroing an excessive number of
> columns. For example, in a recent consultation regarding a biochemical
> problem, nnls() on average deleted over 50% of the columns in the problem
> to produce a non-negative result. But arlsnn() on average deleted of only
> 17% to achieve non-negativity, thus remaining more relevant to the model.
> And the slightly higher residual obtained by arlsnn() is usually trivial.
> Also, arlsnn() cannot fail to converge.
>
> WHAT ISSUES ARE THERE?
>
> One question involves computational cost. I have timed a number of
> problems being solved with lsmr() and autosol?s arls(). The timing is
> almost identical. On the other hand, lstsq() is greatly faster than lsmr()
> and arls(), so there is a cost for regularization, whether with lsmr() or
> arls(). The advantage of autosol?s algorithm is that it knows what
> difficulties it is dealing with: it handles well-conditioned problems with
> full accuracy, like lstsq(), and handles ill-conditioned problems as well
> as lsmr(), without ever foolishly applying a regularization technique to a
> well-conditioned problem.
>
> Of course, no algorithm or software package is without flaws. Like lsmr()
> -- but less often than lsmr() -- autosol?s algorithms will fail to see
> ill-conditioning where some exists, or ?see? ill-conditioning that is not
> really present. We have tuned the detection method in the primary solver,
> arls(), for a decade, in a C++ environment, and have enhanced its
> reliability significantly during redevelopment of these routines for
> Python. It should be significantly more reliable then lsmr(), according to
> the behavior I have seen in testing.
>
> Some history: I developed the first version of the key algorithms in this
> package as part of a Ph.D. dissertation in the 1980?s. Due to a career
> change I did not pursue further development of the methods far a long time.
> But I eventually improved the algorithms and moved the code from Fortran to
> C++ and made the resulting codes available on my web site for a decade or
> so. During this time the web site drew interesting customers from around
> the world, whose problem proved the usefulness of the heuristics in the
> arls() algorithm. Recently I decided to migrate the code to Python -- which
> has been fascinating -- and resulted in some further refinements. See
> http://www.rejones7.net/ for both the C++ material and this Python
> material and related information and examples.
>
> I am hoping that the scipy community will accept this package ? with any
> necessary changes that are required by scipy ? and let users learn about it
> as an alternative to the more traditional solvers currently in scipy. It is
> worth noting that the current solvers, except for lsmr(), date from as far
> back as the 1960?s. Maybe it is time to let in some newer technology.
>
> COMMENT
>
> I have thoroughly tested these routines. But I certainly hope this group
> will try interesting problems with it and push its limits. Please forward
> me any problems with unexpected results.
>
> I look forward to your feedback.
>
> Ron Jones
>
> rejones7 at msn.com <Rejones7 at msn.com>
>
> http://www.rejones7.net/
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201013/16c46ebf/attachment-0001.html>

From ralf.gommers at gmail.com  Tue Oct 13 06:27:46 2020
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Tue, 13 Oct 2020 11:27:46 +0100
Subject: [SciPy-Dev] Resubmission of autosol.py
In-Reply-To: <CABL7CQhsRgCdb5TXQyRy2Aoe5HQYfQmoxhuRAs8Q7yVx=7Ua7Q@mail.gmail.com>
References: <CY4PR0701MB3619ED301A1CECE47017CAAA98070@CY4PR0701MB3619.namprd07.prod.outlook.com>
 <536f6cca-82cc-47e3-b431-4851e281880d@www.fastmail.com>
 <CY4PR0701MB361945AC4F59071EFD34298798040@CY4PR0701MB3619.namprd07.prod.outlook.com>
 <CABL7CQhsRgCdb5TXQyRy2Aoe5HQYfQmoxhuRAs8Q7yVx=7Ua7Q@mail.gmail.com>
Message-ID: <CABL7CQgLLXkawYKcrgvg5FoEYLX8BoNQoTLqg-DuVeJes4iCkQ@mail.gmail.com>

On Tue, Oct 13, 2020 at 11:24 AM Ralf Gommers <ralf.gommers at gmail.com>
wrote:

>
>
> On Tue, Oct 13, 2020 at 2:46 AM rondall jones <rejones7 at msn.com> wrote:
>
>> Today's new submittal is in "rejones7/scipy".
>> The test program is silent now, as required. I did not know about "one
>> test per function rule".
>>
>
> Can you just update the pull request you have open with the new code?
> Reviewing code that's not in a PR is way too cumbersome.
>

Ah, I see you did update https://github.com/scipy/scipy/pull/12755, looks
like we're all good here. The above sentence sounded like you wanted us to
look at code in your own fork.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201013/da37683e/attachment.html>

From rejones7 at msn.com  Tue Oct 13 14:04:50 2020
From: rejones7 at msn.com (rondall jones)
Date: Tue, 13 Oct 2020 18:04:50 +0000
Subject: [SciPy-Dev] Resubmission of autosol.py
In-Reply-To: <CABL7CQhsRgCdb5TXQyRy2Aoe5HQYfQmoxhuRAs8Q7yVx=7Ua7Q@mail.gmail.com>
References: <CY4PR0701MB3619ED301A1CECE47017CAAA98070@CY4PR0701MB3619.namprd07.prod.outlook.com>
 <536f6cca-82cc-47e3-b431-4851e281880d@www.fastmail.com>
 <CY4PR0701MB361945AC4F59071EFD34298798040@CY4PR0701MB3619.namprd07.prod.outlook.com>,
 <CABL7CQhsRgCdb5TXQyRy2Aoe5HQYfQmoxhuRAs8Q7yVx=7Ua7Q@mail.gmail.com>
Message-ID: <CY4PR0701MB36197E4E7300422C23CE7F5298040@CY4PR0701MB3619.namprd07.prod.outlook.com>

Thanks, Ralf.
But I have never used Github before trying to submit the package a couple of months ago and
I find it terribly confusing.
So please let me know if I am not doing something right.
If there is anything else I need to do to move the process along please let me know.
You can also call me.
505-934-8799
Ron

________________________________
From: SciPy-Dev <scipy-dev-bounces+rejones7=msn.com at python.org> on behalf of Ralf Gommers <ralf.gommers at gmail.com>
Sent: Tuesday, October 13, 2020 4:24 AM
To: SciPy Developers List <scipy-dev at python.org>
Subject: Re: [SciPy-Dev] Resubmission of autosol.py


On Tue, Oct 13, 2020 at 2:46 AM rondall jones <rejones7 at msn.com<mailto:rejones7 at msn.com>> wrote:
Today's new submittal is in "rejones7/scipy".
The test program is silent now, as required. I did not know about "one test per function rule".

Can you just update the pull request you have open with the new code? Reviewing code that's not in a PR is way too cumbersome.

Best,
Ralf


Ron Jones


________________________________
From: SciPy-Dev <scipy-dev-bounces+rejones7=msn.com at python.org<mailto:msn.com at python.org>> on behalf of Stefan van der Walt <stefanv at berkeley.edu<mailto:stefanv at berkeley.edu>>
Sent: Monday, October 12, 2020 6:22 PM
To: scipy-dev at python.org<mailto:scipy-dev at python.org> <scipy-dev at python.org<mailto:scipy-dev at python.org>>
Subject: Re: [SciPy-Dev] Resubmission of autosol.py

Hi Ron,

Thank you for your email and contribution; I am not very familiar with the algorithms you implemented, but I hope that we will be able to find someone to review who is.

Is this the work that is being discussed at https://github.com/scipy/scipy/pull/12755 ?

Before the code can be included in SciPy, it will need a bit of polishing to fit in with the rest of the package (each test goes in its own function, modes are input as legible strings instead of ints, ensure all docstrings render correctly, etc.).  I see Ilhan has already provided some feedback on the pull request.

Thank you again for contributing this work; I think having algorithms that are aware of failure cases when solving linear equations would be valuable.

Best regards,
St?fan


On Mon, Oct 12, 2020, at 14:35, rondall jones wrote:

I am resubmitting an enhanced version of my autosol.py module. I have uploaded the code to scipy\linalg, and have uploaded a silent full-coverage test to scipy\linalg\tests.

WHAT IS IN THIS MODULE

Autosol.py is a package of automatically regularized solvers for dense linear algebraic systems. The term ?regularize? (see Wikipedia) means to add extra information to a set of equations (or other application) in order to move a poor result more toward what is needed. In our case this extra information is mainly three things: 1) requesting the solution to exclude abnormal behavior caused by unavoidable noise in the data; 2) adding to the system of regular least-squares equations some specific equations that are known to be exactly true, such as ?the solution sums to 1.0?; 3) giving the solver specific bounds on the answer by way of inequality constraints, such as requesting a non-negative solution.

arls() is the main automatically regularized least squares solver for Ax =b. The equations can be easy to solve, mildly or strongly ill-conditioned, or singular. ?arls? stands for ?Automatically Regularized Least Squares.? I have changed the names of the routines in autosol from the original long names to more scipy-like short names. I have left the module name as it was.

arlseq() builds on arls() to add Equality Constraints. There can be any number of equality constraints. Conflicts or other issues between these equations are resolved. Each equation is either solved exactly or rejected as incompatible with the other equations. This is normal behavior for such solvers.

arlsgt() builds on arlseq() to add Inequality Constraints. These often have the form of ?the sum of the solution elements must be 100?, or ?the solution must start at 1.0?, etc. There can be any number of inequality constraints. Inequality constraints are promoted to equality constraints when they are violated. This is, again, normal behavior for such solvers. See Lawson & Hansons? ?Solving Least Squares Problems? book.

arlsnn() builds on arlsgt() to provide a simple non-negativity constraint, like scipy?s nnls().

WHAT IS SIGNIFICANT ABOUT THESE ROUTINES?

These solvers are different than anything currently in scipy. (Skip this paragraph if you are not interested in the mathematical detais!) Briefly, if Ax = b, and A = USVt (an SVD) then

USVt * x = b can be written S(Vt*x)= Ut*b. The ?Discrete Picard Condition? that is the basic insight that autosol uses says that the right-hand side, Ut*b, must decrease faster than the singular values in S. In ill-conditioned problems that condition fails, often dramatically, and of course it fails by definition if A is singular. In these cases a careful analysis of the vector g = (S+ * Ut*b) can produce what might be call a ?usable rank? (which is smaller than the numerical rank) which then allows us to produce an estimate of the error in b, which then directly leads to an appropriate value for the lambda regularization parameter used in Tikhonov regularization. Thus, we can regularize Ax=b with no user input hints, and with no failure-prone iterative calculations. This method is very robust. It can handle multiple difficulties such as linear dependencies between rows, small singular values, zero singular values, high error levels in b, etc., etc., with no danger whatsoever that a numeric overflow or other error will occur. If the SVD does not crash (and such crashes are very rare indeed) then autosol?s algorithms will complete normally. There are no error modes to report. And there is no need for any special user input to guide the process or limit iteration counts, etc.

The other solvers ? arlseq(), arlsgt(), and arlsnn() build in a somewhat classic fashion on arls(), using traditional methods, though those methods have been generalized or enhanced in places, such as how problematic Equality Constraints are rejected.

HOW DOES THIS PACKAGE HELP SCIPY?

In contrast to autosol, all current solvers in scipy work ?blindly?. That is, they work without knowing whether the Ax=b problem they are solving is actually ill-conditioned or not. For example, lstsq() only diagnoses a problem when A is fully singular. But even a moderately ill-conditioned system will cause lstsq() to deliver a bad, high-norm, often oscillatory solution without notifying the user. This happens because the singular values only have to be small compared to the error level in b ? not near zero -- to produce a pathological result.

On the other hand, lsmr(), which is the current primary routine for handling ill-conditioned systems (and often produces beautiful results) is also ?blind? in that it never knows for sure whether the problem to which it is applying a regularization algorithm is actually ill-conditioned. The result is that it will often provide an inappropriately smoothed answer to a perfectly easy problem. Also, lsmr() is prone to producing erratic results with minor changes to a problem. I have prepared a brief illustration of these problems here: http://www.rejones7.net/Autosol/DemoArlsVsLsmr/index.htm

Additionally, the non-negative solver provided in autosol has advantages over the classis nnls() in scipy. The classic nnls() algorithm (from Lawson & Hanson?s classic Solving Least Squares Problems) works very hard at producing the smallest possible residual, and often ends with a failure to converge. It also causes problems with zeroing an excessive number of columns. For example, in a recent consultation regarding a biochemical problem, nnls() on average deleted over 50% of the columns in the problem to produce a non-negative result. But arlsnn() on average deleted of only 17% to achieve non-negativity, thus remaining more relevant to the model. And the slightly higher residual obtained by arlsnn() is usually trivial. Also, arlsnn() cannot fail to converge.

WHAT ISSUES ARE THERE?

One question involves computational cost. I have timed a number of problems being solved with lsmr() and autosol?s arls(). The timing is almost identical. On the other hand, lstsq() is greatly faster than lsmr() and arls(), so there is a cost for regularization, whether with lsmr() or arls(). The advantage of autosol?s algorithm is that it knows what difficulties it is dealing with: it handles well-conditioned problems with full accuracy, like lstsq(), and handles ill-conditioned problems as well as lsmr(), without ever foolishly applying a regularization technique to a well-conditioned problem.

Of course, no algorithm or software package is without flaws. Like lsmr() -- but less often than lsmr() -- autosol?s algorithms will fail to see ill-conditioning where some exists, or ?see? ill-conditioning that is not really present. We have tuned the detection method in the primary solver, arls(), for a decade, in a C++ environment, and have enhanced its reliability significantly during redevelopment of these routines for Python. It should be significantly more reliable then lsmr(), according to the behavior I have seen in testing.

Some history: I developed the first version of the key algorithms in this package as part of a Ph.D. dissertation in the 1980?s. Due to a career change I did not pursue further development of the methods far a long time. But I eventually improved the algorithms and moved the code from Fortran to C++ and made the resulting codes available on my web site for a decade or so. During this time the web site drew interesting customers from around the world, whose problem proved the usefulness of the heuristics in the arls() algorithm. Recently I decided to migrate the code to Python -- which has been fascinating -- and resulted in some further refinements. See http://www.rejones7.net/ for both the C++ material and this Python material and related information and examples.

I am hoping that the scipy community will accept this package ? with any necessary changes that are required by scipy ? and let users learn about it as an alternative to the more traditional solvers currently in scipy. It is worth noting that the current solvers, except for lsmr(), date from as far back as the 1960?s. Maybe it is time to let in some newer technology.

COMMENT

I have thoroughly tested these routines. But I certainly hope this group will try interesting problems with it and push its limits. Please forward me any problems with unexpected results.

I look forward to your feedback.

Ron Jones

rejones7 at msn.com<mailto:Rejones7 at msn.com>

http://www.rejones7.net/


_______________________________________________
SciPy-Dev mailing list
SciPy-Dev at python.org<mailto:SciPy-Dev at python.org>
https://mail.python.org/mailman/listinfo/scipy-dev


_______________________________________________
SciPy-Dev mailing list
SciPy-Dev at python.org<mailto:SciPy-Dev at python.org>
https://mail.python.org/mailman/listinfo/scipy-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201013/bc029336/attachment-0001.html>

From ralf.gommers at gmail.com  Tue Oct 13 18:27:30 2020
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Tue, 13 Oct 2020 23:27:30 +0100
Subject: [SciPy-Dev] Resubmission of autosol.py
In-Reply-To: <CY4PR0701MB36197E4E7300422C23CE7F5298040@CY4PR0701MB3619.namprd07.prod.outlook.com>
References: <CY4PR0701MB3619ED301A1CECE47017CAAA98070@CY4PR0701MB3619.namprd07.prod.outlook.com>
 <536f6cca-82cc-47e3-b431-4851e281880d@www.fastmail.com>
 <CY4PR0701MB361945AC4F59071EFD34298798040@CY4PR0701MB3619.namprd07.prod.outlook.com>
 <CABL7CQhsRgCdb5TXQyRy2Aoe5HQYfQmoxhuRAs8Q7yVx=7Ua7Q@mail.gmail.com>
 <CY4PR0701MB36197E4E7300422C23CE7F5298040@CY4PR0701MB3619.namprd07.prod.outlook.com>
Message-ID: <CABL7CQjWWxLPyGXkX3dwfKznHLV=G1QvGkDgdCu_bwvOFKVr_Q@mail.gmail.com>

On Tue, Oct 13, 2020 at 7:05 PM rondall jones <rejones7 at msn.com> wrote:

> Thanks, Ralf.
> But I have never used Github before trying to submit the package a couple
> of months ago and
> I find it terribly confusing.
> So please let me know if I am not doing something right.
>

No worries, things seem to be going all right - you'll get used to it:)

Also a small comment on mailing lists for future posts: the convention here
is to bottom-post (so replies go right below the content you are replying
to), that makes email threads easier to follow.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201013/30c6839b/attachment.html>

From rejones7 at msn.com  Thu Oct 15 15:38:49 2020
From: rejones7 at msn.com (rondall jones)
Date: Thu, 15 Oct 2020 19:38:49 +0000
Subject: [SciPy-Dev] changes to autosol.py / arls.py
Message-ID: <CY4PR0701MB36195A0787694298DFF421B098020@CY4PR0701MB3619.namprd07.prod.outlook.com>

To Ilhan, Ralf, Stefan, others...

Ilhan really wants arls() to give the user some feedback about the solution.
I didn't understand his/your point at first but I eventually realized that I can
and should return the artifacts I compute about the problem.
And I really like the result. Ilhan was right.
(There are no changes whatsoever to algorithms.)

Also, Ilhan objected to the name, autosol, feeling that it implies too much.
I had already changed all the routine names to arlsxxx, for
"automatically regularized least squares".
I guess I should have changed the file name, too, and am doing that too.
I hope that doesn't foul up your review process.

So I just uploaded arls() and its test_arls() with these changes"

  *   each solver now returns:
     *   the tradition numerical rank of A ( I checked other current solvers in scipy to be sure I was doing this in the same exact way)
     *   the "usable rank" which involves both A and b, and is the crucial artifact needed to do autoregularization
     *   the estimate of the right-hand-side RMS error, sigma. Some people will be surprised to see that this is possible.
     *   the resulting Tikhonov regularization parameter, lambda
  *   The in-code comments have been updated
  *   The test_arls() code has been updated to use one function per test, as best as I understand how to do it at this point

Unfortunately I don't know how to delete the old autosol.py and test_autosol.py.
Maybe one of you code managers have to do that?

Thanks for your interest and tolerance of my areas of ignorance.
By the way, I am not seeking to REPLACE any current scipy solvers. Arls should only be an additional feature for a long time.
But I think it will be a significant asset for scipy.
I have mostly publicized these codes strictly via my web site in the past, as C++ has no central code repository like python.
Rondall Jones

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20201015/c37e3af2/attachment.html>

From ralf.gommers at gmail.com  Fri Oct 16 11:08:18 2020
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Fri, 16 Oct 2020 16:08:18 +0100
Subject: [SciPy-Dev] changes to autosol.py / arls.py
In-Reply-To: <CY4PR0701MB36195A0787694298DFF421B098020@CY4PR0701MB3619.namprd07.prod.outlook.com>
References: <CY4PR0701MB36195A0787694298DFF421B098020@CY4PR0701MB3619.namprd07.prod.outlook.com>
Message-ID: <CABL7CQhkyhVFSva6q5oDKGz_rFgXHat-T70UkEL5qL4NXo=gqw@mail.gmail.com>

On Thu, Oct 15, 2020 at 8:39 PM rondall jones <rejones7 at msn.com> wrote:

> To Ilhan, Ralf, Stefan, others...
>
> Ilhan really wants arls() to give the user some feedback about the
> solution.
> I didn't understand his/your point at first but I eventually realized that
> I can
> and should return the artifacts I compute about the problem.
> And I really like the result. Ilhan was right.
> (There are no changes whatsoever to algorithms.)
>
> Also, Ilhan objected to the name, autosol, feeling that it implies too
> much.
> I had already changed all the routine names to arlsxxx, for
> "automatically regularized least squares".
> I guess I should have changed the file name, too, and am doing that too.
> I hope that doesn't foul up your review process.
>

That does seem like a better name. And no worries, renames are perfectly
fine and won't interfere with PR review.


> So I just uploaded arls() and its test_arls() with these changes"
>
>    - each solver now returns:
>       - the tradition numerical rank of A ( I checked other current
>       solvers in scipy to be sure I was doing this in the same exact way)
>       - the "usable rank" which involves both A and b, and is the crucial
>       artifact needed to do autoregularization
>       - the estimate of the right-hand-side RMS error, sigma. Some people
>       will be surprised to see that this is possible.
>       - the resulting Tikhonov regularization parameter, lambda
>       - The in-code comments have been updated
>    - The test_arls() code has been updated to use one function per test,
>    as best as I understand how to do it at this point
>
> Unfortunately I don't know how to delete the old autosol.py and
> test_autosol.py.
> Maybe one of you code managers have to do that?
>

On your local branch you do `git rm autosol.py` and then commit the result
and push it to your fork.

Unless you mean you are only using the GitHub UI for adding code. That
won't work for reasonably complex code changes, it really is only meant for
tiny fixes like doc updates. It's well worth taking the time to learn Git
(I know it's a pain to start with).


> Thanks for your interest and tolerance of my areas of ignorance.
> By the way, I am not seeking to REPLACE any current scipy solvers. Arls
> should only be an additional feature for a long time.
>

Yes, understood. If you had removed existing functionality, you'd have
heard howls of protest quickly:)

But I think it will be a significant asset for scipy.
> I have mostly publicized these codes strictly via my web site in the past,
> as C++ has no central code repository like python.
>

I suggest to keep the rest of this conversation on GitHub, no need to keep
the mailing list subscribers up to date. The habit is simply to make them
aware once and take a decision on adding functionality yes/no on the list;
anything else is easier on GitHub.

Cheers.
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20201016/fc413b99/attachment-0001.html>

From rlucas7 at vt.edu  Fri Oct 16 11:48:00 2020
From: rlucas7 at vt.edu (rlucas7 at vt.edu)
Date: Fri, 16 Oct 2020 11:48:00 -0400
Subject: [SciPy-Dev] changes to autosol.py / arls.py
In-Reply-To: <CABL7CQhkyhVFSva6q5oDKGz_rFgXHat-T70UkEL5qL4NXo=gqw@mail.gmail.com>
References: <CABL7CQhkyhVFSva6q5oDKGz_rFgXHat-T70UkEL5qL4NXo=gqw@mail.gmail.com>
Message-ID: <AEF4B4CA-2F51-46CA-9471-2F1EE5DE2A05@vt.edu>

There is another existing procedure not in scipy AFAIK but in literature that is referred to as ARLS.

Google search for 
?Alternating recursive least squares?

 Will Show the initialism used in various places.

Perhaps a different name to avoid potential user confusion? 

One proposal is ?auto_reg_ls? but there might be objections to that too. 

Naming is difficult.

Sincerely, 

-Lucas Roberts 

> On Oct 16, 2020, at 11:08 AM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
> 
> ?
> 
> 
>> On Thu, Oct 15, 2020 at 8:39 PM rondall jones <rejones7 at msn.com> wrote:
>> To Ilhan, Ralf, Stefan, others...
>> 
>> Ilhan really wants arls() to give the user some feedback about the solution.
>> I didn't understand his/your point at first but I eventually realized that I can
>> and should return the artifacts I compute about the problem.
>> And I really like the result. Ilhan was right.
>> (There are no changes whatsoever to algorithms.)
>> 
>> Also, Ilhan objected to the name, autosol, feeling that it implies too much.
>> I had already changed all the routine names to arlsxxx, for 
>> "automatically regularized least squares". 
>> I guess I should have changed the file name, too, and am doing that too.
>> I hope that doesn't foul up your review process.
> 
> That does seem like a better name. And no worries, renames are perfectly fine and won't interfere with PR review.
> 
>> 
>> So I just uploaded arls() and its test_arls() with these changes"
>> each solver now returns:
>> the tradition numerical rank of A ( I checked other current solvers in scipy to be sure I was doing this in the same exact way)
>> the "usable rank" which involves both A and b, and is the crucial artifact needed to do autoregularization
>> the estimate of the right-hand-side RMS error, sigma. Some people will be surprised to see that this is possible.
>> the resulting Tikhonov regularization parameter, lambda
>> The in-code comments have been updated
>> The test_arls() code has been updated to use one function per test, as best as I understand how to do it at this point
>> Unfortunately I don't know how to delete the old autosol.py and test_autosol.py.
>> Maybe one of you code managers have to do that?
> 
> On your local branch you do `git rm autosol.py` and then commit the result and push it to your fork. 
> 
> Unless you mean you are only using the GitHub UI for adding code. That won't work for reasonably complex code changes, it really is only meant for tiny fixes like doc updates. It's well worth taking the time to learn Git (I know it's a pain to start with).
> 
>> 
>> Thanks for your interest and tolerance of my areas of ignorance.
>> By the way, I am not seeking to REPLACE any current scipy solvers. Arls should only be an additional feature for a long time.
> 
> Yes, understood. If you had removed existing functionality, you'd have heard howls of protest quickly:)
> 
>> But I think it will be a significant asset for scipy.
>> I have mostly publicized these codes strictly via my web site in the past, as C++ has no central code repository like python.
> 
> I suggest to keep the rest of this conversation on GitHub, no need to keep the mailing list subscribers up to date. The habit is simply to make them aware once and take a decision on adding functionality yes/no on the list; anything else is easier on GitHub.
> 
> Cheers.
> Ralf
> 
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20201016/cb3e0df5/attachment.html>

From tyler.je.reddy at gmail.com  Sat Oct 17 18:22:36 2020
From: tyler.je.reddy at gmail.com (Tyler Reddy)
Date: Sat, 17 Oct 2020 16:22:36 -0600
Subject: [SciPy-Dev] ANN: SciPy 1.5.3
Message-ID: <CAHPuU_bd_zLCeeZVsV5ma950Zcu66LK6SzoyMDV3tMHXkofF9w@mail.gmail.com>

Hi all,

On behalf of the SciPy development team I'm pleased to announce
the release of SciPy 1.5.3, which is a bug fix release that includes
Linux ARM64 wheels for the first time.

Sources and binary wheels can be found at:
https://pypi.org/project/scipy/
and at: https://github.com/scipy/scipy/releases/tag/v1.5.3

One of a few ways to install this release with pip:

pip install scipy==1.5.3

==========================
SciPy 1.5.3 Release Notes
==========================

SciPy 1.5.3 is a bug-fix release with no new features
compared to 1.5.2. In particular, Linux ARM64 wheels are now
available and a compatibility issue with XCode 12 has
been fixed.

Authors
======

* Peter Bell
* CJ Carey
* Thomas Duvernay +
* Gregory Lee
* Eric Moore
* odidev
* Dima Pasechnik
* Tyler Reddy
* Simon Segerblom Rex +
* Daniel B. Smith
* Will Tirone +
* Warren Weckesser

A total of 12 people contributed to this release.
People with a "+" by their names contributed a patch for the first time.
This list of names is automatically generated, and may not be fully
complete.

Issues closed for 1.5.3
------------------------------

* `#9611 <https://github.com/scipy/scipy/issues/9611>`__: Overflow error
with new way of p-value calculation in kendall...
* `#10069 <https://github.com/scipy/scipy/issues/10069>`__:
scipy.ndimage.watershed_ift regression in 1.0.0
* `#11260 <https://github.com/scipy/scipy/issues/11260>`__: BUG: DOP853
with complex data computes complex error norm, causing...
* `#11479 <https://github.com/scipy/scipy/issues/11479>`__: RuntimeError:
dictionary changed size during iteration on loading...
* `#11972 <https://github.com/scipy/scipy/issues/11972>`__: BUG (solved):
Error estimation in DOP853 ODE solver fails for...
* `#12543 <https://github.com/scipy/scipy/issues/12543>`__: BUG: Picture
rotated 180 degrees and rotated -180 degrees should...
* `#12613 <https://github.com/scipy/scipy/issues/12613>`__: Travis X.4 and
X.7 failures in master
* `#12654 <https://github.com/scipy/scipy/issues/12654>`__:
scipy.stats.combine_pvalues produces wrong results with
method='mudholkar_george'
* `#12819 <https://github.com/scipy/scipy/issues/12819>`__: BUG: Scipy
Sparse slice indexing assignment Bug with zeros
* `#12834 <https://github.com/scipy/scipy/issues/12834>`__: BUG: ValueError
upon calling Scipy Interpolator objects
* `#12836 <https://github.com/scipy/scipy/issues/12836>`__: ndimage.median
can return incorrect values for integer inputs
* `#12860 <https://github.com/scipy/scipy/issues/12860>`__: Build failure
with Xcode 12

Pull requests for 1.5.3
-----------------------------

* `#12611 <https://github.com/scipy/scipy/pull/12611>`__: MAINT: prepare
for SciPy 1.5.3
* `#12614 <https://github.com/scipy/scipy/pull/12614>`__: MAINT: prevent
reverse broadcasting
* `#12617 <https://github.com/scipy/scipy/pull/12617>`__: MAINT: optimize:
Handle nonscalar size 1 arrays in fmin_slsqp...
* `#12623 <https://github.com/scipy/scipy/pull/12623>`__: MAINT: stats:
Loosen some test tolerances.
* `#12638 <https://github.com/scipy/scipy/pull/12638>`__: CI, MAINT: pin
pytest for Azure win
* `#12668 <https://github.com/scipy/scipy/pull/12668>`__: BUG: Ensure
factorial is not too large in mstats.kendalltau
* `#12705 <https://github.com/scipy/scipy/pull/12705>`__: MAINT:
\`openblas_support\` added sha256 hash
* `#12706 <https://github.com/scipy/scipy/pull/12706>`__: BUG: fix
incorrect 1d case of the fourier_ellipsoid filter
* `#12721 <https://github.com/scipy/scipy/pull/12721>`__: BUG: use
special.sindg in ndimage.rotate
* `#12724 <https://github.com/scipy/scipy/pull/12724>`__: BUG: per #12654
adjusted mudholkar_george method to combine p...
* `#12726 <https://github.com/scipy/scipy/pull/12726>`__: BUG: Fix DOP853
error norm for complex problems
* `#12730 <https://github.com/scipy/scipy/pull/12730>`__: CI: pin xdist for
Azure windows
* `#12786 <https://github.com/scipy/scipy/pull/12786>`__: BUG: stats: Fix
formula in the \`stats\` method of the ARGUS...
* `#12795 <https://github.com/scipy/scipy/pull/12795>`__: CI: Pin
setuptools on windows CI
* `#12830 <https://github.com/scipy/scipy/pull/12830>`__: [BUG] sparse:
Avoid using size attribute in LIL __setitem__
* `#12833 <https://github.com/scipy/scipy/pull/12833>`__: BUG: change list
of globals items to list of a copy
* `#12842 <https://github.com/scipy/scipy/pull/12842>`__: BUG: Use uint16
for cost in NI_WatershedElement
* `#12845 <https://github.com/scipy/scipy/pull/12845>`__: BUG: avoid
boolean or integer addition error in ndimage.measurements.median
* `#12864 <https://github.com/scipy/scipy/pull/12864>`__: BLD: replace the
#include of libqull_r.h with with this of qhull_ra.h...
* `#12867 <https://github.com/scipy/scipy/pull/12867>`__: BUG: Fixes a
ValueError yielded upon calling Scipy Interpolator...
* `#12902 <https://github.com/scipy/scipy/pull/12902>`__: CI: Remove 'env'
from pytest.ini
* `#12913 <https://github.com/scipy/scipy/pull/12913>`__: MAINT: Ignore
pytest's PytestConfigWarning


Checksums
=========

MD5
~~~

fd75941594b22a322f63a27d53c1bdda
 scipy-1.5.3-cp36-cp36m-macosx_10_9_x86_64.whl
85d547117ae6d9fd447120c1768ff2d0  scipy-1.5.3-cp36-cp36m-manylinux1_i686.whl
8e8ca0d9123c4f4bf59f74ec474fc936
 scipy-1.5.3-cp36-cp36m-manylinux1_x86_64.whl
855db93424cf97e5b4685c4cf74be346
 scipy-1.5.3-cp36-cp36m-manylinux2014_aarch64.whl
20405ee4157858d33e38c62b873b6420  scipy-1.5.3-cp36-cp36m-win32.whl
be3f3ba1018e7b1f3a30738aa502a4b5  scipy-1.5.3-cp36-cp36m-win_amd64.whl
12be163517b748a025cd399e6e9ce7df
 scipy-1.5.3-cp37-cp37m-macosx_10_9_x86_64.whl
b2d761876d1f1d27e289b08f44034ede  scipy-1.5.3-cp37-cp37m-manylinux1_i686.whl
1cd15a513ad76c64a4353a63eee6b3a8
 scipy-1.5.3-cp37-cp37m-manylinux1_x86_64.whl
00bb4f63f4ee40869193c8e639da3274
 scipy-1.5.3-cp37-cp37m-manylinux2014_aarch64.whl
f783a76397dfe2a73752ac86f32fa474  scipy-1.5.3-cp37-cp37m-win32.whl
3c2927e6adf9b522f7f1745308b4d1f2  scipy-1.5.3-cp37-cp37m-win_amd64.whl
faaa0cab95b6f352508120b8f628aaae
 scipy-1.5.3-cp38-cp38-macosx_10_9_x86_64.whl
77e8cb222c1868f01f048a444ef49260  scipy-1.5.3-cp38-cp38-manylinux1_i686.whl
a2a035b15f78106090b9550f1383b40f
 scipy-1.5.3-cp38-cp38-manylinux1_x86_64.whl
86d52a963596a4cf6e1930ac5cf79a03
 scipy-1.5.3-cp38-cp38-manylinux2014_aarch64.whl
8dd29aceb8dae5b5cc4f8100d8b35423  scipy-1.5.3-cp38-cp38-win32.whl
14f617e43a37827a29e8ee9ad97eda4b  scipy-1.5.3-cp38-cp38-win_amd64.whl
ecf5c58e4df1d257abf1634d51cb9205  scipy-1.5.3.tar.gz
61bcc473115587a66acfb42d3021479b  scipy-1.5.3.tar.xz
a9c549f19c661ab1b44e5b2cc707d4c1  scipy-1.5.3.zip

SHA256
~~~~~~

f574558f1b774864516f3c3fe072ebc90a29186f49b720f60ed339294b7f32ac
 scipy-1.5.3-cp36-cp36m-macosx_10_9_x86_64.whl
e527c9221b6494bcd06a17f9f16874406b32121385f9ab353b8a9545be458f0b
 scipy-1.5.3-cp36-cp36m-manylinux1_i686.whl
b9751b39c52a3fa59312bd2e1f40144ee26b51404db5d2f0d5259c511ff6f614
 scipy-1.5.3-cp36-cp36m-manylinux1_x86_64.whl
d5e3cc60868f396b78fc881d2c76460febccfe90f6d2f082b9952265c79a8788
 scipy-1.5.3-cp36-cp36m-manylinux2014_aarch64.whl
1fee28b6641ecbff6e80fe7788e50f50c5576157d278fa40f36c851940eb0aff
 scipy-1.5.3-cp36-cp36m-win32.whl
ffcbd331f1ffa82e22f1d408e93c37463c9a83088243158635baec61983aaacf
 scipy-1.5.3-cp36-cp36m-win_amd64.whl
07b083128beae040f1129bd8a82b01804f5e716a7fd2962c1053fa683433e4ab
 scipy-1.5.3-cp37-cp37m-macosx_10_9_x86_64.whl
e2602f79c85924e4486f684aa9bbab74afff90606100db88d0785a0088be7edb
 scipy-1.5.3-cp37-cp37m-manylinux1_i686.whl
aebb69bcdec209d874fc4b0c7ac36f509d50418a431c1422465fa34c2c0143ea
 scipy-1.5.3-cp37-cp37m-manylinux1_x86_64.whl
bc0e63daf43bf052aefbbd6c5424bc03f629d115ece828e87303a0bcc04a37e4
 scipy-1.5.3-cp37-cp37m-manylinux2014_aarch64.whl
8cc5c39ed287a8b52a5509cd6680af078a40b0e010e2657eca01ffbfec929468
 scipy-1.5.3-cp37-cp37m-win32.whl
0edd67e8a00903aaf7a29c968555a2e27c5a69fea9d1dcfffda80614281a884f
 scipy-1.5.3-cp37-cp37m-win_amd64.whl
66ec29348444ed6e8a14c9adc2de65e74a8fc526dc2c770741725464488ede1f
 scipy-1.5.3-cp38-cp38-macosx_10_9_x86_64.whl
12fdcbfa56cac926a0a9364a30cbf4ad03c2c7b59f75b14234656a5e4fd52bf3
 scipy-1.5.3-cp38-cp38-manylinux1_i686.whl
a1a13858b10d41beb0413c4378462b43eafef88a1948d286cb357eadc0aec024
 scipy-1.5.3-cp38-cp38-manylinux1_x86_64.whl
5163200ab14fd2b83aba8f0c4ddcc1fa982a43192867264ab0f4c8065fd10d17
 scipy-1.5.3-cp38-cp38-manylinux2014_aarch64.whl
33e6a7439f43f37d4c1135bc95bcd490ffeac6ef4b374892c7005ce2c729cf4a
 scipy-1.5.3-cp38-cp38-win32.whl
a3db1fe7c6cb29ca02b14c9141151ebafd11e06ffb6da8ecd330eee5c8283a8a
 scipy-1.5.3-cp38-cp38-win_amd64.whl
ddae76784574cc4c172f3d5edd7308be16078dd3b977e8746860c76c195fa707
 scipy-1.5.3.tar.gz
cab92f8dab54a3be66525ea23e4f6568145abd1e94681cce19258a140f4de416
 scipy-1.5.3.tar.xz
32dd070e203363e3a63bc184afb1d8e165c0a36a49b83b097642f78ef4da6077
 scipy-1.5.3.zip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20201017/a7a15a7e/attachment.html>

From warren.weckesser at gmail.com  Mon Oct 19 19:50:38 2020
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Mon, 19 Oct 2020 19:50:38 -0400
Subject: [SciPy-Dev] ANN: SciPy 1.5.3
In-Reply-To: <CAHPuU_bd_zLCeeZVsV5ma950Zcu66LK6SzoyMDV3tMHXkofF9w@mail.gmail.com>
References: <CAHPuU_bd_zLCeeZVsV5ma950Zcu66LK6SzoyMDV3tMHXkofF9w@mail.gmail.com>
Message-ID: <CAGzF1uesp1xVf+WmKMOW1=hV5HKERitRHRP2puSrULqkfyk27A@mail.gmail.com>

On 10/17/20, Tyler Reddy <tyler.je.reddy at gmail.com> wrote:
> Hi all,
>
> On behalf of the SciPy development team I'm pleased to announce
> the release of SciPy 1.5.3, which is a bug fix release that includes
> Linux ARM64 wheels for the first time.


Thanks Tyler!  A lot of work goes into a SciPy release, so I'm
grateful you continue to manage the releases so well.

Warren


>
> Sources and binary wheels can be found at:
> https://pypi.org/project/scipy/
> and at: https://github.com/scipy/scipy/releases/tag/v1.5.3
>
> One of a few ways to install this release with pip:
>
> pip install scipy==1.5.3
>
> ==========================
> SciPy 1.5.3 Release Notes
> ==========================
>
> SciPy 1.5.3 is a bug-fix release with no new features
> compared to 1.5.2. In particular, Linux ARM64 wheels are now
> available and a compatibility issue with XCode 12 has
> been fixed.
>
> Authors
> ======
>
> * Peter Bell
> * CJ Carey
> * Thomas Duvernay +
> * Gregory Lee
> * Eric Moore
> * odidev
> * Dima Pasechnik
> * Tyler Reddy
> * Simon Segerblom Rex +
> * Daniel B. Smith
> * Will Tirone +
> * Warren Weckesser
>
> A total of 12 people contributed to this release.
> People with a "+" by their names contributed a patch for the first time.
> This list of names is automatically generated, and may not be fully
> complete.
>
> Issues closed for 1.5.3
> ------------------------------
>
> * `#9611 <https://github.com/scipy/scipy/issues/9611>`__: Overflow error
> with new way of p-value calculation in kendall...
> * `#10069 <https://github.com/scipy/scipy/issues/10069>`__:
> scipy.ndimage.watershed_ift regression in 1.0.0
> * `#11260 <https://github.com/scipy/scipy/issues/11260>`__: BUG: DOP853
> with complex data computes complex error norm, causing...
> * `#11479 <https://github.com/scipy/scipy/issues/11479>`__: RuntimeError:
> dictionary changed size during iteration on loading...
> * `#11972 <https://github.com/scipy/scipy/issues/11972>`__: BUG (solved):
> Error estimation in DOP853 ODE solver fails for...
> * `#12543 <https://github.com/scipy/scipy/issues/12543>`__: BUG: Picture
> rotated 180 degrees and rotated -180 degrees should...
> * `#12613 <https://github.com/scipy/scipy/issues/12613>`__: Travis X.4 and
> X.7 failures in master
> * `#12654 <https://github.com/scipy/scipy/issues/12654>`__:
> scipy.stats.combine_pvalues produces wrong results with
> method='mudholkar_george'
> * `#12819 <https://github.com/scipy/scipy/issues/12819>`__: BUG: Scipy
> Sparse slice indexing assignment Bug with zeros
> * `#12834 <https://github.com/scipy/scipy/issues/12834>`__: BUG: ValueError
> upon calling Scipy Interpolator objects
> * `#12836 <https://github.com/scipy/scipy/issues/12836>`__: ndimage.median
> can return incorrect values for integer inputs
> * `#12860 <https://github.com/scipy/scipy/issues/12860>`__: Build failure
> with Xcode 12
>
> Pull requests for 1.5.3
> -----------------------------
>
> * `#12611 <https://github.com/scipy/scipy/pull/12611>`__: MAINT: prepare
> for SciPy 1.5.3
> * `#12614 <https://github.com/scipy/scipy/pull/12614>`__: MAINT: prevent
> reverse broadcasting
> * `#12617 <https://github.com/scipy/scipy/pull/12617>`__: MAINT: optimize:
> Handle nonscalar size 1 arrays in fmin_slsqp...
> * `#12623 <https://github.com/scipy/scipy/pull/12623>`__: MAINT: stats:
> Loosen some test tolerances.
> * `#12638 <https://github.com/scipy/scipy/pull/12638>`__: CI, MAINT: pin
> pytest for Azure win
> * `#12668 <https://github.com/scipy/scipy/pull/12668>`__: BUG: Ensure
> factorial is not too large in mstats.kendalltau
> * `#12705 <https://github.com/scipy/scipy/pull/12705>`__: MAINT:
> \`openblas_support\` added sha256 hash
> * `#12706 <https://github.com/scipy/scipy/pull/12706>`__: BUG: fix
> incorrect 1d case of the fourier_ellipsoid filter
> * `#12721 <https://github.com/scipy/scipy/pull/12721>`__: BUG: use
> special.sindg in ndimage.rotate
> * `#12724 <https://github.com/scipy/scipy/pull/12724>`__: BUG: per #12654
> adjusted mudholkar_george method to combine p...
> * `#12726 <https://github.com/scipy/scipy/pull/12726>`__: BUG: Fix DOP853
> error norm for complex problems
> * `#12730 <https://github.com/scipy/scipy/pull/12730>`__: CI: pin xdist for
> Azure windows
> * `#12786 <https://github.com/scipy/scipy/pull/12786>`__: BUG: stats: Fix
> formula in the \`stats\` method of the ARGUS...
> * `#12795 <https://github.com/scipy/scipy/pull/12795>`__: CI: Pin
> setuptools on windows CI
> * `#12830 <https://github.com/scipy/scipy/pull/12830>`__: [BUG] sparse:
> Avoid using size attribute in LIL __setitem__
> * `#12833 <https://github.com/scipy/scipy/pull/12833>`__: BUG: change list
> of globals items to list of a copy
> * `#12842 <https://github.com/scipy/scipy/pull/12842>`__: BUG: Use uint16
> for cost in NI_WatershedElement
> * `#12845 <https://github.com/scipy/scipy/pull/12845>`__: BUG: avoid
> boolean or integer addition error in ndimage.measurements.median
> * `#12864 <https://github.com/scipy/scipy/pull/12864>`__: BLD: replace the
> #include of libqull_r.h with with this of qhull_ra.h...
> * `#12867 <https://github.com/scipy/scipy/pull/12867>`__: BUG: Fixes a
> ValueError yielded upon calling Scipy Interpolator...
> * `#12902 <https://github.com/scipy/scipy/pull/12902>`__: CI: Remove 'env'
> from pytest.ini
> * `#12913 <https://github.com/scipy/scipy/pull/12913>`__: MAINT: Ignore
> pytest's PytestConfigWarning
>
>
> Checksums
> =========
>
> MD5
> ~~~
>
> fd75941594b22a322f63a27d53c1bdda
>  scipy-1.5.3-cp36-cp36m-macosx_10_9_x86_64.whl
> 85d547117ae6d9fd447120c1768ff2d0
> scipy-1.5.3-cp36-cp36m-manylinux1_i686.whl
> 8e8ca0d9123c4f4bf59f74ec474fc936
>  scipy-1.5.3-cp36-cp36m-manylinux1_x86_64.whl
> 855db93424cf97e5b4685c4cf74be346
>  scipy-1.5.3-cp36-cp36m-manylinux2014_aarch64.whl
> 20405ee4157858d33e38c62b873b6420  scipy-1.5.3-cp36-cp36m-win32.whl
> be3f3ba1018e7b1f3a30738aa502a4b5  scipy-1.5.3-cp36-cp36m-win_amd64.whl
> 12be163517b748a025cd399e6e9ce7df
>  scipy-1.5.3-cp37-cp37m-macosx_10_9_x86_64.whl
> b2d761876d1f1d27e289b08f44034ede
> scipy-1.5.3-cp37-cp37m-manylinux1_i686.whl
> 1cd15a513ad76c64a4353a63eee6b3a8
>  scipy-1.5.3-cp37-cp37m-manylinux1_x86_64.whl
> 00bb4f63f4ee40869193c8e639da3274
>  scipy-1.5.3-cp37-cp37m-manylinux2014_aarch64.whl
> f783a76397dfe2a73752ac86f32fa474  scipy-1.5.3-cp37-cp37m-win32.whl
> 3c2927e6adf9b522f7f1745308b4d1f2  scipy-1.5.3-cp37-cp37m-win_amd64.whl
> faaa0cab95b6f352508120b8f628aaae
>  scipy-1.5.3-cp38-cp38-macosx_10_9_x86_64.whl
> 77e8cb222c1868f01f048a444ef49260  scipy-1.5.3-cp38-cp38-manylinux1_i686.whl
> a2a035b15f78106090b9550f1383b40f
>  scipy-1.5.3-cp38-cp38-manylinux1_x86_64.whl
> 86d52a963596a4cf6e1930ac5cf79a03
>  scipy-1.5.3-cp38-cp38-manylinux2014_aarch64.whl
> 8dd29aceb8dae5b5cc4f8100d8b35423  scipy-1.5.3-cp38-cp38-win32.whl
> 14f617e43a37827a29e8ee9ad97eda4b  scipy-1.5.3-cp38-cp38-win_amd64.whl
> ecf5c58e4df1d257abf1634d51cb9205  scipy-1.5.3.tar.gz
> 61bcc473115587a66acfb42d3021479b  scipy-1.5.3.tar.xz
> a9c549f19c661ab1b44e5b2cc707d4c1  scipy-1.5.3.zip
>
> SHA256
> ~~~~~~
>
> f574558f1b774864516f3c3fe072ebc90a29186f49b720f60ed339294b7f32ac
>  scipy-1.5.3-cp36-cp36m-macosx_10_9_x86_64.whl
> e527c9221b6494bcd06a17f9f16874406b32121385f9ab353b8a9545be458f0b
>  scipy-1.5.3-cp36-cp36m-manylinux1_i686.whl
> b9751b39c52a3fa59312bd2e1f40144ee26b51404db5d2f0d5259c511ff6f614
>  scipy-1.5.3-cp36-cp36m-manylinux1_x86_64.whl
> d5e3cc60868f396b78fc881d2c76460febccfe90f6d2f082b9952265c79a8788
>  scipy-1.5.3-cp36-cp36m-manylinux2014_aarch64.whl
> 1fee28b6641ecbff6e80fe7788e50f50c5576157d278fa40f36c851940eb0aff
>  scipy-1.5.3-cp36-cp36m-win32.whl
> ffcbd331f1ffa82e22f1d408e93c37463c9a83088243158635baec61983aaacf
>  scipy-1.5.3-cp36-cp36m-win_amd64.whl
> 07b083128beae040f1129bd8a82b01804f5e716a7fd2962c1053fa683433e4ab
>  scipy-1.5.3-cp37-cp37m-macosx_10_9_x86_64.whl
> e2602f79c85924e4486f684aa9bbab74afff90606100db88d0785a0088be7edb
>  scipy-1.5.3-cp37-cp37m-manylinux1_i686.whl
> aebb69bcdec209d874fc4b0c7ac36f509d50418a431c1422465fa34c2c0143ea
>  scipy-1.5.3-cp37-cp37m-manylinux1_x86_64.whl
> bc0e63daf43bf052aefbbd6c5424bc03f629d115ece828e87303a0bcc04a37e4
>  scipy-1.5.3-cp37-cp37m-manylinux2014_aarch64.whl
> 8cc5c39ed287a8b52a5509cd6680af078a40b0e010e2657eca01ffbfec929468
>  scipy-1.5.3-cp37-cp37m-win32.whl
> 0edd67e8a00903aaf7a29c968555a2e27c5a69fea9d1dcfffda80614281a884f
>  scipy-1.5.3-cp37-cp37m-win_amd64.whl
> 66ec29348444ed6e8a14c9adc2de65e74a8fc526dc2c770741725464488ede1f
>  scipy-1.5.3-cp38-cp38-macosx_10_9_x86_64.whl
> 12fdcbfa56cac926a0a9364a30cbf4ad03c2c7b59f75b14234656a5e4fd52bf3
>  scipy-1.5.3-cp38-cp38-manylinux1_i686.whl
> a1a13858b10d41beb0413c4378462b43eafef88a1948d286cb357eadc0aec024
>  scipy-1.5.3-cp38-cp38-manylinux1_x86_64.whl
> 5163200ab14fd2b83aba8f0c4ddcc1fa982a43192867264ab0f4c8065fd10d17
>  scipy-1.5.3-cp38-cp38-manylinux2014_aarch64.whl
> 33e6a7439f43f37d4c1135bc95bcd490ffeac6ef4b374892c7005ce2c729cf4a
>  scipy-1.5.3-cp38-cp38-win32.whl
> a3db1fe7c6cb29ca02b14c9141151ebafd11e06ffb6da8ecd330eee5c8283a8a
>  scipy-1.5.3-cp38-cp38-win_amd64.whl
> ddae76784574cc4c172f3d5edd7308be16078dd3b977e8746860c76c195fa707
>  scipy-1.5.3.tar.gz
> cab92f8dab54a3be66525ea23e4f6568145abd1e94681cce19258a140f4de416
>  scipy-1.5.3.tar.xz
> 32dd070e203363e3a63bc184afb1d8e165c0a36a49b83b097642f78ef4da6077
>  scipy-1.5.3.zip
>

From rejones7 at msn.com  Tue Oct 20 20:35:59 2020
From: rejones7 at msn.com (rondall jones)
Date: Wed, 21 Oct 2020 00:35:59 +0000
Subject: [SciPy-Dev] Re ENH 12755: adding a fast ill-condition sensing
 capability to arls()
Message-ID: <CY4PR0701MB36195D2E0220ADDD1DAD20EE981C0@CY4PR0701MB3619.namprd07.prod.outlook.com>

Early comments regarding this package mentioned the preference for integration in some manner with existing scipy solvers. It took me a while to see a way to do that. The problem was the lack of a quick method for detecting the likelihood of ill-conditioning (including singularity) which would make a regularized solver preferable to a standard solver like lstsq(). But I was able to develop such a quick method, called "strange()" -- because it detects strange behavior that is usually indicative of an ill-conditioned system. This method takes only about 4n^2 flops, rather than O(n^3) which the regularized solvers seem to take.

So: lsmr and arls take a few times as long as lstsq. And lstsq takes a few times as long as "strange()". So inside arls() it is low-cost to call strange() first to see if it is safe to call the faster lstsq().

The special solvers -- arlseq, arlsgt, and arlsnn -- all get the benefit of this change as they depend on arls().  But that this capability applies only to the case of single right-hand-sides, as the benefit of this approach dissolves rapidly when multiple right-hand-sides are used, as all their solutions in arls() share a single call for the SVD.

Also updating the in-code user documentation.

Ron Jones
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20201021/9c0567e4/attachment.html>

From amardeepjk at gmail.com  Fri Oct 23 14:01:28 2020
From: amardeepjk at gmail.com (Amardeep Singh)
Date: Sat, 24 Oct 2020 02:01:28 +0800
Subject: [SciPy-Dev] help needed
In-Reply-To: <CAJmcdX53-QRx_RHQjC5h9PWy5vJgivGS_u=sAS2nWLJ_2r4iyw@mail.gmail.com>
References: <CAJmcdX53-QRx_RHQjC5h9PWy5vJgivGS_u=sAS2nWLJ_2r4iyw@mail.gmail.com>
Message-ID: <CAJmcdX6C0OsWV=C75+EYn2iag8cYTs3OMBgs8G4=+g+8GkpbNw@mail.gmail.com>

Hi All

I am a new joiner.Using macbook.
Can someone please guide me how to debug scipy using clion?
I am able to build  but not sure about debugging the c code.

ProductName: Mac OS X

ProductVersion: 10.15.7

BuildVersion: 19H2


thx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20201024/8691cdf6/attachment.html>

From samwallan at icloud.com  Tue Oct 27 02:11:59 2020
From: samwallan at icloud.com (Sam Wallan)
Date: Mon, 26 Oct 2020 23:11:59 -0700
Subject: [SciPy-Dev] Determining License for fortran inclusion in SciPy
Message-ID: <1C4A1FCD-2073-4F21-9A07-CBA0F46BF3BF@icloud.com>

Hello All, 
 
We are working to add support for the Tukey-Kramer statistical test, and it relies upon an involved computation of the critical values of the Studentized Range, for which a Fortran routine has been published. Using this Fortran would be much quicker in terms of development and likely quicker in computation than our best Python implementation.
 
The code is published in a journal article, but I'm not sure if that means that the code is publicly available, or what license or copyright might be retained. I have contacted one of the original authors to ask him about it and am awaiting a response. The paper the code is in is available from JSTOR, and the raw code is available in various places <http://ftp.uni-bayreuth.de/math/statlib/apstat/190> on the internet both in original form and sometimes with improvements.
 
Algorithm AS 190: Probabilities and Upper Quantiles for the Studentized Range? <https://www.jstor.org/stable/2347300?seq=1>
R. E. Lund and J. R. Lund
Journal of the Royal Statistical Society. Series C (Applied Statistics)
Vol. 32, No. 2 (1983), pp. 204-210 (7 pages)
Published By: Wiley
DOI: 10.2307/2347300
 
In SciPy there is already use of several algorithms also published by Journal of the Royal Statistical Society (these can also be found on JSTOR). In git history I see they were added some 18 years ago:
PRHO: Algorithm AS 89  Appl. Statist. (1975) Vol.24, No. 3, P377 in scipy/stats/statlib/spearman.f
POLY: ALGORITHM AS 181.2   APPL. STATIST.  (1982) VOL. 31, NO. 2 in scipy/stats/statlib/swilk.f


I?ve investigated online for other licensed use of these algorithms, but I have not seen anything concrete. I closest use I was able to find is a direct translation of the algorithm <https://foundry.sandia.gov/releases/latest/javadoc-api/gov/sandia/cognition/statistics/distribution/StudentizedRangeDistribution.APStat.html> we are interested in, AS 190, to Java by Sandia National Laboratories, which?"is released under the open source?BSD License.? <https://foundry.sandia.gov/> 
 
Does anyone have experience with this source or insight into whether the inclusion of this Fortran in SciPy is fair game? 
 
Cheers,
Sam Wallan
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20201026/995f0455/attachment-0001.html>

From ralf.gommers at gmail.com  Tue Oct 27 06:22:10 2020
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Tue, 27 Oct 2020 10:22:10 +0000
Subject: [SciPy-Dev] Determining License for fortran inclusion in SciPy
In-Reply-To: <1C4A1FCD-2073-4F21-9A07-CBA0F46BF3BF@icloud.com>
References: <1C4A1FCD-2073-4F21-9A07-CBA0F46BF3BF@icloud.com>
Message-ID: <CABL7CQj+PyvhUC_4o6D94b8zP0EjYjDitZe-O8yrd-WmQt9Pjg@mail.gmail.com>

On Tue, Oct 27, 2020 at 6:12 AM Sam Wallan <samwallan at icloud.com> wrote:

> Hello All,
>
>
>
> We are working to add support for the Tukey-Kramer statistical test, and
> it relies upon an involved computation of the critical values of the
> Studentized Range, for which a Fortran routine has been published. Using
> this Fortran would be much quicker in terms of development and likely
> quicker in computation than our best Python implementation.
>
>
>
> The code is published in a journal article, but I'm not sure if that means
> that the code is publicly available, or what license or copyright might be
> retained. I have contacted one of the original authors to ask him about it
> and am awaiting a response. The paper the code is in is available from
> JSTOR, and the raw code is available in various places
> <http://ftp.uni-bayreuth.de/math/statlib/apstat/190> on the internet both
> in original form and sometimes with improvements.
>

What you need here is permission to distribute this code under the BSD
license SciPy uses (or another compatible license) from the copyright
holder. I didn't check, but most likely JSTOR holds the copyright and not
the original authors. If that's the case, you'll need permission from the
journal. We've had multiple cases of this in the past where we did receive
such permission, IIRC from ACM journals.

Cheers,
Ralf


>
> Algorithm AS 190: Probabilities and Upper Quantiles for the Studentized
> Range  <https://www.jstor.org/stable/2347300?seq=1>
>
> R. E. Lund and J. R. Lund
>
> Journal of the Royal Statistical Society. Series C (Applied Statistics)
>
> Vol. 32, No. 2 (1983), pp. 204-210 (7 pages)
>
> Published By: Wiley
>
> DOI: 10.2307/2347300
>
>
>
> In SciPy there is already use of several algorithms also published by Journal
> of the Royal Statistical Society (these can also be found on JSTOR). In git
> history I see they were added some 18 years ago:
>
>    - PRHO: Algorithm AS 89  Appl. Statist. (1975) Vol.24, No. 3, P377 in
>    scipy/stats/statlib/spearman.f
>    - POLY: ALGORITHM AS 181.2   APPL. STATIST.  (1982) VOL. 31, NO. 2 in
>    scipy/stats/statlib/swilk.f
>
>
>
> I?ve investigated online for other licensed use of these algorithms, but I
> have not seen anything concrete. I closest use I was able to find is a
> direct translation of the algorithm
> <https://foundry.sandia.gov/releases/latest/javadoc-api/gov/sandia/cognition/statistics/distribution/StudentizedRangeDistribution.APStat.html> we
> are interested in, AS 190, to Java by Sandia National Laboratories, which "is
> released under the open source BSD License.? <https://foundry.sandia.gov/>
>
>
>
>
> Does anyone have experience with this source or insight into whether the
> inclusion of this Fortran in SciPy is fair game?
>
>
>
> Cheers,
> Sam Wallan
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20201027/55a4a0f5/attachment.html>

From hans.dembinski at gmail.com  Tue Oct 27 10:00:16 2020
From: hans.dembinski at gmail.com (Hans Dembinski)
Date: Tue, 27 Oct 2020 15:00:16 +0100
Subject: [SciPy-Dev] Adding non-parametric methods to scipy.stats
In-Reply-To: <8765F3BD-7FEF-4457-9F30-010AA69AA493@gmail.com>
References: <6a5dee0d-3d55-409d-6bf8-b86e3f465275@tik.ee.ethz.ch>
 <CADuxUiz45SJEZm5Mm6V6cC9Z0vMUxJt9D5DD_Shm=PeKrC=MJg@mail.gmail.com>
 <9ed88efe-5504-3d60-ab73-3ffc7cce1edf@tik.ee.ethz.ch>
 <CADuxUiym12uh4bbg1W_m6Lj6xb2LJ4Eb+wyJok7EyWmrDv8enQ@mail.gmail.com>
 <cedc67bf-6dcf-4267-f7c8-84d23200f45c@ethz.ch>
 <A3ABA662-8809-4988-93FD-50950659C14D@gmail.com>
 <91A9C8F6-63B5-408C-9A1E-29EF4997FC37@gmail.com>
 <CAGzF1uc+E+zkak8CawrTfTkCg5PtX=K4BfUEgf3RGtCJ4ZcswQ@mail.gmail.com>
 <8765F3BD-7FEF-4457-9F30-010AA69AA493@gmail.com>
Message-ID: <96A59547-952B-46F7-B360-42DCDD50A655@gmail.com>

Dear Warren, all,

I am following up on my message from June about integrating a general bootstrap library into scipy.

Daniel and I have been busy with finishing our rewrite of the resample library and we released version 1.0.1 for general use on August 24. I have been busy with other stuff that's why I didn't come back sooner, sorry.

Docs: https://resample.readthedocs.io/en/master
Source: https://github.com/resample-project/resample
PyPI: https://pypi.org/project/resample

resample is a pure Python implementation written from scratch using only scipy and numpy as dependencies and a BSD 3-clause license. It should be suitable for inclusion in scipy. I believe we have converged on a high quality Pythonic interface that offers both a powerful low-level API for experts and a convenient high-level API for practitioners. Our implementations were optimised to make efficient use of numpy to offload the hot loops into C and to avoid creation of unnecessary copies and temporary arrays.

What resample offers:

- Ordinary, balanced, and parametric bootstrap resampling with stratification of N-dimensional data
- Jackknife resampling of N-dimensional data
- For both bootstrap and jackknife resampling: computation of bias and/or variance of an estimator (that would be a generic Python function which maps data samples to N-dimensional output)
- Bootstrap confidence intervals (BCa and percentile)
- A battery of non-parametric permutation-based tests like the Wilcoxon rank sum test, https://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test
- Accessive docs in numpydoc format

The bootstrap and jackknife functionalities are completely generic. One can compute confidence intervals (the BCa method is state-of-the-art) for any statistical estimator, including arbitrary complicated ones obtained from machine learning and also for the quantile which has the original topic of this thread.

So far we only have 34 stars on Github, but that is mostly because we did not advertise. I believe our library has the potential to be very popular if we actually start advertising, but neither myself nor Daniel are very interested doing public relations. We both have full-time jobs and developing resample is a hobby to us. We would be happy to have resample in SciPy so that our work can benefit from the visibility that Scipy enjoys, while Scipy can benefit from the functionality that resample offers.

Best regards,
Hans

PS: My credentials in case you need them:
I program in Python since 15 years as a scientist working on big data. I have expertise in both user-friendly interface design and hardware-near numerical programming. I am the author of the Boost.Histogram library in C++14 on www.boost.org and co-author of the corresponding Python module boost-histogram. I contributed to matplotlib and maintain the iminuit Python module, a numerical minimiser and error computation tool that is popular in high energy physics.

PPS: Last week, I had the opportunity to listen live to a talk from Brad Efron himself, the inventor of the bootstrap. Fantastic guy.

> On 18. Jun 2020, at 17:53, Hans Dembinski <hans.dembinski at gmail.com> wrote:
> 
> Dear Warren, (Daniel in CC)
> 
>> On 18. Jun 2020, at 16:15, Warren Weckesser <warren.weckesser at gmail.com> wrote:
>> 
>> On 6/18/20, Hans Dembinski <hans.dembinski at gmail.com> wrote:
>>> Dear all,
>>> 
>>> since there was no reply to my first attempt, I am repeat my message. Daniel
>>> Saxton and I are working on a Python library called `resample`, which
>>> implements the bootstrap and jackknife. We would like to work toward merging
>>> bootstrap functions into Scipy and it would be great to get some feedback
>>> about this. We would be pleased to collaborate with people who are already
>>> working on this in Scipy. We are both pretty decent programmers,
>>> knowledgable about statistics in general and the bootstrap in particular.
>> 
>> Thanks, Hans.  We would be very interested in adding bootstrap methods to SciPy!
>> 
>> I might not get to it for a few days, but I'll take a look at your
>> library and see if it makes sense to incorporate it into SciPy.  If
>> anyone other SciPy devs can get to it sooner, please take a look!
> 
> that is excellent, thanks! The basic functionality is all there. We are currently working on refining the interface, the docs need more work, and we want to add more unit tests. Currently, the project is not at a quality-level fit for SciPy, but I am sure we can get there.
> 
> Best regards,
> Hans


From charlesr.harris at gmail.com  Wed Oct 28 22:12:30 2020
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 28 Oct 2020 20:12:30 -0600
Subject: [SciPy-Dev] NumPy 1.19.3 release
Message-ID: <CAB6mnxK9eYz_0zOb8BDw+5A_2dv5qdnv4xqbO=P9rJfnqEVWOA@mail.gmail.com>

Hi All,

On behalf of the NumPy team I am pleased to announce that NumPy 1.19.3 has
been released. NumPy 1.19.3 is a small maintenance release with two major
improvements:

   - Python 3.9 binary wheels on all supported platforms,
   - OpenBLAS fixes for Windows 10 version 2004 fmod bug.

This release supports Python 3.6-3.9 and is linked with OpenBLAS 3.12 to
avoid some of the fmod problems on Windows 10 version 2004. Microsoft is
aware of the problem and users should upgrade when the fix becomes
available, the fix here is limited in scope.

NumPy Wheels for this release can be downloaded from the PyPI
<https://pypi.org/project/numpy/1.19.3/>, source archives, release notes,
and wheel hashes are available on Github
<https://github.com/numpy/numpy/releases/tag/v1.19.3>. Linux users will
need pip >= 0.19.3 in order to install manylinux2010 and manylinux2014
wheels.

*Contributors*

A total of 8 people contributed to this release.  People with a "+" by their
names contributed a patch for the first time.


   - Charles Harris
   - Chris Brown +
   - Daniel Vanzo +
   - E. Madison Bray +
   - Hugo van Kemenade +
   - Ralf Gommers
   - Sebastian Berg
   - @danbeibei +


*Pull requests merged*
A total of 10 pull requests were merged for this release.

   - #17298: BLD: set upper versions for build dependencies
   - #17336: BUG: Set deprecated fields to null in PyArray_InitArrFuncs
   - #17446: ENH: Warn on unsupported Python 3.10+
   - #17450: MAINT: Update test_requirements.txt.
   - #17522: ENH: Support for the NVIDIA HPC SDK nvfortran compiler
   - #17568: BUG: Cygwin Workaround for #14787 on affected platforms
   - #17647: BUG: Fix memory leak of buffer-info cache due to relaxed
   strides
   - #17652: MAINT: Backport openblas_support from master.
   - #17653: TST: Add Python 3.9 to the CI testing on Windows, Mac.
   - #17660: TST: Simplify source path names in test_extending.

Cheers,

Charles Harris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20201028/6687276b/attachment.html>

From warren.weckesser at gmail.com  Wed Oct 28 23:34:07 2020
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Wed, 28 Oct 2020 23:34:07 -0400
Subject: [SciPy-Dev] NumPy 1.19.3 release
In-Reply-To: <CAB6mnxK9eYz_0zOb8BDw+5A_2dv5qdnv4xqbO=P9rJfnqEVWOA@mail.gmail.com>
References: <CAB6mnxK9eYz_0zOb8BDw+5A_2dv5qdnv4xqbO=P9rJfnqEVWOA@mail.gmail.com>
Message-ID: <CAGzF1udnKP7czp-dGE9uLd_T880z-puqxK7eEJAoeenuhBY90w@mail.gmail.com>

On 10/28/20, Charles R Harris <charlesr.harris at gmail.com> wrote:
> Hi All,
>
> On behalf of the NumPy team I am pleased to announce that NumPy 1.19.3 has
> been released. NumPy 1.19.3 is a small maintenance release with two major
> improvements:
>
>    - Python 3.9 binary wheels on all supported platforms,
>    - OpenBLAS fixes for Windows 10 version 2004 fmod bug.
>
> This release supports Python 3.6-3.9 and is linked with OpenBLAS 3.12 to
> avoid some of the fmod problems on Windows 10 version 2004. Microsoft is
> aware of the problem and users should upgrade when the fix becomes
> available, the fix here is limited in scope.
>
> NumPy Wheels for this release can be downloaded from the PyPI
> <https://pypi.org/project/numpy/1.19.3/>, source archives, release notes,
> and wheel hashes are available on Github
> <https://github.com/numpy/numpy/releases/tag/v1.19.3>. Linux users will
> need pip >= 0.19.3 in order to install manylinux2010 and manylinux2014
> wheels.
>
> *Contributors*
>
> A total of 8 people contributed to this release.  People with a "+" by
> their
> names contributed a patch for the first time.
>
>
>    - Charles Harris
>    - Chris Brown +
>    - Daniel Vanzo +
>    - E. Madison Bray +
>    - Hugo van Kemenade +
>    - Ralf Gommers
>    - Sebastian Berg
>    - @danbeibei +
>
>
>
> *Pull requests merged*
> A total of 10 pull requests were merged for this release.
>
>    - #17298: BLD: set upper versions for build dependencies
>    - #17336: BUG: Set deprecated fields to null in PyArray_InitArrFuncs
>    - #17446: ENH: Warn on unsupported Python 3.10+
>    - #17450: MAINT: Update test_requirements.txt.
>    - #17522: ENH: Support for the NVIDIA HPC SDK nvfortran compiler
>    - #17568: BUG: Cygwin Workaround for #14787 on affected platforms
>    - #17647: BUG: Fix memory leak of buffer-info cache due to relaxed
>    strides
>    - #17652: MAINT: Backport openblas_support from master.
>    - #17653: TST: Add Python 3.9 to the CI testing on Windows, Mac.
>    - #17660: TST: Simplify source path names in test_extending.
>
> Cheers,
>
> Charles Harris
>


Thanks for managing the release, Chuck!

Warren

From ndbecker2 at gmail.com  Thu Oct 29 09:02:46 2020
From: ndbecker2 at gmail.com (Neal Becker)
Date: Thu, 29 Oct 2020 09:02:46 -0400
Subject: [SciPy-Dev] buglet in signal.bilinear_zpk example
Message-ID: <CAG3t+pHpVbe0cxL58=ni33Z4AFi49BmNXmUS_2OOzy7hMewFaw@mail.gmail.com>

The labels on the 2 plots in the example here are swapped:

https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.bilinear_zpk.html#scipy.signal.bilinear_zpk

The 1st plot is the z domain and the 2nd plot is the s domain.
-- 
*Those who don't understand recursion are doomed to repeat it*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20201029/d50d85f8/attachment.html>

From andyfaff at gmail.com  Thu Oct 29 17:06:52 2020
From: andyfaff at gmail.com (Andrew Nelson)
Date: Fri, 30 Oct 2020 08:06:52 +1100
Subject: [SciPy-Dev] Reviewing/dev/merge traction on some optimize PRs
Message-ID: <CAAbtOZcsMAk5h3-tcFmGjGpCo9zi7saEj68GfZCh7WJyuEnAeQ@mail.gmail.com>

Hi all,
there are a few optimize PRs that I could do with help on the
reviewing/dev/merge cycle on. They're mainly to solve various issues which
a few people have been reporting related to the numerical differentiation
changes for minimize methods in 1.5. In order of importance:

https://github.com/scipy/scipy/pull/12889
https://github.com/scipy/scipy/pull/13009
https://github.com/scipy/scipy/pull/12998
(https://github.com/scipy/scipy/pull/11263 this one needs me to do a rebase
and fix a merge conflict, and I don't think there are reviewer objections
to this.)

I'm asking for help because they're non-trivial changes and I'd like to see
them merged (or the issues addressed) well before 1.6 to give a chance for
further issues to be resolved. Also, they're all touching the same area of
code and when one gets merged they might create further merge conflicts.


-- 
_____________________________________
Dr. Andrew Nelson


_____________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scipy-dev/attachments/20201030/4b026d20/attachment.html>